SBM-RGBD dataset

Updated on June 16th, 2017

The SBM-RGBD dataset has been created in order to evaluate and compare scene background modelling methods for moving object detection on RGBD videos. It provides all facilities (data, ground truths, and evaluation scripts) for the , organized in conjunction with the Workshop on Background Learning for Detection and Tracking from RGBD Videos (RGBD2017, ).
The dataset and the results of the SBM-RGBD Challenge will remain available also after the competition, as reference for future methods.

Video categories
Ground truths
Performance evaluation
How to cite the dataset
Dataset organizers


The SBM-RGBD dataset provides a diverse set of groundtruthed synchronized color and depth sequences acquired by the Microsoft Kinect. The dataset consists of 33 videos (~15000 frames) representative of typical indoor visual data captured in video surveillance and smart environment scenarios, selected to cover a wide range of scene background modeling challenges for moving object detection. The videos come from our personal collections as well as from the following public datasets:

  1. GSM dataset, described in: G. Moyà-Alcover, A. Elgammal, A. Jaume-i-Capó, J. Varona, Modeling depth for nonparametric foreground segmentation using RGBD devices, Pattern Recognition Letters, in press, 2017
  2. MULTIVISION, described in E.J. Fernandez-Sanchez, J. Diaz, E. Ros, Background Subtraction Based on Color and Depth Using Active Sensors, Sensors 13, 2013, Pages 8895-8915.
  3. Princeton Tracking Benchmark, described in S. Song and J. Xiao. Tracking Revisited using RGBD Camera: Unified Benchmark and Baselines, Proceedings of 14th IEEE International Conference on Computer Vision (ICCV2013).
  4. RGB-D object detection dataset, described in M. Camplani, L. Salgado, Background Foreground segmentation with RGB-D Kinect data: an efficient combination of classifiers, Journal of Visual Communication and Image Representation 25(1), 2014, Pages 122-136.
  5. UR Fall Detection Dataset, described in B. Kwolek, M. Kepski, Human fall detection on embedded platform using depth maps and wireless accelerometer, Computer Methods and Programs in Biomedicine 117(3), 2014, Pages 489-501.
The videos have 640x480 spatial resolution; depths are recorded at either 16 or 8 bits. The length of the videos varies from 70 to 1400 frames.

Video categories

The RGBD videos span 7 categories, selected to include diverse scene background modelling challenges for moving object detection related only to the RGB channels (RGB), only to the depth channel (D), or related to all the channels (RGB+D):
  1. Illumination Changes (RGB): Videos containing strong and mild illumination changes ( Updated on June 9th, 2017, to fix problems with ROI and GT images of Ls_ds sequence. If you already downloaded this category, you may want to just download the new ROI and GT images for sequence Ls_ds here.
  2. ChairBox genSeq1 Ls_ds TimeOfDay_ds

  3. Color Camouflage (RGB): Videos including foreground objects that are very close in color to the background ( Updated on June 9th, 2017.
  4. Cespatx_ds colorCam1y colorCam2 Hallway

  5. Depth Camouflage (D): Videos including foreground objects very close in depth to the background (
  6. DCamSeq1 DCamSeq2 Despatx_ds Wall

  7. Intermittent Motion (RGB+D): Videos with scenarios known for causing “ghosting” artifacts in the detected motion, i.e., abandoned foreground objects or removed foreground objects. Foreground objects should be detected even if they stop moving (abandoned object) or if they were initially stationary and then start moving (removed object) (
  8. abandoned1 abandoned2 movedBackground1 movedBackground2 Shelves Sleeping_ds

  9. Out of Sensor Range (D): Videos including foreground or background objects that are too close to/far from the sensor ( Updated on June 9th, 2017.
  10. MultiPeople1 MultiPeople2 TopViewLab1 TopViewLab2 TopViewLab3

  11. Shadows (RGB+D): Videos showing shadows caused by foreground objects. These can be visible-light shadows in the RGB channels or IR shadows in the depth channel ( Updated on June 16th, 2017, to scale and register depth images of fall01cam1 sequence. If you already downloaded this category, you may want to just download the new fall01cam1 depth images, GT's and ROI here.
  12. fall01cam1 genSeq2 shadows1 shadows2 Shadows_ds

  13. Bootstrapping (RGB+D): Videos including foreground objects in all their frames ( Updated on June 16th, 2017, to scale and register depth images of adl24cam0, fall01cam0, and fall20cam0 sequences and to fix problems with some ROI and GT images.
  14. adl24cam0 bear_front Bootstrapping_ds fall01cam0 fall20cam0

Details on each video are available .

Ground truths

To enable a precise quantitative comparison and ranking of various algorithms for moving object detection from RGBD videos, all the videos come with pixel-wise ground truth foreground segmentations for each video. A foreground region is intended as anything that does not belong to the background, including abandoned objects and still persons, but excluding light reflections, shadows, etc.. The ground truth images, some of which created using the GroundTruther software kindly made available by the organizers of , contain 4 labels, namely:

Only a subset of the available ground truths is made publicly available for testing. However, for the SBM-RGBD Challenge, the evaluation is made across all the ground truths for all the videos. We hope this will reduce the possibility of overtuning method parameters.

Performance evaluation

The SBM-RGBD dataset comes also with tools to compute performance metrics for moving object detection from RGBD videos, and thus identify algorithms that are robust across various challenges. The Matlab scripts to compute all performance metrics, provided in , have been adapted by the scripts available from . The metrics include where TP, FP, FN, and TN indicate True Positive, False Positive, False Negative, and True Negative pixels, respectively.


All video files within one category can be downloaded as a single .zip file. Each video file, when uncompressed, becomes a directory which contains the following:
  1. a sub-directory named "input" containing a separate .png file for each frame of the input video.
  2. a sub-directory named "groundtruth" containing a ground truth mask .png file for some of the video frames.
  3. a file named "ROI.bmp" showing the spatial region of interest.
  4. a file named "temporalROI.txt" containing two frame numbers. Only the frames in this range are used to evaluate the results submitted to the SBM-RGBD Challenge.

How to cite the dataset

If you use the SBM-RGBD dataset to test and report results in any publication, we request that you acknowledge the dataset webpage .

Dataset organizers

, University of Bristol, UK
, National Research Council, Italy
, Universitat de les Illes Balears, Spain
, University of Naples Parthenope, Italy
, Universidad Politécnica de Madrid & Universidad Autónoma de Madrid, Spain