Vision meets Robotics: The KITTI Dataset - Cvlibs

1 Vision meets Robotics: The KITTI DatasetAndreas Geiger, Philip Lenz, Christoph Stiller and Raquel UrtasunAbstract We present a novel Dataset captured from a VWstation wagon for use in mobile robotics and autonomous drivingresearch. In total, we recorded 6 hours of traffic scenarios at10-100 Hz using a variety of sensor modalities such as high-resolution color and grayscale stereo cameras, a Velodyne 3 Dlaser scanner and a high-precision GPS/IMU inertial navigationsystem.

The scenarios are diverse, capturing real-world trafficsituations and range from freeways over rural areas to inner-city scenes with many static and dynamic objects. Our data iscalibrated, synchronized and timestamped, and we provide therectified and raw image sequences. Our Dataset also containsobject labels in the form of 3D tracklets and we provide onlinebenchmarks for stereo, optical flow, object detection and othertasks. This paper describes our recording platform, the dataformat and the utilities that we Terms Dataset , autonomous driving, mobile robotics,field robotics, computer Vision , cameras, laser, GPS, benchmarks,stereo, optical flow, SLAM, object detection, tracking, KITTII.

INTRODUCTIONThe KITTI Dataset has been recorded from a moving plat-form (Fig. 1) while driving in and around Karlsruhe, Germany(Fig. 2). It includes camera images, laser scans, high-precisionGPS measurements and IMU accelerations from a combinedGPS/IMU system. The main purpose of this Dataset is topush forward the development of computer Vision and roboticalgorithms targeted to autonomous driving [1] [7]. While ourintroductory paper [8] mainly focuses on the benchmarks,their creation and use for evaluating state-of-the-art computervision methods, here we complement this information byproviding technical details on the raw data itself.

We giveprecise instructions on how to access the data and commenton sensor limitations and common pitfalls. The Dataset canbe downloaded from Fora review on related work, we refer the reader to [8].II. SENSORSETUPOur sensor setup is illustrated in Fig. 3: 2 PointGray Flea2 grayscale cameras (FL2-14S3M-C), Megapixels, 1/2 Sony ICX267 CCD, global shutter 2 PointGray Flea2 color cameras (FL2-14S3C-C), , 1/2 Sony ICX267 CCD, global shutter 4 Edmund Optics lenses, 4mm, opening angle 90 ,vertical opening angle of region of interest (ROI) 35 1 Velodyne HDL-64E rotating 3D laser scanner, 10 Hz,64 beams, degree angular resolution, 2 cm distanceaccuracy, collecting million points/second, field ofview: 360 horizontal, vertical, range.

120 mA. Geiger, P. Lenz and C. Stiller are with the Department of Measurementand Control Systems, Karlsruhe Institute of Technology, Germany. Urtasun is with the Toyota Technological Institute at Chicago, : HDL-64E LaserscannerPoint Gray Flea 2 Video CamerasxyzOXTSRT 3003 GPS / IMUxyzOXTSRT 3003 GPS / IMUFig. VW Passat station wagon is equippedwith four video cameras (two color and two grayscale cameras), a rotating3D laser scanner and a combined GPS/IMU inertial navigation system.

1 OXTS RT3003 inertial and GPS navigation system,6 axis, 100 Hz, L1/L2 RTK, resolution: / Note that the color cameras lack in terms of resolution dueto the Bayer pattern interpolation process and are less sensitiveto light. This is the reason why we use two stereo camerarigs, one for grayscale and one for color. The baseline ofboth stereo camera rigs is approximately 54 cm. The trunkof our vehicle houses a PC with two six-core Intel XEONX5650 processors and a shock-absorbed RAID 5 hard diskstorage with a capacity of 4 Terabytes.

Our computer runsUbuntu Linux (64 bit) and a real-time database [9] to storethe incoming data DATASETT heraw datadescribed in this paper can be accessed and contains 25%of ouroverall recordings. The reason for this is that primarily datawith 3D tracklet annotations has been put online, though wewill make more data available upon request. Furthermore, wehave removed all sequences which are part of our benchmarktest sets. The raw data set is divided into the categories Road , City , Residential , Campus and Person.

Example framesare illustrated in Fig. 5. For each sequence, we provide the rawdata, object annotations in form of 3D bounding box trackletsand a calibration file, as illustrated in Fig. 4. Our recordingshave taken place on the 26th, 28th, 29th, 30th of Septemberand on the 3rd of October 2011 during daytime. The total sizeof the provided data figure shows the GPS traces of our recordingsin the metropolitan area of Karlsruhe, Germany. Colors encode the GPS signalquality: Red tracks have been recorded with highest precision using RTKcorrections, blue denotes the absence of correction signals.

The black runshave been excluded from our data set as no GPS signal has been Data DescriptionAll sensor readings of a sequence are zipped into a singlefile , wheredateanddriveareplaceholders for the recording date and the sequence directory structure is shown in Fig. 4. Besides the rawrecordings ( raw data ), we also provide post-processed data( synced data ), , rectified and synchronized video streams,on the Dataset are stored per-frame sensor readings are provided in the corresponding datasub-folders.

Each line composedof the date and time in hours, minutes and seconds. As theVelodyne laser scanner has a rolling shutter , three timestampfiles are provided for this sensor, one for the start position( ) of a spin, one for the endposition ( ) of a spin, and one for thetime, where the laser scanner is facing forward and triggeringthe cameras ( ). The data format in whicheach sensor stream is stored is as follows:a) Images:Both, color and grayscale images are storedwith loss-less compression using 8-bit PNG files.

Vision meets Robotics: The KITTI Dataset - Cvlibs

Tags:

Information

Transcription of Vision meets Robotics: The KITTI Dataset - Cvlibs

Related search queries

Vision meets Robotics: The KITTI Dataset - Cvlibs

Tags:

Information

Documents from same domain

Are we ready for Autonomous Driving? The KITTI Vision ...

RayNet: Learning Volumetric 3D Reconstruction ... - cvlibs.net

SphereNet: Learning Spherical Representations ... - cvlibs.net

Related documents

Dominik Schlegel, Mirco Colosi and Giorgio Grisetti

Related search queries