Example: bachelor of science

Bags of Binary Words for Fast Place Recognition in Image ...

ieee transactions ON ROBOTICS, VOL. , NO. , MONTH, YEAR. SHORT PAPER1 Bags of Binary Words for Fast Place Recognition inImage SequencesDorian G alvez-L opez and Juan D. Tard os,Member, IEEEA bstract We propose a novel method for visual Place Recognition usingbag of Words obtained from FAST+BRIEF features. For the firsttime, webuild a vocabulary tree that discretizes a Binary descriptor space, anduse the tree to speed up correspondences for geometrical present competitive results with no false positives in very differentdatasets, using exactly the same vocabulary and settings. The wholetechnique, including feature extraction, requires 22ms per frame in asequence with 26300 images, being one order of magnitude faster thanprevious Terms Place Recognition , Bag of Words , SLAM, INTRODUCTIONOne of the most significant requirements for long-term visualSLAM (Simultaneous Localization and Mapping) is robust placerecognition.

IEEE TRANSACTIONS ON ROBOTICS, VOL. , NO. , MONTH, YEAR. SHORT PAPER 1 Bags of Binary Words for Fast Place Recognition in Image Sequences Dorian Galvez-L´ opez and Juan D. Tard´ os,´ Member, IEEE Abstract—We propose a novel method for visual place recognition usin g

Tags:

  Transactions, Ieee, Ieee transactions

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Bags of Binary Words for Fast Place Recognition in Image ...

1 ieee transactions ON ROBOTICS, VOL. , NO. , MONTH, YEAR. SHORT PAPER1 Bags of Binary Words for Fast Place Recognition inImage SequencesDorian G alvez-L opez and Juan D. Tard os,Member, IEEEA bstract We propose a novel method for visual Place Recognition usingbag of Words obtained from FAST+BRIEF features. For the firsttime, webuild a vocabulary tree that discretizes a Binary descriptor space, anduse the tree to speed up correspondences for geometrical present competitive results with no false positives in very differentdatasets, using exactly the same vocabulary and settings. The wholetechnique, including feature extraction, requires 22ms per frame in asequence with 26300 images, being one order of magnitude faster thanprevious Terms Place Recognition , Bag of Words , SLAM, INTRODUCTIONOne of the most significant requirements for long-term visualSLAM (Simultaneous Localization and Mapping) is robust placerecognition.

2 After an exploratory period, when areas non-observedfor long are re-observed, standard matching algorithms fail. Whenthey are robustly detected, loop closures provide correct data asso-ciation to obtain consistent maps. The same methods used for loopdetection can be used for robot relocation after track lost, due forexample to sudden motions, severe occlusions or motion blur. In [1]we concluded that, for small environments, map-to- Image methodsachieve nice performance, but for large environments, Image -to- Image (or appearance-based) methods such as FAB-MAP [2] scale basic technique consists in building a database from the imagescollected online by the robot, so that the most similar one can beretrieved when a new Image is acquired.

3 If they are similar enough,a loop closure is recent years, many algorithms that exploit this idea haveappeared [2] [6], basing the Image matching on comparing them asnumerical vectors in the bag-of- Words space [7]. Bags of Words resultin very effective and quick Image matchers [8], but they are not aperfect solution for closing loops, due mainly to perceptual aliasing[6]. For this reason, a verification step is performed later by checkingthe matching images to be geometrically consistent, requiring featurecorrespondences. The bottleneck of the loop closure algorithms isusually the extraction of features, which is around ten times moreexpensive in computation cycles than the rest of steps.

4 This maycause SLAM algorithms to run in two decoupled threads: one toperform the main SLAM functionality, and the other just to detectloop closures, as in [5].In this paper, we present a novel algorithm to detect loops andestablishing point correspondences between images in real time, witha conventional CPU and a single camera. Our approach is based onbag of Words and geometrical check, with several important noveltiesthat make it much faster than current approaches. The main speedimprovement comes from the use of a slightly modified version ofthe BRIEF descriptor [9] with FAST keypoints [10], as explained inSection III. The BRIEF descriptor is a Binary vector where each bitis the result of an intensity comparison between a given pair of pixelsaround the keypoint.

5 Although BRIEF descriptors are hardly invariantThis research has been partly funded by the European Union underproject RoboEarth FP7-ICT-248942, the Direcci on General de Investigaci onof Spain under projects DPI2009-13710, DPI2009-07130 and the Ministeriode Educaci on (scholarship FPU-AP2008-02272).The authors are with the Instituto de Investigaci on en Ingenier a de Arag on(I3A), Universidad de Zaragoza, Mar a de Luna 1, 50018 Zaragoza, Spain.{dorian, scale and rotation, our experiments show that they are very robustfor loop closing with planar camera motions, the usual case in mobilerobotics, offering a good compromise between distinctiveness andcomputation introduce a bag of Words that discretizes a Binary space, andaugment it with a direct index, in addition to the usual inverse index,as explained in Section IV.}

6 To the best of our knowledge, this is thefirst time a Binary vocabulary is used for loop detection. The inverseindex is used for fast retrieval of images potentially similar to a givenone. We show a novel use of the direct index to efficiently obtainpoint correspondences between images, speeding up the geometricalcheck during the loop complete loop detection algorithm is detailed in Section to our previous work [5,6], to decide that a loop has beenclosed, we verify the temporal consistency of the Image matchesobtained. One of the novelties in this paper is a technique to preventimages collected in the same Place from competing among them whenthe database is queried.

7 We achieve this by grouping together thoseimages that depict the same Place during the VI contains the experimental evaluation of our work, in-cluding a detailed analysis of the relative merits of the different partsin our algorithm. We present comparisons between the effectivenessof BRIEF and two versions of SURF features [11], the descriptormost used for loop closing. We also analyze the performance of thetemporal and geometrical consistency tests for loop verification. Wefinally present the results achieved by our technique after evaluatingit in five public datasets 4Km long trajectories. Wedemonstrate that we can run the whole loop detection procedure,including the feature extraction, in52ms in 26300 images (22ms onaverage), outperforming previous techniques by more than one orderof preliminary version of this work was presented in [12].

8 In thecurrent paper we enhance the direct index technique and extend theexperimental evaluation of our approach. We also report results innew datasets and make a comparison with the state-of-the-art FAB-MAP algorithm [13].II. RELATED WORKP lace Recognition based on appearance has obtained great attentionin the robotics community because of the excellent results achieved[4,5,13,14]. An example of this is the FAB-MAP system [13], whichdetects loops with an omnidirectional camera, obtaining a recall and , with no false positives, in trajectories 70 Kmand 1000 Km in length. FAB-MAP represents images with a bagof Words , and uses a Chow Liu tree to learn offline the Words co-visibility probability.

9 FAB-MAP has become the gold standardregarding loop detection, but its robustness decreases when theimages depict very similar structures for a long time, which canbe the case when using frontal cameras [5]. In the work of Angeliet al. [4], two visual vocabularies (for appearance and color) arecreated online in an incremental fashion. The two bag-of-wordsrepresentations are used together as input of a Bayesian filter thatestimates the matching probability between two images, taking intoaccount the matching probability of previous cases. In contrast tothese probabilistic approaches, we rely on a temporal consistencycheck to consider previous matches and enhance the reliability ofthe detections.

10 This technique has proven successful in our previousworks [5,6]. Our work also differs from the ones above in that weuse a bag of Binary Words for the first time, as well as propose atechnique to prevent images collected close in time and depicting thesame Place from competing between them during the matching, sothat we can work at a higher transactions ON ROBOTICS, VOL. , NO. , MONTH, YEAR. SHORT PAPER2To verify loop closing candidates, a geometrical check is usuallyperformed. We apply an epipolar constraint to the best matchingcandidate as done in [4], but we take advantage of a direct index tocalculate correspondence points faster.


Related search queries