Example: stock market

Pyramidal Implementation of the Lucas Kanade Feature ...

Pyramidal Implementation of the Lucas Kanade Feature TrackerDescription of the algorithmJean-Yves BouguetIntel CorporationMicroprocessor Research Problem StatementLetIandJbe two 2D grayscaled images. The two quantitiesI(x) =I(x,y) andJ(x) =J(x,y) are then thegrayscale value of the two images are the locationx= [x y]T, wherexandyare the two pixel coordinates of a genericimage pointx. The imageIwill sometimes be referenced as the first image, and the imageJas the second practical issues, the imagesIandJare discret function (or arrays), and the upper left corner pixel coordinatevector is [0 0]T. Letnxandnybe the width and height of the two images. Then the lower right pixel coordinatevector is [nx 1ny 1] an image pointu= [uxuy]Ton the first imageI. The goal of Feature tracking is to find the locationv=u+d= [ux+dxuy+dy]Ton the second imageJsuch asI(u) andJ(v) are similar.

of the classical Lucas-Kanade algorithm. An iterative implementation of the Lucas-Kanade optical ow computation provides su cient local tracking accuracy. 2.1 Image pyramid representation Let us de ne the pyramid representsation of a generic image Iof size n x n y. Let I0 = Ibe the \zeroth" level image.

Tags:

  Kadena, Alcu, Lucas kanade

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Pyramidal Implementation of the Lucas Kanade Feature ...

1 Pyramidal Implementation of the Lucas Kanade Feature TrackerDescription of the algorithmJean-Yves BouguetIntel CorporationMicroprocessor Research Problem StatementLetIandJbe two 2D grayscaled images. The two quantitiesI(x) =I(x,y) andJ(x) =J(x,y) are then thegrayscale value of the two images are the locationx= [x y]T, wherexandyare the two pixel coordinates of a genericimage pointx. The imageIwill sometimes be referenced as the first image, and the imageJas the second practical issues, the imagesIandJare discret function (or arrays), and the upper left corner pixel coordinatevector is [0 0]T. Letnxandnybe the width and height of the two images. Then the lower right pixel coordinatevector is [nx 1ny 1] an image pointu= [uxuy]Ton the first imageI. The goal of Feature tracking is to find the locationv=u+d= [ux+dxuy+dy]Ton the second imageJsuch asI(u) andJ(v) are similar.

2 The vectord= [dxdy]Tis the image velocity atx, also known as the optical flow atx. Because of theaperture problem, it is essential todefine the notion of similarity in a 2D neighborhood sense. Let xand ytwo integers. We define the image velocitydas being the vector that minimizes the residual function defined as follows: (d) = (dx,dy) =ux+ x x=ux xuy+ y y=uy y(I(x,y) J(x+dx,y+dy))2.(1)Observe that following that defintion, the similarity function is measured on a image neighborhood of size (2 x+1) (2 y+ 1). This neighborhood will be also called integration window. Typical values for xand yare 2,3,4,5,6, Description of the tracking algorithmThe two key components to any Feature tracker are accuracy and robustness. The accuracy component relates tothe local sub-pixel accuracy attached to tracking. Intuitively, a small integration window would be preferable in ordernot to smooth out the details contained in the images ( small values of xand y).

3 That is especially requiredat occluding areas in the images where two patchs potentially move with very different robustness component relates to sensitivity of tracking with respect to changes of lighting, size of imagemotion,.. In particular, in oder to handle large motions, it is intuively preferable to pick a large integration , considering only equation 1, it is preferable to havedx xanddy y(unless some prior matchinginformation is available). There is therefore a natural tradeoff between local accuracy and robustness when choosingthe integration window size. In provide to provide a solution to that problem, we propose a Pyramidal implementationof the classical Lucas - Kanade algorithm. An iterative Implementation of the Lucas - Kanade optical flow computationprovides sufficient local tracking Image pyramid representationLet us define the pyramid representsation of a generic imageIof sizenx ny.

4 LetI0=Ibe the zeroth levelimage. This image is essentially the highest resolution image (the raw image). The image width and height at thatlevel are defined asn0x=nxandn0y=ny. The pyramid representation is then built in a recursive fashion: computeI1fromI0, then computeI2fromI1, and so LetL= 1,2,..be a generic Pyramidal level, and letIL 1bethe image at levelL 1. DenotenL 1xandnL 1ythe width and height ofIL 1. The imageIL 1is then defined asfollows:IL(x,y) =14IL 1(2x,2y) +18(IL 1(2x 1,2y) +IL 1(2x+ 1,2y) +IL 1(2x,2y 1) +IL 1(2x,2y+ 1))+116(IL 1(2x 1,2y 1) +IL 1(2x+ 1,2y+ 1) +IL 1(2x 1,2y+ 1) +IL 1(2x+ 1,2y+ 1)).(2)1 For simplicity in the notation, let us define dummy image values one pixel around the imageIL 1(for 0 x nL 1x 1and 0 y nL 1y 1):IL 1( 1,y).=IL 1(0,y),IL 1(x, 1).=IL 1(x,0),IL 1(nL 1x,y).=IL 1(nL 1x 1,y),IL 1(x,nL 1y).=IL 1(x,nL 1y 1),IL 1(nL 1x,nL 1y).=IL 1(nL 1x 1,nL 1y 1).

5 Then, equation 2 is only defined for values ofxandysuch that 0 2x nL 1x 1 and 0 2y nL 1y 1. Therefore,the widthnLxand heightnLyofILare the largest integers that satisfy the two conditions:nLx nL 1x+ 12,(3)nLy nL 1y+ 12.(4)Equations (2), (3) and (4) are used to construct recursively the Pyramidal representations of the two imagesIandJ:{IL}L=0,..,Lmand{JL}L=0,..,L m. The valueLmis the height of the pyramid (picked heuristically). Practicalvalues ofLmare 2,3,4. For typical image sizes, it makes no sense to go above a level 4. For example, for an imageIof size 640 480, the imagesI1,I2,I3andI4are of respective sizes 320 240, 160 120, 80 60 and 40 30. Goingbeyond level 4 does not make much sense in most cases. The central motivation behind Pyramidal representation isto be able to handle large pixel motions (larger than the integration window sizes xand y). Therefore the pyramidheight (Lm) should also be picked appropriately according to the maximum expected optical flow in the image.

6 Thenext section describing the tracking operation in detail we let us understand that concept better. Final observation:equation 2 suggests that the lowpass filter [1/4 1/2 1/4] [1/4 1/2 1/4]Tis used for image anti-aliasing before imagesubsampling. In practice however (in the C code) a larger anti-aliasing filter kernel is used for pyramid construction[1/16 1/4 3/8 1/4 1/16] [1/16 1/4 3/8 1/4 1/16]T. The formalism remains the Pyramidal Feature TrackingRecall the goal of Feature tracking: for a given pointuin imageI, find its corresponding locationv=u+dinimageJ, or alternatively find its pixel displacement vectord(see equation 1).ForL= 0,..,Lm, defineuL= [uLxuLy], the corresponding coordinates of the pointuon the Pyramidal imagesIL. Following our definition of the pyramid representation equations (2), (3) and (4), the vectorsuLare computedas follows:uL=u2L.

7 (5)The division operation in equation 5 is applied to both coordinates independently (so will be the multiplicationoperations appearing in subsequent equations). Observe that in particular,u0= overall Pyramidal tracking algorithm proceeds as follows: first, the optical flow is comptuted at the deepestpyramid levelLm. Then, the result of the that computation is propagated to the upper levelLm 1 in a form of aninitial guess for the pixel displacement (at levelLm 1). Given that initial guess, the refined optical flow is computedat levelLm 1, and the result is propagated to levelLm 2 and so on up to the level 0 (the original image).Let us now describe the recursive operation between two generics levelsL+ 1 andLin more mathematical that an initial guess for optical flow at levelL,gL= [gLxgLy]Tis available from the computations done fromlevelLmto levelL+ 1. Then, in order to compute the optical flow at levelL, it is necessary to find the residual pixeldisplacement vectordL= [dLxdLy]Tthat minimizes the new image matching error function L: L(dL) = L(dLx,dLy) =uLx+ x x=uLx xuLy+ y y=uLy y(IL(x,y) JL(x+gLx+dLx,y+gLy+dLy))2.

8 (6)Observe that the window of integration is of constant size (2 x+ 1) (2 y+ 1) for all values ofL. Notice that theinitial guess flow vectorgLis used to pre-translate the image patch in the second imageJ. That way, the residualflow vectordL= [dLxdLy]Tis small and therefore easy to compute through a standard Lucas Kanade details of computation of the residual optical flowdLwill be described in the next section For now, let usassume that this vector is computed (to close the main loop of the algorithm). Then, the result of this computationis propagated to the next levelL 1 by passing the new initial guessgL 1of expression:gL 1= 2 (gL+dL).(7)The next level optical flow residual vectordL 1is then computed through the same procedure. This vector, computedby optical flow computation (decribed in Section ), minimizes the functional L 1(dL 1) (equation 6). Thisprocedure goes on until the finest image resolution is reached (L= 0).

9 The algorithm is initialized by setting theinitial guess for levelLmto zero (no initial guess is available at the deepest level of the pyramid):gLm= [0 0]T.(8)The final optical flow solutiond(refer to equation 1) is then available after the finest optical flow computation:d=g0+d0.(9)Observe that this solution may be expressed in the following extended form:d=Lm L=02 LdL.(10)The clear advantage of a Pyramidal Implementation is that each residual optical flow vectordLcan be kept very smallwhile computing a large overall pixel displacement vectord. Assuming that each elementary optical flow computationstep can handle pixel motions up todmax, then the overall pixel motion that the Pyramidal Implementation can handlebecomesdmax final= (2Lm+1 1)dmax. For example, for a pyramid depth ofLm= 3, this means a maximum pixeldisplacement gain of 15! This enables large pixel motions, while keeping the size of the integration window Iterative Optical Flow Computation (Iterative Lucas - Kanade )Let us now describe the core optical flow computation.

10 At every levelLin the pyramid, the goal is finding thevectordLthat minimizes the matching function Ldefined in equation 6. Since the same type of operation is per-formed for all levelsL, let us now drop the superscriptsLand define the new imagesAandBas follows: (x,y) [px x 1,px+ x+ 1] [py y 1,py+ y+ 1],A(x,y).=IL(x,y),(11) (x,y) [px x,px+ x] [py y,py+ y],B(x,y).=JL(x+gLx,y+gLy).(12)Observe that the domains of definition ofA(x,y) andB(x,y) are slightly different. Indeed,A(x,y) is defined over awindow of size (2 x+ 3) (2 y+ 3) instead of (2 x+ 1) (2 y+ 1). This difference will become clear when computingspatial derivatives ofA(x,y) using the central difference operator (eqs. 19 and 20). For clarity purposes, let us changethe name of the displacement vector = [ x y]T=dL, as well as the image position vectorp= [pxpy]T= that new notation, the goal is to find the displacement vector = [ x y]Tthat minimizes the matchingfunction: ( ) = ( x, y) =px+ x x=px xpy+ y y=py y(A(x,y) B(x+ x,y+ y))2.


Related search queries