Example: quiz answers

An Introduction to Locally Linear Embedding

Research180 ParkAve, problemsininformationprocessinginvolve (LLE),anunsu-pervisedlearningalgorithmth atcomputeslowdimensional, , LLEmapsitsinputsintoa singleglobalcoordinatesystemoflowerdimen sionality, anditsoptimizations thoughcapableofgeneratinghighlynonlinear embeddings problemsinstatisticalpatternrecognitionb eginwiththepreprocessingofmultidimension alsignals, , thegoalofpreprocessingis someformofdimensionalityreduction:to com-pressthesignalsin sizeandto popularformsofdimensionalityreductionare themethodsofprincipalcom-ponentanalysis( PCA)[1]andmultidimensionalscaling(MDS)[2 ]. BothPCAandMDSareeigenvectormethodsdesign edto modellinearvariabilitiesin , (ormetric)MDS, thesedistancescorrespondtoEuclideandista nces, ,andtheiroptimizationsdonotinvolve , , weintroducedaneigenvectormethod calledlocallylinearembedding(LLE) fortheproblemofnonlineardimensionalityre duction[4]. ,thedimen-sionalityreductionbyLLEsucceed sinidentifyingtheunderlyingstructureofth emanifold, PCAandMDS,ouralgorithmis sim-pletoimplement,anditsoptimizationsdo notinvolve ,however, it is [5, 6], whichclusterthedataandperformPCAwithinea chcluster, donotaddresstheproblemconsideredhere namely, howtomaphighdimensionaldataintoa , wereviewtheLLEalgorithminitsmostbasicfor mandillustrateapotentialapplicationtoaud iovisualspeechsynthesis[3].

An Introduction to Locally Linear Embedding Lawrence K. Saul AT&T Labs – Research 180 Park Ave, Florham Park, NJ 07932 USA lsaul@research.att.com Sam T. Roweis Gatsby Computational Neuroscience Unit, UCL 17 Queen Square, London WC1N 3AR, UK roweis@gatsby.ucl.ac.uk Abstract Many problems in information processing involve some form …

Tags:

  Linear, Embedding, Locally, Locally linear embedding

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of An Introduction to Locally Linear Embedding

1 Research180 ParkAve, problemsininformationprocessinginvolve (LLE),anunsu-pervisedlearningalgorithmth atcomputeslowdimensional, , LLEmapsitsinputsintoa singleglobalcoordinatesystemoflowerdimen sionality, anditsoptimizations thoughcapableofgeneratinghighlynonlinear embeddings problemsinstatisticalpatternrecognitionb eginwiththepreprocessingofmultidimension alsignals, , thegoalofpreprocessingis someformofdimensionalityreduction:to com-pressthesignalsin sizeandto popularformsofdimensionalityreductionare themethodsofprincipalcom-ponentanalysis( PCA)[1]andmultidimensionalscaling(MDS)[2 ]. BothPCAandMDSareeigenvectormethodsdesign edto modellinearvariabilitiesin , (ormetric)MDS, thesedistancescorrespondtoEuclideandista nces, ,andtheiroptimizationsdonotinvolve , , weintroducedaneigenvectormethod calledlocallylinearembedding(LLE) fortheproblemofnonlineardimensionalityre duction[4]. ,thedimen-sionalityreductionbyLLEsucceed sinidentifyingtheunderlyingstructureofth emanifold, PCAandMDS,ouralgorithmis sim-pletoimplement,anditsoptimizationsdo notinvolve ,however, it is [5, 6], whichclusterthedataandperformPCAwithinea chcluster, donotaddresstheproblemconsideredhere namely, howtomaphighdimensionaldataintoa , wereviewtheLLEalgorithminitsmostbasicfor mandillustrateapotentialapplicationtoaud iovisualspeechsynthesis[3].

2 -10105-10123(A)-10105-10123(B)-2-1012-2- 1012(C)Figure1:Theproblemofnonlineardime nsionalityreduction,asillustratedforthre edimensionaldata(B)sampledfroma two dimensionalmanifold(A).Anun-supervisedle arningalgorithmmustdiscovertheglobalinte rnalcoordinatesofthemanifoldwithoutsigna lsthatexplicitlyindicatehowthedatashould beembed-dedintwo (C) , , is real-valuedvectors , eachofdimensionality , sufficientdata(suchthatthemanifoldis well-sampled),weexpecteachdatapointandit sneigh-borstolieonorclosetoa ,oneidentifies nearestneighborsperdatapoint,asmeasuredb yEuclideandis-tance.(Alternatively, onecanidentifyneighborsbychoosingallpoin tswithinaballoffixedradius,orbyusingmore sophisticatedrulesbasedonlocalmetrics.)R econstructionerrorsarethenmeasuredbythec ostfunction: (1) summarizethecontributionofthe thdatapointtothe computetheweights , weminimizethecostfunc-tionsubjecttotwo constraints:first,thateachdatapoint is reconstructedonlyfromitsneighbors,enforc ing if doesnotbelongtothisset;second,thattherow softheweightmatrixsumtoone.

3 Thereasonforthesum-to-oneconstraintwillb ecomeclearshortly. Theoptimalweights subjecttotheseconstraintsarefoundbysolvi nga leastsquaresproblem, animportantsymmetry:forany particulardatapoint,they areinvarianttorotations,rescalings, (1); consequenceofthissymmetryis thatthereconstructionweightscharacterize intrinsicgeometricpropertiesofeachneighb orhood,asopposedtopropertiesthatdependon a smoothnonlinearmanifoldofdimensionality! #" . To a goodapproximation,then,thereexistsa linearmapping consistingofa translation,rotation,andrescaling ,thereconstructionweights , thesameweights thatreconstructthe thdatapointin dimensionsshouldalsoreconstructitsembedd edmanifoldcoordinatesin!dimensions.(Info rmally, imaginetakinga pairofscissors,cuttingoutlocallylinearpa tchesoftheunderlyingmanifold, doneina ,thetransplantationofeachpatchinvolvesno morethana translation,rotation,andrescalingofitsda ta, ,whenthepatcharrivesatitslowdimensionald estination,weexpectthesameweightstorecon structeachdatapointfromitsneighbors.)

4 LLEconstructsa neighborhoodpreservingmappingbasedonthea bove ,eachhighdimensionalobservation is mappedtoalow dimensionalvector $ donebychoosing!-dimensionalcoordinates $ tominimizetheembeddingcostfunction:% $ & $ ' $ )((2)Thiscostfunction like thepreviousone isbasedonlocallylinearreconstructionerro rs,butherewefixtheweights whileoptimizingthecoordinates $ . TheembeddingcostinEq.(2)definesa quadraticforminthevectors $ . Subjecttoconstraintsthatmake theproblemwell-posed,it canbeminimizedbysolvingasparse +*, eigenvectorproblem,whosebottom! independentoftheweightsforotherdatapoint s theem-beddingcoordinatesarecomputedbyan -*. eigensolver, a globaloperationthatcouplesalldatapointsi n ;thisis donesimplybycomputingthebottomeigenvecto rsfromeq.(2)oneata fairlystraightforward,asthealgorithmhaso nlyonefreeparameter:thenumberofneighbors perdatapoint, . , . thatbestreconstructeachdatapoint fromitsneighbors,minimizingthecostineq.

5 (1) $/ bestreconstructedbytheweights , minimizingthequadraticformineq.(2) :SummaryoftheLLEalgorithm,mappinghighdim ensionaldatapoints, , tolow dimensionalembeddingvectors, $0 .arechosen,theoptimalweights andcoordinates$ andfindsglobalminimaofthereconstructiona ndembeddingcostsinEqs.(1)and(2).Asdiscus sedinAppendixA,intheunusualcasewherethen eighborsoutnumbertheinputdimensionality 213 , theleastsquaresproblemforfindingtheweigh tsdoesnothave a uniquesolution,anda regularizationterm forexample,onethatpenalizesthesquaredmag nitudesoftheweights , ,takesasinputthe highdimensionalvec-tors, . Inmany settings,however, theusermaynothave accesstodataofthisform, ,describedinAppendixC, , matricesofpairwisedistancescanbeanalyzed byLLEjustaseasilyasMDS[2];infactonlya smallfractionofallpos-siblepairwisedista nces(representingdistancesbetweenneighbo ringpointsandtheirrespective neighbors) , forexample,theinputto LLEconsisted 546 7 datapointssampledoff thealgorithm,using 8 :9neighborsperdatapoint,successfullyunra veledtheunderlyingtwo showsanothertwo dimensionalmanifold,thisonelivingina ,wegeneratedexamples showninthemiddlepanelofthefigure bytranslatingtheimageofa singlefaceacrossa two-dimensionalmanifoldparameterizedbyth eface s <;74= grayscaleimages,witheachimagecontaininga97>*96 facesu-perimposedona?

6 ;*? ,themanifoldoftranslatedfacesis highlynonlinearinthehighdimensional( @6 7 A;) vectorspaceofpixel showsthefirsttwo componentsdiscoveredbyLLE,with ,thetopportionshowsthefirsttwo clearthatthemanifoldstructureinthisexamp leis , inadditiontotheseexamples,forwhichthetru emanifoldstructurewasknown,wealsoapplied LLEtoimagesoflipsusedintheanimationoftal kingheads[3]. Ourdatabasecontained D>?>7>color(RGB)imagesoflipsat E A>*> ( 59AG69= :4) is showthefirsttwo componentsdiscovered,respectively, byPCAandLLE(with H :4).Ifthelipimagesdescribeda nearlylinearmanifold,thesetwo methodswouldyieldsimilarresults;thus, somewhatuniformdistributionaboutitsmean, thelocallylinearembeddinghasa distinctlyspiny structure,withthetipsofthespinescorrespo ndingto is worthnotingthatmany hill-climbingmethodsforautoencoderneural networks[7, 8], self-organizingmaps[9], andlatentvariablemodels[10] donothave thesameguaranteesofglobaloptimalityorcon -vergence;they alsotendtoinvolve many morefreeparameters,suchaslearningrates,c onvergencecriteria, ,computingnearestneighborsscales(inthewo rstcase)asI , orlinearlyintheinputdimensionality, , andquadraticallyinthenumberofdatapoints.

7 Formany6 Figure3:TheresultsofPCA(top)andLLE(botto m),appliedtoimagesofa singlefacetranslatedacrossa thecornersofitstwo dimensionalembedding,whilePCAfailstopres erve :Imagesoflipsmappedintotheembeddingspace describedbythefirsttwo coordinatesofPCA(top)andLLE(bottom).Repr esentative ,however andespeciallyfordatadistributedona thinsubman-ifoldoftheobservationspace constructionssuchasK-Dtreescanbeusedtoco mputetheneighborsinI 8 JLK7M time[13].InStep2,computingtherecon-struc tionweightsscalesasI ON ; thisis thenumberofoperationsrequiredtosolve a * ,computingthebottomeigenvectorsscalesasI ! , linearlyinthenumberofembeddingdimensions ,!, andquadraticallyinthenumberofdatapoints, . Methodsforsparseeigenproblems[14], however, canbeusedtoreducethecomplexitytosub-quad raticin . Notethatasmoredimensionsareaddedtotheemb eddingspace,theexistingonesdonotchange,s othatLLEdoesnothave sizeN generalprincipleofmanifoldlearning,eluci datedbyTenenbaumetal[11], thatoverlappinglocalneighborhoods collectively analyzed canpro-videinformationaboutglobalgeometr y.

8 Many virtuesofLLEaresharedbytheIsomapalgorith m[11], anextensionofMDSinwhichembeddingsareopti mizedtopreserve geodesic distancesbetweenpairsofdatapoints; avoidstheneedtosolve accumulateverysparsematrices, howtolearna parametricmappingbetweentheobservationan dembeddingspaces, touseP $ LLEbroadlyusefulinmany particulardatapoint Swith nearestneighbors T andreconstructionweightsU canwritethereconstructionerroras:V S U T W U S T YXU UX7Z) YX (3)9whereinthefirstidentity, wehave exploitedthefactthattheweightssumtoone,a ndinthesecondidentity, wehave introducedthelocalcovariancematrix,Z) YX + S T [\ S TX ((4)Thiserrorcanbeminimizedinclosedform, usinga Lagrangemultipliertoenforcetheconstraint that U ] . Intermsoftheinverselocalcovariancematrix ,theoptimalweightsaregivenby:U XZ#^`_ aX cbedZ^`_bfd((5)Thesolution,aswrittenineq .(5), ,a moreefficientwaytominimizetheerrorissimp lytosolve thelinearsystemofequations, Z) aXUX , andthentorescaletheweightssothatthey sumto one(whichyieldsthesameresult).

9 Byconstruction,thelocalcovariancematrixi neq.(4)issymmetricandsemipositive singularornearlysingular asarises,forexample,whentherearemoreneig hborsthaninputdimensions( 21g ), orwhenthedatapointsarenotingeneralpositi on itcanbeconditioned(beforesolvingthesyste m)byaddinga smallmultipleoftheidentitymatrix,Z) aX,hiZ) aXkjml)n porq YX (6)wheren is smallcomparedtothetraceofZ. $0 arefoundbyminimizingthecostfunction,eq.( 2),forfixedweights :sutwvx% $ y $ / 2 F $ )((7)Notethatthecostdefinesa quadraticform,% $ {z $ [ $ 10involvinginnerproductsoftheembeddingve ctorsandthe *| matrixz:z q j X X X} (8)whereq is 1 if and0 performedsubjecttoconstraintsthatmake is clearthatthecoordinates $ canbetranslatedbya constantdisplace-mentwithoutaffectingthe cost,% $ .We remove thisdegreeoffreedombyrequiringthecoordin atestobecenteredontheorigin: $ ((9)Also,toavoiddegeneratesolutions,weco nstraintheembeddingvectorstohaveunitcova riance,withouterproductsthatsatisfy $ $#~ (10)where isthe!]

10 *! $tobediagonalandoforderunity, sincethecostfunctionin eq.(2)is invariantto upto a globalrotationoftheembeddingspace isfoundbycomputingthebottom!j eigenvectorsofthematrix,z; thisis a versionoftheRayleitz-Ritztheorem[12].The bottomeigenvectorofthismatrix,whichwedis card,is theunitvectorwithallequalcomponents;it representsa zeromean,sincethecomponentsofothereigenv ectorsmustsumtozero,byvirtueoforthogonal ity. Theremaining!eigenvectorsformthe! !j eigenvectorsofthematrixz(thatis,thosecor respond-ingtoitssmallest!j eigenvalues)canbefoundwithoutperforminga fullmatrixdiagonalization[14]. Moreover, thematrixzcanbestoredandmanipulatedasthe sparsesymmetricmatrixz + 8 (11)11givingsubstantialcomputationalsavi ngsforlargevaluesof . Inparticular, leftmultiplicationbyz(thesubroutinerequi redbymostsparseeigensolvers)canbeperform edasz- ~ (12)requiringjustonemultiplicationby andonemultiplicationby ~, ,thematrixzneverneedstobeexplicitlycre-a tedorstored;it is sufficienttostoreandmultiplythematrix.


Related search queries