Computer Vision - University of Cambridge

ComputerVisionComputerScienceTripos:16 Lecturesby J G Daugman1. computervision;why theyareso di Imagesensing,pixelarrays, Biologicalvisualmechanisms,fromretinato Edgedetectionoperators; Higherbrainvisualmechanisms;streaming; Texture,colour,stereo, ; ers; a setof :facedetectionandrecognition; thiscoursearetointroducetheprinciples,mo delsandapplicationsof com-putervision,as wellas somemechanismsusedin biologicalvisualsystemsthatmay inspiredesignof arti :imageformation,structure,andcoding;edge andfeaturedetection;neuraloperatorsforim ageanalysis;texture,colour,stereo,motion ;waveletmethods forvisualcodingandanalysis.

Interpretationof surfaces,solids,andshapes;datafusion;pro babilisticclassi ers;visualinferenceandlearning; Goalsof computervision;why theyareso di imagesareformed,andtheill-posedproblemof making3 Dinferencesfromthemaboutobjectsandtheirp roperties. Imagesensing,pixelarrays, Biologicalvisualmechanismsfromretinato ;receptive eldpro les;spike trains; ;convolution; Edgedetectionoperators;theinformationrev ealedby 'sTheorem. visualprimitives. Higherlevel visualoperationsin ;stream-inganddivisionsof labour;reciprocalfeedback throughthevisualsystem. Texture,colour,stereo, ofinvariances. Dis-countingtheilluminant wheninfering3 Dstructureandsurfaceproperties.

Shape fromshading;surfacegeometry. Boundarydescriptors;codons;superquadrics andthe\ "sketch. see. Lessonsfromneurologicaltraumaandvisualde implyabouthow visionworks. Bayesianinferencein Vision ;knowledge-driven in Vision . Visionasa setof inverseproblems;mathematicalmethodsforso lvingthem:energyminimisation,relaxation, regularisation. Approachesto facedetection,facerecognition, theendof thecoursestudents should: understandvisualprocessingfromboth\botto m-up"(dataoriented)and\top-down"(goalsor iented)perspectives be ableto decomposevisualtasksinto sequencesof imageanalysisoperations,represen-tations ,speci calgorithms,andinferenceprinciples understandtherolesof imagetransformationsandtheirinvariancesi n patternrecogni-tionandclassi cation be ableto analysetherobustness,brittleness,general isability, andperformanceof dif-ferent approachesin computervision be ableto describe keyaspectsof how biologicalvisualsystemsencode,analyse,an drepresent visualinformation be ableto thinkof ways in which biologicalvisualstrategiesmight be implementedinmachinevision.

Despitetheenormousdi erencesin hardware understandin depthat leastoneimportant applicationdomain,such as facerecognition,detection,or interpretationRecommendedbookShapiro,L. & Stockman,G.(2001).ComputerVision. : \TheComputerVisionHomepage"(CarnegieMell onUniversity): : computervision;why theyareso di generateintelligent andusefuldescriptionsof visualscenesandsequences,andof theobjectsthatpopulatethem,by performingoperationsonthesignalsreceived computervisionapplicationsandgoals: automaticfacerecognition,andinterpretati onof expression visualguidanceof autonomousvehicles automatedmedicalimageanalysis,interpreta tion,anddiagnosis roboticmanufacturing:manipulation,gradin g,andassemblyof parts OCR:recognitionof printedor handwrittencharactersandwords agriculturalrobots:visualgradingandharve stingof produce smarto ces:trackingof personsandobjects.

Understandinggestures biometric-basedvisualidenti cationof persons visuallyendowedrobotichelpers security monitoringandalerting;detectionof anomaly intelligent interpretive prosthesesfortheblind trackingof movingobjects;collisionavoidance;stereos copicdepth object-based(model-based)compressionof videostreams generalsceneunderstandingIn many respects,computervisionis an\AI-complete"problem:buildinggeneral-p urposevisionmachineswouldentail,orrequir e,solutionstomostof thegeneralgoalsof arti wouldrequire ndingways ofbuilding exibleandrobustvisualrepresentationsof theworld,maintainingandupdatingthem,andi nterfacingthemwithattention, otherproblemsin AI,thechallengeof visioncanbe described in termsofbuildingasignal-to-symbol itselfonlyas physicalsignalsonsensorysurfaces(such as videocamera,retina, ),which explicitlyexpressverylittleof theinformationrequiredforintelligent understandingof theenvironment.

Thesesignalsmustbe convertedultimatelyinto symbolicrepresentationswhosemanipulation allowsthema-chineor organismto such an e ortlessandimmediatefaculty forhumansandotheranimals,it hasprovenexceedinglydi cultto :1. Animageis a two-dimensionalopticalprojection,butthew orldwe wishto make senseof visuallyis thisrespect,visionis\inverseoptics:"we needto invertthe3D !2 Dprojectionin ordertorecover worldproperties(objectpropertiesin space);butthe2D !3 Dinversionof such a projectionis, strictly, anotherrespect,visionis\inversegraphics: "graphicsbeginswitha 3 Dworlddescription(intermsof objectandilluminant properties,viewpoint,etc.)

,and\merely"computestheresulting2 Dimage,withitsoccludedsurfaces,shadingan dshadows,gradients,perspective, thisprocess!A classicalandcentralproblemin computervisionis ortlessly, rapidly, reliably, andunconsciously.(We don'teven know quitehow we doit; like so many tasksforwhich ourneuralresourcesareso formidable,we have little\cognitive penetrance"or understandingof how we actuallyperformfacerecognition.)Consider thesethreefacialimages(fromPawanSinha,MI T,2002):Which two picturesshow thesameperson?Mostalgorithmsforcomputerv isionselect1 and2 as thesameperson,sincethoseimagesaremoresim ilarthan1 Veryfewvisualtaskscanbe successfullyperformedin a purelydata-drivenway (\bottom-up"imageanalysis).

Considerthenextimageexample:thefoxesarew ellcamou agedby theirtexturedbackgrounds;thefoxesocclude each other;theyappearin severaldi erent posesandperspectiveangles; cantherepossiblyexistmathematicaloperato rsforsuchanimagethatcan: performthe gure-groundsegmentationof thescene(into itsobjectsandbackground) inferthe3 Darrangements of objectsfromtheirmutualocclusions infersurfaceproperties(texture,colour)fr omthe2 Dimagestatistics infervolumetricobjectpropertiesfromtheir 2 Dimageprojections anddoallof thisin \realtime?"(Thismattersquitea lotin thenaturalworld\redin toothandclaw,"sincesurvival dependsonit.)5 Considernow theactualimagedataof a face,shownasa pixelarraywithluminanceplottedas a functionof (X,Y) thisimage,or evensegment thefacefromitsbackground,letalonerecogni zetheface?

Inthisform,theimagerevealsboththecomplex ity of theproblemandthepoverty of \counselof despair"canbe givena moreformalstatement:Mostof theproblemswe needto solve in visionareill-posed,in Hadamard'ssensethatawell-posedproblemmus thave thefollowingsetof properties: itssolutionexists; itssolutionis unique; , fewof thetaskswe needto solve in visionarewell-posedproblemsinHadamard' : inferingdepthpropertiesfromanimage inferingsurfacepropertiesfromimageproper ties inferingcoloursin anilluminant-invariant manner inferingstructurefrommotion,shading,text ure,shadows,..6 inferinga 3 Dshape unambiguouslyfroma 2 Dlinedrawing: interpretingthemutualocclusionsof objects,andstereodisparity recognizinga 3 Dobjectregardlessof itsrotationsaboutitsthreeaxesinspace( chairseenfrommany di erent angles) understandinganobjectthathasnever beenseenbefore: ,pixelarrays,CCDcameras, CCDvideocameracontainsa densearray of independent sensors,whichconvertincident photonsfocusedby thelensonto each point into a chargeproportionalto thelight \coupled"(henceCCD)capacitivelyto allow a voltage(V=Q/C)to be readoutin a sequencescanningthearray.

Thenumber of pixels(pictureelements)rangesfromafew100 ,000to many millions( MegaPixel)in animagingarray thatisabout1 cm2in size,so each pixelsensingelement is onlyabout3 uxinto such smallcatchment areasis a factorlimitingfurtherincreasesin resolutionby micronsis only6 timeslargerthanthewavelengthof a photonoflight in thevisiblespectrum(yellow 500nanometersor nm).Spatialresolutionof theimageis thus determinedbothby thedensity of el-ements in theCCDarray, andby thepropertiesof thelenswhich is (thenumber of distinguishablegreylevels)is determinedby thenumber of bitsper pixelresolved by thedigitizer,andbytheinherent signal-to-noiseratioof (conceptuallyif notliterally)fromthreeseparateCCDarrays precededby di erent colour lters,or mutuallyembeddedas sub-populationswithina singleCCDarray.

Computer Vision - University of Cambridge

Tags:

Information

Transcription of Computer Vision - University of Cambridge

Related search queries

Computer Vision - University of Cambridge

Tags:

Information

Documents from same domain

Related documents

Related search queries