Lecture 11: Detection and Segmentation

Fei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 2017 Fei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 20171 Lecture 11: Detection and SegmentationFei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 2017 Fei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 20172 AdministrativeMidterms being gradedPlease don t discuss midterms until next week - some students not yet takenA2 being gradedProject milestones due Tuesday 5/16 Fei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 2017 HyperQuest3 Fei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 20174 HyperQuestFei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10.

20175 HyperQuestFei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 20176 HyperQuestFei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 20177 Fei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 20178 Fei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 20179 Fei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 201710 HyperQuestWill post more details on Piazza this afternoonFei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 201711 Last Time: Recurrent NetworksFei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 201712 Last Time: Recurrent NetworksFei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 201713 Figure from Karpathy et a, Deep Visual-Semantic Alignments for Generating Image Descriptions , CVPR 2015.

Figure copyright IEEE, for educational Time: Recurrent NetworksA cat sitting on a suitcase on the floorA cat is sitting on a tree branchTwo people walking on the beach with surfboardsA tennis player in action on the courtA woman is holding a cat in her handA person holding a computer mouse on a deskFei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 201714 Last Time: Recurrent NetworksVanilla RNNS imple RNNE lman RNNE lman, Finding Structure in Time , Cognitive Science, and Schmidhuber, Long Short-Term Memory , Neural computation, 1997 Long Short Term Memory(LSTM)Fei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 201715 Today: Segmentation , Localization, DetectionFei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 201716 Class ScoresCat: : : far: Image ClassificationThis image is CC0 public domainVector:4096 Fully-Connected.

4096 to 1000 Fei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 201717 Other Computer Vision TasksClassification + LocalizationSemanticSegmentationObject DetectionInstance SegmentationCATGRASS, CAT, TREE, SKYDOG, DOG, CATDOG, DOG, CATS ingle ObjectMultiple ObjectNo objects, just pixelsThis image is CC0 public domainFei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 201718 Semantic SegmentationCATGRASS, CAT, TREE, SKYDOG, DOG, CATDOG, DOG, CATS ingle ObjectMultiple ObjectNo objects, just pixelsThis image is CC0 public domainFei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 201719 Semantic SegmentationCowGrassSkyTreesLabel each pixel in the image with a category labelDon t differentiate instances, only care about pixelsThis image is CC0 public domainGrassCatSkyTreesFei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 201720 Semantic Segmentation Idea.

Sliding WindowFull imageExtract patchClassify center pixel with CNNCowCowGrassFarabet et al, Learning Hierarchical Features for Scene Labeling, TPAMI 2013 Pinheiro and Collobert, Recurrent Convolutional Neural Networks for Scene Labeling , ICML 2014 Fei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 201721 Semantic Segmentation Idea: Sliding WindowFull imageExtract patchClassify center pixel with CNNCowCowGrassProblem: Very inefficient! Not reusing shared features between overlapping patchesFarabet et al, Learning Hierarchical Features for Scene Labeling, TPAMI 2013 Pinheiro and Collobert, Recurrent Convolutional Neural Networks for Scene Labeling , ICML 2014 Fei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 201722 Semantic Segmentation Idea: Fully ConvolutionalInput:3 x H x WConvolutions:D x H x WConvConvConvConvScores:C x H x WargmaxPredictions:H x WDesign a network as a bunch of convolutional layers to make predictions for pixels all at once!

Fei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 201723 Semantic Segmentation Idea: Fully ConvolutionalInput:3 x H x WConvolutions:D x H x WConvConvConvConvScores:C x H x WargmaxPredictions:H x WDesign a network as a bunch of convolutional layers to make predictions for pixels all at once!Problem: convolutions at original image resolution will be very expensive ..Fei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 201724 Semantic Segmentation Idea: Fully ConvolutionalInput:3 x H x WPredictions:H x WDesign network as a bunch of convolutional layers, with downsampling and upsampling inside the network!

High-res:D1 x H/2 x W/2 High-res:D1 x H/2 x W/2 Med-res:D2 x H/4 x W/4 Med-res:D2 x H/4 x W/4 Low-res:D3 x H/4 x W/4 Long, Shelhamer, and Darrell, Fully Convolutional Networks for Semantic Segmentation , CVPR 2015 Noh et al, Learning Deconvolution Network for Semantic Segmentation , ICCV 2015 Fei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 201725 Semantic Segmentation Idea: Fully ConvolutionalInput:3 x H x WPredictions:H x WDesign network as a bunch of convolutional layers, with downsampling and upsampling inside the network!

High-res:D1 x H/2 x W/2 High-res:D1 x H/2 x W/2 Med-res:D2 x H/4 x W/4 Med-res:D2 x H/4 x W/4 Low-res:D3 x H/4 x W/4 Long, Shelhamer, and Darrell, Fully Convolutional Networks for Semantic Segmentation , CVPR 2015 Noh et al, Learning Deconvolution Network for Semantic Segmentation , ICCV 2015 Downsampling:Pooling, strided convolutionUpsampling:???Fei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 201726In-Network upsampling: Unpooling 1234 Input: 2 x 2 Output: 4 x 41122112233443344 Nearest Neighbor1234 Input: 2 x 2 Output: 4 x 41020000030400000 Bed of Nails Fei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 201727In-Network upsampling: Max Unpooling Input: 4 x 412633521122173481234 Input: 2 x 2 Output: 4 x 40020010000003004 Max UnpoolingUse positions from pooling layer5678 Max PoolingRemember which element was max!

Rest of the networkOutput: 2 x 2 Corresponding pairs of downsampling and upsampling layersFei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 201728 Learnable Upsampling: Transpose ConvolutionRecall:Typical 3 x 3 convolution, stride 1 pad 1 Input: 4 x 4 Output: 4 x 4 Fei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 201729 Learnable Upsampling: Transpose ConvolutionRecall: Normal 3 x 3 convolution, stride 1 pad 1 Input: 4 x 4 Output: 4 x 4 Dot product between filter and inputFei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 201730 Learnable Upsampling: Transpose ConvolutionInput: 4 x 4 Output: 4 x 4 Dot product between filter and inputRecall: Normal 3 x 3 convolution, stride 1 pad 1 Fei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 201731 Input: 4 x 4 Output: 2 x 2 Learnable Upsampling: Transpose ConvolutionRecall: Normal 3 x 3 convolution, stride 2 pad 1 Fei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 201732 Input.

4 x 4 Output: 2 x 2 Dot product between filter and inputLearnable Upsampling: Transpose ConvolutionRecall: Normal 3 x 3 convolution, stride 2 pad 1 Fei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 201733 Learnable Upsampling: Transpose ConvolutionInput: 4 x 4 Output: 2 x 2 Dot product between filter and inputFilter moves 2 pixels in the input for every one pixel in the outputStride gives ratio between movement in input and outputRecall: Normal 3 x 3 convolution, stride 2 pad 1 Fei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 201734 Learnable Upsampling: Transpose Convolution3 x 3 transpose convolution, stride 2 pad 1 Input: 2 x 2 Output: 4 x 4 Fei-Fei Li & Justin Johnson & Serena YeungLecture 11 -May 10, 201735 Input: 2 x 2 Output: 4 x 4 Input gives weight for filterLearnable Upsampling: Transpose Convolution3 x 3 transpose conv

Lecture 11: Detection and Segmentation

Tags:

Information

Transcription of Lecture 11: Detection and Segmentation

Related search queries

Lecture 11: Detection and Segmentation

Tags:

Information

Documents from same domain

Related documents

Related search queries