Example: barber

Deep Multi-Scale Convolutional Neural Network for …

Deep Multi-Scale Convolutional Neural Network for Dynamic Scene DeblurringSeungjun NahTae Hyun KimKyoung Mu LeeDepartment of ECE, ASRI, Seoul National University, 151-742, Seoul, Korea{ , blind deblurring for general dynamicscenes is a challenging computer vision problem as blursarise not only from multiple object motions but also fromcamera shake, scene depth variation. To remove thesecomplicated motion blurs, conventional energy optimiza-tion based methods rely on simple assumptions such thatblur kernel is partially uniform or locally linear. More-over, recent machine learning based methods also dependon synthetic blur datasets generated under these assump-tions. This makes conventional deblurring methods fail toremove blurs where blur kernel is dif cult to approximate orparameterize ( object motion boundaries). In this work,we propose a Multi-Scale Convolutional Neural Network thatrestores sharp images in an end-to-end manner where bluris caused by various sources.}

multi-scale architecture is used. Therefore, we make a multi-scale architecture that pre-serves fine-grained detail information as well as long-range dependency from coarser scales. Furthermore, we make sure intermediate level networks help the final stage in an explicit way by training network with multi-scale losses. 1.2.

Tags:

  Multi, Network, Scale, Neural, Convolutional, Multi scale convolutional neural network for

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Advertisement

Transcription of Deep Multi-Scale Convolutional Neural Network for …

1 Deep Multi-Scale Convolutional Neural Network for Dynamic Scene DeblurringSeungjun NahTae Hyun KimKyoung Mu LeeDepartment of ECE, ASRI, Seoul National University, 151-742, Seoul, Korea{ , blind deblurring for general dynamicscenes is a challenging computer vision problem as blursarise not only from multiple object motions but also fromcamera shake, scene depth variation. To remove thesecomplicated motion blurs, conventional energy optimiza-tion based methods rely on simple assumptions such thatblur kernel is partially uniform or locally linear. More-over, recent machine learning based methods also dependon synthetic blur datasets generated under these assump-tions. This makes conventional deblurring methods fail toremove blurs where blur kernel is dif cult to approximate orparameterize ( object motion boundaries). In this work,we propose a Multi-Scale Convolutional Neural Network thatrestores sharp images in an end-to-end manner where bluris caused by various sources.}

2 Together, we present Multi-Scale loss function that mimics conventional coarse-to- neapproaches. Furthermore, we propose a new large-scaledataset that provides pairs of realistic blurry image and thecorresponding ground truth sharp image that are obtainedby a high-speed camera. With the proposed model trainedon this dataset, we demonstrate empirically that our methodachieves the state-of-the-art performance in dynamic scenedeblurring not only qualitatively, but also IntroductionMotion blur is one of the most commonly arising typesof artifacts when taking photos. Shakes of camera and fastobject motions degrade image quality to undesired blurryimages. Furthermore, various causes such as depth varia-tion, occlusion in motion boundaries make blurs even morecomplex. Single image deblurring problem is to estimatethe unknown sharp image given a blurry image. Earlierstudies focused on removing blurs caused by simple transla-tional or rotational camera motions.

3 More recent works tryto handle general non-uniform blurs caused by depth vari-ation, camera shakes and object motions in dynamic envi-ronments. Most of these approaches are based on followingblur model [28,10,13,11].B=KS+n,(1)whereB,Sandnare vectorized blurry image, latentsharp image, and noise, a large sparsematrix whose rows each contain a local blur kernel actingonSto generate a blurry pixel. In practice, blur kernel isunknown. Thus, blind deblurring methods try to estimatelatent sharp imageSand blur blur kernel for every pixel is a severely ill-posedproblem. Thus, some approaches tried to parametrize blurmodels with simple assumptions on the sources of blurs. In[28,10], they assumed that blur is caused by 3D cameramotion only. However, in dynamic scenes, the kernel es-timation is more challenging as there are multiple movingobjects as well as camera motion. Thus, Kim et al. [14] pro-posed a dynamic scene deblurring method that jointly seg-ments and deblurs a non-uniformly blurred image, allowingthe estimation of complex (non-linear) kernel within a seg-ment.

4 In addition, Kim and Lee [15] approximated the blurkernel to be locally linear and proposed an approach that es-timates both the latent image and the locally linear motionsjointly. However, these blur kernel approximations are stillinaccurate, especially in the cases of abrupt motion discon-tinuities and occlusions. Note that such erroneous kernelestimation directly affects the quality of the latent image,resulting in undesired ringing , CNNs ( Convolutional Neural Networks) havebeen applied in numerous computer vision problems in-cluding deblurring problem and showed promising results[29,25,26,1]. Since no pairs of real blurry image andground truth sharp image are available for supervised learn-ing, they commonly used blurry images generated by con-volving synthetic blur kernels. In [29,25,1], synthesizedblur images with uniform blur kernel are used for , in [26], classification CNN is trained to estimate lo-cally linear blur kernels.

5 Thus, CNN-based models are stillsuited only to some specific types of blurs, and there arerestrictions on more common spatially varying (a)(b)(c)Figure 1. (a) Input blurry image. (b) Result of Sun et al. [26]. (c) Our deblurring result. Our results show clear object boundaries , all the existing methods still have many prob-lems before they could be generalized and used in are mainly due to the use of simple and unrealis-tic blur kernel models. Thus, to solve those problems, inthis work, we propose a novel end-to-end deep learning ap-proach for dynamic scene , we propose a Multi-Scale CNN that directly re-stores latent images without assuming any restricted blurkernel model. Especially, the Multi-Scale architecture isdesigned to mimic conventional coarse-to-fine optimizationmethods. Unlike other approaches, our method does not es-timate explicit blur kernels. Accordingly, our method is freefrom artifacts that arise from kernel estimation errors.

6 Sec-ond, we train the proposed model with a Multi-Scale lossthat is appropriate for coarse-to-fine architecture that en-hances convergence greatly. In addition, we further improvethe results by employing adversarial loss [9]. Third, we pro-pose a new realistic blurry image dataset with ground truthsharp images. To obtain kernel model-free dataset for train-ing, we employ the dataset acquisition method introducedin [17]. As the blurring process can be modeled by the in-tegration of sharp images during shutter time [17,21,16],we captured a sequence of sharp frames of a dynamic scenewith a high-speed camera and averaged them to generate ablurry image by considering gamma training with the proposed dataset and adding properaugmentation, our model can handle general local blur ker-nel implicitly. As the loss term optimizes the result toresemble the ground truth, it even restores occluded re-gions where blur kernel is extremely complex as shown We trained our model with millions of pairs of imagepatches and achieved significant improvements in dynamicscene deblurring.

7 Extensive experimental results demon-strate that the performance of the proposed method is farsuperior to those of the state-of-the-art dynamic scene de-blurring methods in both qualitative and quantitative Related WorksThere are several approaches that employed CNNs fordeblurring [29,26,25,1].Xu et al. [29] proposed an image deconvolution CNN todeblur a blurry image in a non-blind setting. They built anetwork based on the separable kernel property that the (in-verse) blur kernel can be decomposed into a small numberof significant filters. Additionally, they incorporated the de-noising Network [7] to reduce visual artifacts such as noiseand color saturation by concatenating the module at the endof their proposed the other hand, Schuler et al. [25] proposed a blinddeblurring method with CNN. Their proposed networkmimics conventional optimization-based deblurring meth-ods and iterates the feature extraction, kernel estimation,and the latent image estimation steps in a coarse-to-finemanner.

8 To obtain pairs of sharp and blurry images for net-work training, they generated uniform blur kernels using aGaussian process and synthesized lots of blurry images byconvolving them to the sharp images collected from the Im-ageNet dataset [3]. However, they reported performancelimits for large blurs due to their suboptimal to the work of Couzinie-Devy et al. [2], Sunet al. [26] proposed a sequential deblurring approach. First,they generated pairs of blurry and sharp patches with 73candidate blur kernels. Next, they trained classificationCNN to measure the likelihood of a specific blur kernel ofa local patch. And then smoothly varying blur kernel is ob-tained by optimizing an energy model that is composed ofthe CNN likelihoods and smoothness priors. Final latent3884image estimation is performed with conventional optimiza-tion method [30].Note that all these methods require an accurate kernelestimation step for restoring the latent sharp image.

9 In con-trast, our proposed model is learned to produce the latentimage directly without estimating blur other computer vision tasks, several forms of coarse-to-fine architecture or Multi-Scale architecture were ap-plied [8,6,4,23,5]. However, not all Multi-Scale CNNsare designed to produce optimal results, similarly to [25].In depth estimation, optical flow estimation, etc., networksusually produce outputs having smaller resolution com-pared to input image resolution [8,6,5]. These methodshave difficulties in handling long-range dependency even ifmulti- scale architecture is , we make a Multi-Scale architecture that pre-serves fine-grained detail information as well as long-rangedependency from coarser scales. Furthermore, we makesure intermediate level networks help the final stage in anexplicit way by training Network with Multi-Scale Kernel-Free Learning for Dynamic Scene De-blurringConventionally, it was essential to find blur kernel beforeestimating latent image.

10 CNN based methods were no ex-ception [25,26]. However, estimating kernel involves sev-eral problems. First, assuming simple kernel convolutioncannot model several challenging cases such as occluded re-gions or depth variations. Second, kernel estimation processis subtle and sensitive to noise and saturation, unless blurmodel is carefully designed. Furthermore, incorrectly esti-mated kernels give rise to artifacts in latent images. Third,finding spatially varying kernel for every pixel in dynamicscene requires a huge amount of memory and , we adopt kernel-free methods in both blurdataset generation and latent image estimation. In blurryimage generation, we follow to approximate camera imag-ing process, rather than assuming specific motions, insteadof finding or designing complex blur kernel. We capturesuccessive sharp frames and integrate to simulate blurringprocess. The detailed procedure is described in that our dataset is composed of blurry and sharp imagepairs only, and that the local kernel information is implic-itly embedded in it.


Related search queries