arXiv:1707.02921v1 [cs.CV] 10 Jul 2017

Enhanced Deep residual Networks for Single Image Super-ResolutionBee LimSanghyun SonHeewon KimSeungjun NahKyoung Mu LeeDepartment of ECE, ASRI, Seoul National University, 08826, Seoul, research on super-resolution has progressed withthe development of deep convolutional neural networks(DCNN). In particular, residual learning techniques exhibitimproved performance. In this paper, we develop an en-hanced deep super-resolution network (EDSR) with perfor-mance exceeding those of current state-of-the-art SR meth-ods. The significant performance improvement of our modelis due to optimization by removing unnecessary modules inconventional residual networks. The performance is furtherimproved by expanding the model size while we stabilizethe training procedure. We also propose a new multi-scaledeep super-resolution system (MDSR) and training method,which can reconstruct high-resolution images of differentupscaling factors in a single model.

The proposed methodsshow superior performance over the state-of-the-art meth-ods on benchmark datasets and prove its excellence by win-ning the NTIRE2017 Super-Resolution Challenge [26].1. IntroductionImage super-resolution (SR) problem, particularly sin-gle image super-resolution (SISR), has gained increasingresearch attention for decades. SISR aims to reconstructa high-resolution imageISRfrom a single low-resolutionimageILR. Generally, the relationship betweenILRandthe original high-resolution imageIHRcan vary dependingon the situation. Many studies assume thatILRis a bicubicdownsampled version ofIHR, but other degrading factorssuch as blur, decimation, or noise can also be considered forpractical , deep neural networks [11, 12, 14] provide sig-nificantly improved performance in terms of peak signal-to-noise ratio (PSNR) in the SR problem.

However, such net-works exhibit limitations in terms of architecture , the reconstruction performance of the neural networkmodels is sensitive to minor architectural changes. Also, thesame model achieves different levels of performance by dif-0853 from DIV2K [26]HR(PSNR / SSIM)Bicubic( dB / )VDSR [11]( dB / )SRResNet [14]( dB / )EDSR+(Ours)( dB / )Figure 1: 4 Super-resolution result of our single-scale SRmethod (EDSR) compared with existing initialization and training techniques. Thus, carefullydesigned model architecture and sophisticated optimizationmethods are essential in training the neural , most existing SR algorithms treat super-resolution of different scale factors as independent prob-lems without considering and utilizing mutual relationshipsamong different scales in SR.

As such, those algorithms re-quire many scale-specific networks that need to to be trainedindependently to deal with various scales. Exceptionally,1 [ ] 10 Jul 2017 VDSR [11] can handle super-resolution of several scalesjointly in the single network. Training the VDSR modelwith multiple scales boosts the performance substantiallyand outperforms scale-specific training, implying the redun-dancy among scale-specific models. Nonetheless, VDSR style architecture requires bicubic interpolated image as theinput, that leads to heavier computation time and memorycompared to the architectures with scale-specific upsam-pling method [5, 22, 14].While SRResNet [14] successfully solved those timeand memory issue with good performance, it simply em-ploys the ResNet architecture from He et al.

[9] withoutmuch modification. However, original ResNet was pro-posed to solve higher-level computer vision problems suchas image classification and detection. Therefore, applyingResNet architecture directly to low-level vision problemslike super-resolution can be solve these problems, based on the SRResNet ar-chitecture, we first optimize it by analyzing and removingunnecessary modules to simplify the network a network becomes nontrivial when the model iscomplex. Thus, we train the network with appropriate lossfunction and careful model modification upon training. Weexperimentally show that the modified scheme producesbetter , we investigate the model training method thattransfers knowledge from a model trained at other utilize scale-independent information during training,we train high-scale models from pre-trained low-scale mod-els.

Furthermore, we propose a new multi-scale architecturethat shares most of the parameters across different proposed multi-scale model uses significantly fewer pa-rameters compared with multiple single-scale models butshows comparable evaluate our models on the standard benchmarkdatasets and on a newly provided DIV2K dataset. Theproposed single- and multi-scale super-resolution networksshow the state-of-the-art performances on all datasets interms of PSNR and SSIM. Our methods ranked first andsecond, respectively, in the NTIRE 2017 Super-ResolutionChallenge [26].2. Related WorksTo solve the super-resolution problem, early approachesuse interpolation techniques based on sampling theory [1,15, 34]. However, those methods exhibit limitations in pre-dicting detailed, realistic textures.

Previous studies [25, 23]adopted natural image statistics to the problem to recon-struct better high-resolution works aim to learn mapping functions be-tweenILRandIHRimage pairs. Those learning meth-ods rely on techniques ranging from neighbor embed-ding [3, 2, 7, 21] to sparse coding [31, 32, 27, 33]. Yang etal. [30] introduced another approach that clusters the patchspaces and learns the corresponding functions. Some ap-proaches utilize image self-similarities to avoid using exter-nal databases [8, 6, 29], and increase the size of the limitedinternal dictionary by geometric transformation of patches[10].Recently, the powerful capability of deep neural net-works has led to dramatic improvements in SR. Since Donget al. [4, 5] first proposed a deep learning-based SR method,various CNN architectures have been studied for SR.

Kimet al. [11, 12] first introduced the residual network for train-ing much deeper network architectures and achieved su-perior performance. In particular, they showed that skip-connection and recursive convolution alleviate the burdenof carrying identity information in the super-resolution net-work. Similarly to [20], Mao et al. [16] tackled the generalimage restoration problem with encoder-decoder networksand symmetric skip connections. In [16], they argue thatthose nested skip connections provide fast and many deep learning based super-resolution algo-rithms, an input image is upsampled via bicubic interpo-lation before they fed into the network [4, 11, 12]. Ratherthan using an interpolated image as an input, training up-sampling modules at the very end of the network is also pos-sible as shown in [5, 22, 14].

By doing so, one can reducemuch of computations without losing model capacity be-cause the size of features decreases. However, those kindsof approaches have one disadvantage: They cannot dealwith the multi-scale problem in a single framework as inVDSR [11]. In this work, we resolve the dilemma of multi-scale training and computational efficiency. We not onlyexploit the inter-relation of learned feature for each scalebut also propose a new multi-scale model that efficientlyreconstructs high-resolution images for various scales. Fur-thermore, we develop an appropriate training method thatuses multiple scales for both single- and multi-scale studies also have focused on the loss functionsto better train network models. Mean squared error (MSE)or L2 loss is the most widely used loss function for generalimage restoration and is also major performance measure(PSNR) for those problems.

However, Zhao et al.[35]reported that training with L2 loss does not guarantee betterperformance compared to other loss functions in terms ofPSNR and SSIM. In their experiments, a network trainedwith L1 achieved improved performance compared with thenetwork trained with Proposed MethodsIn this section, we describe proposed model architec-tures. We first analyze recently published super-resolutionnetwork and suggest an enhanced version of the residualnetwork architecture with the simpler structure. We showthat our network outperforms the original ones while ex-hibiting improved computational efficiency. In the follow-ing sections, we suggest a single-scale architecture (EDSR)that handles a specific super-resolution scale and a multi-scale architecture (MDSR) that reconstructs various scalesof high-resolution images in a single residual blocksRecently, residual networks [11, 9, 14] exhibit excellentperformance in computer vision problems from the low-level to high-level tasks.

arXiv:1707.02921v1 [cs.CV] 10 Jul 2017

Tags:

Information

Transcription of arXiv:1707.02921v1 [cs.CV] 10 Jul 2017

Related search queries

arXiv:1707.02921v1 [cs.CV] 10 Jul 2017

Tags:

Information

Documents from same domain

Related documents

Related search queries