Example: marketing

ESRGAN: Enhanced Super-Resolution Generative Adversarial ...

ESRGAN: Enhanced Super-ResolutionGenerative Adversarial NetworksXintao Wang1, Ke Yu1, Shixiang Wu2, Jinjin Gu3, Yihao Liu4,Chao Dong2, Yu Qiao2, and Chen Change Loy51 CUHK-SenseTime Joint Lab, The Chinese University of Hong Kong2 Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences3 The Chinese University of Hong Kong, Shenzhen4 University of Chinese Academy of Sciences5 Nanyang Technological University, Super-Resolution Generative Adversarial network (SR-GAN) is a seminal work that is capable of generating realistic texturesduring single image Super-Resolution .

ESRGAN: EnhancedSuper-Resolution Generative Adversarial Networks Xintao Wang 1, Ke Yu , Shixiang Wu2, Jinjin Gu3, Yihao Liu4, Chao Dong 2, Yu Qiao , and Chen Change Loy5 1 CUHK-SenseTime Joint Lab, The Chinese University of Hong Kong 2 Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences 3 The Chinese University of Hong Kong, …

Tags:

  Network, Adversarial, Generative, Generative adversarial, Generative adversarial networks

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of ESRGAN: Enhanced Super-Resolution Generative Adversarial ...

1 ESRGAN: Enhanced Super-ResolutionGenerative Adversarial NetworksXintao Wang1, Ke Yu1, Shixiang Wu2, Jinjin Gu3, Yihao Liu4,Chao Dong2, Yu Qiao2, and Chen Change Loy51 CUHK-SenseTime Joint Lab, The Chinese University of Hong Kong2 Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences3 The Chinese University of Hong Kong, Shenzhen4 University of Chinese Academy of Sciences5 Nanyang Technological University, Super-Resolution Generative Adversarial network (SR-GAN) is a seminal work that is capable of generating realistic texturesduring single image Super-Resolution .

2 However, the hallucinated detailsare often accompanied with unpleasant artifacts. To further enhancethe visual quality, we thoroughly study three key components of SR-GAN network architecture, Adversarial loss and perceptual loss,andimprove each of them to derive an Enhanced SRGAN (ESRGAN). Inparticular, we introduce the Residual-in-Residual Dense Block(RRDB)without batch normalization as the basic network building unit. More-over, we borrow the idea from relativistic GAN to let the discriminatorpredict relative realness instead of the absolute value.

3 Finally, we im-prove the perceptual loss by using the features before activation, whichcould provide stronger supervision for brightness consistency and texturerecovery. Benefiting from these improvements, the proposed ESRGAN achieves consistently better visual quality with more realistic and natu-ral textures than SRGAN and won the first place in the PIRM2018-SRChallenge (region 3) with the best perceptual index. The code is IntroductionSingle image Super-Resolution (SISR), as a fundamental low-level vision prob-lem, has attracted increasing attention in the research community andAI com-panies.

4 SISR aims at recovering a high-resolution (HR) image from a singlelow-resolution (LR) one. Since the pioneer work of SRCNN proposed by Donget al. [8], deep convolution neural network (CNN) approaches have brought pros-perous development. Various network architecture designs and training strategieshave continuously improved the SR performance, especially the Peak Signal-to-Noise Ratio (PSNR) value [21,24,22,25,36,37,13,46,45]. However, these PSNR-oriented approaches tend to output over-smoothed results without sufficient2 Xintao Wanget ESRGANG round TruthFig.

5 1: The Super-Resolution results of 4 for SRGAN, the proposed ESRGANand the ground-truth. ESRGAN outperforms SRGAN in sharpness and details, since the PSNR metric fundamentally disagrees with thesubjective evaluation of human observers [25].Several perceptual-driven methods have been proposed to improve the visualquality of SR results. For instance, perceptual loss [19,7] is proposed to optimizesuper-resolution model in a feature space instead of pixel space. Generative ad-versarial network [11] is introduced to SR by [25,33] to encourage the network tofavor solutions that look more like natural images.

6 The semantic image priorisfurther incorporated to improve recovered texture details [40]. One of the mile-stones in the way pursuing visually pleasing results is SRGAN [25]. The basicmodel is built with residual blocks [15] and optimized using perceptual loss in aGAN framework. With all these techniques, SRGAN significantly improves theoverall visual quality of reconstruction over PSNR-oriented , there still exists a clear gap between SRGAN results and theground-truth (GT) images, as shown in In this study, we revisit thekey components of SRGAN and improve the model in three aspects.

7 First, weimprove the network structure by introducing the Residual-in-Residual DenseBlock (RDDB), which is of higher capacity and easier to train. We also removeBatch Normalization (BN) [18] layers as in [26] and use residual scaling [35,26]and smaller initialization to facilitate training a very deep network . Second, weimprove the discriminator using Relativistic average GAN (RaGAN) [20], whichlearns to judge whether one image is more realistic than the other rather than whether one image is real or fake . Our experiments show that this improvementhelps the generator recover more realistic texture details.

8 Third, we propose animproved perceptual loss by using the VGG featuresbefore activationinstead ofafter activation as in SRGAN. We empirically find that the adjusted perceptualloss provides sharper edges and more visually pleasing results, as will be shownESRGAN: Enhanced Super-Resolution Generative Adversarial Networks3 Perceptual on PIRM self val 2: Perception-distortion plane on PIRM self validation dataset. We showthe baselines of EDSR [26], RCAN [45] and EnhanceNet [33], and the submittedESRGAN model. The blue dots are produced by image Extensive experiments show that the Enhanced SRGAN, termed ES-RGAN, consistently outperforms state-of-the-art methods in both sharpness anddetails (see ).

9 We take a variant of ESRGAN to participate in the PIRM-SR Challenge [5].This challenge is the first SR competition that evaluates the performance in aperceptual-quality aware manner based on [6]. The perceptual quality is judgedby the non-reference measures of Ma s score [27] and NIQE [30], , perceptualindex =12((10 Ma) + NIQE). A lower perceptual index represents a betterperceptual shown in , the perception-distortion plane is divided into threeregions defined by thresholds on the Root-Mean-Square Error (RMSE), and thealgorithm that achieves the lowest perceptual index in each region becomes theregional champion.

10 We mainly focus on region 3 as we aim to bring the perceptualquality to a new high. Thanks to the aforementioned improvements andsomeother adjustments as discussed in , our proposed ESRGAN won the firstplace in the PIRM-SR Challenge (region 3) with the best perceptual order to balance the visual quality and RMSE/PSNR, we further proposethe network interpolation strategy, which could continuously adjust the recon-struction style and smoothness. Another alternative is image interpolation, whichdirectly interpolates images pixel by pixel.


Related search queries