EfficientNet: Rethinking Model Scaling for Convolutional ...

efficientnet : Rethinking Model Scaling for Convolutional Neural NetworksMingxing Tan1 Quoc V. Le1 AbstractConvolutional Neural Networks (ConvNets) arecommonly developed at a fixed resource budget,and then scaled up for better accuracy if moreresources are available. In this paper, we sys-tematically study Model Scaling and identify thatcarefully balancing network depth, width, and res-olution can lead to better performance. Basedon this observation, we propose a new scalingmethod that uniformly scales all dimensions ofdepth/width/resolution using a simple yet highlyeffectivecompound coefficient.

We demonstratethe effectiveness of this method on Scaling upMobileNets and go even further, we use neural architec-ture search to design a new baseline networkand scale it up to obtain a family of models,calledEfficientNets,which achieve muchbetter accuracy and efficiency than previousConvNets. In particular, our efficientnet -B7achieves state-of-the-art top-1 accuracyon ImageNet, while fasteron inference than the best existingConvNet. Our EfficientNets also transfer well andachieve state-of-the-art accuracy on CIFAR-100( ), Flowers ( ), and 3 other transferlearning datasets, with an order of magnitudefewer parameters.

Source code is IntroductionScaling up ConvNets is widely used to achieve better accu-racy. For example, ResNet (He et al., 2016) can be scaledup from ResNet-18 to ResNet-200 by using more layers;Recently, GPipe (Huang et al., 2018) achieved Ima-geNet top-1 accuracy by Scaling up a baseline Model four1 Google Research, Brain Team, Mountain View, CA. Corre-spondence to: Mingxing of the36thInternational Conference on MachineLearning, Long Beach, California, PMLR 97, of Parameters (Millions)747678808284 Imagenet Top-1 Accuracy (%)ResNet-34 ResNet-50 ResNet-152 DenseNet-201 Inception-v2 Inception-ResNet-v2 NASNet-ANASNet-AResNeXt-101 XceptionAmoebaNet-AAmoebaNet-CSENetB0B3B 4B5B6 efficientnet -B7 Top1 Acc.

#ParamsResNet-152 (He et al., 2016) (Xie et al., 2017) (Hu et al., 2018) (Zoph et al., 2018) (Huang et al., 2018) Not plottedFigure Size vs. ImageNet numbers arefor single-crop, single- Model . Our EfficientNets significantly out-perform other ConvNets. In particular, efficientnet -B7 achievesnew state-of-the-art top-1 accuracy but being smallerand faster than GPipe. efficientnet -B1 is smaller faster than ResNet-152. Details are in Table 2 and larger. However, the process of Scaling up ConvNetshas never been well understood and there are currently manyways to do it. The most common way is to scale up Con-vNets by their depth (He et al.)

, 2016) or width (Zagoruyko &Komodakis, 2016). Another less common, but increasinglypopular, method is to scale up models by image resolution(Huang et al., 2018). In previous work, it is common to scaleonly one of the three dimensions depth, width, and imagesize. Though it is possible to scale two or three dimensionsarbitrarily, arbitrary Scaling requires tedious manual tuningand still often yields sub-optimal accuracy and this paper, we want to study and rethink the processof Scaling up ConvNets. In particular, we investigate thecentral question: is there a principled method to scale upConvNets that can achieve better accuracy and efficiency?

Our empirical study shows that it is critical to balance alldimensions of network width/depth/resolution, and surpris-ingly such balance can be achieved by simply Scaling eachof them with constant ratio. Based on this observation, wepropose a simple yet effectivecompound Scaling conventional practice that arbitrary scales these fac-tors, our method uniformly scales network width, depth, [ ] 11 Sep 2020 efficientnet : Rethinking Model Scaling for Convolutional Neural Networks(a) baseline(b) width Scaling (c) depth Scaling (d) resolution Scaling (e) compound Scaling #channelslayer_iresolution HxWwiderdeeperhigher resolutionhigher resolutiondeeperwiderFigure Scaling .

(a) is a baseline network example; (b)-(d) are conventional Scaling that only increases one dimension of networkwidth, depth, or resolution. (e) is our proposed compound Scaling method that uniformly scales all three dimensions with a fixed resolution with a set of fixed Scaling coefficients. Forexample, if we want to use2 Ntimes more computationalresources, then we can simply increase the network depth by N, width by N, and image size by N, where , , areconstant coefficients determined by a small grid search onthe original small Model . Figure 2 illustrates the differencebetween our Scaling method and conventional , the compound Scaling method makes sense be-cause if the input image is bigger, then the network needsmore layers to increase the receptive field and more channelsto capture more fine-grained patterns on the bigger image.

Infact, previous theoretical (Raghu et al., 2017; Lu et al., 2018)and empirical results (Zagoruyko & Komodakis, 2016) bothshow that there exists certain relationship between networkwidth and depth, but to our best knowledge, we are thefirst to empirically quantify the relationship among all threedimensions of network width, depth, and demonstrate that our Scaling method work well on exist-ing MobileNets (Howard et al., 2017; Sandler et al., 2018)and ResNet (He et al., 2016). Notably, the effectiveness ofmodel Scaling heavily depends on the baseline network; togo even further, we use neural architecture search (Zoph& Le, 2017; Tan et al.)

, 2019) to develop a new baselinenetwork, and scale it up to obtain a family of models, calledEfficientNets. Figure 1 summarizes the ImageNet perfor-mance, where our EfficientNets significantly outperformother ConvNets. In particular, our efficientnet -B7 surpassesthe best existing GPipe accuracy (Huang et al., 2018), butusing fewer parameters and running faster on in-ference. Compared to the widely used ResNet-50 (He et al.,2016), our efficientnet -B4 improves the top-1 accuracyfrom to (+ ) with similar FLOPS. BesidesImageNet, EfficientNets also transfer well and achieve state-of-the-art accuracy on 5 out of 8 widely used datasets, whilereducing parameters by up to 21x than existing Related WorkConvNet Accuracy:Since AlexNet (Krizhevsky et al.

,2012) won the 2012 ImageNet competition, ConvNets havebecome increasingly more accurate by going bigger: whilethe 2014 ImageNet winner GoogleNet (Szegedy et al., 2015)achieves top-1 accuracy with about parameters,the 2017 ImageNet winner SENet (Hu et al., 2018) top-1 accuracy with 145M parameters. Recently,GPipe (Huang et al., 2018) further pushes the state-of-the-artImageNet top-1 validation accuracy to using 557 Mparameters: it is so big that it can only be trained with aspecialized pipeline parallelism library by partitioning thenetwork and spreading each part to a different accelera-tor.

EfficientNet: Rethinking Model Scaling for Convolutional ...

Tags:

Information

Transcription of EfficientNet: Rethinking Model Scaling for Convolutional ...

Related search queries

EfficientNet: Rethinking Model Scaling for Convolutional ...

Tags:

Information

Documents from same domain

Related documents

Related search queries