EfficientNet: Rethinking Model Scaling for Convolutional ...

efficientnet : Rethinking Model Scaling for Convolutional neural NetworksMingxing Tan1 Quoc V. Le1 AbstractConvolutional neural Networks (ConvNets) arecommonly developed at a fixed resource budget,and then scaled up for better accuracy if moreresources are available. In this paper, we sys-tematically study Model Scaling and identify thatcarefully balancing network depth, width, and res-olution can lead to better performance. Basedon this observation, we propose a new scalingmethod that uniformly scales all dimensions ofdepth/width/resolution using a simple yet highlyeffectivecompound coefficient. We demonstratethe effectiveness of this method on Scaling upMobileNets and go even further, we use neural architecturesearch to design a new baseline network and scaleit up to obtain a family of models, calledEfficient-Nets, which achieve much better accuracy and effi-ciency than previous ConvNets.

In particular, ourEfficientNet-B7 achieves state-of-the-art / top-5 accuracy on ImageNet, fasteron inferencethan the best existing ConvNet. Our EfficientNetsalso transfer well and achieve state-of-the-art ac-curacy on CIFAR-100 ( ), Flowers ( ),and 3 other transfer learning datasets, with anorder of magnitude fewer IntroductionScaling up ConvNets is widely used to achieve better accu-racy. For example, ResNet (He et al., 2016) can be scaledup from ResNet-18 to ResNet-200 by using more layers;Recently, GPipe (Huang et al., 2018) achieved Ima-geNet top-1 accuracy by Scaling up a baseline Model fourtime larger. However, the process of Scaling up ConvNetshas never been well understood and there are currently many1 Google Research, Brain Team, Mountain View, CA.

Corre-spondence to: Mingxing of the36thInternational Conference on MachineLearning, Long Beach, California, PMLR 97, 2019. Copyright2019 by the author(s).020406080100120140160180 Number of Parameters (Millions)747678808284 Imagenet Top 1 Accuracy (%)ResNet-34 ResNet-50 ResNet-152 DenseNet-201 Inception-v2 Inception-ResNet-v2 NASNet-ANASNet-AResNeXt-101 XceptionAmoebaNet-AAmoebaNet-CSENetB0B3B 4B5B6 efficientnet -B7 Top1 Acc. #ParamsResNet-152 (He et al., 2016) (Xie et al., 2017) (Hu et al., 2018) (Zoph et al., 2018) (Huang et al., 2018) Not plottedFigure Size vs. ImageNet numbers arefor single-crop, single- Model . Our EfficientNets significantly out-perform other ConvNets. In particular, efficientnet -B7 achievesnew state-of-the-art top-1 accuracy but being smallerand faster than GPipe.

efficientnet -B1 is smaller faster than ResNet-152. Details are in Table 2 and to do it. The most common way is to scale up Con-vNets by their depth (He et al., 2016) or width (Zagoruyko &Komodakis, 2016). Another less common, but increasinglypopular, method is to scale up models by image resolution(Huang et al., 2018). In previous work, it is common to scaleonly one of the three dimensions depth, width, and imagesize. Though it is possible to scale two or three dimensionsarbitrarily, arbitrary Scaling requires tedious manual tuningand still often yields sub-optimal accuracy and this paper, we want to study and rethink the processof Scaling up ConvNets. In particular, we investigate thecentral question: is there a principled method to scale upConvNets that can achieve better accuracy and efficiency?

Our empirical study shows that it is critical to balance alldimensions of network width/depth/resolution, and surpris-ingly such balance can be achieved by simply Scaling eachof them with constant ratio. Based on this observation, wepropose a simple yet effectivecompound Scaling conventional practice that arbitrary scales these fac-tors, our method uniformly scales network width, depth,and resolution with a set of fixed Scaling coefficients. Forexample, if we want to use2 Ntimes more computationalEfficientNet: Rethinking Model Scaling for Convolutional neural Networks(a) baseline(b) width Scaling (c) depth Scaling (d) resolution Scaling (e) compound Scaling #channelslayer_iresolution HxWwiderdeeperhigher resolutionhigher resolutiondeeperwiderFigure Scaling .

(a) is a baseline network example; (b)-(d) are conventional Scaling that only increases one dimension of networkwidth, depth, or resolution. (e) is our proposed compound Scaling method that uniformly scales all three dimensions with a fixed , then we can simply increase the network depth by N, width by N, and image size by N, where , , areconstant coefficients determined by a small grid search onthe original small Model . Figure 2 illustrates the differencebetween our Scaling method and conventional , the compound Scaling method makes sense be-cause if the input image is bigger, then the network needsmore layers to increase the receptive field and more channelsto capture more fine-grained patterns on the bigger image. Infact, previous theoretical (Raghu et al.)

, 2017; Lu et al., 2018)and empirical results (Zagoruyko & Komodakis, 2016) bothshow that there exists certain relationship between networkwidth and depth, but to our best knowledge, we are thefirst to empirically quantify the relationship among all threedimensions of network width, depth, and demonstrate that our Scaling method work well on exist-ing MobileNets (Howard et al., 2017; Sandler et al., 2018)and ResNet (He et al., 2016). Notably, the effectiveness ofmodel Scaling heavily depends on the baseline network ; togo even further, we use neural architecture search (Zoph Tan et al., 2019) to develop a new baseline net-work, and scale it up to obtain a family of models, calledEffi-cientNets. Figure 1 summarizes the ImageNet performance,where our EfficientNets significantly outperform other Con-vNets.

In particular, our efficientnet -B7 surpasses the bestexisting GPipe accuracy (Huang et al., 2018), but fewer parameters and running faster on to the widely used ResNet (He et al., 2016), ourEfficientNet-B4 improves the top-1 accuracy from ResNet-50 to with similar FLOPS. Besides Ima-geNet, EfficientNets also transfer well and achieve state-of-the-art accuracy on 5 out of 8 widely used datasets, whilereducing parameters by up to 21x than existing Related WorkConvNet Accuracy:Since AlexNet (Krizhevsky et al.,2012) won the 2012 ImageNet competition, ConvNets havebecome increasingly more accurate by going bigger: whilethe 2014 ImageNet winner GoogleNet (Szegedy et al., 2015)achieves top-1 accuracy with about parameters,the 2017 ImageNet winner SENet (Hu et al.)

, 2018) top-1 accuracy with 145M parameters. Recently,GPipe (Huang et al., 2018) further pushes the state-of-the-artImageNet top-1 validation accuracy to using 557 Mparameters: it is so big that it can only be trained with aspecialized pipeline parallelism library by partitioning thenetwork and spreading each part to a different accelera-tor. While these models are mainly designed for ImageNet,recent studies have shown better ImageNet models also per-form better across a variety of transfer learning datasets(Kornblith et al., 2019), and other computer vision taskssuch as object detection (He et al., 2016; Tan et al., 2019).Although higher accuracy is critical for many applications,we have already hit the hardware memory limit, and thusfurther accuracy gain needs better Efficiency:Deep ConvNets are often over-parameterized.

Model compression (Han et al., 2016; Heet al., 2018; Yang et al., 2018) is a common way to re-duce Model size by trading accuracy for efficiency. As mo-bile phones become ubiquitous, it is also common to hand-craft efficient mobile-size ConvNets, such as SqueezeNets(Iandola et al., 2016; Gholami et al., 2018), MobileNets(Howard et al., 2017; Sandler et al., 2018), and ShuffleNets(Zhang et al., 2018; Ma et al., 2018). Recently, neural archi-tecture search becomes increasingly popular in designingefficient mobile-size ConvNets (Tan et al., 2019; Cai et al., efficientnet : Rethinking Model Scaling for Convolutional neural Networks2019), and achieves even better efficiency than hand-craftedmobile ConvNets by extensively tuning the network width,depth, convolution kernel types and sizes.

EfficientNet: Rethinking Model Scaling for Convolutional ...

Tags:

Information

Transcription of EfficientNet: Rethinking Model Scaling for Convolutional ...

Related search queries

EfficientNet: Rethinking Model Scaling for Convolutional ...

Tags:

Information

Documents from same domain

Related documents

Related search queries