Nesterov
Found 4 free book(s)Convex Optimization — Boyd & Vandenberghe 1. Introduction
web.stanford.educonvex optimization (Nesterov & Nemirovski 1994) applications • before 1990: mostly in operations research; few in engineering • since 1990: many new applications in engineering (control, signal processing, communications, circuit design, . . . ); new problem classes (semidefinite and second-order cone programming, robust optimization)
Bag of Tricks for Image Classification with Convolutional ...
openaccess.thecvf.comNesterov Accelerated Gradient (NAG) descent [20] is used for training. Each model is trained for 120 epochs on 8 Nvidia V100 GPUs with a total batch size of 256. The learning rate is initialized to 0.1and divided by 10 at the 30th, 60th, and 90th epochs. 2.2. Experiment Results We evaluate three CNNs: ResNet-50 [9], Inception-V3[1],andMobileNet ...
Non-convex optimization - University of British Columbia
www.cs.ubc.caCubic regularization [Nesterov 2006] Gradient Lipschitz continuous Hessian Lipschitz continuous. Local non-convex optimization Random stochastic gradient descent Sample noise r uniformly from unit sphere Escapes saddle points but step size is difficult to determine
刘浩洋、户将、李勇锋、文再文编著 - pku.edu.cn
bicmr.pku.edu.cn似点梯度法、Nesterov 加速算法、近似点算法、分块坐标下降法、对 偶算法、交替方向乘子法、随机优化算法。 本书主要概念配有详细的例子来解释,主要优化算法的介绍包含算法 描述、应用举例和收敛性分析三个方面。在算法描述方面,本书侧重于算法