Transcription of Alternating Direction Method of Multipliers
{{id}} {{{paragraph}}}
Alternating Direction Method of MultipliersRyan TibshiraniConvex Optimization 10-725 Last time: dual ascentConsider the problemminxf(x) subject toAx=bwherefis strictly convex and closed. Denote LagrangianL(x,u) =f(x) +uT(Ax b)Dual gradient ascent repeats, fork= 1,2,3,..x(k)= argminxL(x,u(k 1))u(k)=u(k 1)+tk(Ax(k) b)Good:xupdate decomposes whenfdoes. Bad: require stringentassumptions (strong convexity off) to ensure convergence2 Augmented Lagrangian Method modifies the problem, for >0,minxf(x) + 2 Ax b 22subject toAx=buses a modified LagrangianL (x,u) =f(x) +uT(Ax b) + 2 Ax b 22and repeats, fork= 1,2,3,..x(k)= argminxL (x,u(k 1))u(k)=u(k 1)+ (Ax(k) b)Good: better convergence properties. Bad: lose decomposability3 Alternating Direction Method of multipliersAlternating Direction Method of Multipliers or ADMM tries for thebest of both methods. Consider a problem of the form:minx,zf(x) +g(z) subject toAx+Bz=cWe define augmented Lagrangian, for a parameter >0,L (x,z,u) =f(x) +g(z) +uT(Ax+Bz c) + 2 Ax+Bz c 22We repeat, fork= 1,2,3.
ADMM steps are \almost" like repeated soft-thresholding of ridge regression coe cients 10. Comparison of various algorithms for lasso regression: 100 random instances with n= 200, p= 50 0 10 20 30 40 50 60 1e-10 1e-07 1e-04 1e-01 Iteration k Suboptimality fk-fstar Coordinate desc Proximal grad Accel prox ADMM (rho=50)
Domain:
Source:
Link to this page:
Please notify us if you found a problem with this document:
{{id}} {{{paragraph}}}