Transcription of Convex Optimization Overview (cnt’d)
1 Convex Optimization Overview (cnt d)Chuong B. DoNovember 29, 2009 During last week s section, we began our study ofconvex Optimization , the study ofmathematical Optimization problems of the form,minimizex Rnf(x)subject tox C.(1)In a Convex Optimization problem,x Rnis a vector known as theoptimization variable,f:Rn Ris aconvex functionthat we want to minimize, andC Rnis aconvex setdescribing the set of feasible solutions. From a computational perspective, Convex optimiza-tion problems are interesting in the sense that any locally optimal solution will always beguaranteed to be globally optimal.
2 Over the last several decades, general purpose methodsfor solving Convex Optimization problems have become increasingly reliable and these lecture notes, we continue our foray into the field of Convex Optimization . Inparticular, we explore a powerful concept in Convex Optimization theory known asLagrangeduality. We focus on the main intuitions and mechanics of Lagrange duality; in particular,we describe the concept of the Lagrangian, its relation to primal and dual problems, andthe role of the Karush-Kuhn-Tucker (KKT) conditions in providing necessary and sufficientconditions for optimality of a Convex Optimization Lagrange dualityGenerally speaking, the theory of Lagrange duality is the study of optimal solutions to convexoptimization problems.
3 As we saw previously in lecture, when minimizing a differentiableconvex functionf(x) with respect tox Rn, a necessary and sufficient condition forx Rnto be globally optimal is that xf(x ) =0. In the more general setting of convexoptimization problem with constraints, however, this simple optimality condition does notwork. One primary goal of duality theory is to characterize the optimal points of convexprograms in a mathematically rigorous these notes, we provide a brief introduction to Lagrange duality and its applications1to generic differentiable Convex Optimization problems of the form,minimizex Rnf(x)subject togi(x) 0, i= 1.
4 , m,hi(x) = 0, i= 1, .. , p,(OPT)wherex Rnis theoptimization variable,f:Rn Randgi:Rn Raredifferen-tiable Convex functions1, andhi:Rn Rareaffine The LagrangianIn this section, we introduce an artificial-looking construct called the Lagrangian whichis the basis of Lagrange duality theory. Given a Convex constrained minimization problemof the form (OPT), the (generalized)Lagrangianis a functionL:Rn Rm Rp R,defined asL(x, , ) =f(x) +m i=1 igi(x) +p i=1 ihi(x).(2)Here, the first argument of the Lagrangian is a vectorx Rn, whose dimensionality matchesthat of the Optimization variable in the original Optimization problem; by convention, we refertoxas theprimal variablesof the Lagrangian.
5 The second argument of the Lagrangianis a vector Rmwith one variable ifor each of themconvex inequality constraints inthe original Optimization problem. The third argument of the Lagrangian is a vector Rp,with one variable ifor each of thepaffine equality constraints in the original optimizationproblem. These elements of and are collectively known as thedual variablesof theLagrangian orLagrange , the Lagrangian can be thought of as a modified version of the objectivefunction to the original Convex Optimization problem (OPT) which accounts for each of theconstraints.
6 The Lagrange multipliers iand ican be thought of costs associated withviolating different constraints. The key intuition behind the theory of Lagrange duality isthe following:For any Convex Optimization problem, there always exist settings of the dual vari-ables such that the unconstrained minimum of the Lagrangian with respect to theprimal variables (keeping the dual variables fixed) coincides with the solution ofthe original constrained minimization formalize this intuition when we describe the KKT conditions in Section that a functionf.
7 S Ris Convex ifSis a Convex set, and for anyx, y Sand [0,1], wehavef( x+ (1 )y) f(x) + (1 )f(y). A functionfis concave if fis that an affine function is a function of the formf(x) =aTx+bfor somea Rn, b R. Since theHessian of an affine function is equal to the zero matrix ( , it is both positive semidefinite and negativesemidefinite), an affine function is both Convex and Primal and dual problemsTo show the relationship between the Lagrangian and the original Convex Optimization prob-lem (OPT), we introduce the notions of the primal and dual problems associated with aLagrangian:The primal problemConsider the Optimization problem,minx[max , : i 0, iL(x, , )] call this P(x)= minx P(x).
8 (P)In the equation above, the function P:Rn Ris called theprimalobjective, and the unconstrained minimization problem on the righthand side is known as theprimal problem. Generally, we say thata pointx Rnisprimal feasibleifgi(x) 0, i= 1, .. , mandhi(x) = 0, i= 1, .. , p. We typically use the vectorx Rnto denotethe solution of (P), and we letp = P(x ) denote the optimal valueof the primal dual problemBy switching the order of the minimization and maximization above,we obtain an entirelydifferentoptimization problem,max , : i 0, i[minxL(x, , )] call this D( , )= max , : i 0, i D( , ).
9 (D)Here, the function D:Rm Rp Ris called thedual objective,and the constrained maximization problem on the right hand side isknown as thedual problem. Generally, we say that ( , ) aredualfeasibleif i 0, i= 1, .. , m. We typically use the pair of vectors( , ) Rm Rpto denote the solution of (D), and we letd = D( , ) denote the optimal value of the dual Interpreting the primal problemFirst, observe that the primal objective, P(x), is a Convex function interpret theprimal problem, note that P(x) = max , : i 0, iL(x, , )(4)= max , : i 0, i[f(x) +m i=1 igi(x) +p i=1 ihi(x)](5)=f(x) + max , : i 0, i[m i=1 igi(x) +p i=1 ihi(x)](6)which follows from the fact thatf(x) does not depend on or.
10 Considering only thebracketed term, notice that If anygi(x)>0, then maximizing the bracketed expression involves making the cor-responding ian arbitrarily large positive number; however, ifgi(x) 0, then therequirement that ibe nonnegative means that the optimal setting of ito achievethe maximum is i= 0, so that the maximum value is 0. Similarly, if anyhi(x)6= 0, then maximizing the bracketed expression involves choosingthe corresponding ito have the same sign ashi(x) and arbitrarily large magnitude;however, ifhi(x) = 0, then the maximum value is 0, independent of these two cases together, we see that ifxis primal feasible ( ,gi(x) 0, i=1.)