Transcription of Lecture 2 Piecewise-linear optimization
1 L. VandenbergheEE236A (Fall 2013-14) Lecture 2 Piecewise-linear optimization Piecewise-linear minimization 1- and -norm approximation examples modeling software2 1 Linear and affine functionslinear function:a functionf:Rn Ris linear iff( x+ y) = f(x) + f(y) x, y Rn, , Rproperty:fis linear if and only iff(x) =aTxfor someaaffine function:a functionf:Rn Ris affine iff( x+ (1 )y) = f(x) + (1 )f(y) x, y Rn, Rproperty:fis affine if and only iff(x) =aTx+bfor somea,bPiecewise-linear optimization2 2 Piecewise-linear functionf:Rn Ris (convex)piecewise-linearif it can be expressed asf(x) = maxi=1.
2 ,m(aTix+bi)fis parameterized bym n-vectorsaiandmscalarsbixaTix+bif(x)(the termpiecewise-affineis more accurate but less common) Piecewise-linear optimization2 3 Piecewise-linear minimizationminimizef(x) = maxi=1,..,m(aTix+bi) equivalent LP(with variablesxand auxiliary scalar variablet)minimizetsubject toaTix+bi t, i= 1, .. , mto see equivalence, note that for fixedxthe optimaltist=f(x) LP in matrix notation:minimize cT xsubject to A x bwith x= xt , c= 01 , A= aT1 1 , b= bm Piecewise-linear optimization2 4 Minimizing a sum of Piecewise-linear functionsminimizef(x) +g(x) = maxi=1.
3 ,m(aTix+bi) + maxi=1,..,p(cTix+di) cost function is Piecewise-linear :maximum ofmpaffine functionsf(x) +g(x) = maxi=1,..,mj=1,..,p (ai+cj)Tx+ (bi+dj) equivalent LPwithm+pinequalitiesminimizet1+t2subjec t toaTix+bi t1, i= 1, .. , mcTix+di t2, i= 1, .. , pnote that for fixedx, optimalt1,t2aret1=f(x),t2=g(x)Piecewise- linear optimization2 5 equivalent LP in matrix notationminimize cT xsubject to A x bwith x= xt1t2 , c= 011 , A= aT1 1 1 0cT10 1 , b= bm dp Piecewise-linear optimization2 6 -Norm (Cheybshev) approximationminimizekAx bk withA Rm n,b Rm -norm (Chebyshev norm)ofm-vectoryiskyk = maxi=1.
4 ,m|yi|= maxi=1,..,mmax{yi, yi} equivalent LP(with variablesxand auxiliary scalar variablet)minimizetsubject to t1 Ax b t1(for fixedx, optimaltist=kAx bk ) Piecewise-linear optimization2 7 equivalent LP in matrix notationminimize 01 T xt subject to A 1 A 1 xt b b Piecewise-linear optimization2 8 1-Norm approximationminimizekAx bk1 1-normofm-vectoryiskyk1=mXi=1|yi|=mXi=1m ax{yi, yi} equivalent LP(with variablexand auxiliary vector variableu)minimizemPi=1uisubject to u Ax b u(for fixedx, optimaluisui=|(Ax b)i|,i= 1, .. , m) Piecewise-linear optimization2 9 equivalent LP in matrix notationminimize 01 T xu subject to A I A I xu b b Piecewise-linear optimization2 10 Comparison with least-squares solutionhistograms of residualsAx b, with randomly generatedA R200 80, forxls= argminkAx bk,x 1= argminkAx bk1 (Axls b)k (Ax 1 b)k 1-norm distribution iswiderwith ahigh peak at zeroPiecewise-linear optimization2 11 Robust curve fitting fit affine functionf(t) = + ttompoints(ti, yi) an approximation problemAx bwithA= ,x= ,b= 10 50510 20 15 10 50510152025tf(t) dashed.
5 MinimizekAx bk solid: minimizekAx bk1 1-norm approximation is morerobust against outliersPiecewise-linear optimization2 12 Sparse signal recovery via 1-norm minimization x Rnis unknown signal, known to be very sparse we make linear measurementsy=A xwithA Rm n,m < nestimation by 1-norm minimization :compute estimate by solvingminimizekxk1subject toAx=yestimate is signal with smallest 1-norm, consistent with measurementsequivalent LP(variablesx,u Rn)minimize1 Tusubject to u x uAx=yPiecewise-linear optimization2 13 Example exact signal x R1000 10 nonzero components02004006008001000k 2 1012 xkleast-norm solutions(randomly generatedA R100 1000)02004006008001000k 2 1012xkminimum 2-norm solution02004006008001000k 2 1012xkminimum 1-norm solution 1-norm estimate isexactPiecewise-linear optimization2 14 Exact recoverywhen are the following problems equivalent?
6 Minimizecard(x)subject toAx=yminimizekxk1subject toAx=y card(x)is cardinality (number of nonzero components) ofx depends onAand cardinality of sparsest solution ofAx=ywe sayAallowsexact recoveryofk-sparse vectors if x= argminAx=ykxk1wheny=A xandcard( x) k here,argminkxk1denotes the unique minimizer a property of (the nullspace) of the measurement matrix APiecewise-linear optimization2 15 Nullspace condition for exact recoverynecessary and sufficient condition for exact recovery ofk-sparse vectors1|z(1)|+ +|z(k)|<12kzk1 z nullspace(A)\ {0}here,z(i)denotes componentziin order of decreasing magnitude|z(1)| |z(2)| |z(n)| a bound on how concentrated nonzero vectors innullspace(A)can be impliesk < n/2 difficult to verify for generalA holds with high probability for certain distributions of randomA1 Feuer & Nemirovski (IEEE Trans.)
7 IT, 2003) and several other papers oncompressed optimization2 16 Proof of nullspace conditionnotation xhas supportI {1,2, .. , n}ifxi= 0fori6 I |I|is number of elements inI PIis projection matrix onn-vectors with supportI:PIis diagonal with(PI)jj= 1j I0otherwise Asatisfies the nullspace condition ifkPIzk1<12kzk1for all nonzerozinnullspace(A)and for all support setsIwith|I| kPiecewise-linear optimization2 17sufficiency:supposeAsatisfies the nullspace condition let xbek-sparse with supportI( , withPI x= x); definey=A x consider any feasiblex( , satisfyingAx=y), different from x definez=x x.
8 This is a nonzero vector innullspace(A)kxk1=k x+zk1 k x+z PIzk1 kPIzk1=Xk I| xk|+Xk6 I|zk| kPIzk1=k xk1+kzk1 2kPIzk1>k xk1(line 2 is the triangle inequality; the last line is the nullspace condition)therefore x= argminAx=ykxk1 Piecewise-linear optimization2 18necessity:supposeAdoes not satisfy the nullspace condition for some nonzeroz nullspace(A)and support setIwith|I| k,kPIzk1 12kzk1 define ak-sparse vector x= PIzandy=A x the vectorx= x+zsatisfiesAx=yand has 1-normkxk1=k PIz+zk1=kzk1 kPIzk1 2kPIzk1 kPIzk1=k xk1therefore xis not the unique 1-minimizerPiecewise-linear optimization2 19 Linear classification given a set of points{v1.}
9 , vN}with binary labelssi { 1,1} find hyperplane that strictly separates the two classesaTvi+b >0ifsi= 1aTvi+b <0ifsi= 1homogeneous ina,b, hence equivalent to the linear inequalities (ina,b)si(aTvi+b) 1, i= 1, .. , NPiecewise-linear optimization2 20 Approximate linear separation of non-separable setsminimizeNXi=1max{0,1 si(aTvi+b)} penalty1 si(aTivi+b)for misclassifying pointvi can be interpreted as a heuristic for minimizing #misclassified points a Piecewise-linear minimization problem with variablesa,bPiecewise-linear optimization2 21equivalent LP(variablesa Rn,b R,u RN)minimizeNPi=1uisubject to1 si(vTia+b) ui, i= 1.
10 , Nui 0, i= 1, .. , Nin matrix notation:minimize 001 T abu subject to s1vT1 s1 1 0 0 s2vT2 s20 1 sNvTN sN0 0 100 1 0 0000 1 0 1 1 Piecewise-linear optimization2 22 Modeling softwaremodeling tools simplify the formulation of LPs (and other problems) accept optimization problem in standard notation (max,k k1, .. ) recognize problems that can be converted to LPs express the problem in the input format required by a specificLP solverexamples of modeling packages AMPL, GAMS CVX, YALMIP (MATLAB) CVXPY, Pyomo, CVXOPT (Python) Piecewise-linear optimization2 23 CVX exampleminimizekAx bk1subject to0 xk 1, k= 1.