Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
COMP4121 Lecture Notes
Linear Programming
We now move to one of the most important cases of convex programming, called Linear Programming, (LP), in which the objective is a linear function and the convex domain over which the extremum is sought is defined by con- straints which are also linear functions. We follow closely our old COMP3121 textbook by Cormen, Leiserson, Rivest and Stein, introduction to Algorithms. We start by defining a common representations of Linear Programming opti- mization problems. In the standard form the objective to be maximized is given by f(x) = n∑ j=1 cj xj , and the constraints are of the form n∑ j=1 aijxj ≤ bi, 1 ≤ i ≤ m; (1) xj ≥ 0, 1 ≤ j ≤ n, (2) Note that x represents a vector, x> = 〈x1, . . . , xn〉; we assume that vectors are columns; thus, to write them as rows we have to transpose them. The numbers aij are assumed to be real. Clearly, a minimization problem of the form: minimize f(x) = n∑ j=1 cj xj , subject to the constraints of the form n∑ j=1 aijxj ≥ bi, 1 ≤ i ≤ m; (3) xj ≥ 0, 1 ≤ j ≤ n, (4) can easily be reduced to a corresponding maximization problem in the standard form maximize f∗(x) = n∑ j=1 c∗j xj , with constraints of the form n∑ j=1 a∗ijxj ≤ b∗i , 1 ≤ i ≤ m; xj ≥ 0, 1 ≤ j ≤ n, by taking c∗ = −c, a∗ij = −aij and b∗ = −b. 1 Example: Humans require 13 vitamins V1, . . . , V13 in total; the daily re- quired amounts d1, . . . , d13 for each of them are known. Your local grocery store caries 130 types of food staples, and for each food staple Sk and each of the 13 vitamins Vi the content of Vi per gram of Sk is also a known value, denoted by C(k, i). For each food staple Sk you are also given its price per gram, denoted by Pk. Your goal is to determine the quantities xk of each food staple Sk which you should buy, such that your daily requirements of all vitamins are met, but the total price you pay for your daily food provision is as small as possible. Solution: You have to: minimize the objective 130∑ k=1 Pk xk subject to 13 constraints 130∑ k=1 C(k, i)xk > di, 1 ≤ i ≤ 13, x1 ≥ 0, . . . , x130 ≥ 0. Somewhat messy formulation of an LP program we have given can be refor- mulated in a much more compact matrix form. To get a very compact represen- tation of linear programs, let us introduce a partial ordering on vectors x ∈ Rn by x ≤ y if and only if such inequalities hold coordinate-wise, i.e., if and only if xj ≤ yj for all 1 ≤ i ≤ n. Since vectors are written as a single column matrices, letting c> = 〈c1, . . . , cn〉 ∈ Rn and b> = 〈b1, . . . bm〉 ∈ Rm, and letting A be the matrix A = (aij)i,j of size m × n, we get that the above problem can be formulated simply as: maximize c>x subject to the following two (matrix-vector) constraints:1 Ax ≤ b and x ≥ 0. Thus, to specify a Linear Programming optimisation problem we just have to provide a triplet (A,b, c), and the information that the triplet represents an LP problem in the standard form. While in general the “natural formulation” of a problem into a Linear Pro- gram does not necessarily produce the non-negativity constrains (4) for all of the variables, in the standard form such constraints are indeed required for all of the variables. However, this poses no problem, because each occurrence of an unconstrained variable xj can be replaced by the expression x ′ j − x∗j where x′j , x ∗ j are new variables satisfying the constraints x ′ j ≥ 0, x∗j ≥ 0. Any vector x which satisfies the two constraints is called a feasible solution, regardless of what the corresponding objective value cTx might be. Note that the set of feasible solutions, i.e., the domain over which we seek to maximize the 1Note that the inequality involved in the constraints is the partial ordering on vectors which we have just introduced above. Also, in the non-negativity constraint, 0 represents a vector 0 ∈ Rn with all coordinates equal to 0. 2 objective, is a convex set because it is an intersection of the half spaces defined by each of the constraints. As an example, let us consider the following optimization problem: maximize 3x1 + x2 + 2x3 (5) subject to the constraints x1 + x2 + 3x3 ≤ 30 (6) 2x1 + 2x2 + 5x3 ≤ 24 (7) 4x1 + x2 + 2x3 ≤ 36 (8) x1, x2, x3 ≥ 0 (9) How large can the value of the objective z(x1, x2, x3) = 3x1 + x2 + 2x3 be, without violating the constraints? If we add inequalities (6) and (7), we get 3x1 + 3x2 + 8x3 ≤ 54 Since all variables are constrained to be non-negative, we are assured that 3x1 + x2 + 2x3 ≤ 3x1 + 3x2 + 8x3 ≤ 54 The far left hand side of this equation is just the objective (5); thus z(x1, x2, x3) is bounded above by 54, i.e., z(x1, x2, x3) ≤ 54. Can we obtain a tighter bound? We could try to look for coefficients y1, y2, y3 ≥ 0 to be used to for a linear combination of the constraints: y1(x1 + x2 + 3x3) ≤ 30y1 y2(2x1 + 2x2 + 5x3) ≤ 24y2 y3(4x1 + x2 + 2x3) ≤ 36y3 Then, summing up all these inequalities and factoring, we get x1(y1 + 2y2 + 4y3) + x2(y1 + 2y2 + y3) + x3(3y1 + 5y2 + 2y3) ≤ 30y1 + 24y2 + 36y3 (10) If we compare this with our objective (5) we see that if we choose y1, y2 and y3 so that: y1 + 2y2 + 4y3 ≥ 3 y1 + 2y2 + y3 ≥ 1 3y1 + 5y2 + 2y3 ≥ 2 then x1(y1 + 2y2 + 4y3) + x2(y1 + 2y2 + y3) + x3(3y1 + 5y2 + 2y3) ≥ 3x3 + x2 + 2x3 3 Combining this with (10) we get: 30y1 + 24y2 + 36y3 ≥ 3x1 + x2 + 2x3 = z(x1, x2, x3) Consequently, in order to find an as tight as possible upper bound for our objective z(x1, x2, x3), we have to look for y1, y2, y3 which produce the smallest possible value of z∗(y1, y2, y3) = 30y1 + 24y2 + 36y3, but which do not violate the constraints y1 + 2y2 + 4y3 ≥ 3 (11) y1 + 2y2 + y3 ≥ 1 (12) 3y1 + 5y2 + 2y3 ≥ 2 (13) y1, y2, y3 ≥ 0 (14) Thus, trying to find the best upper bound for our objective z(x1, x2, x3) obtained by forming a linear combination of the constraints only reduces the original maximization problem to a minimization problem: minimize 30y1 + 24y2 + 36y3 subject to the constraints (11-14) Such minimization problem P ∗ is called the dual problem of the initial problem P . Let us now repeat the whole procedure with P ∗ in place of P , i.e., let us find the dual program (P ∗)∗ of P ∗. We are now looking for non negative multipliers z1, z2, z3 ≥ 0 to multiply inequalities (11-13) and obtain z1(y1 + 2y2 + 4y3) ≥ 3z1 z2(y1 + 2y2 + y3) ≥ z2 z3(3y1 + 5y2 + 2y3) ≥ 2z3 Summing these up and factoring produces y1(z1 + z2 + 3z3) + y2(2z1 + 2z2 + 5z3) + y3(4z1 + z2 + 2z3) ≥ 3z1 + z2 + 2z3 (15) If we choose these multiples so that z1 + z2 + 3z3 ≤ 30 (16) 2z1 + 2z2 + 5z3 ≤ 24 (17) 4z1 + z2 + 2z3 ≤ 36 (18) we will have: y1(z1 + z2 + 3z3) + y2(2z1 + 2z2 + 5z3) + y3(4z1 + z1 + 2z3) ≤ 30y1 + 24y2 + 36y3 Combining this with (15) we get 3z1 + z2 + 2z3 ≤ 30y1 + 24y2 + 36y3 Consequently, finding the dual program (P ∗)∗ of P ∗ amounts to maximizing the objective 3z1 + z2 + 2z3 subject to the constraints (16-18). But notice that, 4 except for having different variables, (P ∗)∗ is exactly our starting program P . Thus, the dual program (P ∗)∗ for program P ∗ is just P itself, i.e., (P ∗)∗ = P . So, at the first sight, looking for the multipliers y1, y2, y3 did not help much, because it only reduced a maximization problem to an equally hard minimization problem. Well, perhaps it still maybe easier to somehow solve both problems at the same, and this is precisely what the SIMPLEX algorithms does.2 To find the maximal value of 3x1 + x2 + 2x3 subject to the constraints x1 + x2 + 3x3 ≤ 30 2x1 + 2x2 + 5x3 ≤ 24 4x1 + x2 + 2x3 ≤ 36 let us start with x1 = x2 = x3 = 0 and ask ourselves: how much can we increase x1 without violating the constraints? Since x1 +x2 +3x3 ≤ 30 we can introduce a new variable x4 such that x4 ≥ 0 and x4 = 30− (x1 + x2 + 3x3) (19) Since such variable measures how much “slack” we got between the actual value of x1+x2+3x3 and its upper limit 30, such x4 is called a slack variable. Similarly we introduce new variables x5, x6 requiring them to satisfy x5, x6 ≥ 0 and let x5 = 24− 2x1 − 2x2 − 5x3 (20) x6 = 36− 4x1 − x2 − 2x3 (21) Since we started with the values x1 = x2 = x3 = 0, this implies that these new slack variables must have values x4 = 30, x5 = 24, x6 = 36, or as a single vector, the initial basic feasible solution is ( x1 0 , x2 0 , x3 0 , x4 30, x5 24, x6 36). Note that “fea- sible” refers merely to the fact that all of the constraints are satisfied.3 Now we see that (19) implies that x1 cannot exceed 30, and (20) implies that 2x1 ≤ 24, i.e., x1 ≤ 12, while (21) implies that 4x1 ≤ 36, i.e., x1 ≤ 9. Since all of these conditions must be satisfied we conclude that x1 cannot exceed 9, which is the upper limit coming from the constraint (21). If we set x1 = 9, this forces x6 = 0. We now swap the roles of x1 and x6: since we cannot increase x1 any more, we eliminate x1 from the right hand sides of the equations (19-21) and from the objective, introducing instead variable x6 to the right hand side of the constraints and into the objective. To do that, we solve equation (21) for x1: x1 = 9− x2 4 − x3 2 − x6 4 2It is now useful to remember how we proved that the Ford - Fulkerson Max Flow algorithm in fact produces a maximal flow, by showing that it terminates only when the flow reaches the capacity of a (minimal) cut! 3Clearly, setting all variables to 0 does not always produce a basic feasible solution because this might violate some of the constraints; this would happen, for example, if we had a constraint of the form −x1 +x2 +x3 ≤ −3; choosing an initial basic feasible solution requires a separate algorithm to “bootstrap” the SIMPLEX algorithm - see for the details our (C-L- R-S) textbook. 5 and eliminate x1 from the right hand side of the remaining constraints and the objective to get: z = 3 ( 9− x2 4 − x3 2 − x6 4 ) + x2 + 2x3 = 27− 3 4 x2 − 3 2 x3 − 3 4 x6 + x2 + 2x3 = 27 + 1 4 x2 + 1 2 x3 − 3 4 x6 x5 = 24− 2 ( 9− x2 4 − x3 2 − x6 4 ) − 2x2 − 5x3 = 6 + x2 2 + x3 + x6 2 − 2x2 − 5x3 = 6− 3 2 x2 − 4x3 + x6 2 x4 = 30− ( 9− x2 4 − x3 2 − x6 4 ) − x2 − 3x3 = 21 + x2 4 + x3 2 + x6 4 − x2 − 3x3 = 21− 3 4 x2 + 5 2 x3 + x6 4 To summarise: the “new” objective is z = 27 + 1 4 x2 + 1 2 x3 − 3 4 x6 and the “new constraints” are x1 = 9− x2 4 − x3 2 − x6 5 (22) x4 = 21− 3 4 x2 − 5 2 x3 + x6 4 (23) x5 = 6− 3 2 x2 − 4x3 + x6 2 (24) Our new basic feasible solution replacing ( x1 0 , x2 0 , x3 0 , x4 30, x5 24, x6 36) is obtained by set- ting all the variables on the right to zero, thus obtaining ( x1 9 , x2 0 , x3 0 , x4 21, x5 6 , x6 0 ). NOTE: These are EQUIVALENT constraints and objectives; the old ones were only transformed to an equivalent form. Any values of the variables will produce exactly the same value in both forms of the objective and they will satisfy the first set of constraints if and only if they satisfy the second set. So x1 and x6 have switched their roles; x1 acts as a new basic variable, and the new basic feasible solution is: ( x1 9 , x2 0 , x3 0 , x4 21, x5 6 , x6 0 ); the new value of the objective is z = 27 + 140 + 1 20− 340 = 27. We will continue this process of find- ing basic feasible solutions which increase the value of the objective, switching which variables are used to measure the slack and which are on the right hand side of the constraint equations and in the objective. The variables on the left are called the basic variables and the variables on the right are the non basic variables. 6 We now choose another variable with a positive coefficient in the objective, say x3 (we could also have chosen x2). How much can we increase it? From (22) we see that x32 must not exceed 9, otherwise x1 will become negative. Thus x3 cannot be larger than 18. Similarly, 5 2x3 cannot exceed 21, otherwise x4 will become negative, and so x3 ≤ 425 ; similarly, 4x3 cannot exceed 6, ie, x3 ≤ 32 . Thus, in order for all constraints to remain valid, x3 cannot exceed 3 2 . Thus, we increase x3 to 3 2 ; equation (24) now forces x5 to zero.