COMP4691-8691 Cut Generation and Convex Optimisation
Cut Generation and Convex Optimisation
1 Cut Generation for TSP [50 marks]
The objective of the travelling salesman problem (TSP) is to find the shortest
tour of a set of cities C := {1, 2, . . . , n}. The distance between each pair of
cities is given by di,j ≥ 0 ∀i, j ∈ C where di,j = dj,i.
A valid tour is a sequence of cities, representing the order they are visited,
where each city appears exactly once. The total length of a tour is the sum
of distances between adjacent cities in the sequence, where we also consider
the first and last city in the sequence as being adjacent (the tour must finish
in the same city as it started in).
A key difficulty in TSP-like problems is simultaneously modelling:
• the requirement for a single large tour covering all cities (instead of
multiple subtours as shown in figure 1); and
• the tour length.
1
2000 1500 1000 500 0 500 1000 1500 2000
2000
1500
1000
500
0
500
1000
Figure 1: Example of an invalid TSP solution, which consists of 10 subtours
rather than a single large one.
City Sequence Variables
One encoding of the problem is to consider decision variables that indicate
the position of each city in the sequence xu,i ∈ {0, 1} ∀u, i ∈ C. Here the
index u represents the sequence position, and i the city. To ensure we only
visit each city exactly once, and that we can only visit one city at a time in
the sequence we can impose the following constraints:∑
i∈C
xu,i = 1 ∀u ∈ C (1)∑
u∈C
xu,i = 1 ∀i ∈ C (2)
The downside of this representation is that we cannot easily represent the
2
tour length.
Link Variables
A second approach is to instead have decision variables for each possible link
in the tour (each pair of cities) yi,j ∈ {0, 1} ∀i 6= j ∈ C where yi,j = yj,i. A
tour then consists of a set of links, where each city must appear in exactly
two links (one for the incoming trip, the other for the outgoing):∑
j∈C\i
yi,j = 2 ∀i ∈ C (3)
In this representation it is easy to write out the objective:∑
i∈C
∑
j>i∈C
di,jyi,j (4)
The sticking point with this representation is that our constraints to represent
a valid tour are insufficient, as we haven’t prevented subtours from forming.
1.1 Combining Representations [5 marks]
One option is to utilise both representations and implement constraints to
ensure the two sets of variables are consistent. Mathematically formulate a
set of linear constraints (should result in of the order of n3 constraints) to
connect the city sequencing variables to the link variables.
Unfortunately, because this requires n3 constraints the problem formulations
quickly get too large to solve efficiently (we’ll solve a problem with n =
216 shortly), and these constraints tend to have a relatively weak linear
relaxation.
1.2 Subtour Elimination and Strength [10 marks]
An alternative to combining the representations, is to stick with just the link
variables, and directly implement constraints on these variables to eliminate
subtours. Let S represent the set of all city subsets with size greater than
2 and less than n: S ⊂ C, 2 < |S| < n ∀S ∈ S. Each of these city subsets
3
S ∈ S represents a potential subtour, that we want to prevent from forming
by enforcing the constraint:∑
iyi,j ≤ |S| − 1 ∀S ∈ S (5)
Come up with and explain a minimal example that demonstrates that a
formulation that utilises link variables (3)–(4) and subtour elimination con-
straints (5) results in a stronger linear relaxation than combining the city
sequence variables (1)–(2) and link variables (3)–(4) with the constraints
you derived in section 1.1.
1.3 Subtour Cut Generation [35 marks]
The major downside of the this formulation with link variables and subtour
elimination constraints is that the set S grows exponentially with the number
of cities.
In order to take advantage of the strength of these constraints, without blow-
ing out the formulation size, we are going to generate the subtour elimination
constraints on demand. The expectation is that we will hopefully only need
to generate a small subset of the subtour elimination constraints in order to
find the optimal tour.
The steps of our overall algorithm are:
1. Initialise the subtour set S ′ to ∅.
2. Solve MILP consisting of (3)–(4), and (5) using S ′.
3. If the solution to the MILP is a single valid tour, return optimal.
4. Otherwise add subtours present in MILP solution to S ′.
5. Repeat from 2 with updated S ′.
Explain why we can be confident we have the optimal to the overall TSP
problem if we get a single valid tour at step 3 [5 marks].
Implement the above algorithm in tsp.py. Report the results including the
resulting plot, the objective, the overall solve time, the number of iterations
and the number of subtour cuts generated [30 marks].
4
Hint: If you are implementing constraints or objectives in PuLP that have
expressions with many terms, you will likely want to use the pulp functions
pulp.lpSum and pulp.lpDot. These are many times faster that iterating
through a loop or calling python’s inbuilt sum function.
Hint: You will likely want to set a time limit for the subproblem MILP
solves (i.e. using the CBC maxSeconds option through PuLP). The idea is
that you don’t need to solve every subproblem optimally (only certain ones),
and it might in fact be better to stop early, generate more cuts, and then start
working on the next MILP. You sholud make this time limit adaptive, so that
the algorithm remains complete (e.g., you don’t want to get stuck returning
infeasible results or solutions that don’t lead to useful subtour cuts).
Hint: The PuLP status output will be zero when the solver times out,
regardless of whether or not an integer feasible solution has been found yet.
Depending on how tight the time limits are you choose, you might need to
check for yourself whether the returned value is integer feasible and satisfies
the important tour requirements, before attempting to generate cuts from it.
Hint: You should be able to solve the full 216 city instance in around 2
minutes. If you want small instances to work with, the script allows you to
filter out cities below a certain size. E.g., pass --min-population 1e6 to
limit the cities to those over 1 million in population.
Note: If you are keen to push your implementation even further, you can
take a look at some of the problem instances available here: http://comopt.
ifi.uni-heidelberg.de/software/TSPLIB95/tsp/.
2 Dual Gradient Ascent [50 marks]
Dual gradient ascent is an algorithm for solving strictly convex constrained
optimisation problems. In this question you will implement the approach,
and use it to solve several problems.
The problems.py file contains two constrained optimisation problems hard-
coded into classes Problem1 and Problem2. These classes offer functions to
evaluate the objective, the gradient of the objective, the constraints and the
constraint jacobian. This file also contains a more general class to represent
quadratically constrained quadratic programs (QCQP). It can load a specific
5
problem instance from file.
Your code should go in the file dual ascent.py. This can be called as a
script. If you provide it with a JSON file it will solve the corresponding
QCQP, otherwise it will solve the two hardcoded Problem1 and Problem2.
2.1 Gradient Descent [15 marks]
As a subcomponent, we need to be able to first solve gradient descent. Im-
plement the gradient descent algorithm in the gradient descent function,
of the dual ascent.py file.
Solve unconstrained versions of Problem1 and Problem2 (just ignore the
constraints) using your gradient descent algorithm. Report the number of
iterations required and the solution.
Hint: You’ll likely want to make use of one of the rule-based step sizes, e.g.,
one of those by Barzilai and Borwein from the lectures.
Hint: Implement a stopping criteria that checks whether the gradient is
within a provided tolerance of being zero.
2.2 Dual Gradient Ascent [25 marks]
Next implement the dual ascent algorithm in dual ascent, making use of
gradient descent.
Solve the constrained versions of Problem1 and Problem2 using your algo-
rithm. Report the number of iterations required and the solution.
Hint: Implement a stopping criteria based on satisfaction of the KKT con-
ditions to within a tolerance.
2.3 More General Quadratic Problems [10 marks]
Report the results you get by running your dual ascent algorithm on instances
qcqp-{1,2,3,4}.json. Based on the dual variables you obtain, explain
which constraints are active for each instance [5 marks]. You will likely
struggle to solve the instance qcqp-3.json. If so explain why [5 marks].
6