Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
COMP26020 - Lab exercise for Part III (Compilers)
Register Allocation using Graph Colouring
Background
Computer programs, regardless of the programming language, often use many more variables
than the number of variables that can fit in all CPU registers. When a program is compiled for
execution on a given processor, the compiler needs to consider what variables will stay in
registers and for how long. If we think that moving data from the memory takes several cycles,
there is a performance benefit if the compiler can minimise such transfers. How to do this? By
doing some ‘clever’ register allocation, for example, by making sure that the most frequently used
variables are placed in registers.
To understand the problem, consider the following piece of code:
1. r1=x
2. r2=y
3. r3=r1*r2
4. r4=z
5. r5=r4+r2
6. r6=w
7. r7=r5+r6
8. r8=r7*r3
9. r9=r8+r1
In this piece of code, the programmer has used 9 variables. However, does this mean that 9
registers are needed? To find the answer, let us define the notion of a live range. For any given
variable, there is a live range that starts from the point where a value is assigned to this variable
and lasts until the last time this particular value is used. Note that if a new value is assigned to
the same variable, a new live range starts. For example, a value for r2 is defined in instruction 2.
The last time it is used is in instruction 5, hence, the live range is between 2 and 5. However, if
instruction 4 was r2=z, the live range would be from 2 to 3 and another live range would start at
instruction 4 and end at instruction 5.
To practice, you may want to find all live ranges of the code above. The answer is given: r1:[1,9],
r2:[2,5], r3:[3,8], r4:[4,5], r5:[5,7], r6:[6,7], r7:[7,8], r8:[8,9], r9:[9,9].
Live ranges are important because they indicate how many values need to be live at any given
instruction. For example, the live ranges above tell us that at instruction 6 four values need to be
live. Clearly, the maximum number of values that need to be live at any instruction indicates how
many registers we need to have so that all values (live ranges) can be placed in registers.
However, most importantly, live ranges can guide register allocation: two live ranges that do not
overlap or interfere can use the same register. For example, with the live ranges above, r2 and r6
can share the same register as the corresponding values are needed (or are ‘live’) at different
parts of the code.
Different algorithms have been developed to find how to allocate different live ranges to registers.
This problem is known as register allocation. It is an NP-complete problem, which means that
most of the different solutions proposed over the years are based on heuristics. For additional
information you can refer to Chapter 13 of the ‘Engineering a Compiler’ recommended textbook:
https://www.sciencedirect.com/science/article/pii/B978012088478000013X
Among the different approaches, register allocation using graph colouring is a common
approach. In register allocation using graph colouring, live ranges are used to create an
interference graph. In this graph, every live range corresponds to a node. There is an edge
between two nodes if the live ranges overlap. Then, register allocation becomes equivalent to the
problem of graph colouring. This is a well-known graph theory problem where the aim is to colour
all nodes of the graph so that two adjacent nodes do not share the same colour. Typically the
goal is to find the smallest number of colours. Every colour corresponds to a register and the
colour of a node corresponds to the register that should be used for a particular live range. There
are various algorithms to colour a graph. Here, we are going to focus on a simple (heuristic)
algorithm, which is known as top-down colouring.