In a graph G = (V; E), an independent set is a set of vertices V 0 2 V , such that 8u; v 2 V 0, v and u are not
neighbors. A maximal independent set (MIS) is an independent set S such that for any v 2 V nS, S [ fvg
is not an independent set. This means that adding any one more node into S will make it not independent.
Note that MIS of a graph is not unique. Assume the graph is maintained using CSR, which means you can
directly read the degree of each vertex and the neighbor list of each vertex.
Sequentially, there is a simple greedy algorithm to find one maximal independent set. First of all, let’s
assign a unique priority to each vertex, uniformly at random. Let p(v) be the priority of vertex v. We
process all vertices in the order of their priority, highest the first. We start with S = ;. All vertices are
initialized as active. When processing a pivot v, we check if it’s still active. If not, we skip to the next
vertex. If so, we add v into S, and mark v and all its neighbors as inactive. This is because we chose v in S,
so all its neighbors cannot be chosen in S then. We then continue to the next vertex, until all vertices have
been processed. See Figure 1 as an example. The numbers indicates the priority of the vertices. The orange
vertices are those selected in MIS. We start with O1 and add it to S. It will set O6 and O5 as inactive.
Then we process O2 , and add it to S. Then we process O3 , and we found it’s already inactive (set inactive
by O2 ), so we skip it. O4 is still active now and can be added to S. O5 , O6 , and O6 are inactive because
of O1 and O2 . O8 is active and can be added. We can follow the process and finally, S = f1; 2; 4; 8; 10; 13g.
This does not work in parallel, since it requires O(jV j) rounds to process all vertices. In this problem, we
will design a parallel algorithm to find an MIS. Although directly parallelizing the above algorithm seems
hard, some techniques in class that might be useful. In particular, we will use the idea we learnt in the
lecture about deterministic parallelism. The core idea is to find all vertices that do not need to wait for any
other vertices, and just process them all in parallel.
In the class about deterministic parallelism, we’ve seen how to parallelize an iterative algorithm, where
the algorithm is a list of tasks performed one after the other. The key idea is to make the parallel version
of the algorithm to have exactly the same effect (and execution) as the sequential version. If
there are iterations that do not dependent on other iterations (or, all those it depends on has been finished),
we say an iteration is ready. In each round, we execute all ready iterations (tasks). In particular, we
can consider the following (you can plug in the random permutation algorithm to help you understand the
general idea):
For a task i, is there any unfinished task j before task i that could block task i? If not, or if all such
tasks have finished, we say task i is ready. We can then start performing task i immediately, instead
of waiting until task i − 1 finish.
When multiple tasks are ready at the same time, we can execute them all in parallel.
Answer the following questions:
1. (5pts) How can we generate unique priorities to each vertex uniformly at random? How can we do
that in parallel, and what are the work and span?
2. (5pts) Consider a vertex v in the graph with priority p(v), what is the condition that makes v ready
such that v can be used as the pivot right away? Hint: consider v’s priority and all its neighbors’
priority.
3. (8pts) Based on your idea above, in Figure 1, which vertices can be processed and added in round 1,
2, 3, …? For general cases, which vertices will be executed in parallel in each round (in this question,
you don’t need to specify how to find such vertices)?
4. (10pts) Describe a parallel algorithm based on your idea in Problem 3. You need to specify more
details about how to find the vertices to be processed in each round. You do not need to prove the
cost of your algorithm.
5. (7pts) Prove that your algorithm has span O(D log n), where D is the longest chain in the graph with
decreasing priority. In other words, D is the length of the longest chain v1; v2; :::; vD in the graph such
that p(v1) > p(v2) > · · · > p(vD)
6. (bonus, 1 candy and 3pts) Assume you have a constant-cost atomic operation fetch-and-add. De
sign a parallel algorithm for the MIS problem with work O(m), and span bound the same as above
(O(D log n)).