Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
CMPSC 497: Advanced Algorithms
Problem Set 3
Notice: Type your answers using LaTeX and make sure to upload the answer file on Gradescope before the deadline.
For more details and instructions, read the syllabus.
Problem 1.
In the MAXIMUM CUT problem we are given an undirected graph G = (V,E) with a weight w(e) on each edge, and
we wish to separate the vertices into two sets S and V −S so that the total weight of the edges between the two sets is
as large as possible. For each S⊆V define w(S) to be the sum of all w(e) over all edges u,v such that |S∩{u,v}|= 1.
Obviously, MAX CUT is about maximizing w(S) over all subsets of V . Consider the following local search algorithm
for MAX CUT:
start with any S⊆V
while there is a subset S′ ⊆V such that
|(S′∪S)− (S′∩S)|= 1 and w(S′)> w(S) do: set S= S′
(a) Show that this is an approximation algorithm for MAX CUT with ratio 2.
(b) What is the time complexity of this algorithm?
Problem 2. Tree Jumper
In the MINIMUM STEINER TREE problem, the input consists of: a complete graph G = (V,E) with distances duv
between all pairs of nodes; and a distinguished set of terminal nodes V ′ ⊆V . The goal is to find a minimum-cost tree
that includes the vertices V ′. This tree may or may not include nodes in V −V ′.
Suppose the distances in the input are a metric (recall the definition on textbook page 273). Show that an efficient
ratio-2 approximation algorithm for MINIMUM STEINER TREE can be obtained by ignoring the nonterminal nodes
and simply returning the minimum spanning tree on V ′. (Hint: Recall our approximation algorithm for the TSP.)
Problem 3. Partial Independence
Give a polynomial running time
( 1
d+1
)
-approximation algorithm for maximum independent problem when the input
graph G in undirected and the degree of all vertices of G is at most d ∈ N.
Problem 4. Bad Packing
1
Consider the following bin packing problem: given n items of sizes a1,a2, ...,an, find a way to pack them into unit-
sized bins so that the number of bins needed is minimized. Consider a greedy algorithm: process all items in an
arbitrary order; for the current k-th item, if there exists a partially packed bin that can further fit it, then put it in that
bin; otherwise, open a new bin and put the k-th item in it. Show that the approximation ratio of above greedy algorithm
is at most 2. Design an instance such that above algorithm performs at least as bad as 5/3 ·OPT .
Problem 5. Lightning Punch
Consider a minimal-weighted hitting set problem define as follows. We are given a set U = {u1,u2, ...,un} and a
collection {B1,B2, ...,Bm} of subsets of U , i.e., B j is a subset of U . Also, each element ui ∈U has a weight wi ≥ 0.
The problem is to find a hitting set H ⊂U such that the total weight of the elements in H is as small as possible, i.e.,
to minimize ∑ui∈H wi. Note that we say H is a hitting set if H ∩B j ̸= 0 for any 1 ≤ j ≤ m. Let b = max1≤i≤m |Bi| be
the maximum size of any of the sets {B1,B2, ...,Bm}.
(a) Design a b-approximation algorithm for above problem using the LP + rounding schema.
(b) Design a b-approximation algorithm for above problem using the primal-dual schema.
Problem 6. Streaming
Design an algorithm to return the ε−approximate ∆-quantile of a stream of numbers in [n]. The ∆−quantile of [n],
for ∆ ∈ (0,1], is some number k ∈ [n] such that ∆n ≤ k. The ε−approximate ∆-quantile, is some k ∈ [n] such that
(1− ε)∆n≤ k ≤ (1+ ε)∆n.
Problem 7. Reservoir Sampling
1. Let us now assume that while doing reservoir sampling, we wish to maintain a random set of k elements from
our stream at a time i.e the probability of a specific k−set is 1
(nk)
. Recall that in class, we only stored a single
element at a time, i.e. k = 1. Describe an algorithm for the generalized reservoir sampling, and show that it
works.
2. Given data streams S1 and S2, of lengths l1 and l2, we use reservoir sampling to store k-length samples k1 and
k2. If we now consider the concatenated data stream S = S1 ◦S2, how can we generate the k−length sample k∗
for S using k1, k2, l1 and l2?
Problem 8. Streaming III
1. Given a data stream D= d1,d2, . . .dm such that ∀i,di ∈ [n], we define the kth frequency moment of D as
Fk = Σi∈n( fi)k
where f =< f1, f2, . . . fn > is a vector that encodes the frequency with which each distinct element appears in
the stream. Describe an algorithm that correctly estimates Fk in expectation. How much space does it need?
2. Let us now assume that we have the same frequency vector f as above (initialized to 0) and the data stream is
of form D= {(i j,k j)|∀ j ∈ [1,m], i ∈ [1,n]}, where i j describes an index in f and k j is the amount by which it is
increased. Describe an algorithm to estimate the kth frequency moment of D.
2
Problem 9. Streaming IV
We define the k−Majority elements of a data stream D of size n, as elements that occur more than nk times in the
stream (when k = 2, this is the majority element). Design an algorithm to return the k−majority elements of D, and
prove that it works. How much space do you need?