Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
CS 333 Algorithms in the Real World
HW2 Due 3:30 pm on Tuesday, September 14 Submission. You will submit materials for homework number n on Gradescope under two separate as- signments: HWn Report and HWn Code. You should turn in a pdf containing all of your written solutions for the report, and a source code file containing all of the code you wrote for the assignment. When you submit your code, the interface may mention an autograder; you can ignore this, as the assignment will not be autograded. Your grade will be recorded on the report; the source code is to show your work and for reference during grading. Report. The pdf you submit for the report should be typed, and solutions to individual questions should be clearly labeled. Show your work or explain your reasoning for your answers. You should use LaTeX (which is free) or some other editor of your choice (Microsoft Word, Google Docs, etc.) to prepare your reports. If you use an editor like Microsoft Word, make sure to convert the final document to a pdf, confirm that the symbolic math from the equation editor is properly formatted, and submit the pdf. Collaboration. You may complete this assignment independently or in a group of two students. If you work with another student you should submit a single report and a single code file with both of your names, and should use the group feature when submitting on gradescope to indicate that you worked as a group. Do not split up the assignment, and each only complete half of the problems. Instead, complete each portion of the assignment by working together synchronously (for example, by pair programming with screen sharing over zoom or some other similar service) or by working independently and then coming together to merge solutions and check one another’s work. Allowed Materials. You can use any standard library functions and data structures in your programming language of choice (Java or Python 3 are recommended). You may also use any slides or notes from class or reference materials posted on the course website. You may search the internet for basic definitions, terminology, and language documentation (for example, checking the syntax for array slicing in python), but you may not use anyone else’s code (either another student’s or from the internet), nor may you search the internet for solutions or descriptions of solutions to the homework problems. Problem 1 (Discrete Cosine Transform, 20 points). The discrete cosine transform of N values x0, x1, . . . , xN−1 is given by DCT (x0, x1, . . . , xN−1) = X0, X1, . . . , XN−1 where Xk = N−1∑ n=0 xn cos [ pi N ( n+ 1 2 ) k ] . Again for N values, the inverse discrete cosine transform that maps the indices n from 0 to N − 1 back to their original values is given by the function f(n) = 2 N ( 1 2 X0 + N−1∑ k=1 Xk cos [ pi N k ( n+ 1 2 )]) . (a) Suppose you are given the following 3 pixel intensity values: [1, 0, 1]. Apply the discrete cosine transform to compute the coefficients corresponding to the cosine interpolation of these values. Show your work. (b) Using the coefficients you computed in part (a), write the function f(n) that maps the indices [0, 1, 2] back onto the intensity values you started with. Evaluate your function at these indices to confirm it is correct. Show your work. 1 (c) Suppose that you are given a set of N equal intensity values x0 = x1 · · · = xN−1. Show that X1 = 0. For simplicity, you may assume that N is even. [hint: You may find it useful to recall the trigonometric identity that cos(θ) = − cos(pi− θ).] If you are having difficulty showing this formally, try to explain the intuition for why X1 should be 0. (d) In fact, if x0 = x1 · · · = xN−1 then Xk = 0 for all k > 0. Suppose an 8x8 block of pixels in an image has all equal intensity values; what does this imply for compression? Specifically, about how many bytes will be required to store the entire 8 × 8 block after quantization in a sparse matrix representation? Problem 2 (Reasoning about Runtimes, 18 points). The 2 dimensional discrete cosine transform of an N ×M matrix (N rows by M columns) x is another N ×M matrix where the row j column k entry is given by Xj,k = N−1∑ n=0 M−1∑ m=0 xn,m cos [ pi N ( n+ 1 2 ) j ] cos [ pi M ( m+ 1 2 ) k ] . (a) Argue that the transform coefficients (that is, all of the values in the transform matrix X above) can be computed in O(NM(N+M)) time. [hint: we mentioned the idea in class; argue that this idea computes the above coefficients correctly and takes the stated runtime] (b) Recall that the overall algorithm for compression we have studied scans over every 8 × 8 pixel block of an image, computes its discrete cosine transform, quantizes that block, and then stores the quantized coefficients with a sparse representation. What is the overall asymptotic running time of the compression algorithm for a W ×H image? For simplicity you can assume that W and H are both divisible by 8, it does not impact the runtime complexity. (c) It turns out there is a more efficient way to compute the 2 dimensional discrete cosine transform based on the recursive fast Fourier transform algorithm. Instead of taking O(NM(N+M)) time for an N×M matrix as in part (a), this faster method takes O(NM(log(N) + log(M))) time. About what factor speedup would you expect this to yield for the overall time it takes to perform compression on a W ×H image, the process described in part (b)? For simplicity, assume that the runtime of the first method is exactly c × NM(N + M) for some constant c, and the runtime of the second method is exactly c×NM(log2(N) + log2(M)) for the same constant c. Problem 3 (Implementing Wavelet Compression, 28 points). In this problem, you will implement the basic compression scheme we have discussed (except without run-length encoding, which we will discuss in the next problem). Note that in addition to standard arrays in Java and lists in Python, you are welcome to use the Python Numpy library (the most prolific library for numerical computing in Python) if you prefer to represent matrices as two-dimensional Numpy arrays and use vectorized calculations to implement the transform. You will need to scan over every 8 × 8 pixel block of an image (represented as a matrix / two-dimensional array / nested list of integer intensity values between 0 and 255) and compute the following functions (as always, feel free to implement additional helper functions as necessary for organizing your code). • DCT(). Should take as input a pixel block and return the matrix (of the same dimension as the input) of two-dimensional discrete cosine transform coefficients (floating point numbers in general). You should implement the O(NM(N + M)) algorithm from problem 2 part (a). You may find it helpful to separately implement the one dimensional discrete cosine transform as a helper function. For example, the two-dimensional DCT of the 2 × 2 block 1 2 3 4 should be 2 10.0 -1.414 -2.828 0.0 up to numerical error (e.g., instead of 0.0 you may see something extremely small, smaller than 1×10−10 due to numerical rounding error). To further your implementation, you can also optionally implement the inverse DCT and confirm that for a given 8 × 8 block of transform coefficients you can recover the original values. Note that the inverse DCT for a single dimension is given by evaluating the function f() defined in Problem 1 at the appropriate index values, and the 2 dimensional inverse can be computed by applying the one dimensional inverse on all of the rows and then on all of the columns. • Quantize(). Should take as input an integer quality factor and an 8 × 8 matrix of discrete cosine transform coefficients (floating point numbers in general) and return a quantized 8 × 8 matrix of integer values. To perform quantization, divide element-wise by the quantization matrix provided in the “quant.csv” file. Further divide each element by the quality factor and then cast each resulting element as an integer. Problem 4 (Testing Wavelet Compression and Run Length Encoding, 20 points). HW2 contains a file “parrot.csv”1 that contains the matrix representation of a grayscale image of a parrot that is 200 by 144 pixels. In particular, row i of the csv file contains the pixel values for row i of the image, with the intensity values for columns separated by commas. (a) Apply wavelet compression as described in Problem 3 for three different values of the quality factor used for quantization: 1, 2, and 4 (note that we divide by the quality factor during quantization, so lower numbers mean higher quality and higher numbers imply greater compression). For each, report the frac- tion of the total quantized coefficients that are nonzero. Recall that one sparse matrix representation we discussed is to store three lists: one for values, one for row indices, and one for column indices. Assuming for simplicity that each individual number stored in these lists is one byte, report the compression ratio (number of bytes to store the original uncompressed image divided by the number of bytes needed to store the sparse compressed coefficients) for each of the results. To validate your result, you are encouraged (though not required) to attempt recovering the image by applying the inverse DCT block-wise. You may also need to re-multiply element-wise by the quantization matrix and quality factor in order to get the values back on the original scale. You should get something similar to ‘parrot.csv”, particularly if you use a quality factor of 1. You can also view the recovered image to inspect visually whether it resembles the original image and how the quality factor affects the visual quality of the image. If you increase the quality factor dramatically (to, say, 32) you should eventually see the distinct 8 by 8 blocks distorting the recovered image. (b) In class, we discussed an alternative method of representing a sparse matrix based on run-length encoding. We flatten the matrix into one long vector (consisting of the first row, then the second row, and so on). Then, rather than storing each value separately, we store (value, repetition) pairs. For example, (3, 2) (0, 4), (5, 1) would code for 3 3 0 0 0 0 5. Compute the run-length encoding for each of the three coefficient matrices you generated in part (a) (corresponding to the three different quality factors). For each, report the total number of pairs. Assuming for simplicity that a given pair requires 1.5 bytes to store on average2, report the compression ratio (number of bytes to store the original uncompressed image divided by the number of bytes needed to store the sparse compressed coefficients) for each of the results. 1Original image retrieved from 2This is a rough but reasonable estimate; typically Huffman compression is applied to efficiently store the repetition numbers in the pairs using less than a byte on average, since small numbers like 1 and 2 tend to be very common