Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
COMP-381 Introduction to Machine Leanring
1. Image Compression: Convert a photo of yours to .png format. Then load it with python and compute its SVD
compression as explained in the lecture notes and in class. Choose a number of eigenvalues (truncation threshold) and
explain why you chose this value below. How much compression have you achieved?
Answer this question below and hand in all code and your image before and after compression. 2.
PCA: As explained in class in the context of digit images (the images of 2’s), implement a PCA projection
of about 10 or 20 data instances of your liking (images, text documents,
objects in a excel document, wikipedia entities, etc.) to a 2D PCA layout (like the ones for countries, hand
gestures and numbers in the lectures). If the data instances are images, transform them to be of the same size,
say 40 by 40 pixels, and grayscale. For help on generating 2D PCA layouts, see the posting of Bobak Shahriari on “annotating scatter plots with images”
in the google group or visit. Hand in all the code and your 2D PCA layout. For example: for a hotel browsing app, I would select 10 Vancouver
hotels in tripadvisor. As the attributes for each hotel, I would use the 5 traveler ratings (excellent, very good, average, poor, terrible),
the tripadvisor rank and the cheapest room price. I would then form a matrix of 10 rows by 7 columns, and use the SVD to compute
the 2D PCA components U2Σ2. Finally, I would do a scatter plot of the hotels and, at the location of the hotel point in 2D,
I would insert either the name of the hotel or the image of the hotel. Be creative and enjoy the exercise! 3. Learning Bayesian networks:
For the Fritz network that was discussed in the lecture notes, the joint distribution of the i-th
observation is: P (Ti,Mi, Si, Fi|θ, α, γ1:2, β1:4) = P (Ti|Si, Fi, β1:4)P (Fi|Mi, γ1:2)P (Mi|θ)P (Si|α),
where each distribution is Bernoulli: P (Mi|θ) = θI(Mi=1)(1− θ)I(Mi=0) P (Si|α) = αI(Si=1)(1− α)I(Si=0) P (Fi|Mi = 0, γ1) =
γI(Fi=1|Mi=0)1 (1− γ1)I(Fi=0|Mi=0) P (Fi|Mi = 1, γ2) = γI(Fi=1|Mi=1)2 (1− γ2)I(Fi=0|Mi=1) P (Ti|Si = 0, Fi = 0, β1) =
βI(Ti=1|Si=0,Fi=0)1 (1− β1)I(Ti=0|Si=0,Fi=0) P (Ti|Si = 0, Fi = 1, β2) = βI(Ti=1|Si=0,Fi=1)2 (1− β2)I(Ti=0|Si=0,Fi=1)
P (Ti|Si = 1, Fi = 0, β3) = βI(Ti=1|Si=1,Fi=0)3 (1− β3)I(Ti=0|Si=1,Fi=0) P (Ti|Si = 1, Fi = 1, β4) = βI(Ti=1|Si=1,Fi=1)
4 (1− β4)I(Ti=0|Si=1,Fi=1) (a) Derive the ML estimates of θ, α, γ1:2, β1:4 for the dataset in the slides. Hint: we did some
of these estimates already in class. (b) Derive the posterior mean estimates of θ, α, γ1:2, β1:4 assuming that
we use the same Beta(1, 1) prior for each of the parameters. Start by writing each of the 8 posterior distributions.
For example, P (θ|M1:5) ∝ 5∏ i=1 P (Mi|θ)p(θ) ∝ θ4(1− θ)1θ1−1(1− θ)1−1 = θ5−1(1− θ)2−1 and,
consequently, the posterior mean estimate of θ is E(θ|M1:5) = 55+2 = 5/7. (c)