Computational Statistics & Data Science
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
MAST90083 Computational Statistics & Data Science
Reading time: 30 minutes — Writing time: 3 hours — Upload time: 30 minutes
This exam consists of 4 pages (including this page)
Permitted Materials
This exam and/or an o✏ine electronic PDF reader, blank loose-leaf paper and a non-
programmable calculator.
No books or other material are allowed. Only one double side A4 page note (handwritten
or printed ) is allowed.
Instructions to Students
If you have a printer, print the exam. If using an electronic PDF reader to read the
exam, it must be disconnected from the internet. Its screen must be visible in Zoom. No
mathematical or other software on the device may be used. No file other than the exam
paper may be viewed.
Ask the supervisor if you want to use the device running Zoom.
Writing
There are 6 questions with marks as shown. The total number of marks available is 55.
Write your answers on A4 paper. Page 1 should only have your student number, the
subject code and the subject name. Write on one side of each sheet only. Each question
should be on a new page. The question number must be written at the top of each page.
Scanning
Put the pages in question order and all the same way up. Use a scanning app to scan all
pages to PDF. Scan directly from above. Crop pages to A4. Make sure that you upload
the correct PDF file and that your PDF file is readable.
Submitting
You must submit while in the Zoom room. No submissions will be accepted after
you have left the Zoom room.
Go to the Gradescope window. Choose the Canvas assignment for this exam. Submit
your file. Wait for Gradescope email confirming your submission. Tell your supervisor
when you have received it.
University of Melbourne 2021 Page 1 of 4 pages Do not place in Baillieu Library
MAST90083 Computational Statistics & Data Science Semester 2, 2021
Question 1 (10 marks)
Given the model
y = X + ✏
where y 2 Rn, X 2 Rn⇥p is full rank p and ✏ 2 Rn ⇠ N (0,2In). Let X = (x1, ...,xp) be the
column representation of X where we further assume that the columns are mutually orthogonal.
(a) Derive the expression of, ˆj the jth component of the least square estimate of , as a
function of xj
(b) Is the least square estimate of j modified if any of the other components l, (l 6= j) is
forced to zero ?
(c) Provide the expression of the residual sum of squares and discuss its change when a
component is put to zero, j = 0
Assume now that instead of having orthogonal columns, the xij are standardized so that for
j = 1, ..., p
nX
i=1
xij = 0 and
nX
i=1
x2ij = c
(d) Derive the expression of the covariance of the least square estimator of
(e) Derive the expression of
Pp
j=1 var
⇣
ˆj
⌘
as a function of 2 and j , j = 1, ..., p, the eigen-
values of C = X>X
(f) Use these results to show that
Pp
j=1 var
⇣
ˆj
⌘
is minimized when X is orthogonal
Note: For parts (a)-(c), the columns of X are assumed orthogonal and not orthonormal; i.e.,
x>i xj = 0 but x>i xi = kxik2 6= 1
Question 2 (9 marks)
Consider a positive sample x1, ..., xn from an exponential distribution
f (x|✓) = ✓e✓x, x 0, ✓ > 0.
Suppose we have observed x1 = y1, ..., xm = ym and xm+1 > c, ..., xn > c where m is given,
m < n and y1, .., ym are given numerical values. This implies that x1, · · · , xm are completely
observed whereas xm+1, · · · , xn are partially observed in that they are right-censored. We want
to use an EM algorithm to find the MLE of ✓.
(a) Find the complete-data log-likelihood function `(✓) = logL(✓).
(b) In the E-step, we calculate
Q(✓, ✓(k)) = E[lnL(✓) | x1 = y1, · · · , xm = ym, xm+1 > c, · · · , xn > c; ✓(k)]
where ✓(k) is the current estimate of ✓. Show that
Q(✓, ✓(k)) = n log ✓ ✓
"
mX
i=1
yi + (nm)
c✓(k) + 1
✓(k)
!
e✓
(k)c
#
Page 2 of 4 pages
MAST90083 Computational Statistics & Data Science Semester 2, 2021
(c) In the M-step, we maximise Q(✓, ✓(k)) with respect to ✓ to find an update ✓(k+1) from ✓(k).
Show that
✓(k+1) = n
"
mX
i=1
yi + (nm)
c✓(k) + 1
✓(k)
!
e✓
(k)c
#1
(d) Suppose the sequence {✓(k); k=1, 2, · · · } converges to the MLE ✓ˆ when k !1. Establish
the equation allowing the derivation of ✓ˆ.
Question 3 (9 marks)
Consider scatterplot data (xi, yi) , 1 i n such that
yi = f (xi) + ✏i
where yi 2 R, xi 2 R, ✏i 2 R ⇠ N (0,2) and are i.i.d. The function
f(x) = E(y|x)
characterizing the underlying trend in the data is some unspecified smooth function that needs
to be estimated from (xi, yi) , 1 i n. For approximating f we propose to use quadratic
spline basis with truncated quadratic functions 1, x, x2, (x k1)2+, ..., (x kK)2+.
(a) Provide the quadratic spline model for f and define the set of unknown parameters that
need to be estimated
(b) Derive the matrix form of the model and the associated penalized spline fitting criterion
(c) Derive the expression for the penalized least squares estimator for the unknown parameters
of the model and the associated expression for the best fitted values.
(d) Find the degrees of freedom of the fit (e↵ective number of parameters) obtained with
the proposed model and its extremes or limit values when the regularization parameter
varies from 0 to +1.
(e) Find the optimism of the fit and its relation with the degrees of freedom.
Question 4 (10 marks)
Let Y = (y1, ...,yn) be a set of n vector observations of dimension q such that yi =
(y1i, ..., yqi)> 2 Rq. For modeling these observations we propose to use the parametric model
given by
yi = 1yi1 + 2yi2 + ...+ pyip + ✏i
where ✏i are independent identically distributed normal random variables with mean vector
zero and q ⇥ q variance-covariance matrix ⌃ modeling the approximation errors and the j ,
j = 1, ..., p are q ⇥ q coecient or parameter matrices.
(a) How many vector observations need to be lost to work with this model ? And what is the
e↵ective number of observation ?
(b) Provide a linear matrix form for the model where the parameters are represented in a
(pq) ⇥ q matrix form = [1, ...,p]>, derive the least square estimator of and the
maximum likelihood estimate of ⌃