Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
ECA 5304 Homework 1
(Due by Friday February 18th 11pm to LumiNUS Homework 1 submission folder)
Instructions: If working in groups, submit one copy per group and indicate clearly the names of the
collaborators. You should use R to answer the computational questions. The submission format will be
as follows: 1) Merge all your handwritten works and typed up answers/report into a single PDF file;
2) Append your R code at the end of the same PDF file; 3) Name your file as your NUS recorded
name. E.g., if I were a student registered under "Tkachenko, Denis", my filename would be "Tkachenko
Denis.pdf". Therefore, you will submit ONE pdf file per student or group that contains all the
answers and the code appended at the end.
Verify that your code runs seamlessly as a whole, containing commands to load all the necessary
libraries etc. Where randomness is involved, remember to seed the seed(s) of the random number
generator for replicability. You should verify that your code produces the same answers when run
several times. Answers to computational questions should be formatted as a report (i.e., type /write up
your answers and supplement with graphs/tables/numbers as necessary – do not just comment answers
between the lines of the R script (don’t use Rmarkdown either – it looks ugly and hard to follow the
answers) or printscreen the whole output when you only need 1 or 2 numbers from it. Finally, read the
hints carefully and good luck!
Question 1 (In-sample vs. out-of-sample MSE)
In this question you will establish an important and generally valid result in a simplest possible
setting. Consider the data generated by the following simple constant plus noise process:
ݕ = μ + ϵ , ϵ~ௗ(0,σଶ)
You plan to estimate the equation by OLS. Suppose you have a training dataset ݕ =(ݕଵ,ݕଶ. . . , ݕே) and a test dataset ݕ′ = (ݕ′ଵ,ݕ′ଶ. . . , ݕ′ே) of the same size generated by the
same process.
Hints: If you forgot, re-derive the OLS estimator for the model with only a constant. Also,
recall the key properties of OLS estimators – these will be useful in working out the answers.
1) Derive the in-sample and out-of-sample mean squared error (MSE) expressions.
2) Using the results in (1), argue that the in-sample MSE is always going to be less than or
equal to the out-of-sample MSE.
3) Explain what determines the difference between the two MSE’s. Do you think this result
can be valid more generally? Discuss the significance and any potential usefulness of this
result.
Question 2 (Some intuition on ridge regression)
In this question you will establish an important property of ridge estimates in a simplest possible
setting. Suppose that N =2, P=2, and the design matrix X is such that ݔଵଵ = ݔଵଶ = ݔଵ , ݔଶଵ =
ݔଶଶ = ݔଶ (the indices refer to the row/column positions of the elements of the X matrix).
2
Furthermore, assume that the variables are demeaned, so there is no intercept included in the
model and hence there is no constant in the design matrix.
1) Can you estimate the parameters using OLS in this setting? Explain.
2) State the ridge regression optimization problem in this setting.
3) Solve the problem in (2) and argue that ߚଵ and ߚଶ obtained from ridge estimation for a
given lambda will be equal in this setting. (Hint: you do not need to resolve the problem
fully – derive the F.O.C.’s and see whether there is something you can note that gives
away the answer. You also don’t need to use matrix algebra here.).
4) Without using any derivation, what would you intuitively expect to happen to ߚଵ and ߚଶ
in this setting if instead of ridge you used the LASSO penalty?
Question 3 (Revisit the Boston housing data with new tools)
1) Load the quarterly ‘Boston’ dataset from the MASS package. Use the ‘?Boston’ command to
retrieve the description of the dataset and the variables – discuss each variable and provide
economic intuition on why it may be a useful predictor for the median house value (medv).
Explain which variables you expect intuitively to be the most important predictors of medv
(do not run any quantitative analysis yet). Randomly split the dataset into the training set of
400 observations and the test set of 106 observation.