Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
ECON7310: Elements of Econometrics
Problem set
Instruction
Answer all questions following a similar format of the answers to your tutorial questions. When
you use R to conduct empirical analysis, you should show your R script(s) and outputs (e.g.,
screenshots for commands, tables, and figures, etc.). You will lose 2 points whenever you fail to
provide R commands and outputs. When you are asked to explain or discuss something, your
response should be brief and compact. To facilitate tutors’ grading work, please clearly label
all your answers. You should upload your research report (in PDF or Word format) via the
“Turnitin” submission link (in the “Research Project 3” folder under “Assessment”) by 10:00
AM on the due date June 16, 2022. Do not hand in a hard copy. You are allowed to work on this
assignment in groups; that is, you can discuss how to answer these questions with your group
members. However, this is not a group assignment, which means that you must answer all the
questions in your own words and submit your report separately. The marking system will check
the similarity, and UQ’s student integrity and misconduct policies on plagiarism apply.
A. OLS and 2SLS (50 points)
You are to analyze the relationship between two variables, x1 (explanatory variable) and y
(dependent variable) in this problem. You run the following four regressions with a random
sample of 2000 observations and obtain corresponding estimated coefficients:
(1) OLS1 = lm(logy ∼ x1, data = RV)
l̂og(y) = 0.056
5.782
+ 0.040
0.004
× x1, R2 = 0.047
(2) OLS2 = lm(logy ∼ x1 + x2 + x2sq, data = RV)
l̂og(y) = 4.590
0.093
+ 0.085
0.005
× x1 + 0.092
0.010
× x2 − 0.003
0.001
× x22, R2 = 0.156
(3) TSLS1 = ivreg(logy ∼ x1 + x2 + x2sq | z1 + x2 + x2sq, data = RV)
l̂og(y) = 1.531
1.019
+ 0.256
0.057
× x1 + 0.196
0.037
× x2 − 0.005
0.001
× x22
(4) TSLS2 = ivreg(logy ∼ x1 + x2 + x2sq | z1 + z2 + x2 + x2sq, data = RV)
l̂og(y) = 0.976
1.050
+ 0.287
0.058
× x1 + 0.215
0.038
× x2 − 0.005
0.001
× x22
(a) (5 points) You want to estimate the effect on y (in terms of percentage change) from an
extra unit of x1 and interpret the result from Regression (1).
(b) (5 points) What is the assumptions behind of your interpretation?
(c) (5 points) You think x2 and x2sq should also be included in the regression. You want
to convince yourself with a joint hypothesis test. State your null hypothesis and calculate
the corresponding test statistic using the information provided in the above regressions.
1
(d) (5 points) A consultant to your project is worried that the OLS estimator is problematic
as x1 is very likely to be endogenous. If this is true, which assumption of linear regression
model is violated? What is wrong with using OLS if x1 is indeed endogenous?
(e) (5 points) The consultant suggests that you should adopt 2SLS rather than OLS. In par-
ticular, he proposes two instrumental variables, z1 and z2 , for the endogenous variable
x1. What conditions must hold for z1 and z2 to be valid instrumental variables?
(f) (5 points) Now assuming that both z1 and z2 are valid instruments, which estimate from
which of the above regressions do you think is the best estimate of this effect. Be sure to
give the regression number and particular coefficient and briefly justify your choice.
(g) (5 points) Describe how to assess the strength of the instruments used in Regression (4).
(h) (5 points) What is the first least squares assumption for prediction?
(i) (5 points) What is the trade-off between increasing the power of x2?
(j) (5 points) What is the difference between forecasts and predicted values from OLS?
B. Time Series Regression (50 points)
One version of the permanent income hypothesis (PIH) of consumption is that the growth in
consumption is unpredictable. Let gct = log(ct) − log(ct−1) be the growth in real per capita
consumption (of non-durable goods and services). Then the PIH implies that E[gct|It−1] =
E[gct], where It−1 denotes information known at time t− 1 (e.g., gc1, ..., gct−1); in this case, t
denotes a year. Use the data in CONSUMP.csv to answer the questions below.
(a) (5 points) Compute the first five autocorrelations of gct.
(b) (8 points) Test the PIH by estimating gct = β0 +β1gct−1 +ut (2 points).1 Clearly state
the null and alternative hypotheses (4 points). What do you conclude (2 points)?
(c) (7 points) Estimate AR(p) models for p = 1, ..., 5 and report regression results (5 points).
What lag length is chosen by the BIC (1 point)? What lag length is chosen by the AIC
(1 point)?
(d) (12 points) Add variables gyt−1, i3t−1, and inft−1 to the AR model you chose in (c) by
BIC.2 Report the new regression results (4 points). Are these new variables individually
or jointly significant at the 5% level (8 points)?
(e) (6 points) For the regression in (d), what happens to the p-value for the t-statistic on
gct−1 (2 points)? Does this mean the PIH hypothesis is now supported by the data (1
point)? Explain your answer (3 points).
(f) (7 points) For the regression in (d), what is the F -statistic and its associated p-value
for joint significance of the four explanatory variables (3 points)? Does your conclusion
about the PIH now agree with what you found in (b) (1 point)? Explain your answer (3
points).
(g) (5 points) Explain what is the meaning of stationarity (3 points) and do we need to
worry about it in this question? (2 points).
1You do not need to compute robust standard errors for any time series regressions in this question.
2gyt is the growth in real disposable income, i3t is the interest rate as measured by the return on three-month
T-bill rates, and inft is the inflation rate based on the Consumer Price Index.