Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
ECON 3210: Use of Economic Data
1 General Instructions
You will estimate econometric models of the following form:
y = β1 + β2x+ e (1)
y = β1 + β2x2 + β3x3 + e (2)
You will choose a dependent variable (y), a key explanatory variable (x or x2), a secondary
explanatory variable (x3) from the dataset surveydata.Rdata, posted on the course website. A
codebook for this dataset is also posted; this contains a list of variables in surveydata.Rdata,
as well a description of each of these variables. In addition to choosing a dependent variable
and two explanatory variables, you will choose a variable that divides the sample into two
groups (call this “sample A” and “sample B”). You will then answer questions 1-10 below.
You may work in groups of up to three students. If you choose to work in a group,
please submit only one assignment per group, and clearly indicate the names and student
numbers of all group members on the first page of your assignment. Your assignment should
be submitted through the course website. Please submit two files: (1) a text file with the
written answers to questions 1-10 below; (2) a text file containing your code.
1
2 Questions
1. Explain your choice of y and x. Why should these variables be related, in theory? Why
is it reasonable to think that x causes y, and not the other way around?
2. Do you think your model is likely to satisfy the assumptions of the simple linear regression
model? Why or why not? Be specific.
3. Present basic summary statistics for y and x (mean, standard deviation, maximum value,
minimum value).
4. Estimate model (1) using the entire sample, and present your results. Discuss the
following:
(a) Provide an economic interpretation for your estimate b1, and comment on whether
or not it is statistically significant.
(b) Provide an economic interpretation for your estimate b2, and comment on whether
or not it is statistically significant.
(c) Provide a 95% confidence interval for β2.
(d) What is R2? Do you consider this high or low?
5. Generate a diagnostic residual plot. Based on this plot, does your model appear correctly
specified? Explain.
6. Comment on why you think it is interesting to estimate model (1) separately for samples
A and B (in theory).
7. Estimate model (1) separately for sample A and sample B. Comment on the following
differences between the two sets of results:
(a) b1: magnitude and significance
(b) b2: magnitude and significance
(c) R2
8. Explain your choice of x3. Why do you think this is a useful addition to the model?
2
9. Estimate model (2) using the entire sample, and present your results. Discuss the
following:
(a) Provide an economic interpretation for your estimate b3, and comment on whether
or not it is statistically significant.
(b) Comment on the difference between your estimate b2 in model (2) and your estimate
b2 in model (1).
(c) What is R2? Do you consider this high or low?
10. Do you think b2 (from either model (1) or model (2)) is a good estimate of the causal
effect of x on y in a nationally representative sample of the population? Why or why
not? Try to be as specific as possible.