Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
Final Exam – Data Project
ECON 6074: Causal Inference
Instructions
• Please submit your work in one zip file containing three files: a) write up of your answers
(including tables), b) log file, and c) do-file that works with your matched data set.
• You will conduct empirical analysis using a dataset from the study by Duflo, Dupas, and
Kremer (2011) “Peer Effects, Teacher Incentives, and the Impact of Tracking: Evidence
from a Randomized Evaluation in Kenya” American Economic Review, 101(5): 1739-1774.
• Each student is assigned with a unique dataset. Look up your matched Final Exam Dataset
ID number and download your unique data set “Final_M2_2022_ID[#].dta”. If your
answers do not match with your assigned dataset you will receive 0 points.
1. [25 points] The Impact of Tracking on Students’ Achievements
a. (5 points) What is the authors’ identification strategy to estimate the impact of
tracking on students’ achievements? Explain how this method can identify the
treatment effect using the potential outcomes framework. What are the key
assumptions associated with this identification strategy?
b. (5 points) Conduct a balance check to examine whether students in tracking and
non-tracking schools are similar to each other.
c. (10 points) Report the OLS estimate of the effect of tracking on standardized
endline test score [stdR_totalscore] with and without individual controls (control
variables: student’s gender [girl], age [agetest], being assigned to contract teacher
[etpteacher], percentile in initial distribution [percentile]).
d. (5 points) Provide a test of whether the effect of tracking on standardized endline
test score differs across students at different percentiles of initial distribution (e.g.
students in the bottom half versus students in the top half).
2. [35 points] The Effect of Assignment to Upper Tracking Section on Achievement
a. (5 points) Propose an identification strategy to estimate the impact of being in
upper tracking (top half) section on achievement? Use the potential outcomes
framework to illustrate the identification strategy and key assumptions.
b. (5 points) Create a graph which plots standardized endline test score
[stdR_totalscore] on the assignment variable [percentile] and visually inspect
whether there is discontinuity at the cutoff.
c. (5 points) Create graphs to examine whether covariates [girl; agetest; etpteacher]
are balanced around the cutoff point. Do students at the bottom of the upper section
look similar to students at the top of the lower section?
d. (10 points) Use parametric approach to estimate the effect of being assigned to the
upper section [tophalf] on test score. That is, regress test score on indicator for upper
section controlling for the assignment variable, individual controls [girl; agetest;
etpteacher], and school fixed effect [schoolid]. Use different polynomials of the
assignment variable: 1. Linear, 2. Quadratic, 3. Cubic. And report the estimated
coefficients on assignment to upper section. How do the estimates vary across
different polynomials?
e. (10 points) Use local linear regression (nonparametric approach) to estimate the
effect of being assigned to upper section. Report results from using at least three
different bandwidths. How does the estimate from the nonparametric approach
compare to the estimate from the parametric approach in part d?