ECON7360 Causal Inference for Microeconometrics
Causal Inference for Microeconometrics
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
ECON7360 Causal Inference for Microeconometrics
Instruction
When you are asked to explain or discuss something, your response should be concise (no
more than four sentences). Please clearly label all your answers.
Use STATA to conduct the empirical analysis and include your do file as an “Appendix”
at the end of your report.
You should upload your work as a PDF file via the “Turnitin” submission link (in the
“Problem Set 2” folder under “Assessment”) by 4 pm on the due date October 20, 2023.
You are allowed to work on this assignment in groups; that is, you can discuss how to
answer these questions with your group members. However, this is not a group assignment,
which means that you must answer all the questions in your own words and submit
your report separately. The marking system will check the similarity, and UQ’s student
integrity and misconduct policies on plagiarism apply.
The maximum possible mark allocated for this problem set is 100. Each subsection of
each question is worth five marks.
1
1 Panel Data Methods - RE
In a random effects model, define the composite error vit = ai+uit, where ai is uncorrelated with
uit and the uit have constant variance σ
2
u and are serially uncorrelated. Define eit = vit − θv¯i,
where v¯i =
1
T
∑T
t=1 vit and θ = 1−
√
σ2u
σ2u + Tσ
2
a
.
(i) Show that E(eit) = 0.
(ii) Show that V ar(eit) = σ
2
u, t = 1, · · · , T .
(iii) Show that for t ̸= s, Cov(eit, eis) = 0.
2 Panel Data Methods - BE
With a single explanatory variable, the equation used to obtain the between estimator is
y¯i = β0 + β1x¯i + ai + u¯i (1)
where the overbar represents the average over time. We can assume that E(ai) = 0 because
we have included an intercept in the equation. Suppose that u¯i is uncorrelated with x¯i, but
Cov(xit, ai) = σxa for all t and i because of random sampling in the cross section.
(i) Letting β˜1 be the between estimator, that is, the OLS estimator using the time averages,
show that
plimβ˜1 = β1 + σxa/V ar(x¯i)
where the probability limit is defined as N →∞.
(ii) Assume further that the xit, for all t = 1, 2, · · · , T, are uncorrelated with constant
variance σ2x. Show that
plimβ˜1 = β1 + T (σxa/σ
2
x)
(iii) If the explanatory variables are not very highly correlated across time, what does part
(ii) suggest about whether the inconsistency in the between estimator is smaller when there are
more time periods?
2
3 Panel Data Methods - POLS, FD, FE, RE
Use the data in WAGEPAN.DTA to answer the following questions.
(i) Using lwage as the dependent variable, estimate a model that only contains an intercept
and the year dummies d81 through d87. Use pooled OLS, RE, FE, and FD. What do you
conclude about the coefficients on the year dummies?
(ii) Add the time-constant variables educ, black, and hisp to the model, and estimate it by
OLS and RE. How do the coefficients compare? What happens if you estimate the equation by
FE?
(iii) What do you conclude about the four estimation methods when the model includes
only variables that change just across t or just across i?
(iv) Now estimate the equation
lwageit = αt + β1unionit + β2marriedit + β3educi + β4blacki + β5hispi + ci + uit (2)
by random effects. Do the coefficients seem reasonable? How do the nonrobust and cluster
robust standard errors compare?
(v) Now estimate the equation
lwageit = αt + β1unionit + β2marriedit + ci + uit (3)
by fixed effects, being sure to include the full set of time dummies to reflect the different
interecepts. How do the estimates of β1 and β2 compare with those in part (iv)? Compute the
usual FE standard errors and the cluster-robust standard errors. How do they compare?
(vi) Obtain the robust, variable addition Hauman test. What do you conclude about RE
versus FE?
(vii) Let educ have an interactive effect with union and married and estimate the model by
fixed effects. Are the interactions individually or jointly significant? Why are the coefficients
on union and married now imprecisely estimated?
3
4 DID and Panel Data Regression
The following empirical questions are based on:
McKinnish, T., 2005. “Importing the Poor Welfare Magnetism and Cross-Border Welfare Mi-
gration.” Journal of human Resources, 40(1), pp.57–76.
The research question is whether poor families in the U.S. migrate to states with higher welfare
benefits. The main empirical difficulty is to show that migrants to high-benefit states are moving
for welfare benefits, rather than other state amenities (such as strong labor markets) that tend
to be positively correlated with welfare benefits.
Most studies of welfare migration, including this one, focus on the Aid to Families with Depen-
dent Children (AFDC) program. AFDC was a welfare program that provided cash payments to
low-income single mothers.1 Even though it was a federal welfare program, states set their own
benefit levels, generating sizeable differences in generosity across states. An important feature
for the purpose of this paper is that benefit levels are set at the state level. They do not vary
by county within a state.
Assume that the costs of between-state migration are lower for individuals located close to state
borders. Consider the simple example for a country with two states illustrated below. The top
state is the high-benefit state and the bottom state is the low-benefit state. Area HB contains
the counties of the high-benefit state that border on the other state, and area LB is likewise
defined for the low-benefit state. Areas HI and LI are the interiors of the two states. If the
assumption of differential migration costs is correct, then, the border counties in area HB should
disproportionately draw migrants from the border counties in area LB. As a result, area LB
should exhibit significantly lower per capita AFDC expenditures than area LI, which can be
thought of as evidence of welfare migration.
To examine this, in what follows, you will need to conduct a series of DID-type analyses using
the data set welmig.dta, which contains observations for all counties in the 48 continental
states for the years 1970–1990. The variables included in the data set are:
AFDCExp: Log of per capita AFDC expenditures in county.
AFDCBen: Monthly AFDC benefit in state (benefit for a family of 4, in 100s of dollars).
1It was reformed and renamed Temporary Aid to Needy Families (TANF) in 1996.
4
neardist: Distance from county to the closest neighbor state (in miles).
NeighborBen: Monthly AFDC benefit in the closest neighboring state (in 100s of dollars).
state: State ID code.
year: Year, 1970–1990.
Note that AFDCExp, neardist, and NeighborBen vary at the county level. AFDCBen and state
vary only at the state level.
(i) Define a border county as one with neardist ≤ 25. Create a dummy variable Border25
for border counties. Create a dummy variable BenDiff indicating if the closest neighbor state’s
benefits is higher than own state’s benefits (=1). Create a variable Border25xBenDiff as the
interaction between Border25 and BenDiff.
(ii) Using only 1979 observations, regress AFDCExp on AFDCBen, NeighborBen, Border25,
BenDiff, and Border25xBenDiff.2 Which coefficient captures the welfare migration effect?
Can you find evidence of welfare migration?
(iii) Now re-run the regression in part (ii) but this time control for state fixed effects. A
convenient way to do so is to use STATA’s xtreg command with option “fe i(state)”. Can
the coefficient on AFDCBen be identified? Why?
(iv) Compare the estimation results obtained in part (ii) and part (iii). Comment on your
findings.
(v) Re-run the regression in part (iii) but this time use all observations (1970–1990). Can
the coefficient on AFDCBen be identified now? Why? What has happened to the SE of the
2Compute clustered SE by using option cluster(state) in this and all the following regressions.
5
coefficient on BenDiff? What has happened to the coefficient on Border25xBenDiff?
(vi) Create and add year fixed effects to the regression model in part (v). Can you tell a
story for why the effect of AFDCBen changes a lot when controlling year fixed effects? Explain
why the coefficient on Border25xBenDiff only changes a little.
(vii) Does the welfare migration effect change over time? Create an indicator variable
post1980 for the years 1980-1990 and interact it with all included regressors (other than year
dummies) of the regression model in part (vi) to test this hypothesis.