Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
EC 486
Instructions
1. This take-home exam includes one question worth 100 points.
2. Please read all questions before answering them.
3. The data are on Moodle.
4. The due date is March 28th, 2021 at 9am. Please upload a writeup of your answers and
the computer codes to Moodle (ideally as a single .pdf file).
5. There is not a single way to answer correctly the questions below. Rather, I will be looking
for a coherent analysis (i.e., it does not mean a long answer).
6. If at any point in this exam you feel that anything is unclear, please make additional
assumptions that you feel are necessary and state them clearly.
7. I will post a code with a suggested solution on Monday March 29th at 9am.
8. Good luck!
1
Question 1
The goal of this exercise is to measure the effect of health risk on housing values by exploiting
a natural experiment. Specifically, prior to 1997, Churchill County, Nevada, United States (pop-
ulation 23,982) had no history of pediatric leukemia. Since 1997, 15 children have been diagnosed
with acute lymphocytic leukemia and a sixteenth with acute myelogenous leukemia. Leukemia
incidence of this magnitude far exceeds the population mean. A joint investigation by the Nevada
Health Department and the U.S. Centers for Disease Control has been unable to determine the
cause of the increase. No common characteristic has been identified among the case families and
the cases have not been linked to occupational hazards, a certain neighborhood, or a particular
water source.
The dataset houses.dta reports transaction prices of all sales of single-family residences be-
tween 1990 and 2002 in Churchill County (identified by the variable cc=1) and in Lyon County
(identified by the variable lc=1). Lyon County was chosen to act as a control because it lies
immediately to the west of Churchill County. The dataset also includes the characteristics of
these residences. Use the “describe” command for a brief description of the variable names.
1. The variable cases reports the cumulative number of cases of leukemia cases. The variable
news reports the cumulative number of newspaper articles citing “leukemia” and “Churchill
Country.”
(a) Graph the time-series of cumulative cases in the two cities.
(b) In which year do you notice the largest increase in cases?
(c) Graph the time-series of cumulative news in the two cities. How does it differ from the
graph of cases?
(d) Based on the two graphs, define a pre- and post- period. Which years did you include
in the pre- and in post- periods? Please justify your choice. Construct a corresponding
variable post equals zero in the pre- period and one in the post- period.
2. Lyon County is chosen as the control group. Based on the data available to you, do you
think it is a good control group? Please explain.
3. The variable saleprice is the sale prices of the property and nvhpi is the house price
index for the State of Nevada. Based on these two variables, construct the real sale price as
100saleprice
nvhpi
, which is the price of the property adjusted for inflation.
2
4. We want to calculate the difference-in-difference estimate of the effect of leukemia cluster
on house prices using the log of the real sale price as our outcome variable.
(a) Write down your regression equation to estimate the difference-in-difference model,
without additional control variables.
(b) Estimate your difference-in-difference regression. Based on the coefficients, construct
a 2x2 difference-in-difference table of the mean log sales price.
(c) What is your difference-in-difference estimate of the effect of the leukemia cluster
on house prices? Explain your answer, with reference to the difference-in-difference
methodology.
5. We now want to include controls for house characteristics and a measurement of local health
risk in our regression equation.
The dataset includes observable housing characteristics such as include lot size (acres),
interior floor space (sqft square feet, measured in 100s), building age (age), and overall
building conditions (the categorical variable origclass), as well as year and month indicator
variables.
(a) Using the variable cases as a measure of risk, estimate a regression to evaluate the effect
of health risk on house prices. Please comment on the estimates of your coefficients.
What is the effect of peak risk on house prices?
(b) Perform a similar regression to the one in the previous point using the variable news
as a measure of risk. What is the effect of peak risk on house prices? Please comment
on the differences with the previous regression.
(c) What do you think are the advantages and the disadvantages of the two measures
health risk—i.e., cases or news? Please explain your answer.
6. Using the variable sqft, construct four groups of homes: small homes (< 1, 250 square feet),
medium homes (1,250-1,500 square feet), large homes (1,500-2,000 square feet), and very
large homes (> 2, 000 square feet).
Using regressions, please explain whether or not the effect of health risk on house prices
differ between these groups of homes of different sizes.
7. Graphs are often very effective at conveying the results of regression analysis. Please con-
struct one chart that summarizes the effect of health risk on house prices in your dataset.