ECMT 6002/6702: Econometric Applications
Econometric Applications
ECMT 6002/6702: Econometric Applications
Part I : Computing exercises
Information on the dataset
We will use the “ECONMATH" dataset (provided in Wooldridge’s textbook) with 856 samples with
17 variables:
• age : individual i’s age
• work : i’s hours of work (per week)
• study : i’s hours of study (per week)
• econhs : take 1 if individual i studied economics in high school
• colgpa : individual i’s college GPA measured at the beginning semester.
• hsgpa : individual i’s high school GPA
• acteng : individual i’s ACT English score
• actmth : individual i’s ACT math score
• act : individual i’s ACT composite
• mathscr : individual i’s math quiz score, 0-10
• male : take 1 if individual i is male
• calculus : take 1 if individual i took calculus course
• attexc : take 1 if individual i’s past attendance is “excellent”
• attgood : take 1 if individual i’s past attendance is “good”
• fathcoll : take 1 if individual i’s father has BA
• mothcoll : take 1 if individual i’s mother has BA
• score : individual i’s score in a certain course of interest, in percent
If you need more detailed explanation on the variables, you could see Wooldridge’s textbook “In-
troductory Econometrics: A Mordern Approach”.
1
Questions
Consider the multiple linear regression model:
yi
score
= β1 + β2 x2,i
study
+ β3 x3,i
econhs
+ β4 x4,i
colgpa
+ β5x5,i
male
+ β6 x6,i
calculus
+ β7 x7,i
attgood
+ β8 x8,i
attexc
+ ui. (1)
Answer to the following questions.
(i) Report the OLS estimates β̂6 and β̂7 with their estimated standard errors (with no correction
for heteroskedasticity/autocorrelation). Interpret the estimation results (Does this result
makes sense?).
(ii) Test if H0 : β6 = 0 (against H1 : β6 ̸= 0) and H0 : β7 = 0 (against H1 : β7 ̸= 0) based on the
estimation results given in (i). Interpret the test results.
(iii) Test if H0 : β7 = β8 = 0 (against H1 : H0 is not true) using the Wald, LR, and LM tests.
Report the test statistics and test results. (Use the 95% quantile of χ2(m) distribution for
correctly specified m.) Interprete these results. Is there any significant difference between
the Wald, LR and LM statistics?
(iv) Implement the White’ test for heteroskedasticity. Report the test statistic and the test result.
Inteprete the test result. (Use the 95% quantile of χ2(m) distribution for correctly specified
m.)
(v) Report White’s robust standard errors of β̂6 and β̂7 and re-examine the hypotheses given in
(ii) using the computed standard errors.
Now include all the remaining variables as additional regressors.
(vi) Report the OLS estimates β̂6 (associated with calculus) and β̂7 (associated with attgood) with
their estimated standard errors (with no correction for heteroskedasticity/autocorrelation).
Compare the estimation results to those in (i) and interpret the difference.
(vii) Based on the results in (vi), discuss on potential problems that can occur in the linear
regression model (1) when you are interested in the effect of calculus and attgood.
2
Part II : Alcohol Abuse and Employment
Information on the dataset
The dataset for this exercise is obtained from the Journal of Applied Econometrics data archive.
This dataset was used to examine the effect of alcohol abuse on employment status as in the paper
by Terza, J.V. (2002, Journal of Applied Econometrics). Do your own analysis.
Instructions
• The original paper considers an advanced nonlinear econometric model. You are NOT en-
couraged to follow that approach. It will be sufficient if you construct your own multiple lin-
ear regression model using the given variables and provide some relevant empirical analysis.
• The dataset contains various individual characteristics. Using this, we may construct your
own model and propose an empirical question. For example, you may
(i) do similar analysis as in the original paper (using linear regression)
(ii) focus on the effect(s) of some other variable(s).
(iii) implement various diagnostic checks of the considered linear model.
(iv) test some hypotheses of interest.
Briefly state what you want to do and why. Report the results.
Notes:
• Even if there is no limit in the range of topics that you can choose, please summarize all the
results within around three pages. The computing code needs not to be attached.
• This is an econometric exercise. Make sure your results do not include any opinion
and interpretation which are outside the field of economics/econometrics.
Information on the dataset
• abuse : take 1 if individual i abuses alcohol
• status : take 1 if individual i is out of workforce, take 2 if unemployed, take 3 if employed.
• unemrate : employment rate for the state where individual ’i’ resides.
• age : individual i’ age.
• educ : individual i’s years of schooling.
• married : if individual i is married.
• famsize : individual i’s family size.
3
• exhealth : take 1 if individual i is in excellent health.
• vghealth : take 1 if individual i is in very good health.
• goodhealth : take 1 if individual i is in good health.
• fairhealth : take 1 if individual i is in fair health
• northeast : take 1 if individual i lives in northeast US.
• midwest : take 1 if individual i lives in midwest US.
• south : take 1 if individual i lives in south US.
• centcity : take 1 if individual i lives in a central city of metropolitan area
• outercity : take 1 if individual i lives in a outer city of metropolitan area
• qrt1 (qrt2, qrt3) : takes 1 if individual i’s was interviewed in the first (second, third) quarter.
• beertax : state (where individual i resides) beer tax, $ per gallon.
• cigtax : state (where individual i resides) cigarette tax, $ per gallon.
• ethanol : state (where individual i resides) per-capita ethanol consumption.
• mothalc (fathalc) : take 1 if individual i’s mother (father) is an alcoholic.
• livealc : take 1 if individual i lives with an alcoholic.
• inwf : take 1 if individual i is in workforce (i.e., state > 1).
• employ: take 1 if if individual i is employed
• agesq: age2.
• beertaxsq: beertax2.
• cigtaxsq: cigtax2.
• ethanolsq: ethanol2.
• educsq: educ2.