Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
Computer Project 1
ECON5120: Bayesian Data Analysis
Credit: 20% of final grade
This project will have you work with limited dependent variable models. These models are the backbone of
many statistical applications in economics and finance. Furthermore, they will introduce you to the idea of
data augmentation, which forms the basis of the computational techniques for these class of models (as well
as other important techniques such as the Expectation-Maximization algorithm).
Task: You are asked to program from scratch three variations of limited dependent variable models
and then apply your code to three real life datasets. Each exercise has an equal weight of the final
grade (adding up to 20%). You will also receive bonus points if you provide the technical details for
the derivations of each model (1% per exercise, for a total of up to 3%). These points supplement
your score for the coding part, giving you a higher chance of achieving full marks. The details for each
model and what you are expected to submit are outlined below.
Submittables: For full credit, you must submit to the course’s Moodle page three computer codes
that will execute posterior simulations of all three models along with the results of applying your code
to each of the datasets. These must be in either Matlab or R. Points will be subtracted for incorrect,
unreadable or uncommented code. For the bonus points, you must submit all technical derivations in
a typeset document (e.g., using LATEX). No handwritten notes will be accepted.
The following context is common to all models: We will assume that the observed data (y1, x1), . . . , (yn, xn)
come from a random sample of size n. There are k covariates x and the dependent variable y is limited in
the sense that the values it can take are constrained. From observed data, we can construct the observation
matrices
y =
y1
y2
...
yn
and X =
x11 x12 · · · x1k
x21 x22 · · · x2k
...
...
. . .
...
xn1 xn2 · · · xnk
where y is an n-dimensional vector and X is an n× k matrix. Remember to include a constant in all models
and in your applications by setting xi1 = 1 for all i (that is, by letting the first column of X be all ones).
For all execises, you can use noninformative prior values
B0 = 1000 · Ik
β0 = 0k
α0 = 0.002
δ0 = 0.002
where Ik is an identity matrix of size k×k and 0k is a vector of zeros of dimension k. Feel free to experiment
with these values in your application. Finally, for all posterior simulations you can use a burn-in period of
1,000 iterations plus 10,000 final iterations to estimate posterior means.
Model 1. Tobit: This model is useful when the outcome y is censored, i.e., a range of its values are actually
reported as a single value. Specifically, let
y∗i = x
′
iβ + ui
1
where ui ∼ N (0, σ2) independently across units i. Then, we say that the outcome variable is
censored since the latent y∗i are only observed when y
∗
i > 0 and all other values are set to 0.
Mathematically, this means that our observed data yi satisfy
yi =
{
y∗i if y
∗
i > 0
0 if y∗i ≤ 0
=
{
x′iβ + ui if x
′
iβ + ui > 0
0 if x′iβ + ui ≤ 0
This model can be estimated using maximum likelihood by writing out the full likelihood and
maximizing it numerically. However, the Bayesian approach takes advantage of the idea of data
augmentation to simplify the problem. Instead of having the unknown quantities be only β and
σ2, we can introduce y∗i as a latent variable. We would then write the likelihood of y = (y1, . . . , yn)
in terms of y∗ = (y∗1 , . . . , y
∗
n), β and σ
2. We will use as prior
p(y∗, β, σ2) = p(y∗|β, σ2)p(β)p(σ2)
where
p(y∗|β, σ2) =
n∏
i=1
N (y∗i |x′iβ, σ2)
p(β) = Nk(β|β0, B0)
p(σ2) = Inv-Gamma(σ2|α0/2, δ0/2)
The idea behind data augmentation is that including the latent variables greatly simplifies the
resulting posterior (even while it increases the number of variables we must sample). Let C be the
set of censored observations, i.e., C = {i : yi = 0}. We can show that the conditional posterior
distributions in the tobit case are given by
y∗i |β, σ2, y ∼ T N (−∞,0](x′iβ, σ2) for all i ∈ C
β|σ2, y∗, y ∼ Nk(βn, Bn)
σ2|β, y∗, y ∼ Inv-Gamma(αn/2, δn/2)
where T N (−∞,0] is the truncated normal distribution to the interval (−∞, 0] (that is, a regular
normal distribution for values of y∗i between −∞ and 0, but 0 if y∗i > 0) and
Bn =
(
X ′X
σ2
+B−10
)−1
βn = Bn
(
X ′y∗
σ2
+B−10 β0
)
αn = α0 + n
δn = δ0 + (y
∗ −Xβ)′(y∗ −Xβ)
Task: Program a Gibbs sampler that samples from the appropriate conditional posterior
distributions. Then, apply your program to the dataset on pension benefits (fringe.csv)
where the outcome variable is the amount of pension benefits (pension) and the covariates are
a constant, experience (exper), age (age), tenure duration (tenure), education level (educ),
number of dependents (depends), marriage status (married), dummy for white race (white),
and dummy for sex (male). Comment on your results.
Bonus: Present the derivations on how to obtain the conditional posterior distributions.
Model 2. Probit: The probit model is useful for binary outcome variables, i.e., where yi can only take the
values 0 or 1. In this case, we still define
y∗i = x
′
iβ + ui
2