Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
STA 365: Final Examination
Instructions.
This is an individual, open-book, take home examination.
You are allowed to use the course lecture notes, the recommended textbooks, and external
sources. However, you must cite the sources you rely on, and you should be using these
sources to inform/help you in developing your own answer, not copying them.
You must work on this assignment independently, and you should not discuss it with col-
leagues.
Your final answers should be your own.
You are strongly encouraged to type out your answers in LaTeX or a word processor. If you
need to handwrite your responses, make sure that they are clear and legible, and that you
scan a high quality image. What cannot be read will be marked as incomplete.
Where a problem requires the use of R and/or JAGS, you must produce your associated R
and/or JAGS code.
c© Boris Babic. This document is authorized for use only in STA 365: Applied Bayesian
Statistics (Winter 2023), in the Faculty of Arts & Sciences at the University of Toronto, by
Professor(s) Boris Babic. Copying, reprinting or posting is a copyright infringement.
1
Problem 1. Markov Chains (30 points). Consider a problem where we would like to
estimate a posterior distribution pi(θ|X) that has support of the entire real line R through
Markov Chain Monte Carlo. For each of the following functions, state whether it is a valid
transition function for a Markov Chain that converges to the posterior. You receive upto
2.5 points for correctly identifying if the function is valid or invalid, and upto 2.5 points for
a correct explanation for why it is valid or invalid.
Part (a). 5 points.
θ(t) ∼
θ(t−1) with probability pi(θ
(t−1)|X)
c
θ(t−1) + 1 with probability pi(θ
(t−1)+1|X)
c
θ(t−1) − 1 with probability pi(θ(t−1)−1|X)
c
where c = pi(θ(t−1)|X) + pi(θ(t−1) + 1|X) + pi(θ(t−1) − 1|X)
Part (b). 5 points.
θ(t) ∼
{
θ(t−1) with probability r
θ′ with probability 1− r
where θ′ ∼ θ(t−1) +N (0, 1)
r = min
(
1,
pi(θ(t−1)|X)
pi(θ′|X)
)
Part (c). 5 points.
θ(t) ∼
{
θ(t−1) with probability r
θ′ with probability 1− r
where θ′ ∼ θ(t−1) +N (0, 10000)
r = min
(
1,
pi(θ(t−1)|X)
pi(θ′|X)
)
2
Part (d). 5 points.
θ(t) ∼
{
θ(t−1) with probability r
θ′ with probability 1− r
where θ′ ∼ θ(t−1) +N (20, 1)
r = min
(
1,
pi(θ(t−1)|X)
pi(θ′|X)
)
Part (e). 5 points.
θ(t) ∼
{
θ(t−1) with probability r
θ′ with probability 1− r
where θ′ ∼ N (θ¯, 1), where θ¯ =
∑t−1
i=1 θ
(i)
t− 1
r = min
(
1,
pi(θ(t−1)|X)
pi(θ′|X)
)
Part (f). 5 points.
θ(t) ∼
θ(t−1) with probability r
θ′ with probability 1− r if t is even
θ′′ with probability 1− r if t is odd
where θ′ ∼ θ(t−1) +N (0, 1)
θ′′ ∼ θ(t−1) −N (0, 1)
r =
min
(
1, pi(θ
(t−1)|X)
pi(θ′|X)
)
if t is even
min
(
1, pi(θ
(t−1)|X)
pi(θ′′|X)
)
if t is odd
3
Problem 2. Monte Carlo (15 points).
Let Z1 ∼ N(µ1, σ2) and let Z2 ∼ N(µ2, σ2), where µ1 6= µ2. Let Y be a random variable
satisfying Y ∼ δZ1 + (1 − δ)Z2, where 0 < δ < 1. Simulate a data set of size N ≥ 100
consisting of iid draws from Y . You are free to choose each of the relevant parameters,
µ1, µ2, δ, σ
2. You may wish to this manually in R, or you may use JAGS.
Your task is to:
a. Write your choices of the parameters clearly. (3 points).
b. Produce the code used to generate the simulations. (3 points).
c. Plot a histogram of your simulated data. (3 points).
d. Overlay a plot of the density of the variable Y = δZ1 + (1− δ)Z2 on the histogram. (3
points).
e. Label the graph clearly, using captions or titles that mention your parameter choices.
(3 points).
4
Problem 3. Bayesian Logistic Regression (30 points).
In this problem, you will perform Bayesian logistic regression. We consider the following
hierarchical model:
yi ∼ Binomial(Ni, θi)
Fi ∼ N (Xiβ, σ2), where Fi = log
(
θi
1− θi
)
β ∼ N (0, σ2βI)
σ2 ∼ Inverse-Gamma(ν0/2, ν0σ20/2)
Note how this is similar to linear regression, but we add a link function step as we do in
standard logistic regression.
Part (a). Write down the full joint density,
P ((X1, y1), ..., (Xn, yn), F1, ..., Fn, β, σ
2|σ2β, ν0, σ20),
in terms of the conditional distributions shown in the hierarchical model. You do not need
to explicitly write out the functional forms of the densities, just a simplified probability
expression. (5 points).
Part (b). We will now implement a JAGS version of the Bayesian Logistic Regression
sampler specified above to model households’ decisions to switch the well they are using in
Bangladesh, based on whether their wells were marked as unsafe or not. Per the description
in Gelman and Hill (page 87):
“Many of the wells used for drinking water in Bangladesh and other South Asian countries
are contaminated with natural arsenic, affecting an estimated 100 million people. Arsenic
is a cumulative poison, and exposure increases the risk of cancer and other diseases, with
risks estimated to be proportional to exposure. Any locality can include wells with a range
of arsenic levels. The bad news is that even if your neighbor’s well is safe, it does not mean
that yours is safe. However, the corresponding good news is that, if your well has a high
arsenic level, you can probably find a safe well nearby to get your water from - if you are
willing to walk the distance and your neighbor is willing to share. [In an area of Bangladesh,
a research team] measured all the wells and labeled them with their arsenic level as well
as a characterization as safe (below 0.5 in units of hundreds of micrograms per liter, the
Bangladesh standard for arsenic in drinking water) or unsafe (above 0.5). People with
unsafe wells were encouraged to switch to nearby private or community wells or to new wells
of their own construction. A few years later, the researchers returned to find out who had
switched wells.”
The link for the data can be found here. You can read in this data using read table()
or read delim(), and will model well switching as the yi in the above model, and arsenic
levels and distance from the well as the Xi. Make sure to center all X-variables. Write a
5
JAGS version of the hierarchical model specified above, and run it for at least 1000 iterations.
Justify choices of the prior parameter values. Attach all code, and show 95% credible intervals
for β1, β2, σ
2. (20 points).
Part (c). What kind of plot would you use to assess MCMC convergence to a posterior?
Choose one, and produce it here. Judging from your plot, comment on whether you believe
we have MCMC convergence to a posterior (5 points).
6
Problem 4. Bayesian Linear Regression: Swimming (10 points).
Download the file swim time.RData from the course page. The data file contains a data
matrix Y on the amount of time, in seconds, it takes each of four high school swimmers to
swim 50 yards. Each swimmer has six times, taken on a biweekly basis.
For each swimmer j, (j = 1, 2, 3, 4), fit a Bayesian linear regression model which considers the
swimming time as the response variable and week as the explanatory variable. To formulate
your prior, use the information that competitive times for this age group generally range
from 22 to 24 seconds. As part of your answer, produce a diagnostic plot of your choice,
and comment on the suitability of the resulting model. Consider whether we have reached
MCMC convergence to a posterior, and whether your priors are reasonable. Briefly comment
how you would consider revising the model, and how you would evaluate if this revised model
is better than the current version?
7
Problem 5. Bayesian Linear Regression: Crime (15 points).
In R, load library(MASS) and then consider the dataset UScrime which contains crime rates
(y) and data on 15 explanatory variables for 47 U.S. states. A description of the variables
can be obtained by typing “?UScrime” in R console.
Part (a). Fit a Bayesian linear regression model using uninformative priors. Obtain
marginal posterior means and 95% credible intervals for coefficients. Describe the relation-
ships between crime and the explanatory variables. Which variables seem strongly predictive
of crime rates? (10 points).
Part (b). Repeat Part (a), using spike and slab priors. (5 points).