Bayesian Statistical Learning
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
Assignment 1 MAST90125: Bayesian Statistical Learning
There are places in this assignment where R code will be required. Therefore set the random
seed so assignment is reproducible.
set.seed(123456) #Please change random seed to your student id number.
Please save this R markdown document and write your answers in it. Between your answer
to each question, ensure there is sufficient space for marker comments by using the command
\newpage
Question One (5 marks)
In some cases, the data generative models, e.g., g(θ), are black-box and likelihood functions cannot be
obtained. Assume that there is only one parameter θ in the data generative model, and we have two data
observations: y1 = 33, and y2 = 54 that are i.i.d. given the generative model.
a) In such cases, if we want to analyze the posterior of θ, how could we obtain it? Please write down the
procedures step by step. hint: Approximate Bayesian Computation
b) Could we estimate the posteriors Pr(θ|y1) and Pr(θ|y2) seperately and then obtain the posterior by
Pr(θ|y1, y2) = Pr(θ|y1) Pr(θ|y2)? Please justify your answer using the definition of conditional probability.
c) Based on your result in b), please answer the question: if we want to obtain the posterior distribution
regarding parameters of interest in complex situations (many parameters and many observations), is
the Approximate Bayesian Computation method suitable given limited computing resources? Briefly
justify your answer.
Question Two (5 marks)
Medical researchers are wishing to investigate the performance of a diagnostic test. Prior studies suggest
the underlying probability of disease (event A) is a. To determine the effectiveness of the diagnostic test
(event B = testing positive), a case-control study was undertaken. Both cases and controls were added to
the study until d1 cases tested positive, and d2 controls tested negative.
a) Identify an appropriate distribution for the likelihood of nB¯|A, the number of cases testing negative,
and nB|A¯, the number of controls testing positive, including the parameter(s) of these probability mass
functions.
b) Identify a suitable conjugate prior for the parameters determined in a). Hint: Each of the priors will
depend on two hyper-parameters.
c) Determine the posterior distribution for the parameters identified in a).
1
Question Three (5 marks)
As part of an investigation into traffic flows, a study was proposed to count the number of vehicles passing
through an intersection each minute between 5 pm and 6 pm for one week. The Researchers have decided
to assume the resulting counts are i.i.d. both within and between days.
a) Specifying an appropriate likelihood for the situation above, calculate Jeffreys’ prior.
b) In this example, is Jeffreys’ prior improper? Justify your answer.
c) In this example does Jeffreys’ prior satisfy the criterion:
Posterior ∝ Likelihood.
Justify your answer.