Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
MAST90045 Assignment 2
Due on 1 May 2023
Question 1 (50 pts)
Imagine you are an entomologist studying new species on two remote islands (island A
and island B) to develop new taxonomy of related species. You hire local people to catch
insects and take measurements of their body lengths. Each measurement is assumed to
be independently collected.
You have received 100 measurements of a new species from the island A. A histogram
looks as follows.
Given the histogram, you have decided to use a normal distribution as
a probabilistic model to describe the body length distribution. Let the data be denoted by
, which is stored in X.A in the answer source Ple.
In settings like this, the standard approach to specifying the model (i.e., specifying
and in this case) is maximum likelihood estimation (MLE). The idea of MLE is
that, given the parameterised probabilistic model, we search for a parameter that most
likely gives rise to the observed data. Formally, the best parameter is a solution to
the following optimisation problem:
where
is the PMF/PDF for each of independent and identically distributed (IID)
random variables parameterised by , and
is the corresponding observed data.
∼ N( , )XA μ1 σ21
{xAn }100n=1
μ1
σ1
( )θ∗
∈ f ( ; θ),θ∗ argmax
θ∈Θ ∏n=1
N
xn
f N
{Xn}Nn=1 θ
{xn}Nn=1
The objective function
is a function of and called the likelihood function. It is the product of function values
obtained by evaluating the PMF/PDF at for . The multiplication is justiPed by
the independence of . Note that
and the left-hand side is typically easier to work with in both analytical and numerical
optimisation.
In this question, you are asked to practise using R built-in optimiser optim with
method="L-BFGS-B" , a popular choice among applied researchers. Read its help page
and learn how to use it, especially how to use lower and upper parameters to restrict
the search space .
1-1 (3 pts)
Back to the island A. Let where (i.e., ).
Now, based on the modelling assumption , formulate a maximisation
problem.
1-2 (4 pts)
Find a local maximiser and round it to one decimal place. Playing with par , lower ,
and upper based on the histogram helps you choose good values.
1-3 (3 pts)
Moving on to the island B. You have received 300 measurements of new species from
the island B. A histogram looks as follows.
f ( ; θ)∏
n=1
N
xn
θ N
{xn}Nn=1 θ
{Xn}Nn=1
log( f ( ; θ)) = f ( ; θ),argmaxθ∈Θ ∏n=1
N
xn argmax
θ∈Θ ∏n=1
N
xn
Θ
= ( , ) ∈θA μ1 σ1 ΘA = ℝ ×ΘA ℝ++ > 0σ1
∼ N( , )XA μ1 σ21
θÂ
While you are certain that there are two new species (species 2 and 3) whose size
distributions are distinct, local people were unable to tell the difference and recorded
only measurements of lengths. Encouraged by the above histogram, your conviction that
the data came from two distinct species leads to the following model.
The new species’ body length distribution consists of two normal distributions—it is
drawn from with probability and drawn from with probability
where . In other words, the PDF for this random variable is
As for the island A model, using the above PDF, your approach to Pnding the best
parameter is MLE. Note that where
. Formulate a maximisation problem.
N( , )μ2 σ22 q N( , )μ3 σ23
1 − q q ∈ (0, 1) XB
f ( ) = exp(− ( − ) + exp(− ( − ).xB q2πσ22‾ ‾‾‾‾√
1
2σ22
xB μ2)2
1 − q
2πσ23‾ ‾‾‾‾√
1
2σ23
xB μ3)2
θB∗ = ( , , , , q) ∈θB μ2 σ2 μ3 σ3 ΘB
= ℝ × × ℝ × × (0, 1)ΘB ℝ++ ℝ++
1-4 (6 pts)
Try to Pnd a local maximiser in the above problem using optim . Depending on your
speciPcation and creativity, you may or may not be able to solve it. If you make it, report
rounded to one decimal place. If you cannot, make a sensible conjecture about the
cause of the challenge. Either case, remember to explain your answer.
1-5 (3 pts)
You have got an idea about an alternative approach to the problem. Although is a
deterministic parameter, it is for the probability that a body length is drawn from species
2 (and from species 3). So, why don’t we treat the species selection as another
random event? SpeciPcally, dePne a new random variable such that
where is the PMF of . Then, we have the size distribution conditional on as
follows:
Find an expression for , the joint PDF for a mix of continuous and discrete .
Remember to be explicit about where and where .
1-6 (4 pts)
Marginalise over to recover the marginal PDF and marginalise
over to recover the marginal PMF .
1-7 (7 pts)
θB̂
θB̂
q
1 − q
S
p(2) = q and p(3) = 1 − q,
p(s) S S = s
f ( |s) = exp(− ( − ).xB 12πσ2s‾ ‾‾‾‾√
1
2σ2s
xB μs)2
f ( , s)xB XB S
f ( , s) > 0xB f ( , s) = 0xB
f ( , s)xB S f ( )xB f ( , s)xB
XB p(s)
At this point, you come to regret not telling the local people how to distinguish two
species so that they can also record the species for each measurement. Why? Explain in
detail how the knowledge of species for each measurement could make your MLE
easier.
Note that, the independence of each observation implies that the likelihood of given
by the joint PDF for all 300 pairs of random variables would be simply the
product of each component PDF :
1-8 (5 pts)
You are tenacious and start thinking along the following lines. Now, the species selection
is a random variable, and the knowledge of species for each measurement could make
the MLE easier. Although is unmeasured, we can derive its distribution as a function of
. So, to eliminate the dependency on , why don’t we take the expectation of
the MLE objective function with respect to and then maximise it. In particular,
to make use of the measurement data , we should take the expectation with
respect to the PMF of conditional on ,
where and .
Derive an expression for .
1-9 (8 pts)
Given the derived , here is the formal description of the above algorithm. In
search of ,
0. Begin with some initial .
1. With , specify .
2. Find a local maximiser by solving
θB
{( , )XBn Sn }300n=1
f ( , ; )xBn sn θB
f ( , ; ).∏
n=1
300
xBn sn θB
S
S
θB {sn}300n=1
{Sn}300n=1
{xBn }300n=1
{Sn}300n=1 {xBn }300n=1
p(s| ; ),xB θB
s = ( , ,… , )s1 s2 s300 = ( , ,… , )xB xB1 xB2 xB300
p(s| ; )xB θB
p(s| ; )xB θB
θB∗
θBold
θBold p(s| ; )xB θBold
θBnew
피[log( f ( , ; ))] ,
300
n
B