Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
STAT2004 Assignment 3 Due: 08 October 2021 @ 4pm Each of the first four questions is worth 10 marks, so you can get maximum 40 marks for a total of 10%. Note that some questions involve interpretation and communication of results in the form of an audio recording which you upload onto Blackboard as an audio file. Question 5 is a bonus question. A complete and correct solution of this question earns you an extra 1% for this assignment. Question 1 (Bias and MSE of an MLE) [10 marks] For λ > 0, define the function gλ(y) = { λy−2 y ≥ λ, 0 else. (a) (1 mark) Show that gλ is a probability density function (pdf) for every λ > 0. Let X = (X1, . . . , Xn) with X1, . . . , Xn iid random variables. Assume that each Xi has a continuous distribution with pdf gλ, where the parameter λ > 0 is unknown. (b) (1 mark) Given a realization x = (x1, . . . , xn) of X, write down the corresponding likelihood function (considered as a function of λ). (c) (2 marks) Find the ML estimate T (x) for the unknown parameter λ based on the realization x. Also give the ML estimator T (X). (d) (1 mark) Compute Pλ(T (X) > t) for t ∈ R. (e) (1 mark) Based on the expression you found for Pλ(T (X) > t), compute the cumulative distribution function (cdf) Fλ of T (X). (f) (1 mark) Based on Fλ, compute the pdf of T (X). (g) (1 mark) Compute the bias of T (X), i.e., compute Eλ(T (X)− λ). (h) (2 marks) Compute the Mean Square Error (MSE) of T (X). Question 2 (Comparing estimators) [10 marks] Let X1, X2 and X3 be the diameters three randomly sampled berries from a Uniform[θ, 2θ] population, where θ > 0 is a model parameter. (a) (2 marks) Find the method-of-moments estimator of θ. (b) (4 marks) Find the MLE, θˆ, and find a constant k such that Eθ(kθˆ) = θ. (c) (2 marks) Compute both the method-of-moments and the ML estimates of θ based on the following three observations of berry sizes (in centimetres) of wine grapes: 1.29, 0.86, 1.34. (d) (2 marks) [Audio Question]: Which of the two competing estimators is better? Briefly justify your answer. Question 3 (Confidence interval for an upper bound) [10 marks] Let U1, U2, . . . , Un be an iid sample from the standard uniform U(0, 1) distribution, and let U(n) = max{U1, U2, . . . , Un} be the sample maximum. (a) (2 marks) Show that the cumulative distribution function for U(n) is given by P ( U(n) ≤ u ) = 0 , u < 0 un , 0 ≤ u ≤ 1 1 , u > 1 (b) (1 mark) Using part (a), or otherwise, find critical values a and b such that P (a ≤ U(n) ≤ b) = 90% Now let X1, X2, . . . , Xn iid∼ U(0, θ), where the upper bound θ ≥ 0 is a model parameter. (c) (2 marks) Show that the MLE of θ is given by X(n) = max{X1, X2, . . . , Xn}. (d) (2 marks) Argue why the transformation X(n)/θ is a pivot variable. (e) (2 marks) A sample of n = 8 observations is taken from this distribution, giving the following data values: 1.421 1.197 0.476 1.106 2.356 1.856 0.581 1.299 Using parts (b) and (d), or otherwise, construct and interpret a 90% confidence interval for θ based on these data. (f) (1 mark) [Audio question]: A STAT2004 student concludes that “there is a 90% chance that the true θ is contained within the interval calculated in part (e)”. Do you agree with this conclusion? Briefly explain why, or why not. Question 4 (Hypothesis testing for a failure probability) [10 marks] Suppose that a current industry-standard engineering model predicts that a certain type of concrete block will have a 40% chance of failure when subject to a two-tonne load. A civil engineer wants to test the accuracy of this claim by obtaining 50 independent samples of such blocks, subjecting each to a two-tonne load and then recording the total number X of blocks that broke. She decides that she will reject the claim H0 : p = 0.4 in favour of H1 : p 6= 0.4 if X ≤ 15 or X ≥ 25. (a) (1 mark) What is the significance level α of this test? (b) (1 mark) Graph the power of this test as a function of p. The civil engineer proposes an alternative model for predicting the chance of failure of concrete blocks. Under the alternative model, the probability of failure for this type of concrete block is predicted to be 55%. She wants to test whether her new model is more accurate than the current industry standard. (c) (3 marks) Construct a likelihood ratio test (LRT) of H0 : p = 0.4 versus H1 : p = 0.55 based on the number of failures X from 50 trials. (d) (1 mark) Show that the LRT is equivalent to the test with rejection region {X ≥ c} for some cutoff c. (e) (1 mark) Changes in industry standards should not be taken lightly, so a significance level of 1% is often used in engineering instead of 5%. Find the cutoff value c for carrying out this test at the 1% significance level. (f) (1 mark) If the alternative model is actually correct, what is the power of this test? (g) (2 marks) What sample size is needed to increase the power of this test to 90%, while keeping the level of the test at 1%? 2 Question 5 (Bonus Question) [4 marks] In some applied scenarios, it may be more realistic for the measurement variability to increase as the measurement itself increases. For example, an experimenter might take more care when measuring the size of a smaller unit (say, 1cm in length) but less care when measuring the size of a larger unit (say, 10cm in length). Instead of using a N(µ, σ2) distribution to model such measurements, one may consider the model N(µ, µ2) in which the variance increases with the mean. Let X1, X2, . . . , Xn iid∼ N(µ, µ2), where the mean µ > 0 is an unknown parameter. (a) (2 marks) Suggest a pivot transformation for this problem. (b) (2 marks) Using your result from part (a), or otherwise, explain how a 90% confidence interval for µ can be constructed based on the observations X1, X2, . . . , Xn. 3