Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
MTH5120: Statistical Modelling I
You should attempt ALL questions. Marks available are shown next to the ques-
tions.
In completing this assessment, you may use books, notes, and the internet. You
may use calculators and computers, but you should show your working for any
calculations you do. You must not seek or obtain help from anyone else.
At the start of your work, please copy out and sign the following declaration:
I declare that my submission is entirely my own, and I have not sought or obtained
help from anyone else.
All work should be handwritten, and should include your student number.
You have 24 hours in which to complete and submit this assessment. When you have
finished your work:
scan your work, convert it to a single PDF file and upload this using the upload tool
on the QMplus page for the module;
e-mail a copy to [email protected] with your student number and the module code in
the subject line;
with your e-mail, include a photograph of the first page of your work together with
either yourself or your student ID card.
You are not expected to spend a long time working on this assessment. We expect you to
spend about 2 hours to complete the assessment, plus the time taken to scan and upload
your work. Please try to upload your work well before the end of the assessment period, in
case you experience computer problems. Only one attempt is allowed – once you have
submitted your work, it is final.
IFoA exemptions
This module counts towards IFoA actuarial exemptions. For your submission to be eligible for
IFoA exemptions, you must submit within the first 3 hours of the assessment period. You
may then submit a second version later in the assessment period if you wish, which will count
only towards your degree. There are two separate upload tools on the QMplus page to enable
you to submit a second version of your work.
Examiners: L I Pettit, D S Coad
© Queen Mary University of London (2020) Continue to next page
MTH5120 (2020) Page 2
Question 1 [15 marks].
You should answer this question using a calculator. You should show all working.
(a) For the data given below find the values of x¯, y¯, Sxx, Syy and Sxy. [5]
x 7 6 5 1 5 4 7 3 4
y 97 86 78 10 75 62 101 39 y9
Note: The value of y9 is equal to 43 plus the sum of the seventh and ninth digits of
your student number.
(b) Hence find the least squares estimates of β0 and β1 in the simple linear regression model
of y on x. [3]
(c) Find the Analysis of Variance table to test the hypothesis that β1 = 0 against a two
sided alternative using a 5% significance level. [7]
You are given the following values from R:
qf(0.95,1,6)= 5.99, qf(0.95,1,7)= 5.59, qf(0.95,1,8)= 5.32,
qf(0.95,1,9)= 5.12, qf(0.95,2,6)= 5.14, qf(0.95,2,7)= 4.74,
qf(0.95,2,8)= 4.46.
Question 2 [34 marks]. Crickets are insects which make a characteristic chirping sound.
It has been observed that the frequency of chirps seems to be related to the temperature. A
biologist measured the number of chirps (y) per second a cricket made at various
temperatures (x) measured in degrees Fahrenheit.
(a) State the assumptions made about the errors in a simple linear regression model. [2]
(b) The biologist fitted a simple linear regression model to the data and obtained the
following plot. Comment on whether a simple linear regression model seems to fit the
data. [3]
© Queen Mary University of London (2020) Continue to next page
MTH5120 (2020) Page 3
(c) The following output was obtained from R. Two quantities in the Table of Coefficients
are recorded as A and B. Find their values. [4]
> crickets<-lm(y~x)
> summary(crickets)
Call:
lm(formula = y ~ x)
Residuals:
Min 1Q Median 3Q Max
-1.62746 -0.56464 0.08213 0.76762 1.54563
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.46951 2.96747 A 0.876716
x B 0.03727 5.447 0.000112 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.9799 on 13 degrees of freedom
Multiple R-squared: 0.6953,Adjusted R-squared: 0.6719
F-statistic: 29.67 on 1 and 13 DF, p-value: 0.0001119
(d) Two t tests can be carried out using this output. For each write down the null and
alternative hypotheses, the distribution of the test statistic if H0 is true and state your
conclusions. [8]
(e) The biologist looked at the plot of standardised residuals versus x.
(i) Comment on this plot. [3]
© Queen Mary University of London (2020) Continue to next page
MTH5120 (2020) Page 4
(ii) The biologist obtained the following output. Explain what assumption he is
checking and the conclusion. What graph could he have looked at to examine this
assumption? [4]
> shapiro.test(stdres)
Shapiro-Wilk normality test
data: stdres
W = 0.96628, p-value = 0.7996
(iii) The biologist is interested in predicting the mean number of chirps per second
when the temperature is 85 degrees Fahrenheit. He obtains the following output.
Explain what it shows. [5]
> pred1 <- predict(crickets, newdata=data.frame(x=85), interval=’confidence’)
> pred1
fit lwr upr
1 17.72361 17.01161 18.4356
(iv) The biologist wonders if it would be possible to predict the temperature based on
the number of chirps. Discuss how you could do that. Looking at the data
comment on how good the prediction would be. [5]
Question 3 [19 marks]. For the general linear model Y = Xβ+ ε, where ε is a vector of
errors assumed to be uncorrelated with zero mean and constant variance σ2, the formula for
the least squares estimator βˆ is
βˆ = (XTX)−1XTY .