Your solutions are to be handed in by yourself to the Statistical Science Teaching & Learning Office
Declaration: I am aware of the UCL Statistical Science Department’s regulations on plagiarism for assessed coursework. I have read the guidelines in the student handbook and understand what constitutes plagiarism.
I hereby affirm that the work I am submitting for this in-course assessment is entirely my own.
This assessment consists of two parts. For Part A you can submit hand-written solutions. For Part B you are required to write a report: use a text-editor and hand in a printed text.
Parts A and B are both marked on a 0–100 scale, but part A counts for 40% and Part B for 60% towards the final mark for this assessment.Marks for the constituent parts are listed in bold face. For Part A, marks are given for correct answers, but also for clarity of explanation. This assessment as a whole counts for 50% towards the final course mark.
On the cover sheet, write your name and student ID number. To allow anonymous mark- ing, provide your student ID number at the top of each of the sheets that you use for Part A and B (and not your name).
Yijk = µ + αi + βj + γk + sijk , (1) for i, j, k = 1, . . . , p. Assume sijk ∼ N (0, σ2) independently for all triplets (i, j, k).
(i)Write down the sum-to-zero constraints for the effect parameters. How many parameters in total are specified by the model? Give your answer as a function of p. How many of those are estimated independently given data for this Latin squaredesign? [7]
Data are available for a design with p = 5. Assume that the data are from an experiment regarding the effect of compost on the height of seedlings 30 days after sowing. The five compost treatments A, B, C, D, E are denoted by i = 1, 2, 3, 4, 5, respectively. The seedlings are sowed in a large square tray which is divided in 25 little squares of equal size. Compost treatment i for row j and column k is determined by the Latin square design given by
A | B | C | D | E |
D | E | A | B | C |
E | A | B | C | D |
B | C | D | E | A |
C | D | E | A | B |
10 seedlings are sowed, and the mean height after 30 days is the response for the experiment.
The data are uploaded to Moodle in R format; see QuestionA1 Data.txt. In the data, Y is the response variable, col and row are the columns and rows in the above design, and compost is the treatment variable.
For the following questions, you can use software but do not hand in the code. Handed-in code will be ignored in the marking.
(ⅱ)Fit model (1) and print out two graphs for model diagnostics: (a) residuals againstfitted values, and (b) a quantile-quantile plot for the Hand in these printouts. Briefly discuss the graphs with reference to model assumptions.[6]
(ⅲ)Report the ANOVA Assuming significance level 5%, is there a significant effect of the treatment? Briefly explain in words what this means for the current experiment. [5]
(ⅳ)Derivethe mathematical expression for a 95% confidence interval for the contrast α5 − α4. Estimate this confidence interval using the data, and explain how you obtained the relevant quantities for this estimation. Interpret the estimated interval for the current experiment taking into account your answer to (iii).[12]