Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
STAT 441, ASSIGNMENT 2
1. Return back to the Problem 2 of the previous homework, Assignment 1. You do not needthe original operational data, variables Murder and Latitude, if you kept in your R workspacethe objects for the 3 models you fitted in Assignment 1. Recall those were
(i) linear: y = α + βx, (ii) quadratic: y = α + βx+ γx2, (iii) reciprocal: y = 1
α + βx
.
If you did not keep the object and need to refit those models for this assignment those modelsusing Murder and Latitude, then do it—but without much further ado; make only sure (doublecheck!) that you obtain the same parameter estimates as in the Assignment 1 (otherwise itsgrading will have to be revised, and possibly not for the better).And now, here is your performance evaluation dataset, data for the Canadian provinces andterritories: m2019 is the murder rate in 2019, and d, m, s are respectively degrees, minutes,and seconds for the latitude of the center of corresponding Canadian province or territory.(I’ll make the dataset available, although typing it down—as I did—does not take too muchtime either)
d m s m2019
AB 55 10 11 2.29
BC 54 45 16 1.78
MN 54 55 46 5.23
NB 46 37 01 1.96
NL 52 53 23 0.96
NS 45 08 45 0.63
ON 50 26 43 1.69
PE 46 23 25 1.31
QC 53 23 34 0.92
SK 54 25 05 4.68
YT 63 37 52 2.52
NT 66 22 49 4.52
NU 71 01 10 17.81Of course, before doing anything with this data, you need to make them compatible with theUS data first. (How? That is a sort of high school mathematical problem. Thus, no extensivecommentaries needed on this, just do not forget to do it.) And now:(a) Evaluate the performance of your fits of all three models on the Canadian data andcomment on the results. Is the fit you proposed in Assignment 1 a winner? (There is no pointin changing your mind now, it all sits submitted there. And your grade does not depend onwhether you proposed a winner or not. It depends on whether the analysis is done right.)(b) Evaluate the performance of your fits of all three models on the Canadian data, but nowonly on provinces, not on territories (you know which are those, don’t you?). Comment againon the result; do not forget to mention whether your bet in Assignment 1 happens to be awinner in this case or not.
(This assignment has 2 pages.)
22. Lecture 9 says on page 8Scaling of the variables entails that the roˆle of the variance-covariance matrixin principal components is taken over by the correlation matrix.(a) Explain what it means.(b) Justify it in an “R experiment”; that is, take some suitable dataset, for instance USArrestsonce again, and calculate... not the principal components or directions themselves, but thematrices in your explanation.(c) Once you achieve success in (b), give a mathematical proof why it works. (It is fine to workcomponentwise here.)This is neither a pure mathematical, nor a pure computational exercise. Something of both.Show code and results for (b), give mathematics for (c), and a mathematical/verbal explanationfor (a).
3. Consider the dissimilarity matrix of Crime data (page 4 of Lecture 11, also available as
crime.R from the Datasets folder in eClass).(a) Is the dissimilarity represented by this matrix a metric?(b) And how about that on page 1 on Lecture 11? You may want to look only at the matrix
Fairorg, if you understand the way how Fairmont was produced out of it. (Both are containedin fairmont.R, also in the Datasets folder.)Use methodology that you believe should provide you with the correct result in the mostconvenient way. (Slovak proverb: If it’s not in the head, it’s in the feet.)