Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
Statistics Statistics
Problem 1 (25 points) Answer the following questions: a. Consider the multiple regression model Y = Xβ + with E() = 0 and var() = σ2I. In addition, ∼ N(0, σ2I). Show that the F test (see the formula below) for the general linear hypothesis when testing the overall significance of the model is the same as MSR MSE , where MSR = SSR k and MSE = SSE n−k−1 . (Cβˆ − γ)′ [ C(X′X)−1C′ ]−1 (Cβˆ − γ) ms2e (F test for the general linear hypothesis.) b. Suppose X is the initial matrix in a multiple regression problem. We then add an extra predictor z. So the regression matrix is now W = (X, z). Show that the last diagonal element of (W′W)−1 is equal to 1 z∗′z∗ , where z∗ is the residual vector from the regression of z on X. c. Consider the multiple regression model, Y = Xβ + , with E() = 0, var() = σ2I, and ∼ N(0, σ2I). Prove that if a constant c is subtracted from each yi then the F statistic for testing the hypothesis that any subvector of β(0) is equal to zero is unchanged. If the model does not include the intercept β0 will the F statistic change? Explain. d. A multiple regression problem that involves the regression of y on a constant, x1, and x2 gave the following results: yˆi = 4 + 0.4xi1 + 0.9xi2, R 2 = 8 60 , e′e = 520, n = 29 and X′X = ( 29 0 0 0 50 10 0 10 80 ) , (X′X)−1 = ( 0.034 0.000 0.000 0.000 0.021 −0.003 0.000 −0.003 0.013 ) . Using α = 0.05 test the hypothesis that the two slopes sum up to 1. Test the hypothesis that β1 = 0 by using the extra sum of squares principle. Use α = 0.05. 1 Problem 2 (25 points) Answer the following questions: a. Consider the multiple regression model with no intercept Y = Xβ + with E() = 0 and var() = σ2I. In addition, ∼ N(0, σ2I). Find the information matrix and explain why (or why not) βˆ is an efficient estimator of β. Is S2e an efficient estimator of σ 2? b. Consider the multiple regression model with no intercept Y = Xβ + with E() = 0 and var() = σ2I. In addition, ∼ N(0, σ2I). Find the variance of the t statistic when we test the hypothesis H0 : β1 + β2 = 1. Please include in your answer elements of (X′X)−1 denoted by vij . c. Let Y = (Y1, Y2, . . . , Yn) ′ and Y ∼ N(µ, σ2I). Show that the sample mean Y¯ is independent of the vector of the deviations (Y1 − Y¯ , Y2 − Y¯ , . . . , Yn − Y¯ )′ and therefore Y¯ is independent of s2, where s2 is the sample variance of Y1, Y2, . . . , Yn. d. Let Y = (Y1, Y2, . . . , Yn) ′ and Y ∼ N(µ, σ2I). Express ∑n i=1 (Yi − Y¯ )2 in the form Y′AY. By using theorem (i) of handout #35, show that (n−1)S 2 σ2 ∼ χ2n−1, where S2 is the sample variance of Y1, Y2, . . . , Yn. 2 Problem 3 (25 points) Answer the following questions: a. Consider the multiple regression model Y = Xβ + subject to a set of linear constraints of the form Cβ = γ, where C ism×(k+1) matrix. The Gauss-Markov conditions hold and also ∼ N(0, σ2I). Let C = ( C1 C2 ) and β = ( β1 β2 ) . Use distribution theory results to show that (n−k−1+m)S2ec σ2 ∼ χ2n−k−1+m, where S2ec is the unbiased estimator of σ2 under the constraint. b. Consider the multiple regression model Y = Xβ + subject to a set of linear constraints of the form Cβ = γ, where C is m × (k + 1) matrix. The Gauss-Markov conditions hold and also ∼ N(0, σ2I). Let C =( C1 C2 ) and β = ( β1 β2 ) . We will use the canonical form of the model. Find var(βˆ1c), var(βˆ2c), and cov(βˆ1c, βˆ1c), and place them in a matrix to show how these are related to var(βˆc), where βˆc is the constrained least squares estimate using the method of Lagrange multipliers. c. Let X1 equal a constant (column of ones) plus three predictors. Let X2 contain three more predictors. Let Y be the response variable. The Gauss-Markov conditions hold. Regress each of the three variables in X2 on X1 and obtain the residuals X ∗ 2. Regress y on X1 and X ∗ 2. How do your results compare to the results of the regression of Y on X1 and X2? The comparison you are making is between the least squares coefficients of the two regression models. Derive the result mathematically. d. Consider the multiple regression model Y = Xβ + and the partial regression problem, where X = (X1,X2). Is it true that βˆ2.1 can be obtained by regressing Y on X ∗ 2? Derive the result mathematically. Note: X ∗ 2 is the matrix of residuals from regressing each column of X2 on X1. 3 Problem 4 (25 points) Consider the multiple regression model Y = Xβ + , with E() = 0 and cov() = σ2I. In addition, if we have a set of linear constraints of the form Cβ = γ, we showed that βˆc = βˆ − (X′X)−1C′[C(X′X)−1C′]−1(Cβˆ − γ). Answer the following questions: a. Show that ec ′ec = e′e+ (βˆc − βˆ)′X′X(βˆc − βˆ). b. Let ec be the constrained residuals. Show that e ′ cec = SST , where SST is the total sum of squares, the matrix C is k × (k + 1), and the vector γ = 0. For example, if k = 5, the matrix C is as follows: C = 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 . c. Is R2 of the constrained least squares larger or smaller than the R2 of the unconstrained least squares? Please explain your answer mathematically. d. Under the hypothesis Cβ = γ find the unbiased estimator of σ2 in the case of constrained least squares