Advanced Statistical Methods
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
ST22 – Advanced Statistical Methods
Final Exam Revision – addition exercise set
This is NOT a mock final exam! It is a selection of exercises for extra practice similar to the ones
discussed in class. Solving only these exercises is not enough for revision, you must also go over all
the exercises covered in class.
Exercise 1
The Okun law highlights a statistical relationship between unemployment rates and the rate
of economic growth (measured by the increase in GDP).
In a recent article which questioned the relevance of the Okun law and its predictive capability,
the results of econometrics tests confirmed that the relationship is indeed strong.
In order to do so, the study relied on the unemployment rate and the rate of economic growth
for the US between 1955 and 2015 (so n=60 observations).
Source: Bureau of Labor Statistics
Let X denote the rate of economic growth (increase in GDP) and Y the rate of
unemployment. Statistical results are provided below:
Mean (m)
Corrected variance
(dividing by n-1) (̅2)
X 3.6 4.16
Y -0.3 1.75
ST22 – Advanced Statistical Methods
The slope of the regression line is of -0.587.
The sum of squares explained by the regression is 84.77.
1. In the article, it is stated that “the graph shows the relatively strong relationship which
exists between the increase in GDP and the variation in unemployment rate since 1950
in the U.S”. Can you specify which type of statistical “relationship” is referred to?
2. How do you assess the quality of the chosen model for this sample? Calculate and
interpret, in this context, an appropriate indicator to justify your answer.
3. Determine the best estimations for the model parameters using the above
information.
4. Is this model valid for the entire population (at a 95% confidence level)?
5. Do you therefore confirm the validity of the Okun law?
Exercise 2
A ranking of engineering schools was published. The measured criteria are the following:
- The average result at the Baccalaureate (Moyenne au bac)
- The amount of money allocated to Research, in thousands of euros (Budget alloué recherche)
- The number of PhD students (Doctorants étudiants)
- The mandatory internship duration, in months (Durée obligatoire entreprise)
- The average salary 3 years after graduation, in thousands of euros Salaire à 3 ans)
- The amount of the apprenticeship tax, in thousands of euros (Taxe d’apprentissage)
- The number of alumni who are working (Anciens diplomés en activité)
- The percentage of foreign students (Etudiants étrangers)
- The percentage of students holding a double diploma (Double diplomés)
- The percentage of students having obtained their first job abroad (Premier emploie étranger)
- The percentage of scholarship holders (Boursiers)
- The percentage of women (Filles)
We perform a multiple regression using a statistical software (SPAD). The results are presented
below for a probability of type I error of 5%.
IDENTIFICATION OF THE ADJUSTEMENT COEFFICIENTS
VARIABLE ENDOGENE (Y) ... Salaire à 3 ans (K€)
VARIABLE 4 ... Durée entreprise (en mois)
COEFFICIENT 1 : Dur
VARIABLE 6 ... Taxe d'apprentissage
COEFFICIENT 2 : Taxe
VARIABLE 8 ... % Etudiants étrangers
COEFFICIENT 3 : % Et
VARIABLE 9 ... % Doubles diplômés
COEFFICIENT 4 : % Do
TABLE 9
ESTIMATION / COEFFICIENTS
AJUSTEMENT DES MOINDRES CARRES (AVEC TERME CONSTANT)
27 INDIVIDUS, 5 PARAMETRES (CONSTANTE EN QUEUE).
IDEN LIBELLE COEFFICIENT ECART-TYPE STUDENT PROBA. V.TEST
22
CRITERE(S)
Dur - Durée entreprise (e -0.3353 0.187 -1.797 0.086 -1.72
Taxe - Taxe d'apprentissage 0.0027 0.001 4.126 0.000 3.51
% Et - % Etudiants étranger 0.0731 0.041 1.770 0.091 1.69
% Do - % Doubles diplômés -0.0617 0.046 -1.337 0.195 -1.30
ST22 – Advanced Statistical Methods
CONSTANTE 41.5256 1.848 22.476 0.000 8.29
TABLE 10
TEST D'AJUSTEMENT GLOBAL
SOMME DES CARRES DES ECARTS ........... SCE = 72.0761
COEFFICIENT DE CORRELATION MULTIPLE ... R = 0.7556 R2 = 0.5710
VARIANCE ESTIMEE DES RESIDUS ...... S2 = 3.2762 S = 1.8100
TEST DE NULLITE SIMULTANEE DES COEFFICIENTS DES 4 VARIABLES :
FISHER = 7.320 DEG.LIB = 4;22
P.CRIT = 0.0006 V.TEST = 3.22
TABLE 11
1. Using Table 9, determine the response variable, the explanatory variables and the equation of
the regression model in the sample.
2. How good is the adjustment model in the sample? Justify.
3. What can you conclude about the overall validity of the population model? Justify.
4. Using table 10, explain which variables contribute significantly to the model. Justify.
Exercise 3
In this exercise MM: Moyenne Mobile = MA: Moving Average and replace decimal notation
separator with . instead of , in table 3.
Following the World automobile exhibition, a study was conducted in September 2016
(source: Automobile Propre, 2016), about the sales of electric cars. The number of license
plates of electric cars registered in France was noted starting January 2013.
This data is presented hereafter:
ST22 – Advanced Statistical Methods
1. Using the above charts, analyze the time series. Justify your answer
2. Which adjustment model will you use? Justify your answer.
3. Using the y(t) values (ref. Table 1), showing the evolution of the number of license
plates for electric cars issued in France, we determine MA(t) (ref. Table 2) and create
the below graph (ref. Figure 3).
ST22 – Advanced Statistical Methods
a. What does MA(t) represent? What period length was used? Justify.
b. Which component of the time series does MA(t) reveal?
c. Using the data given above, give the numeric expression of the calculation that
leads to MA(7)=738.
4. Using the above data, we obtain the ratio y(t)/MA(t) as well as the averages calculated
for each month:
a. What does the y(t)/MA(t) ration represent? Which calculation must be performed
in order to obtain the value 1.3 in Table 3?
b. Give the numeric expression that leads to the value 0.5 in Table 3 and interpret this
value in the context of the study. Detail your answer.
5. Find the value of the seasonally adjusted series for August 2016. What does this value
represent?
ST22 – Advanced Statistical Methods
Exercise 4
The below table 1 shows quarterly data for landline phone call durations (expressed in million of
minutes) between the 1st trimester in 2001 (T1-01) and the 4th trimester in 2005 (T4-05).
1. Describe the time series components using charts 1 & 2.
2. How did we calculate the moving average for the 4th trimester in 2004?
3. Which model, additive or multiplicative, did we choose to analyze the time series? Justify
based on the charts.
4. Detail the calculation steps for the seasonal coefficient of 3rd trimester and interpret its value
in context.
5. How did we calculate the seasonally adjusted series value for the 1st trimester in 2003?
6. Calculate the predicted durations for T1-06 and T2-06.