Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
QBUS6840 Group Assignment (25 marks)
1 Background and Task
The Consumer Price Index (CPI) is a measure that examines the weighted average of prices of a
basket of consumer goods and services, such as transportation, food, and medical care. It is calcu-
lated by taking price changes for each item in the predetermined basket of goods and averaging
them. Changes in the CPI are used to assess price changes associated with the cost of living. The
CPI is one of the most frequently used measures of inflation and deflation 1.
In this group project, your task is to develop a predictive model to forecast CPI of a particular
sector given its historical quarterly values. The CPI data set CPI_train.csv contains the quarterly
CPI data from Jan 1990 to Dec 2019 (120 data points). This data set is based on a real CPI dataset
with some added noise for the de-identification purposes. The test data set CPI_test.csv (not
provided) has the same structure as the training data, and contains the quarterly CPI data from
Jan 2020 to Dec 2021 (8 data points).
Your task is to develop a predictive model, using CPI_train.csv, to forecast the quarterly CPI
measures from Jan 2020 to Dec 2021. Note that, this is a multiple-step-ahead forecast problem.
Test error
For the measure of forecast accuracy, please use mean squared error (MSE). The MSE, computed
on the test data, is defined as follows. Let ŷT+h|1:T be the h-step-ahead forecast of yT+h, based on
the training data y1:T, where yT+h is the h-th value in the test data CPI_test.csv. The test error is
computed as follows
test_error =
1
8
8
∑
h=1
(ŷT+h|1:T − yT+h)2,
where 8 is the number of observations in the test data.
2 Submission Instructions
1. Each group needs to submit THREE files via the link in the Canvas site.
• A document file, named Group_xxx_document.pdf, that reports your data analysis
procedure and results. You should replace the xxx in the file name with your group ID.
• A Python file, named Group_xxx_implementation.ipynb (or .py) that implements
your data analysis procedure and produces the test error. You should replace the xxx
in the python file name with your group ID.
• A csv file, named Group_xxx_forecast.csv, that reports the 8 forecast CPI values made
by your final predictive model. You should replace the xxx in the file name with your
group ID.
2. About your document file Group_xxx_document.pdf
• Describe your data analysis procedure in detail: how the Exploratory Data Analysis
(EDA) step is done, what and why models/methods are used, how the models are
trained, etc. with sufficient justifications. The description should be detailed enough so
that other data scientists, who are supposed to have background in your field, under-
stand and are able to implement the task. All the numerical results are reported up to
four decimal places.
• Clearly and appropriately present any relevant graphs and tables.
• The page limit is 25 pages including EVERYTHING: appendix, computer output,
graphs, tables, etc.
• You must use the cover sheet provided on Canvas.
3. The Python file is written using Jupyter Notebook or Spyder as the editors, with the assump-
tion that all the necessary data files (CPI_train.csv and CPI_test.csv) are in the same folder
as the Python file. If you use deep learning models, then please assume that Keras (with
Tensorflow backend) has been installed.
• If the training of your model involves generating random numbers, the random seed
in Group_xxx_implementation.ipynb (or Group_xxx_implementation.py) must be
fixed, e.g. np.random.seed(0), so that the marker expects to have the same results
as you had.