Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
ECON2040 – COMPUTATIONAL ECONOMICS
Coursework 2
• This coursework consists of three questions and is worth 35% of the overall mark for ECON2040.
• The deadline for submission is 18:00 GMT on Wednesday 13 December 2023.
• Standard University policies and procedures will be followed for late submission, extensions,
and academic integrity. (See the Module Syllabus and Programme Handbook for details.)
• Submission is via Blackboard. Your answers should be composed of three parts: a report (type-
set using Word/LaTex and saved in pdf format), containing your analysis/summary statis-
tics/output (e.g. tables and figures), AND csv files containing your data output AND your
Python and R scripts, containing the code that you used to obtain your results.
– You should submit your report via TurnItIn on Blackboard in a file called ECON2040CW2_ID.pdf,
where ID is your student ID number, for example ECON2040CW2_12345678.pdf. In the As-
signments folder, click on Coursework 2 – Report Submission to submit your report.
– Please make sure that your answers are typeset, and that the output is well-organised and
clear and that tables and figures are appropriately labelled.
– You should not include Python/R code used in your analysis in your report, but you
must submit a separate Python and R scripts via Blackboard containing your code called
ECON2040CW2_ID.py and ECON2040CW2_ID.R, where ID is your student ID number, for
example ECON2040CW2_12345678.R. In the Assignments folder, click on Coursework 2 –
Code Submission to submit your code.
– Please also upload your CSV files called ECON2040CW2_Sales_ID.csv and ECON2040CW2_WDI_ID.csv,
where ID is your student ID number, for example ECON2040CW2_Sales_12345678.csv. In
the Assignments folder, click on Coursework 2 – CSV Submission to upload your data.
– Your Python and R scripts should include comments throughout to explain what you are
doing and each section should be properly labelled.
• In answering the questions, please keep in mind the Grade Descriptors for Year 2 as posted on
Blackboard at the start of the module.
• It is the policy of the Department of Economics that coursework is anonymous, therefore please
do not put your name on any part of your report.
1
1. The Sales Data.zip contains 12 csv files with the information about the sales of a U.S. electronic
goods store chain throughout 12 months of a year. Every line contains information about a
product sold by one of the stores in a given month. It has the following columns:
• Order ID identifies a specific sales order. There may be several lines with the same Order
ID, which means that all those goods were part of the same order.
• Product gives the name of the product sold.
• Quantity Ordered provides information about how many items of this product were sold
in this order.
• Price Each lists the price at which each item was sold.
• Order Date gives the date of the sale.
• Purchase Address provides the address of the store where sale took place.
Some of the rows may contain missing information or processing errors, so you may need to do
some data cleaning. Use Python’s Pandas library to analyse this data in order to answer the
following questions:
(a) Merge all the data into one Pandas dataframe. How many observations are there altogether?
(b) Calculate and plot the total value of sales by month. In what month were the total sales
the highest and the lowest? What were they equal to?
(c) In what city were the total annual sales the highest? Be careful, since in the U.S., there
may be several cities with the same name in different states.
(d) The company wants to order some online advertising, and wants to know at what time of
the day most of the sales usually happen. Plot the distribution of sales by hour.
(e) Which are the items that are most likely to be sold together?