Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
Analysing COVID-19 data
This assignment asks you to create a number of general functions to process a JSON file. You will also need
to create some tests and commit your work as you progress in a git repository.
The test file provided is not real and should work without difficulty, however real files will be provided that
include different levels of difficulty to the exercise (e.g., missing data, different number of elements on some
fields).
Your job consists of modifying/creating the functions so that the code works with the “simple” provided file,
while also being general enough to process other files that share the same schema but vary in data quality.
You’ll also need to write some specific tests and demonstrate your ability to use git version control and
GitHub.
The exercise will be semi-automatically marked, so it is very important that your solution adheres to the
correct file and folder name convention and structure, as defined in the rubric below. An otherwise valid
solution which doesn’t work with our marking tool will not be given credit.
First, we set out the problem we are solving. Next, we specify in detail the target for your solution. Finally,
to assist you in creating a good solution, we state the marks scheme we will use.
1 Our epidemiologist friend and his data problem
Jim is an epidemiologist at WHO. Lately he’s been having a lot of data to analyse from different regions
around the world. However, he never expected he would have to learn that much about computers. Thankfully, his good friend Carmen is amazing at writing scripts with Python and is afraid of no JSON file (however
big or complex they are!). There’s an additional difficulty though: WHO computers are very locked down!
As the data they handle contains personal information, the computers have limitations on what software
can be run on them and what data can be transferred to and from them. This means that Carmen will
have to help Jim write some scripts in plain Python (i.e., we don’t have access to NumPy or Pandas!), with
matplotlib being the only exception allowed. “Not to worry”, Carmen thinks, “if there’s anywhere I’m stuck
my friends at MPHY0021 can help me to solve these problems”.
Jim has been able to generate a fake sample file that he can get out of the WHO computers and send to
Carmen, so she can generate the functions that Jim can then load from a Jupyter notebook.
Inspecting the sample file, Carmen finds that it contains geographic and demographic information about
a particular region in the world, information about some age binning, and daily evolution data for many
measurements such as the number of people that have been hospitalised, tested or deceased. It also contains
information about the weather of that day and different actions taken by the government in that region.
Based on Jim’s needs, Carmen has created the skeleton of some functions and a notebook with how Jim is
expected to use them.
1
2020-2021 UCL-MPHY0021
2 Your mission!
You are required to modify the provided Python file (process_covid.py) so that Jim’s notebook
(Jim.ipynb) works as expected. You will also need to add some tests (within test_process_covid.py)
using pytest and save it all in a git repository.