Machine Learning Practical
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
MLP 2023/24
Machine Learning Practical
1 Introduction
The aim of this coursework is to study the classification of images of handwritten digits using neural networks.
The first part of this coursework will concern the identification and discussion of a fundamental problem in
machine learning, as shown in Figure 1. Following this preliminary discussion, you will further investigate this
problem in wider and deeper neural networks, study it in terms of network width and depth. The second part
involves implementing different methods to combat the problem identified in Task 1 and then comparing these
methods empirically and theoretically. In the final part, you will briefly discuss one related work to the methods
examined in Task 2.
The coursework will use an extended version of the MNIST database, the EMNIST Balanced dataset,
described in Section 2. Section 3 describes the additional code provided for the coursework (in branch
mlp2023-24/coursework1 of the MLP github), and Section 4 describes how the coursework is structured into
three tasks. The main deliverable of this coursework is a report, discussed in section 8, using a template that is
available on the github. Section 9 discusses the details of carrying out and submitting the coursework, and the
marking scheme is discussed in Section 10.
You will need to submit your completed report as a PDF file and your local version of the mlp code including
any changes you made to the provided (.py files). The detailed submission instructions are given in Section 9.2 –
please follow these instructions carefully.
2 EMNIST dataset
In this coursework we shall use the EMNIST (Extended MNIST) Balanced dataset [Cohen et al., 2017],
EMNIST extends the well-known MNIST by including
images of handwritten letters (upper and lower case) as well as handwritten digits. Both EMNIST and MNIST
are extracted from the same underlying dataset, referred to as NIST Special Database 19. Both use the same
conversion process resulting in centred images of dimension 28×28.
There are 62 potential classes for EMNIST (10 digits, 26 lower case letters, and 26 upper case letters). However,
we shall use a reduced label set of 47 different labels. This is because (following the data conversion process)
there are 15 letters for which it is confusing to discriminate between upper-case and lower-case versions. In the
47 label set, upper- and lower-case labels are merged for the following letters:
C, I, J, K, L, M, O, P, S, U, V, W, X, Y, Z.
The training set for Balanced EMNIST has about twice the number of examples as the MNIST training set, thus
you should expect the run-time of your experiments to be about twice as long. The expected accuracy rates
are lower for EMNIST than for MNIST (as EMNIST has more classes, and more confusable examples), and
differences in accuracy between different systems should be larger. Cohen et al. [2017] present some baseline
results for EMNIST.
You do not need to directly download the EMNIST database from the nist.gov website, as it is part of the
coursework1 branch in the mlpractical Github repository, discussed in Section 3 below.
1
MLP 2023/24: Coursework 1 Due: 27 October 2023
3 Github branch mlp2023-24/coursework1
You should run all of the experiments for the coursework inside the (mini-)Conda environment you set
up for the labs. The code for the coursework is available on the course Github repository on a branch
mlp2023-24/coursework1. To create a local working copy of this branch in your local repository you
need to do the following.
1. Make sure all modified files on the branch you are currently have been committed (see notes/getting-
started-in-a-lab.md if you are unsure how to do this).
2. Fetch changes to the upstream origin repository by running
git fetch origin
3. Checkout a new local branch from the fetched branch using
git checkout -b coursework1 origin/mlp2023-24/coursework1
You will now have a new branch in your local repository with all the code necessary for the coursework in it.
This branch includes the following additions to your setup:
• A new EMNISTDataProvider class in the mlp.data_providers module. This class makes some
changes to the MNISTDataProvider class, linking to the EMNIST Balanced data, and setting the number
of classes to 47.
• Training, validation, and test sets for the EMNIST Balanced dataset that you will use in this coursework.
• In order to further improve performance and mitigate the problem identified in neural networks, you will
also need to implement a new class in the mlp.layers module:
DropoutLayer
and also two weight penalty tecniques in the mlp.penalties module:
L1Penalty and L2Penalty.
• DropoutandPenalty_tests.ipynb Jupyter notebook
to be used for testing the implementations of DropoutLayer, L1Penalty and L2Penalty classes.
The tests serve as a safeguard to prevent experimentation with faulty code which might lead to wrong
conclusions. Tests in general are a vital ingredient for good software development, and especially important
for building correct and efficient deep learning systems.
Please note that passing these preliminary tests does not necessarily mean your classes are absolutely
bug-free. If you get unexpected curves during model training, re-check your implementation of the classes.
• A directory called report which contains the LaTeX template and style files for your report. You should
copy all these files into the directory which will contain your report.
2
MLP 2023/24: Coursework 1 Due: 27 October 2023
(a) Error curve on the training and validation set of EMNIST dataset.
(b) Accuracy curve on the training and validation set of EMNIST dataset.
Figure 1: Error and Accuracy curves for a baseline model on EMNIST Dataset.
4 Tasks
The coursework is structured into 3 tasks, the first two are supported by experiments on the EMNIST dataset.
1. Identification of a fundamental problem in machine learning as shown in Fig 1 and setting up a baseline
system on EMNIST by a valid hyper-parameter search.
2. A research investigation and analysis into whether using Dropout and/or Weight Penalty (L1Penalty and
L2Penalty) addresses the problem found in training machine learning models (Fig 1). How do these two
approaches improve/degrade the model’s performance?
3. Summarise and conclude the report, relating your conclusions to the overall literature.
5 Task 1: Problem identification
Figure 1 shows the training and validation error curves in Figure 1a and also training and validation accuracies in
Figure 1b for a model with 2 hidden layers1 with ReLU trained on the EMNIST dataset by using cross-entropy
error function. This curve can be re-produced by running the model settings defined in the Coursework1.ipynb
notebook in the github repository. We first identify and discuss the problem shown by the curves in Figure 1 as
overfitting, and briefly discuss potential solutions in this section for overcoming this problem.
Varying number of hidden units. Initially you will train various 1-hidden layer networks by using either 32, 64
and 128 ReLU hidden units per layer on EMNIST. Note that 1-hidden layer network contains two layers, one
1All layers are hidden layer except the output one.
3
MLP 2023/24: Coursework 1 Due: 27 October 2023
mapping input units to hidden units and another one mapping hidden units to output units. 2 and 3-hidden layer
networks would contain 3 and 4 layers respectively. Make sure you use Adam optimiser with the hyperparameters
provided in the template and train each network for 100 epochs. Visualise and discuss how increasing number of
hidden units affects the validation performance and whether it worsens or mitigates the overfitting problem.
Varying number of layers. Here you will train various neural networks by using either 1, 2, 3 hidden layers with
128 ReLU hidden units per layer on EMNIST. Make sure that you use Adam optimiser with the hyperparameters
provided in the template and train each network for 100 epochs. Visualise and discuss how increasing number of
layers affects the validation performance and whether it worsens or mitigates the overfitting problem.
The questions in (mlp-cw1-questions.tex) that you must answer and count for this task are:
• Question 1;
• Question 2;
• Question 5;
• Question 6;
• Question 7;
• Question Table 1;
• Question Figure 2;
• Question 8;
• Question 9;
• Question Table 2;
• Question Figure 3;
• Question 10; and
• Question 11.
(20 Marks)
6 Task 2: Mitigating the problem with Dropout and Weight Penalty
Definition and Motivation. We provide the analysis and explanation for Dropout, L1Penalty, and L2Penalty.
You will have to, in your own words, explain how one could use a combination of L1 and L2 regularisation,
discussing any potential benefits of this approach.
The question in (mlp-cw1-questions.tex) that you must answer and counts for this part of the task is:
• Question 12.
(10 Marks)
Implementing Dropout and Weight Penalty. Here you will implement DropoutLayer, L1Penalty and
L2Penalty and test their correctness. Here are the steps to follow:
4
MLP 2023/24: Coursework 1 Due: 27 October 2023
1. Implement the Dropout class in the DropoutLayer of the mlp.layers module. You need to implement
fprop and bprop methods for this class. Please note that the solution uses the original dropout formulation
(i.e. scale the hidden unit activations by inclusion probability p in the final network for compensating
missing units). The sample distribution to be used for Dropout implementation is numpy’s uniform
distribution, U(0,1) to pass the unit tests.
2. Implement the L1Penalty and L2Penalty class in the L1Penalty and L2Penalty of the mlp.penalties
module. You need to implement __call__ and grad methods for this class. After defining these functions,
they can be provided as a parameter, weights_penalty, biases_penalty in the AffineLayer class
while creating the multi-layer neural network.
3. Verify the correctness of your implementation using the supplied unit tests in
DropoutandPenalty_tests.ipynb
4. Automatically create test outputs xxxxxx_regularization_test_pack.npy, by running the provided
program scripts/generate_regularization_layer_test_outputs.py which uses your code for
the previously mentioned layers to run your fprop, bprop, __call__ and grad methods where necessary
for each layer on a unique test vector generated using your exam ID number.
To do this part simply go to the scripts folder scripts/ and then run
python generate_regularization_layer_test_outputs.py --exam_id Bxxxxxx replacing
the "exam id" with your exam number. A file called xxxxxx_regularization_test_pack.npy will
be generated under data which you need to submit with your report.
(20 Marks)