Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
with Neural Networks
100 points
1 Introduction
Prime numbers are fantastic. In this assignment we will use Multi Layer Perceptrons
to detect handwritten prime digits. But before doing such a difficult
task I suggest to try and solve an easier problem with MLP. If you succeed,
which I know you will, you can proceed with tackling the challenging problem
of detecting handwritten prime digits.
2 Regression - Toy Data
The first task is to learn the function that generated the following data, using
a simple neural network.
The function that produced this data is actually y = x
2+, where ~ N(0, σ)
is a random noise from a normal distribution with a small variance. We are going
to use an MLP with one hidden layer, which each has 5 neurons, to learn
1
an approximation of this function using the data that we have. This assignment
comes with a starting code, which is incomplete and you are supposed to
complete it.
2.1 Technical Details
2.1.1 Code
The code that comes with this assignment has multiple files including:
assignment
toy example regressor.py
layers.py
neural network.py
utils.py
? toy_example_regressor.py contains most of the codes that are related
to the training procedure, including loading data and iteratively feeding
mini-batches of data to the neural network, plotting the approximated
function and the data, etc. Please read this file and understand it but you
don’t need to modify this.
? layers.py contains definition of the layers that we use in this assignment,
including DenseLayer, SigmoidLayer, and L2LossLayer. Your main responsibility
is to implement forward and backward functions for these
layers.
? neural_network.py contains the definition of a neural network (NeuralNet
class), which is an abstract class. This class basically takes care of running
forward pass and propagating the gradients, backwards, from loss to the
first layer.
? utils.py contains some useful functions.
2.1.2 Data
The training data for this problem, which consists of input data and labels, can
be generated by the function get_data(), which you can find in the main file,
toy_example_regressor.py.
2.1.3 Network Structure
For the regression problem (i.e. the first task) we defined a new class, SimpleNet
, which is inherited from NeuralNet. SimpleNet contains two DenseLayer
s, which one of them has hidden neurons with Sigmoid activation functions.
Network definition can be found in toy_example_regressor.py.
2
2.2 Your Task (total: 80 points)
2.2.1 Implementing compute_activations, compute_gradients, and update_weights
functions
There are three type of layers completely implemented in the layers.py file:
DenseLayer, Sigmoid, and L2LossLayer. However, implementation of DenseLayer
is incomplete. You are supposed to implement the following functions
? DenseLayer: This is a simple dense (or linear or fully connected) layer
that has two types of parameters: weights w and biases b.
– compute_activations (15 points): The value of every output neuron
is oi = x.wi + bi
. The number of input and output neurons are
specified in the __init__ function.
– compute_gradients (20 points): Assume that gradient of the loss
with respect to the output neurons, self._output_error_gradient
, are computed by the next layer already. You need to compute the
gradients of the loss with respect to all the parameters of this layer
(i.e. b and w) and store them in self.dw and self.db so that you
can use them the update the parameters later. Needless to say that
shape of dw and w should be equal, and same goes for db and b. In
addition, you should compute the gradient of the loss with respect
to the input, which is the output of the previous layer, and store it
in self._input_error_gradient. This value will be passed on to
the previous layer in the network, which will be used to compute the
gradients recursively (Back Propagation).