Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
CS5014
Machine Learning
EXAM DURATION: 2 hours
EXAM INSTRUCTIONS (a) Answer three questions.
(b) Each question carries 20 marks.
(c) Answer questions in the script book.
YOU MUST HAND IN THIS EXAM PAPER AT THE END OF THE EXAM.
PLEASE DO NOT TURN OVER THIS EXAM PAPER UNTIL
YOU ARE INSTRUCTED TO DO SO.
1. Support Vector Machines
(a) With help of a diagram, explain the concept of margin in binary classifiers and
explain why it is important.
[3 marks]
(b) Explain the differences between hard-margin and soft-margin classifiers.
[3 marks]
(c) Linear SVM and logistic regression are both linear classifiers. Explain the
differences between the two classifiers in terms of:
i. Loss functions. [3 marks]
ii. Optimisation strategies. [4 marks]
(d) What is the relationship between Support Vector Machines and regularisation?
[3 marks]
(e) The function (, ′) = (1 − ⟨, ′⟩) is known as the polynomial kernel.
Explain its relevance to Support Vector machines: why is it useful, and how is
it used.
[4 marks]
[Total 20 marks]
Page 3 of 6
2. Basis Expansions
(a) What are basis functions and why are they useful in machine learning?
[3 marks]
(b) Using a diagram, give an example of polynomial regression.
[3 marks]
(c) You are given training data, from which you obtain the following scatter plot:
Explain the process of fitting a curve (regression model) to these data. Your
solution should give the general form of the equation which defines your
model (). You do not need to calculate the values of the parameters of your
model.
[6 marks]
(d) What is a cubic spline and how does it relate to basis expansion? Illustrate your
answer using a diagram.
[5 marks]
(e) Explain how the regularisation parameter influences L1-regularised
polynomial regression.
[3 marks]
[Total 20 marks]
3. Gradient Descent
(a) Describe what feature scaling is and why it is useful in machine learning.
[2 marks]
(b) A fellow machine learning researcher running gradient descent for a regression
task on a single dataset. They have produced three figures of the cost function
over number of epochs from three versions (V1, V2, and V3) of a model they
have produced:
i. Which version of the model (V1, V2 or V3) should they use and why?
[2 marks]
ii. For the versions of model that they shouldn’t use what would you
recommend they change in their algorithm and why?
[1 marks]
(Question continued on the next page…)
V1.) V2.)
V3.)
10,000 10,000
10,000
0 0
0
1,000 1,000
1,000
Epochs (iterations)
Cost
()
Epochs (iterations) Epochs (iterations)
0 0
0
Cost
()
Cost
()
3 Continued
(c) What are the advantages and disadvantages of using the normal equation
instead of gradient descent for a linear regression task?
[3 marks]
(d) What is the logistic sigmoid function? Your explanation should include a sketch
of the logistic sigmoid function, the logistic sigmoid function equation and
some text to describe it. Also explain why it is sometimes used in classification
tasks?
[3 marks]
(e) Using a linear logistic regression, the task is to classify recognition of 5 different
vehicles in 1000,000 images of classes: cars, boats, planes, trucks, and trains),
how can multiclass classification be done, explain the strengths and
weaknesses of the approach.
[3 marks]
(f) Describe the cost function used in gradient descent for logistic regression and
why it’s used. Include an equation of the cost function to be used in gradient
descent for logistic regression including regularisation. Describe the terms you
use and explain why there are in the equation.
[6 marks]
[Total 20 marks]
Page 6 of 6
4. Neural Networks
(a) Explain how a single neuron (logistic unit) is used in an artificial neural
network, your answer should include a sketch and explanatory text.
[3 marks]
(b) How many neurons are required, in the output layer to classify inputs of
multiple types of medical blood test data into cancer or not-cancer classes?
Your answer should also explain why.
[1 marks]
(c) Describe the forward propagation in neural networks for this task of classifying
cancer or not cancer using the inputs from three medical blood data features.
The network architecture you should use has the following properties: there is
an input layer, 2 hidden layers and an output layer, you get to decide the
number of neurons per layer. Your answer should use a sketch, any equations
used and explanatory text of the process.
[5 marks]
(d) What is the purpose of backpropagation in neural networks?
[2 marks]
(e) Design a machine learning program, utilising neural networks, which will
automatically decide whether buy, sell, or retain your stocks in oil company
EvilOil on the stock market. You are given hourly stock market data (stock
prices) from the last 3 years for all the companies (including EvilOil) that are
traded on major stock market exchanges all over the world, you are also given
data on the number of times that EvilOil is mentioned on social media sites,
again hourly over the 3 year period, you are allowed to use additional data you
can acquire if you wish. Your answer should include all the steps you take and
how you evaluate technique. You do not need to write the program, just the
steps that the program should take (and why).