Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
CS5014 Machine Learning
EXAM DURATION: 2 hours
EXAM INSTRUCTIONS (a) Answer three questions.
(b) Each question carries 20 marks.
(c) Answer questions in the script book.
YOU MUST HAND IN THIS EXAM PAPER AT THE END OF THE EXAM.
PLEASE DO NOT TURN OVER THIS EXAM PAPER UNTIL YOU ARE INSTRUCTED TO DO SO.
1. Support Vector Machines
(a) With help of a diagram, explain the concept of margin in binary classifiers and explain why it is important. [3 marks] (b) Explain the differences between hard-margin and soft-margin classifiers. [3 marks] (c) Linear SVM and logistic regression are both linear classifiers. Explain the differences between the two classifiers in terms of:
i. Loss functions. [3 marks] ii. Optimisation strategies. [4 marks]
(d) What is the relationship between Support Vector Machines and regularisation? [3 marks] (e) The function (, ′) = (1 − ⟨, ′⟩) is known as the polynomial kernel. Explain its relevance to Support Vector machines: why is it useful, and how is it used. [4 marks]
[Total 20 marks]
Page 3 of 6
2. Basis Expansions
(a) What are basis functions and why are they useful in machine learning? [3 marks] (b) Using a diagram, give an example of polynomial regression. [3 marks] (c) You are given training data, from which you obtain the following scatter plot:
Explain the process of fitting a curve (regression model) to these data. Your solution should give the general form of the equation which defines your model (). You do not need to calculate the values of the parameters of your model. [6 marks] (d) What is a cubic spline and how does it relate to basis expansion? Illustrate your answer using a diagram. [5 marks] (e) Explain how the regularisation parameter influences L1-regularised polynomial regression. [3 marks]
[Total 20 marks]
3. Gradient Descent
(a) Describe what feature scaling is and why it is useful in machine learning. [2 marks]
(b) A fellow machine learning researcher running gradient descent for a regression task on a single dataset. They have produced three figures of the cost function over number of epochs from three versions (V1, V2, and V3) of a model they have produced:
i. Which version of the model (V1, V2 or V3) should they use and why? [2 marks] ii. For the versions of model that they shouldn’t use what would you recommend they change in their algorithm and why? [1 marks]
(c) What are the advantages and disadvantages of using the normal equation instead of gradient descent for a linear regression task? [3 marks]
(d) What is the logistic sigmoid function? Your explanation should include a sketch of the logistic sigmoid function, the logistic sigmoid function equation and some text to describe it. Also explain why it is sometimes used in classification tasks? [3 marks]
(e) Using a linear logistic regression, the task is to classify recognition of 5 different vehicles in 1000,000 images of classes: cars, boats, planes, trucks, and trains), how can multiclass classification be done, explain the strengths and weaknesses of the approach. [3 marks]
(f) Describe the cost function used in gradient descent for logistic regression and why it’s used. Include an equation of the cost function to be used in gradient descent for logistic regression including regularisation. Describe the terms you use and explain why there are in the equation. [6 marks]
[Total 20 marks]
Page 6 of 6
4. Neural Networks
(a) Explain how a single neuron (logistic unit) is used in an artificial neural network, your answer should include a sketch and explanatory text. [3 marks]
(b) How many neurons are required, in the output layer to classify inputs of multiple types of medical blood test data into cancer or not-cancer classes? Your answer should also explain why. [1 marks]
(c) Describe the forward propagation in neural networks for this task of classifying cancer or not cancer using the inputs from three medical blood data features. The network architecture you should use has the following properties: there is an input layer, 2 hidden layers and an output layer, you get to decide the number of neurons per layer. Your answer should use a sketch, any equations used and explanatory text of the process. [5 marks] (d) What is the purpose of backpropagation in neural networks? [2 marks]
(e) Design a machine learning program, utilising neural networks, which will automatically decide whether buy, sell, or retain your stocks in oil company EvilOil on the stock market. You are given hourly stock market data (stock prices) from the last 3 years for all the companies (including EvilOil) that are traded on major stock market exchanges all over the world, you are also given data on the number of times that EvilOil is mentioned on social media sites, again hourly over the 3 year period, you are allowed to use additional data you can acquire if you wish. Your answer should include all the steps you take and how you evaluate technique. You do not need to write the program, just the steps that the program should take (and why).