In this assignment, you will develop several classification models to classify noisy input images into
In this assignment, you will develop several classification models to classify noisy input images into the classes square or circle, as shown in Fig. 1
Figure 1: Samples of noisy images labelled as square (left) and circle (right).
Your classification models will use the training and testing sets (that are available with this assignment) containing many image samples labelled as square or circle. Your task is to write a Python code that can be run on a Jupyter Notebook session, which will train and validate the following classification models:
1)K Nearest neighbour (KNN) classifier [35 marks]. For the KNN classifier, you can only use standard Python libraries (e.g., numpy) in order to implement all aspects of the training and testing algorithms. You will need to implement two functions: a) one to build a K-d tree from the training set (this function takes the training samples and labels as its parameters), and b) another to test the KNN classifier and compute the classification accuracy, where the parameters are K and the test images and labels. Using matplotlib, plot a graph of the evolution of classification accuracy for the training and testing sets as a function of K, where K = 1 to 10. Clearly identify the value of K, where generalisation is best.
For the decision tree classifier, you can only use standard Python libraries (e.g., numpy) in order to implement all aspects of the training and testing algorithms. Essentially you will need to implement two functions: a) one to train the decision tree using the training samples and labels plus a pre-pruning parameter indicating theminimum information content before stop splitting, and b) another to test the decision tree and compute the classification accuracy (similarly to the KNN classifier, the test function takes as one of its parameters the test images and labels and returns the classification accuracy). Using matplotlib, plot a graph of the evolution of classification accuracy for the training and testing sets as a function of the information content, where information content = 0 to 0.5 bits. Clearly identify the value of information content, where generalisation is best.
For the convolutional neural network, you are allowed to use Keras using TensorFlow backend, similar to the example shown in the code provided. The CNN structure is the lenet structure used in lecture. Using matplotlib, please plot a graph of the evolution of accuracy for the training and testing sets as a function of the number of epochs, where the max number of epochs is 200. Clearly identify the value of information content, where generalisation is best.Artificial Intelligence代写
A sample code that trains and tests a multi-layer perceptron classifier that can run on a Jupyter Notebook session is provided, and it is expected that the submitted code can run on a Jupyter Notebook session in a similar manner. A held-out test set will be used to test the generalisation of the implemented classification models, but this held-out set will only be available after the assignment deadline – please note that this held-out set will contain samples obtained from the same distributions used to generate the training and testing sets.
You must write the program yourself in Python, and the code must be a single file that can run on a Jupyter Notebook session (file type .ipynb). You will only get marks for the parts that you implemented yourself. If you use a library package or language function call for training or testing a KNN or a Decision Tree classifier, then you will be limited to 50% of the available marks (noting that this assignment is a hurdle for the course). If there is evidence you have simply copied code from the web, you will be awarded no marks and referred for plagiarism
You must submit, by the due date, two files:
a)The training and testing accuracies at the best generalisation operating point for each type of classifier, using a table [5marks]:
Training Accuracy | Testing Accuracy | |
K=1 NN |
… | ||
K=10 NN |
|
|
DT (IC = 0 bits) | ||
… | ||
DT (IC = 0.5 bits) |
|
|
CNN |
b)Running time for training and testing algorithms accuracies of each type of classifier, using a table [5marks]:
Training Time | Testing Time | |
K=1 NN | ||
… | ||
K=10 NN |
|
|
DT (IC = 0 bits) | ||
… |
|
|
DT (IC = 0.5 bits) | ||
CNN |
c)Bonus question: How can the classification accuracy of the decision tree classifier be improved? Please implement your idea (hint: dimensionality reduction) [10marks].