Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
COMP9417 Machine Learning Project
1. Introduction
The proliferation of online forums and platforms where users can interact and share their opinions has become
widespread with the introduction of the internet. As the scale of these platforms grow, it has become harder to
monitor and identify abusive or toxic comments manually. Consequently, researchers have turned to machine
learning to help identify these types of conversations at scale.
An opportunity to apply machine learning in this context can be found in the Toxic Comment Classification
Challenge on Kaggle . The purpose of the challenge is to implement a set of
classification models capable of detecting different subtypes of toxicity. Previous work on this topic has been
limited to identifying if comments are toxic; in this competition we instead try to classify on the different subtypes
of toxic comments. The data was provided by Kaggle and was derived from Wikipedia’s Talk pages, with crowd-
evaluated labels.
In this project we aim to build a set of models to classify whether a piece of text is labelled as one or more of the
following labels: toxic, severe toxic, obscene, threat, insult, and identity hate. Note that the scope of our problem
is binary classification on six separate labels, not one multiclass label. As such, a piece of text can be classified as
one or more of the listed labels, and we will build a model for each label. The evaluation metric for our project
will be the average of the individual ROC AUCs of each predicted label, consistent with Kaggle’s evaluation metric.
A sample of the text and labels are provided in the appendix.
2. Implementation
In this section we discuss the different aspects considered in building our final set of models, including feature
extraction from text, a review of appropriate learning algorithms for this task, hyperparameter tuning, nested
cross validation, the implementation of a model algorithm selection process to automate the model build across
six labels, and finally, evaluation of our models on test data. Implementation was done using Python 3.8.