Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
Module title: ACS61013
Assignment Name: Coursework 1
Tools to use: Majority of the MATLAB code you need to complete the assignment are
available from various lab sessions. If you are comfortable using Python, you are free to use
it. You are also free to use Orange for various aspects of the coursework as required.
Tasks and Mark Scheme: The aim of this coursework is to design, implement and evaluate
effective machine-learning pipelines for various tasks. The specific tasks and the
corresponding marking schemes are given in the table below. It is up to you to decide how
you approach the various tasks, design a solution and write-up your results. For each task,
the mark within the grade boundary will be based on your discussion in your report and results
obtained.
Task/Assessment Description Mark
Range
Level of
achievement
Task 1: Conduct and write a domain analysis that discusses
the important weather features that affect the energy usage of
a house as well as discuss the weather features that could
affect the energy generated by the solar panel attached to the
house.
Discuss how what you have found from your domain analysis
will support and be carried over to other parts of your work.
0-10% 1
Task 2: Achieve level 1 as well as conduct data cleaning,
pre-processing and feature engineering.
Discuss how you used your understanding of the domain from
level 1 to support this task. This should also involve
discussions on deciding which features to drop and which
relevant features to keep. Support your explanation by
applying dimension reduction (e.g PCA or Hierarchical
Clustering Analysis) techniques.
10-20% 2
Task 3. Achieve all the previous levels as well as build a
regression model (decide which hypothesis function that is
best to use e.g polynomial or linear etc) or neural network
model to predict the value of energy usage from at least two
weather features you deemed important to keep from Task 2.
20-30% 3
Task 4. Achieve all the previous levels as well as use
learning curves to discuss how effective your regression
model machine learning pipeline is at preventing overfitting
and underfitting.
30-45% 4
Task 5. Achieve all the previous levels plus discuss which
cross validation technique you applied in Task 4 above and
why.
45-50% 5
Task 6. Using the features you consider most important to
this challenge, apply a classification machine learning
methodology (e.g Decision Trees or Neural Network) to build
a model that predicts when energy usage of a house will be
LOW, MEDIUM or HIGH.
Use the classification metrics of confusion matrix, accuracy,
precision and recall to explain your results.
Can your model explain or highlight which appliance is used
most when the energy usage is LOW, MEDIUM or HIGH?
50-65% 6
Task 7. From Task 6, compare the results of the Decision
Tree methodology with Neural Network methodology using
the classification metrics of confusion matrix, accuracy,
precision and recall to explain your results.
Demonstrate and explain how model complexities (both
Decision Tree and Neural Network) affect the results you
obtain.
65-80% 7
Task 8. Achieve all the previous levels and the below:
Using the dataset given, compare the results of the
machine learning algorithms above with the results of
two other algorithms that we have not covered in class.
Discuss the mathematical peculiarities of the
algorithms you have chosen (strengths and
weaknesses) and how they impact the results you
obtain.
Apply the appropriate metrics to compare the
algorithms you have chosen with the ones we have
used in class.
80-100% 8
Technical Report and code
Write your results in no more than a 15 page technical report. Make sure your report has
a table of content, sections, discussion and conclusions.
You must create a MATLAB (or Python code) and an Orange pipeline design for your
solution(s). Support your report with an Orange pipeline design and MATLAB code. Make
sure you provide comments in your MATLAB code as well as instructions on how to run it.
Hand in your report (.pdf), software (Orange and MATLAB (or Python)) via Blackboard by
11pm on the 9th of December 2022. This course work makes up 60% of your total module
mark.
Appendix
Most of the features in the dataset are self-explanatory. However, I have highlighted a few
below:
Features Description
Time The time is provided in the current epoch unix timestamp format.
See this link
Use [kW] This is the energy usage of the house. This is similar to the house
overall [kW] column.
Gen [kW] The is the energy generated by the solar panel attached to the
building. This feature is similar to the Solar [kW] column.
Weather Icon These are weather icons used to indicate weather conditions.
PrecipIntensity This means precipitation intensity
PrecipProbability This means precipitation probability