Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
Module title: ACS61013
Assignment Name: Coursework 1
Tools to use: Majority of the MATLAB code you need to complete the assignment are available from various lab sessions. If you are comfortable using Python, you are free to use it. You are also free to use Orange for various aspects of the coursework as required. Tasks and Mark Scheme: The aim of this coursework is to design, implement and evaluate effective machine-learning pipelines for various tasks. The specific tasks and the corresponding marking schemes are given in the table below. It is up to you to decide how you approach the various tasks, design a solution and write-up your results. For each task, the mark within the grade boundary will be based on your discussion in your report and results obtained. Task/Assessment Description Mark Range Level of achievement Task 1: Conduct and write a domain analysis that discusses the important weather features that affect the energy usage of a house as well as discuss the weather features that could affect the energy generated by the solar panel attached to the house.
Discuss how what you have found from your domain analysis will support and be carried over to other parts of your work. 0-10% 1 Task 2: Achieve level 1 as well as conduct data cleaning, pre-processing and feature engineering.
Discuss how you used your understanding of the domain from level 1 to support this task. This should also involve discussions on deciding which features to drop and which relevant features to keep. Support your explanation by applying dimension reduction (e.g PCA or Hierarchical Clustering Analysis) techniques. 10-20% 2 Task 3. Achieve all the previous levels as well as build a regression model (decide which hypothesis function that is best to use e.g polynomial or linear etc) or neural network model to predict the value of energy usage from at least two weather features you deemed important to keep from Task 2. 20-30% 3 Task 4. Achieve all the previous levels as well as use learning curves to discuss how effective your regression model machine learning pipeline is at preventing overfitting and underfitting. 30-45% 4 Task 5. Achieve all the previous levels plus discuss which cross validation technique you applied in Task 4 above and why. 45-50% 5 Task 6. Using the features you consider most important to this challenge, apply a classification machine learning methodology (e.g Decision Trees or Neural Network) to build a model that predicts when energy usage of a house will be LOW, MEDIUM or HIGH.
Use the classification metrics of confusion matrix, accuracy, precision and recall to explain your results.
Can your model explain or highlight which appliance is used most when the energy usage is LOW, MEDIUM or HIGH? 50-65% 6 Task 7. From Task 6, compare the results of the Decision Tree methodology with Neural Network methodology using the classification metrics of confusion matrix, accuracy, precision and recall to explain your results.
Demonstrate and explain how model complexities (both Decision Tree and Neural Network) affect the results you obtain. 65-80% 7 Task 8. Achieve all the previous levels and the below: Using the dataset given, compare the results of the machine learning algorithms above with the results of two other algorithms that we have not covered in class. Discuss the mathematical peculiarities of the algorithms you have chosen (strengths and weaknesses) and how they impact the results you obtain. Apply the appropriate metrics to compare the algorithms you have chosen with the ones we have used in class. 80-100% 8
Technical Report and code Write your results in no more than a 15 page technical report. Make sure your report has a table of content, sections, discussion and conclusions. You must create a MATLAB (or Python code) and an Orange pipeline design for your solution(s). Support your report with an Orange pipeline design and MATLAB code. Make sure you provide comments in your MATLAB code as well as instructions on how to run it. Hand in your report (.pdf), software (Orange and MATLAB (or Python)) via Blackboard by 11pm on the 9th of December 2022. This course work makes up 60% of your total module mark. Appendix Most of the features in the dataset are self-explanatory. However, I have highlighted a few below: Features Description Time The time is provided in the current epoch unix timestamp format. See this link Use [kW] This is the energy usage of the house. This is similar to the house overall [kW] column. Gen [kW] The is the energy generated by the solar panel attached to the building. This feature is similar to the Solar [kW] column. Weather Icon These are weather icons used to indicate weather conditions. PrecipIntensity This means precipitation intensity PrecipProbability This means precipitation probability