Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
CMPT 291 Section OP01 Database Programming
Objectives
The goal for this lab is to compute the mean absolute error of an experiment using the “leave one out” strategy explained in the lecture on evaluating recommender systems.
Demonstrating/Submitting
To get credit for the lab you must either demonstrate the lab to a TA/instructor during office hours or record a demo, host it online, and include a link along with your submitted code on Brightspace. You can use the Kaltura Capture and MediaSpace resources provided by Carleton or your own software/hosting resources to record and share the demonstration. For partners, only one demonstration or submission is required. If demonstrating during office hours, inform the instructor/TA of your partner’s name. If submitting on Brightspace, include both partners’ names in a README file.
Lab Description
For this lab, you will be provided with a larger dataset (parsed-data-trimmed.txt) that follows a similar structure to the previously used ratings data. The full dataset will be used for the assignment. The only change to the structure of this data from the past labs is that the value 0 is now used to indicate no rating instead of -1.
The goal for this lab is to generate an output of the mean absolute error achieved by either the user-based or item-based recommendation algorithms that have been covered in the last few weeks. Whichever algorithm you decide to use, you should use a neighbourhood size of 5 for your calculation. You should also not include any similarities that are less than 0.
Now that we are working with data that is closer to what would be seen in a real world application, we will have to deal with some unexpected outcomes in our calculations. For example, if the numerator in any of your calculations comes out to 0, you will get an output of NaN/-Infinity/Infinity. These should not be included as real predictions. Instead, a suitable “best guess” to make in any of these cases would be the average rating score of the user (remember to compute it by ignoring the current rating that we are predicting). You may also run into problems where you want to use a neighbourhood of size X but there are not X suitable neighbours to use. In these cases, you should use as many neighbours as possible (up to a maximum of X).
For your demonstration, you should discuss how you have implemented the ‘leave one out’ strategy, how you have computed your predictions, and how you have computed the mean absolute error over all of your predicted ratings. It should be possible to achieve a MAE of ~0.77 using 5 neighbours and either the user-based or item-based recommendation algorithm.