$25
Exploratory data analysis (EDA)
• Checking the description of the datatsets like the data types, how many users, items, etc
• Visualization on some important parts like most rated items, most popular items, most rated users, frequency of ratings
• Any other factors can help you make prediction
Model building
You may try many models and pick up the best one. I recommend you to introduce your final model as the structure as follows.
1
• Attempt model 1: (i) Which model you want to use; (ii) Any hyperparameters? how to tune; (iii) performance in Public Leaderboard; (iv) Any issue? (v) how to make improvement.
• Attempt model 2: (i) Which model you want to use; (ii) Any hyperparameters? how to tune; (iii) performance in Public Leaderboard; (iv) Any issue? (v) how to make improvement.
• Maybe more attempts ...
• You final model: (i) Which model you want to use; (ii) Any hyperparameters? how to tune; (iii) performance in Public Leaderboard; (iv) Explain why you think the model is the best.
Result
• Print the user_id, item_id, and pred_rating, for the T-th record in the test.csv, where T is the last four digits of your student Id. For example, if your student Id = 1155111111, please print the 1111-th record.
• Print the top-5 preferred items based on your predicted_rating for the user_id in the above question.