$25
Exploratory data analysis ( EDA )
• Checking the description of the datatsets like the data types , how many users , items , etc
• Visualization on some important parts like most rated items , most popular items , most rated users , frequency of ratings
• Any missing data? any irregular data like nan , or np.inf ?
• How to pre-process the irregular data.
• Any other factors can help you make prediction
Model building
You may try many models and pick up the best one. I recommend you to introduce your final model as the structure as follows.
• Attempt model 1: (i) Which model you want to use; (ii) Any hyperparameters? how to tune; (iii) performance in Public Leaderboard; (iv) Any issue? (v) how to make improvement.
• Attempt model 2: (i) Which model you want to use; (ii) Any hyperparameters? how to tune; (iii) performance in Public Leaderboard; (iv) Any issue? (v) how to make improvement.
• Maybe more attempts ...
• You final model: (i) Which model you want to use; (ii) Any hyperparameters? how to tune; (iii) performance in Public Leaderboard; (iv) Explain why you think the model is the best.
Result
• Print the user_id , item_id , and pred_rating , for the T-th record in the test.csv , where T is the last four digits of your student Id. For example, if your student Id = 1155111111 , please print the 1111-th record.
• Print the top-5 preferred items based on your predicted_rating for the user_id in the above question.