Starting from:

$20

ITE4005-Long Term Project Solved

1.  Predict the ratings of movies in test data by using the given training data containing movie ratings of users. You can choose any algorithm to predict (ex. content-based and collaborative filtering algorithms). For a content-based algorithm, you can refer to the web page to get the content related to our training and test data (http://grouplens.org/datasets/movielens/). 

(Note) This assignment is to predict ratings for each user-item pair only within test data. 

 

3. Requirements
The program must meet the following requirements: l        Execution file name: recommender.exe

l  Execute the program with two arguments: training data name, test data name n            Example:

  

                         -     training data name = ‘u1.base’, test data name = ‘u1.test’

l  File format for a training data

[user_id]\t[item_id]\t[rating]\t[time_stamp]\n 

[user_id]\t[item_id]\t[rating]\t[time_stamp]\n 

[user_id]\t[item_id]\t[rating]\t[time_stamp]\n [user_id]\t[item_id]\t[rating]\t[time_stamp]\n 

... 

n  Row: a record that was already rated by a user for an item n       Example:

  

Figure 1. An example of a training data.  

n  Five training data will be provided: ‘u1.base’, ‘u2.base’, ‘u3.base’, ‘u4.base’, and ‘u5.base’

l  File format for a test data

[user_id]\t[item_id]\t[rating]\t[time_stamp]\n 

[user_id]\t[item_id]\t[rating]\t[time_stamp]\n 

[user_id]\t[item_id]\t[rating]\t[time_stamp]\n [user_id]\t[item_id]\t[rating]\t[time_stamp]\n 

...

n  Row: a record that needs to be predicted by using your algorithm n          Example:

  

Figure 2. An example of a test data.  

n  Five test data will be provided: ‘u1.test’, ‘u2.test’, ‘u3.test’, ‘u4.test’, and ‘u5.test’

l  Output file format n       You must print an output file for each test data n            File format for the output of ‘u#.test’

                         -     ‘u#.base_prediction.txt’

[user_id]\t[item_id]\t[rating]\n 

[user_id]\t[item_id]\t[rating]\n 

[user_id]\t[item_id]\t[rating]\n [user_id]\t[item_id]\t[rating]\n 

... n ‘u#.base_prediction.txt’ should contain all user-item pairs in the test data and ratings that were predicted for the pairs by using your algorithm 

n  Supposed to follow the naming scheme for the output file as above 

 

4. Evaluation measure
l  Compute the difference between each predicted data (u1~u5.base_prediction.txt) and each test data (u1~u5.test)

 

l  Test method n For testing, we will use a measure called RMSE (Root Mean Square Error) defined as follows 

 ∑𝒏 (𝒑 − 𝒂 )𝟐

                                                                    𝐑𝐌𝐒𝐄 =         ’ 𝒊$𝟏         𝒊     𝒊    

𝒏

(𝒑𝒊: predicted rating for item i, 𝒂𝒊: original rating for item i, n: the number of ratings)

n    Because RMSE means error rates, the bigger value means that the ratings are predicted more incorrectly 

More products