Starting from:

$30

COMP3162-Project 1 Solved

A large entertainment & bar franchise (Hard Knocks) would like to determine whether now is a good time to expand and which country/region of the world is best to open next. Ultimately, they would like to gain some insights on whether to open their next franchise, based on public sentiments generally and opportunities to maximize profits. They first want to test the pulse of persons in relation to the products they sell (Beverages) which would indicate the current public views towards these products. In addition, they have a large dataset with sales data by region and country for different types of consumer goods (household, cosmetics etc), food and beverage. You will be helping Hard Knocks to make their decision in this project! The tasks required have been broken down in several segments and you will be required to start this week! 

 

                                                                                                                                    

 

1.       For each set of tweets retrieved (a-b above), retain the following features only: text, screen_name, user_id, created_at, favourite_count, retweet_count, location, followers_count, friends_count, account_lang, lang. 

 

a)       Remove all non-English tweets (you must indicate how many tweets were removed).        

b)      A tweet is considered a duplicate if the text is the same as another tweet. Remove all duplicate tweets 

                        (you must indicate how many tweets were removed).                                                                 

c)       Write the remaining tweets data to a file (.csv). The csv filename should have the format 

<keyword_<date_<myname. For example, for tweets on “beverage” retrieved on March 07 by 

                        Anderson would be: beverage_2021Mar07_Anderson.csv                                                          

 

d)      Write code to review and show details of tweets retrieved including number of tweets (after doing 2a-c), screen_name with the most followers, tweet with the most retweets, location from which the most tweets originate.           

 

repeat tasks 1 & 2 above. For task 1, use the other words that were not used during March 04-09 tweet retrieval. That is, if you retrieved tweets using “beverage”, you should use “beer” and if you used “party” now use “concert” for part 1b

More products