Starting from:

$25

DS501 - HW 3 - Solved

Manipulating and Blending Data from Twitter




Problem 1: Sampling Twitter Data with Streaming API about a certain topic
•     Select a topic that you are interested in, for example, “#WPI” or “#DataScience”

•     Use Twitter Streaming API to sample a collection of tweets about this topic in real time. (It would be recommended that the number of tweets should be larger than 50, but smaller than 500.

•     Store the tweets you downloaded into a local file (csv file)

library(twitteR) library(stringr)

setup_twitter_oauth(consumerKey, consumerSecret, accessToken, accessTokenSecret) tweets = searchTwitter('#rstats', n=50) tweetsDF = twListToDF(tweets)
Report some statistics about the tweets you collected

•     The topic of interest: < INSERT YOUR TOPIC HERE>

•     The total number of tweets collected: < INSERT THE NUMBER HERE>

Problem 2: Analyzing Tweets and Tweet Entities with Frequency Analysis
1.     Word Count:

•     Use the tweets you collected in Problem 1, and compute the frequencies of the words being used in these tweets.

# Your R code here

•     Display a table of the top 30 words (ONLY) with their counts

# Your R code here

2.     Find the most popular tweets in your collection of tweets

•     Please display a table of the top 10 tweets (ONLY) that are the most popular among your collection,

i.e., the tweets with the largest number of retweet counts.

# Your R code here

3.     Find the most popular Tweet Entities in your collection of tweets

1

Please display a table of the top 10 hashtags (ONLY), top 10 user mentions (ONLY) that are the most popular in your collection of tweets.

# Your R code here

Problem 3: Getting any 20 friends and any 20 followers of a popular user in twitter
•     Choose a twitter user who has many followers, such as @hadleywickham.

•     Get the list of friends and followers of the twitter user.

•     Display 20 out of the followers, Display their ID numbers and screen names in a table.

•     Display 20 out of the friends (if the user has more than 20 friends), Display their ID numbers and screen names in a table.

•     Compute the mutual friends within the two groups, i.e., the users who are in both friend list and follower list, Display their ID numbers and screen names in a table

Problem 4 (Optional): Explore the data

Run some additional experiments with your data to gain familiarity with the twitter data and twitter API

Done
All set!


More products