$34.99
• Analyse the activity, language use and social interactions of an on-line community using metadata and linguistic summary from a real on-line forum and submit a report of your findings.
Suggested Length • 6 – 8 A4 pages (for your report) + extra pages as appendix (for your code)
• Font size 11 or 12pt, single spacing
Submission • PDF file only. Naming convention: FirstnameSecondnameID.pdf
• Via Moodle Assignment Submission.
Late
Penalties • 10% (3 mark) deduction per calendar day for up to one week.
Instructions
Submit the results of your analysis, answering the research questions and report anything else you discover of relevance. If you choose to analyse only a subset of your data, you should explain why.
There are two options for compiling your report:
(1) You can submit a single pdf with R code pasted in as machine-readable text as an appendix, or
(2) As an R Markup document that contains the R code with the discussion/text interleaved. Render this as an HTML file and print off as a pdf and submit.
Submit your report as a single PDF with the file name FirstnameSecondnameID.pdf on Moodle.
Software
It is expected that you will use R for your data analysis and graphics and tables. You are free to use any R packages you need but please document these in your report and include in your R code.
Questions
Activity, language use and social interactions in an on-line community. Analyse the metadata and linguistic summary from a real on-line forum and submit a report of your findings. Do the following:
(a) Analyse activity and language on the forum over time:
(b) Analyse the language used by threads:
We can think of threads as groups of participants posting on the same topic.
(c) Analyse social networks online:
(d) Overall considerations:
Data
The data is contained in the file webforum.csv and consists of the metadata and linguistic analysis of posts over the years 2002 to 2011. You will each work with 20,000 posts, randomly selected from the original file. The linguistic analysis was conducted using Linguistic Inquiry and Word Count (LIWC), which assesses the prevalence of certain thoughts, feelings and motivations by calculating the proportion of key words used in communication. See http://liwc.wpengine.com/ for more information, including the language manual http://liwc.wpengine.com/wpcontent/uploads/2015/11/LIWC2015_LanguageManual.pdf
Create your individual data as follows:
rm(list = ls())
set.seed(XXXXXXXX) # XXXXXXXX = your student ID webforum <- read.csv(“webforum.csv”)
webforum <- webforum [sample(nrow(webforum), 20000), ] # 20000 rows
Data fields given. (see the language manual for more detail and examples):
Column
Brief Descriptor Column Brief Descriptor
ThreadID Unique ID for each thread we “We, us, our” words
AuthorID Unique ID for each author you “You” words
Time Time they “They” words
WC Word count of the text of the post posemo Expressing positive emotions
Analytic Summary: Analytical thinking negemo Expressing negative emotions
Clout Summary: Power, force, impact anx Indicating anxiety
Authentic Summary: Authentic tone of voice anger Indicating anger
Tone Summary: Emotional tone sad Indicating sadness
ppron “I, we, you” words focuspast Expressing a focus on the past
i “I, me, mine” words focuspresent Expressing a focus on the present
focusfuture Expressing a focus on the future focusfuture Expressing a focus on the future
End.