$25
Social Dominance and Politics
Welcome to the fourth Data Exploration Assignment. we read about Social Dominance Theory, and its related psychological construct, Social Dominance Orientation (SDO). SDO is measured through a survey scale consisting of 16 items that we already explored in class. In this assignment, you will explore data on SDO and its relationship with other variables from a nationally representative survey fielded in 2018.
Note that the actionable part of each question is bolded.
Data Details:
• File Name: sdo_data.csv
• Source: These data are condensed and adapted from a survey by Data for Progress in 2018 (N = 3144).
The data are representative of 2018 voters.
Variable Name
Variable Description
sdo5
Five-point social dominance orientation (SDO) scale: 1. Minimum
SDO; ...; 5. Maximum SDO
female
Indicator for whether or not the respondent is female. Coded 1 if respondent is female, 0 otherwise.
birthyr
Respondent’s birth year
educ
Education: 1. Didn’t graduate HS; 2. HS graduate; 3. Some college; 4. 2-year college; 5. 4-year college; 6. Postgraduate degree
race
Race: 1. White; 2. Black or African-American; 3. Hispanic or Latino; 4. Asian or Asian-American; 5. Native American; 6. Mixed Race; 7. Other; 8. Middle Eastern
favor_trump
Favorability of Donald Trump: 1. very unfavorable; ...; 4. very favorable
favor_blm
Favorability of Black Lives Matter: 1. very unfavorable; ...; 4.
very favorable
favor_metoo
Favorability of the Me Too movement: 1. very unfavorable; ...; 4.
very favorable
american_customs
“The growing number of newcomers from other countries threatens traditional American customs and values”: 1. Strongly disagree; ...; 5. Strongly agree
race_ident
“How important is being [respondent’s race] to you?”: 1. Not at all important; ...; 4. Very important
pid3
Three-category party identification: 1. Democrat; 2. Independent;
3. Republican
Variable Name
Variable Description
ideo5
Five-category political ideology: 1. Very liberal; ...; 5. Very conservative
fear_of_demographic_change
Fear of demographic change in the US: 0. Least fearful; ...; 1.
Most fearful
confederate_flag
Is the Confederate flag mostly a symbol of slavery and white supremacy or Southern heritage and culture? Coded either “slavery” or “heritage”
presvote16
Vote choice in the 2016 presidential election
Question 1: REQUIRED
Before looking at data, the science of political psychology often involves building surveys. The teaching team builds the surveys you take using an online survey-building software called Qualtrics. This is often the same software that researchers use to build surveys and collect data that is eventually published in peer-reviewed journals. In this question, you’ll create your own brief survey.
THIS QUESTION IS REQUIRED FOR ALL STUDENTS. Go to harvard.qualtrics.com and log in using your HarvardKey. Click “Create new project”, then select “Survey”. You can name your survey whatever you like. Leave the other two drop-down options at their default and click “Create project”. Now you can input the SDO scale, which is given below. Make sure to include all 16 items, split into two sub-scales, in your survey. They are split into two sub-scales here, but they don’t need to be in your survey. For each item, there should be seven response categories: Strongly favor, Somewhat favor, Slightly favor, Neutral, Slightly oppose, Somewhat oppose, Strongly oppose. Think about the format you think is best for these questions, available under “Question Type”. How might the format of the questions affect the responses you get from the survey, or the experience respondents have while taking the survey? Also consider question ordering and how that may also affect the responses. BE
SURE TO UPLOAD A SCREENSHOT OF YOUR QUALTRICS SURVEY TO YOUR BLOG THIS WEEK.
Dominance Sub-Scale
1. Some groups of people must be kept in their place.
2. It’s probably a good thing that certain groups are at the top and other groups are at the bottom.
3. An ideal society requires some groups to be on top and others to be on the bottom.
4. Some groups of people are simply inferior to other groups.
5. Groups at the bottom are just as deserving as groups at the top.
6. No one group should dominate in society.
7. Groups at the bottom should not have to stay in their place.
8. Group dominance is a poor principle.
Anti-Egalitarianism Sub-Scale
1. We should not push for group equality.
2. We shouldn’t try to guarantee that every group has the same quality of life.
3. It is unjust to try to make groups equal.
4. Group equality should not be our primary goal.
5. We should work to give all groups an equal chance to succeed.
6. We should do what we can to equalize conditions for different groups.
7. No matter how much effort it takes, we ought to strive to ensure that all groups have the same chance in life.
8. Group equality should be our ideal.
ANSWER: I chose to divide the survey by the Anti-Egalitarianism and Dominance sub-scales, as they’re neatly organized and I think asking the respondent questions with similar themes is less confusing and might produce more accurate answers. I used the multiple choice question type, with the seven listed options - again, I think a slider or scale would be more confusing, and respondents might be more tempted to be lazy and stick with the default. Finally, I reordered the questions so that they switch between normal and reverse-coded. Back to back questions contradict each other, forcing a respondent to be more consistent.
Question 2
Now let’s take a look at the data.
sdo <- read_csv("sdo_data.csv")
## Parsed with column specification:
## cols(
## sdo5 = col_double(),
## female = col_double(),
## age = col_double(),
## educ = col_double(),
## race = col_double(),
## favor_trump = col_double(), ## favor_blm = col_double(),
## favor_metoo = col_double(),
## american_customs = col_double(),
## race_ident = col_double(),
## pid3 = col_double(),
## ideo5 = col_double(),
## fear_of_demographic_change = col_double(),
## confederate_flag = col_character(),
## presvote16 = col_character() ## )
What is the distribution of social dominance orientation in the sample? Make a plot, and report the mean and standard deviation of SDO in the sample. Extend this problem by splitting the plot by party ID of the respondent. Comment on what you find.
sdo %>%
ggplot() + geom_bar(aes(x = sdo5, y = ..prop..), fill = "purple") + labs(title = y = "Proportion" theme(plot.title = element_text(face = "bold", hjust = 0.5, size = 16), plot.background = element_rect(fill = "white"), axis.title = element_text(face = "bold"))
"SDO Distribution",
, x = "SDO")
SDO Distribution
0.2
0.1
0.0
1 2 3 4 5
SDO
sdo %>% filter(!is.na(pid3)) %>% mutate(pid3 = recode(pid3,
"1" = "Democrat",
"3" = "Republican",
"2" = "Independent")) %>%
ggplot() + geom_bar(aes(x = sdo5, y = ..prop..), fill = "purple") + labs(title = y = "Proportion", x = "SDO") +
theme(plot.title = element_text(face = "bold", hjust = 0.5, size = 16), plot.background = element_rect(fill = "white"), axis.title = element_text(face = "bold"),
strip.background = element_rect(color = "NA", fill = "NA"), strip.text.x = element_text(size = 12, color = "black"))
"SDO Distribution by facet_wrap(~pid3) +
SDO Distribution by Party ID
mean <- mean(sdo$sdo5) sd <- sd(sdo$sdo5)
ANSWER: The distribution of SDO overall is heavily skewed left, though more respondents seem to have an SDO of 3.00 than 2.00. The mean SDO is 2.0525604, and the standard deviation is 0.9214715. However, few have an SDO greater than 3.00. When investigating the distribution by party ID, however, it seems that Republicans make up the bulk of those with an SDO of 2.00 or greater. Democrats’ SDO is heavily skewed left (almost a majority have an SDO of 1.00), and Independents are somewhat skewed left, while Republicans’ SDO appears to be normally distributed around just under 3.00.
Question 3
In the reading for this week, we saw that gender is central to social dominance theory, which predicts that men tend to have higher SDO than women do. Is this true in this sample as well? Report the average SDO for men and women. Comment on what you find. Extend this question by reporting the difference in means along with the p-value. Is the difference significant at a .05 significance level?
sdo %>% filter(!is.na(female)) %>% mutate(female = recode(female,
"0" = "Men", "1" = "Women")) %>%
ggplot() + geom_bar(aes(x = sdo5, y = ..prop..), fill = "purple") + labs(title =
"SDO Distribution by
y = "Proportion", x = "SDO") +
theme(plot.title = element_text(face = "bold", hjust = 0.5, size = 16), plot.background = element_rect(fill = "white"), axis.title = element_text(face = "bold"), strip.background = element_rect(color = "NA", fill = "NA"), strip.text.x = element_text(size = 12, color = "black"))
facet_wrap(~female) +
SDO Distribution by Sex
Men Women
men_mean <- mean((sdo %>% filter(female == 0))$sdo5) women_mean <- mean((sdo %>% filter(female == 1))$sdo5)
t.test(sdo5 ~ female, data = sdo)
##
## Welch Two Sample t-test
##
## data: sdo5 by female
## t = 7.6147, df = 2896.4, p-value = 3.554e-14
## alternative hypothesis: true difference in means is not equal to 0 ## 95 percent confidence interval: ## 0.1867603 0.3162976 ## sample estimates:
## mean in group 0 mean in group 1
## 2.189685 1.938156
p_val <- format(t.test(sdo5 ~ female, data = sdo)[3][1], scientific = FALSE)
ANSWER: The mean SDO for men is 2.1896853, and the mean SDO for women is 1.9381564. A twosample t-test shows that the difference is means is statistically significant at a very high level, as the p-value is very, very low (0.00000000000003554351 - much lower than 0.05), and the confidence interval does not include zero
Question 4
What is the correlation between sdo5 and the favor_trump variable? Is the correlation statistically different from zero? You can use cor.test() for this question. Interpret what you find. If you want, extend this question by creating a scatterplot with the line of best fit to visualize the relationship. You can use geom_point() in the ggplot architecture for this.
cor.test(sdo$sdo5, sdo$favor_trump)
##
## Pearson’s product-moment correlation
##
## data: sdo$sdo5 and sdo$favor_trump
## t = 41.246, df = 3111, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0 ## 95 percent confidence interval: ## 0.5713838 0.6168261 ## sample estimates:
## cor
## 0.5945795
sdo %>% filter(!is.na(favor_trump)) %>% ggplot(aes(x = favor_trump, y = sdo5)) + geom_jitter(height = 0.05, width = 0.05) + geom_smooth(method = "lm") + labs(title = "Relationship Between Favoring Trump and SDO", x = "Favor Trump Scale", y = "SDO") +
theme(plot.title = element_text(face = "bold", hjust = 0.5, size = 16), plot.background = element_rect(fill = "white"), axis.title = element_text(face = "bold"))
Relationship Between Favoring Trump and SDO
ANSWER: The cor.test() shows a high correlation between SDO and favoring Trump: about 0.59, at a high level of statistical significance (a very low p-value and narrow confidence interval). The scatterplot I made, which slightly “jitters” the points since both scales are discrete, reflects this relationship; you can see that for higher levels on the “favor Trump” scale, the average SDO is higher. However, this is an imperfect relationship, as there are still many people who favor Trump but have lower SDO scores. The best takeaway is that you can be more confident someone with a high SDO score favors Trump than you can be that someone with a low SDO score doesn’t favor Trump.
Question 5
Correlation matrices, like the one below, are useful for visualizing the pairwise relationships between several variables. They allow you to see the correlation coefficients of of many relationships at once. Plot a correlation matrix of the correlation between SDO and some of the variables you think might be related to SDO and to each other. Choose at least 3 variables in addition to SDO. Before you make your plot, briefly discuss why you think the variables might all be related. The package ggcorrplot may be useful here. Discuss what you see in your plot.
corr.mat <- sdo %>% filter(!is.na(favor_blm)) %>% filter(!is.na(favor_metoo)) %>% filter(!is.na(favor_trump)) %>%
select(sdo5, race, educ, favor_blm, favor_metoo, favor_trump) %>% cor() ggcorrplot(corr.mat, type = "lower", lab = TRUE)
ANSWER: The results are completely predictable - favoring BLM and MeToo are highly positively correlated, though these are highly negatively correlated with favoring Trump. The other variables display weak correlations, possibly because of how they are coded. For instance, the race is coded as a double though its substance is categorical. To find a better measurement of correlation, recoding this would be necessary.
Question 6: Data Science Question
In this next question, we will use regression to model vote choice as a function of SDO and other variables of interest. This will help us get a fuller picture of the impact of social dominance orientation on political attitudes. We will fit the following model:
rep_vote = β0 + β1sdo5 + β2female + β3white + β4educ + β5age + β6pid3 + β7ideo5 +
You’ll notice that the variable white doesn’t exist in our data set. When doing regression analysis, researchers often code race as a binary - for example, 1 for white and 0 for all non-white. This is done largely to make the regression results easier to interpret. Without turning race into a binary variable, the regression model would instead have several binary variables corresponding to each racial category (e.g. 1 for Black, 0 otherwise; 1 for Hispanic, 0 otherwise, etc.) which can quickly become unwieldy. Try it both ways if you are interested in seeing the difference (though you’ll need to turn the race variable to a factor).
We also need to adjust the vote choice variable. Currently, presvote16 codes vote choice for any party in the 2016 election (Dem, Rep, Green, Libertarian) as well as votes for others. This, too, would become unwieldy in a regression. To simplify, we will turn this into an indicator variable for whether or not the respondent voted for the Republican (Donald J. Trump), called rep_vote. To be clear, rep_vote should be 1 if the respondent voted for Trump, 0 if they voted for someone else, and NA if they did not vote.
First, create the white variable from the race variable, as well as rep_vote from presvote16. Then, fit the linear model described above. Comment on what you see. Is this in line with what we would expect based on social dominance theory? Interpret your results and comment on what you find, especially as it relates to social dominance theory. Note that you can explore other model specifications in the next question.
new_sdo <- sdo %>%
mutate(white = ifelse(race == 1, 1, 0)) %>% mutate(rep_vote = ifelse(presvote16 == "Trump", 1, 0))
mod <- lm(rep_vote ~ sdo5 + female + white + educ + age + pid3 + ideo5, data = new_sdo) stargazer(mod, type = "text")
##
## ===============================================
## Dependent variable:
## ---------------------------
## rep_vote
## -----------------------------------------------
## sdo5
0.094***
##
##
(0.007)
## female
-0.004
##
##
(0.012)
## white
0.044***
##
##
(0.015)
## educ
-0.011***
##
##
(0.004)
## age
0.002***
##
##
(0.0004)
## pid3
0.248***
##
##
(0.010)
## ideo5
0.109***
##
##
(0.007)
## Constant
-0.689***
##
(0.034)
##
## -----------------------------------------------
## Observations 2,884
## R2 0.639
## Adjusted R2 0.638
## Residual Std. Error 0.297 (df = 2876)
## F Statistic 728.435*** (df = 7; 2876)
## ===============================================
## Note: *p<0.1; **p<0.05; ***p<0.01
ANSWER: The results are in line with what I’d expect; obviously, ideo5 and pid3 correlate most strongly with whether one voted for Trump, but the sdo measurement seems to be highly correlated at a high level of statistical confidence. A higher Social Dominance Orientation would make one favor a stronger, more conservative, and racially-charged leader like Trump, so this correlation makes sense.
Question 7
Lastly, just explore the data! This question is open-ended, but make sure you have a theoretical expectation in mind for any relationships between variables you want to explore, and include them in your answer.