$30
Variable Names in order from left to right:
Column
Name
Description
A
Sex
What is your biological sex? (Male or Female)
B
Seat
What is your seat located in the classroom? (Front, Middle, Back)
C
libarts
Are you a liberal arts major or not?
D
TV
On average, how many hours of TV do you typically watch in a WEEK?
E
computer
On average, how many hours do you spend on the computer in a WEEK?
F
Sleep
On average, how many hours of sleep do you get at night?
G
alcohol
How many alcoholic beverages do you consume in a week?
H
Height
What is your height, in inches?
I
momheight
What is your mom’s height, in inches?
J
dadheight
What is your dad’s height, in inches?
K
exercise
On average, how many hours of exercise do you typically get in a week?
L
GPA
What is your current GPA?
Univariate Quantitative Analysis:
1. Provide a table of descriptive statistics for the quantitative variables (including all measurements of central tendency and dispersion).
2. Create a histogram and boxplot for 3 quantitative variables of your choice.
3. For the 3 quantitative variables, explain why the mean or the median is the best representation of central tendency and the standard deviation or IQR of dispersion. This is decided by examining the histogram and boxplot. Identify the appropriate measures and state the associated values from the descriptive statistics table in your write up.
Univariate Categorical Analysis:
4. Create univariate frequency tables (frequency and percent) for the 3 categorical variables.
5. Create a bar chart and pie chart for the 3 categorical variables.
6. For each variable comment on the frequency distribution including the mode.
Bivariate Analysis 2 Categorical:
7. Using the Pivot Table tool, provide a contingency table for two categorical variables that you are interested in exploring the relationship between. Determine which variable you think may ‘explain’ the response of the other variables. The variable that explains should be on the rows and the variable that responds should be on the columns.
8. Report on the frequency and percent distribution of the two variables. Comment on anything you find particularly interesting about the relationship.
9. Create a grouped bar chart, a stacked bar chart, and a 100% stacked bar chart. The variable that ‘explains’ should be along the x-axis and the variable that ‘responds’ should be represented by the colors of the bars in the legend.
10. Describe the differences between these graphics and determine which graph(s) is a better visualization of the relationship of the two variables.
Bivariate Analysis 1 Cat 1 Quant:
11. Perform a stratified analysis of the mean of the quantitative variables “TV”, “computer”, “Sleep”, “alcohol”, “exercise”, and “GPA” by the “Seat” variable. Compare and contrast the mean of the quantitative variables between the categories.
12. Pick one of the quantitative variables that you feel has an interesting relationship with your categorical variable. Explore the relationship further by creating side-by-side boxplots of the quantitative variable stratified by the categorical variable you chose.
13. Report on the distributions of the quantitative variable across the different categories. Compare things like the median, IQR, range, quartiles, and outliers. Comment on anything you find particularly interesting about the relationship.
Bivariate Analysis 2 Quant:
14. Create a scatterplot of two quantitative variables that you are interested in exploring the relationship between. Determine which variable you think may ‘explain’ the response of the other variables. The variable that explains should be on the x-axis and the variable that responds should be on the y-axis. (For example, the number of square feet in a house explains the price of the house. Square feet is the explanatory variable and price is the response variable)
15. Describe the relationship you observe between these two variables, based on shape, strength, and direction.
Confidence Interval:
16. Construct and report 90%, 95% and 99% confidence intervals for a quantitative variable of your choice.
17. What do you notice as the confidence level increases? Interpret the 95% confidence interval in context.
Variable Creation:
18. Create a new categorical variable from one of the original quantitative variables.
19. Explain how and or why you chose the cut points for your categories.
20. Perform a stratified analysis of an original categorical variable of your choice by your newly created categorical variable. Report on the frequency and percent distributions between the categorical variable. Comment on anything you find particularly interesting about the relationships.
Sampling:
21. Generate a random sample of 30 observations called “ucdavis_sample1”.
22. Export your new sample to a CSV and turn in your sample with the project deliverables. Use the naming convention lastname_firstname_gss_sample1 (carder_nicole_ucdavis_sample1.csv)