Starting from:

$25

MIE1624-Assignment 1 EDA and Hypothesis Testing

Questions:

The objectives of this Assignment is to explore the survey data to understand (1) the nature of women’s representation in Data Science and Machine Learning and (2) the effects of education on income level. The following tasks should be completed:

1.       Perform exploratory data analysis to analyze the survey dataset and to summarize its main characteristics. Present 3 graphical figures that represent different trends in the data. For your explanatory data analysis, you can consider Country, Age, Education, Professional Experience, and Salary.

2.       Estimating the difference between average salary (Q24) of men vs. women (Q2).

a.       Compute and report descriptive statistics for each group (remove missing data, if necessary).

b.       If suitable, perform a two-sample t-test with a 0.05 threshold. Explain your rationale.

c.       Bootstrap your data for comparing the mean of salary (Q24) for the two groups. Note that the number of instances you sample from each group should be relative to its size. Use 1000 replications. Plot two bootstrapped distributions (for men and women) and the distribution of the difference in means.

d.       If suitable, perform a two-sample t-test with a 0.05 threshold on the bootstrapped data. Explain your rationale.

e.       Comment on your findings.

 Select “highest level of formal education” (Q4) from the dataset and repeat steps a to e, this time use analysis of variance (ANOVA) instead of t-test for hypothesis testing to compare the means of salary for three groups 

More products