Starting from:

$35

CSDE 502 Assignment 8 Solved

Code 

Show All Code
Hide All Code
CSDE 502 Assignment 8
dcoomes
library(captioner)
library(tidyverse)
library(magrittr)
 
figure_nums <- captioner(prefix = "Figure")
table_nums <- captioner(prefix = "Table")
Explanation: This assignment is intended to give you more practice in manipulating variables.

Instructions:

Make sure your Rmd file has no local file system dependencies (i.e., anyone should be able to recreate the output HTML using only the Rmd source file).
Make a copy of this Rmd file and add answers below each question.
Change the YAML header above to identify yourself and include contact information.
For any tables or figures, include captions and cross-references and any other document automation methods as necessary.
Make sure your output HTML file looks appealing to the reader.
Upload the final Rmd to your github repository.
Download assn_08_id.txt and include the URL to your Rmd file on github.com.
Create a zip file from your copy of assn_08_id.txt and upload the zip file to the Canvas site for Assignment 8. The zip file should contain only the text file. Do not include any additional files in the zip file--everything should be able to run from the file you uploaded to github.com. Use zip format and not 7z or any other compression/archive format.

Imagine a new variable: multirace, using the following value definitions:

1 = one race, White
2 = one race, not White
3 = two races, includes White
4 = two races, both non-White
5 = three or more races, includes White
6 = three or more races, all non-White
9 = any race missing (White, Black/African American, American Indian, Asian, other)
1.1 
Fill in the codes for the hypothetical cases below (Table 1).

Table 1: A hypothetical data set

white
black
AI
asian
raceother
multirace
1
0
0
0
0
1
0
1
0
0
0
2
1
0
0
1
0
3
0
1
1
0
0
4
1
1
0
1
0
5
0
1
0
0
1
4
0
1
1
0
1
6
1
0
1
0
0
3
1
1
1
0
1
5
6
1
8
1
6
9
1.2 
Using this data frame (code below), report how many cases checked more than one race. Use R code to make this calculation and use inline expressions.

dat <- 
structure(
    list(
        white = c(1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 1L, 6L),
        black = c(0L, 1L, 0L, 1L, 1L, 1L, 1L, 0L, 1L, 1L), 
        AI = c(0L, 0L, 0L, 1L, 0L, 0L, 1L, 1L, 1L, 8L), 
        asian = c(0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 1L), 
        raceother = c(0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 1L, 6L), 
        multirace = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)
    ), 
    class = "data.frame", 
    row.names = c(NA, -10L)
)
dat <- 
structure(
    list(
        white = c(1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 1L, 6L),
        black = c(0L, 1L, 0L, 1L, 1L, 1L, 1L, 0L, 1L, 1L), 
        AI = c(0L, 0L, 0L, 1L, 0L, 0L, 1L, 1L, 1L, 8L), 
        asian = c(0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 1L), 
        raceother = c(0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 1L, 6L), 
        multirace = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)
    ), 
    class = "data.frame", 
    row.names = c(NA, -10L)
)
 
dat <- dat %%
  mutate(multirace = rowSums(across(white:raceother)))
There are 8 cases that checked more than one race in this data frame.

 

1.3 
Write R code to create the multirace variable, using the data set AHwave1_v3.rds. Hint: You may want to create another variable, numrace, that counts the number of races. Use download_file() and Sys.getenv("TEMP") to download the file to your system's TEMP directory to avoid local file system dependencies.

1.4 
Label the multirace variable as well as its values using attribute labels. Include the code here.

1.5 
Include below a contingency table of the multirace variable. Make sure that the values are labelled so the table is readable, and also include any missing values.


Review part B of each of the answers (i.e., H1KQ1B .. H1KQ10B) to the Knowledge Quiz (Section 19 of the Add Health questionnaire, documented in INH19PUB.PDF). The 10 questions each ask: “How confident are you that your answer is correct?”

2.1 
Write R code that creates a single summary variable named kqconfidence, with a larger number representing the respondent being more confident across all questions (scale of 0 to 3 for each individual question; kqconfidence will be the sum for each subject across the 10 questions). Note that any observations with value 7 (i.e., age less than 15) should be removed from the data frame, and values 6, 8, and 9 should be coded as NA (i.e., missing) for the purposes of scoring confidence. Document your code so that the reader knows how you scored the scale and how you handled missing values. Make sure to label the new variable.

2.2 
Create and include below a contingency table from kqconfidence with raw counts, percentages, and cumulative percentages. Put code to do this in your .R file.

2.3 
[BONUS] For each subject there were zero to 10 “missing” answers to each of the 10 component questions. We would like to know what this distribution is. Include below a table that shows the count of subjects for each unique value of the count of missing questions, and include code in your .R file.

2.4 
For each possible value of the Knowledge Quiz Part A score (from Assignment 3), what is the mean kqconfidence level? (Include results below.)

2.5 
[BONUS] For each respondent, create two different confidence scores: a confidence score for the items answered correctly and a confidence score for the items answered incorrectly. How many respondents are more confident when answering incorrectly?

More products