Starting from:

$20

CS6313 Mini Project 6 Solved

 1. Consider the prostate cancer dataset available on eLearning as prostate cancer.csv. It consists of data on 97 men with advanced prostate cancer. A description of the variables is given in Figure 1. We would like to understand how PSA level is related to the other predictors in the dataset. Note that vesinv is a qualitative variable. You can treat gleason as a quantitative variable.

Build a “reasonably good” linear model for these data by taking PSA level as the response variable. Carefully justify all the choices you make in building the model. Be sure to verify the model assumptions. In case a transformation of response is necessary, try the natural log transformation. Use the final model to predict the PSA level for a patient whose quantitative predictors are at the sample means of the variables and qualitative predictors are at the most frequent category.

1

header
name
description
subject 
ID
1 to 97
psa 
PSA level 
Serum prostate-specific antigen level (mg/ml) 
cancervol 
Cancer Volume 
Estimate of prostate cancer volume (cc) 
weight 
Weight
prostate weight (gm) 
age 
Age
Age of patient (years) 
benpros 
Benign prostatic hyperplasia 
Amount of benign prostatic hyperplasia (cm2) 
vesinv 
Seminal vesicle invasion 
Presence (1) or absence (0) of seminal vesicle invasion 
capspen 
Capsular penetration 
Degree of capsular penetration (cm) 
gleason 
Gleason score 
Pathologically determined grade of disease (6, 7 or 8) 
Figure 1: List of variables in the prostate cancer data

.

2

More products