Starting from:

$25

CSC487: Data Mining Midterm -Solved


11. Suppose you have these data points: 29, 75, 13, 20, 168, 163, 140, 52, 4, 37, 
36, 123, 120, 31, 111. 
Let us first load in the data... 

2a.) If you draw a histogram with a bin size of 25, how many bars will you have 
in your chart? Please justify your answer 
 
b. What’s the value at the 60th percentile. (ile. 
 
c. Use z-score normalization to transform the value 36. 
def z_score_scaler(val): 

We can see that z-score normalization transforms the value 36 to −0.679 . 
3d. Use min-max normalization to transform the value 13 onto the range [1, 10].  

52. Compute the distance between objects 3 and 4 in the table below

We can see that the distance between objects 3 and 4 is 0.25 . 
83. Use chi-square for the data below to find out whether there’s a relation 
between playing basketball and eating cereal. Based on your result describe the 
relation
all expectations 

A χ2 chart will clearly show that a test statistic of 1.231 given 1 degree of freedom is not a 
statistically significant. Thus we settle on the null hypothesis that playing basketball and 
eating cereal are not correlated. 
114. Using the data table below, calculate the information gain for gender and age. 

We can now see that the information gain for age when using gender as labels is 0.262 . 
13

More products