Starting from:

$25

BUAN6356- Homework 1 Solved

The “Utilities” dataset includes information on 22 public utility companies in the US.  The variable definitions are provided below.   

 

Fixed_charge = fixed-charge covering ratio (income/debt)

RoR = rate of return on capital

Cost = cost per kilowatt capacity in place

Load_factor = annual load factor

Demand_growth = peak kilowatthour demand growth from 1974 to 1975

Sales = sales (kilowatthour use per year)

Nuclear = percent nuclear

Fuel_Cost = total fuel costs (cents per kilowatthour)

 

For Questions 1-4 below, do not scale the data.  

 

1.      Compute the minimum, maximum, mean, median, and standard deviation for each of the numeric variables using data.table package.  Which variable(s) has the largest variability?  Explain your answer.

 

2.     Create boxplots for each of the numeric variables.  Are there any extreme values for any of the variables?  Which ones?  Explain your answer.  

 

3.     Create a heatmap for the numeric variables.  Discuss any interesting trend you see in this chart.

 

4.     Run principal component analysis using unscaled numeric variables in the dataset.  How do you interpret the results from this model?

 

5.     Next, run principal component model after scaling the numeric variables.  Did the results/interpretations change?  How so?  Explain your answers.

More products