Assignment-1 Linear and Logistic Regression, ML in Practice
Instructions • Your submission should be a single zip file 2020xxx_HW1.zip (Where 2020xxx is your roll number). Include all the files (code and report with theory questions) arranged with proper names. A single .pdf report explaining your codes with results, relevant graphs, visualization and solution to theory questions should be there. The structure of submission should follow: 2020xxx_HW1 |− code_rollno.py/.ipynb |− report_rollno.pdf |− (All other files for submission) • Anything not in the report will not be graded. • Your code should be neat and well-commented. • You have to do either Section B or C. • Section A is mandatory.
1. (10 points) Section A (Theoretical) (e) (1 mark) The parameters to be estimated in the simple linear regression model Y = α+βx+ϵ ϵ N(0,σ) are: (a) α, β, σ (b) α, β, ϵ (c) a, b, s (d) ϵ, 0, σ (f) (1 mark) In a study of the relationship between X=mean daily temperature for the month and Y=monthly charges on the electric bill, the following data was gathered: X=[20, 30, 50, 60, 80, 90], Y= [125, 110, 95, 90,110, 130]. Which of the following seems the most likely model? (a) Y= α + βx + ϵ β<0 (b) Y= α + βx + ϵ β>0 (c) Y= α + β1x + β2x2+ϵ β2 < 0 (d) Y= α + β1x + β2x2+ϵ β2 > 0 2. (15 points) Section B (Scratch Implementation) Logistic Regression Dataset: Diabetes Healthcare Dataset Page 2 OR 3. (15 points) Section C (Algorithm implementation using packages) Implementation of linear regression using libraries:- Split the dataset into 80:20 (train: test) Dataset: CO2 Emissions Dataset Page 3