$35
Objectives
Solve a regression problem with deep neural networks (DNN).
Understand basic DNN training tips
e.g. hyper-parameter tuning, feature selection, regularization, ...
Get familiar with PyTorch.
Task Description
COVID-19 Cases Prediction
Source: Delphi group @ CMU
○ A daily survey since April 2020 via facebook.
Do not attempt to find any related data! Using additional data is prohibited and your final grade x 0.9 !
Task Description
Given survey results in the past 3 days in a specific state in U.S., then predict the percentage of new tested positive cases in the 3rd day.
survey positive
cases
Day 1
survey positive
cases
Day 2
survey positive
cases
Day 3
Conducted surveys via facebook (every day & every state)
Survey: symptoms, COVID-19 testing, social distancing, mental health, demographics, economic effects, ...
estimation for all
certain state of the U.S.All population in a some samples survey population in that state
(data we are using)
States (40, encoded to one-hot vectors)
○ e.g. AL, AK, AZ, ... ● COVID-like illness (4)
○ e.g. cli,ili (influenza-like illness), ... ● Behavior Indicators (8)
○ e.g. wearing_mask, travel_outside_state, ...Percentage ● Mental Health Indicators (5)
○ e.g. anxious, depressed, ... ● Tested Positive Cases (1)
○ tested_positive (this is what we want to predict)
Data -- One-hot Vector
One-hot vectors:
Vectors with only one element equals to one while others are zero.
Usually used to encode discrete values.
AL (Alabama)
AK (Alaska)
If state code = AZAZ (Arizona)
(Arizona)AR (Arkansas)
WI (Wisconsin)
Data -- Training
covid.train.csv (2700 samples)
1 row = 1 sample
Data -- Testing
covid.test.csv (893 samples)
1 row = 1 sample
Evaluation Metric
Root Mean Squared Error (RMSE)
input features
(testing data)
Kaggle
Link: https://www.kaggle.com/c/ml2021spring-hw1
Displayed name: <student ID>_<anything>
○ e.g. b06901020_puipui
○ For auditing, don’t put student ID in your displayed name.
Your .zip file should include only
○ Code: either .py or .ipynb