Starting from:

$34.99

COSC426-526 Assignment 01 - Test your Python skills Solution


This set of questions can help you understand your programming skills in Python. Remember that this is a simple test, and we will work with more complex programming problems during the semester.
Dataset Information

For this problem, you will be working with COVID-19 sequence processing data from Kaggle. The dataset contains data about the processing of COVID-19 sequences by different countries over time. It comes as a Comma-Separated Value (CSV) file. It includes the following six columns:
1. location: the country for which the information is provided
3. variant: the COVID-19 variant for the data entry
6. perc_sequences: the percentage of the available number of sequences that were processed (Note: this value is out of 100)

Each row (or data entry) in the dataset represents the processing of one variant by one country on one day.

A copy of this dataset is provided to you. However, if you want to, you can also find the dataset here: https://www.kaggle.com/yamqwe/omicron-covid19-variant-daily-cases?select=covidvariants.csv.

Problem 1

The three main variants of COVID-19 that we’ve experienced in the United States are:
1. Alpha
2. Delta
3. Omicron

However, there are many other variants recognized by the WHO.

For this problem, determine which other variants are included in the dataset. Additionally, sort the variant names alphanumerically.

Note: the variants column contains 2 “catch-all” categories called “non_who” and “others.” Do NOT include these categories in the list.

Problem 2

Determine which variant of COVID-19 has the most sequences processed across the entire dataset.

Problem 3

Determine which country did the best at processing sequences across all variants (including the “catch-all” categories). The output should be the name of a single country.

Problem 4

Problem 4 has two parts.

Part A

Determine which country did the best at processing sequences across the Alpha, Delta, and Omicron variants only. The output should be the name of a single country.

Part B

Determine the ranking of the United States at processing sequences across the Alpha, Delta, and Omicron variants only.

Note: the best country overall should have a ranking of 1, but indexing in Python starts at 0.

Problem 5


Problem 6

Determine the percentage of processed sequences for the Alpha, Delta, and Omicron variants only in the United States.

Implementation Requirements

There are only two simple requirements for your implementation:
1. All code should be written in Python 3. We’ll run your code with a Python 3 interpreter, so Python 2 code will almost certainly fail.
2. All code should be either a single Python script (.py file) or Jupyter Notebook (.ipynb file).

Upload your solution in canvas before 11:59 PM ET today.

More products