Starting from:

$24.99

STAT240 Homework 3 Solution

Preliminaries
This file should be in STAT240/homework/hw03 on your local computer.
Problem 1
The columns mass and radius are relative to Earth. A mass or radius of 2.0 refers to a planet twice the mass or radius of Earth. A
mass or radius of 0.5 refers to a planet half the mass or radius of Earth.
Starting from planets ,

Problem 2
Starting from planets :
Keep only planets discovered by the method “Radial Velocity”.
Keep only planets whose mass and radius are both known. (e.g. they are not missing.)
Group and summarize that data, such that:
You get one row per year.
Create the columns n_discovered and minimum_mass , which contain the number of planets discovered in that year, and the smallest mass among planets discovered in that year.
Save this dataframe with the name mass_by_year . Then, on a separate line, let mass_by_year print as output so we can see the first ten rows in your knitted file (no need to use print to show the whole thing, we only need ten rows.)
# Write your code here!
Your first two rows should look like this if you did it correctly. (Column order is arbitrary, doesn’t matter if n_discovered and minimum_mass are switched around, just match the values.)
year n_discovered minimum_mass <dbl> <int> <dbl> 1 1999 1 232.
2 2001 1 1392.
Problem 3

Problem 4
Starting from planets , the original dataframe from the top of the file,
Print out the planet name, mass, radius, and density of the top five most dense planets.
To do so, you will have to calculate the density of each planet first, and then find the top five by density.
The density of a planet is its mass divided by its volume.

Note: This question requires you to understand the request and figure out which commands to chain together. Previous questions have indicated the step by step process, it is intentionally left out of this question and some future ones.
Problem 5
Which star or stars have the most planets orbiting them in this dataset? How many planets are orbiting that star or stars?
To answer this question, start from planets , then create and print a dataframe with columns star and n , with n representing how many planets are orbiting that star .
# Write your code here!
Problem 6
Problems 6 - 8 take you through a relatively complex analysis -> visualization process, which mimics what you might provide to a client asking the question:
“How has the most popular method of planet discovery changed over time?”


Problem 7
Starting from methods_within_year from problem 6 above,
Grouping by year, add a column called yearTotal . yearTotal should indicate how many planets were discovered within that year across all methods.
Here’s the first four rows to check your work against:
year method n yearTotal <dbl> <chr> <int> <int>
1 2000 Radial Velocity 16 16
2 2001 Radial Velocity 12 12
3 2002 Radial Velocity 28 29
4 2002 Transit 1 29
Now, add another column called methodProportion , which determines what percentage of the discoveries within that year were by that method.
Here’s what methodProportion should look like, again, column order doesn’t matter:
year method n yearTotal methodProportion <dbl> <chr> <int> <int> <dbl>
1 2000 Radial Velocity 16 16 1
2 2001 Radial Velocity 12 12 1
3 2002 Radial Velocity 28 29 0.966
4 2002 Transit 1 29 0.0345
Save this dataframe with the name methods_within_year_proportions , and then let methods_within_year_proportions be printed as output so the first ten rows are visible in your .html file.
# Write your code here!
Problem 8

Now, let’s answer the client’s question based on the graph. “How has the most popular method of planet discovery changed over time?”
Replace this text with your response.

More products