Starting from:

$29.99

CSE6242-CX4242 Homework 2 Solution

Homework 2 : Tableau, D3 Graphs and Visualization
Prepared by our 26+ wonderful TAs of CSE6242A,Q,OAN,O01,O3/CX4242A for our 1200+ students
Submission Instructions and Important Notes:
bottom right of this document).
╃ Submit a single zipped file, called “HW2-GT username.zip”, containing all the deliverables including
source code/scripoints, data files, and readme. Example: “HW2-jdoe3.zip” if GT account username is “jdoe3”. Only .zip is allowed (no other format will be accepted). Your GT username is the one with letters and numbers.
validation works, use hashmap instead of array) and review any relevant materials online. However, each student must write up and submit his or her own answers.
be subject to the institute’s Academic Integrity procedures (e.g., reported to and directly handled by the Office of Student Integrity (OSI)). Consequences can be severe, e.g., academic probation or dismissal, grade penalties, a 0 grade for assignments concerned, and prohibition from withdrawing from the class.
╃ At the end of this assignment, we have specified a folder structure you must use to organize your
files in a single zipped file. 5 points will be deducted for not following this strictly.
task, unless your script is absolutely dependent on it to get the final result (which it ideally should not be).
you strictly follow our requirements.
╃ Wherever you are asked to write down an explanation for the task you perform, stay within the word
╃ Every homework assignment deliverable and every project deliverable comes with a 48-hour "grace
period". Any deliverable submitted after the grace period will get zero credit. We recommend that you plan to finish by the beginning of the grace period in order to leave yourself time for any unexpected issues which might arise.
╃ We will not consider late submission of any missing parts of a homework assignment or project
Grading
The maximum possible score for this homework is 100 points.
Students in the CX4242 undergraduate section can choose to complete any 85 points worth of work to receive the full 15% of the final course grade. For example, if a CX4242 student scores 100 pts, that student will receive (100 / 85) * 15 = 17.65 points towards the final course grade. To receive the full 15% score, students in the CSE6242 sections will need to complete all 100 points.
===== Important Prerequisites =====
Download the HW2 Skeleton that contains files you will use in this homework.
We highly recommend that you use the latest Firefox browser to complete this homework. We will grade your work using Firefox 68.0 (or newer).
For this homework, you will work with version 5 of D3, provided to you in the lib folder. You must NOT use any D3 libraries (d3*.js) other than the ones provided.
All d3*.js files in the lib folder must be referenced using relative paths, e.g., “../lib/<filename>” in your html files. For example, suppose the file “Q2/graph.html” uses d3, its header should contain: <script type="text/javascript" src="../lib/d3.v5.min.js"></script> It is incorrect to use an absolute path such as:
<script type="text/javascript" src="http://d3js.org/d3.v5.min.js"></script>
You can and are encouraged to decouple the style, functionality and markup in the code for each question. That is, you can use separate files for css, javascript and html.
==========
Q1 [10 points] Designing a good table. Visualizing data with Tableau.
Imagine you are a data scientist working with data that documents population distribution according to ethnic group, age and gender across years.
For each year, and for each ethnic group (treat “Other Ethnic Groups” as an ethnic group), your table should clearly communicate,:
● The total number of males (across all ages)
● The total number of females (across all ages)
● The total population (across all ages)
● The percentage of people that are 65 years and over, rounded to 2 decimal places (you will need to calculate this percentage) ● Save the table as table.png.
Our main goal here is for you to try out Tableau, a popular information visualization tool. Thus, we keep this part more open-ended, so you can practice making design decisions. We will accept most designs from you all. We show one possible design in the figure below, based on the tutorial from Tableau, and you are not limited to the techniques presented there.
Please follow the instructions below:+
● Your design should visualize the values of the categories Total Malays, Total Indians, Total Chinese, Other Ethnic Groups (Total) for each year.
● Your design should utilize a stacked bar chart to show the count for each of the aforementioned columns
● Your design should have clear label axes and a clear chart title. Include a legend for your chart. ● Save the chart as barchart.png.
Tableau has provided us with student licenses for Tableau Desktop, available for Mac and Windows. Go to tableau activation and select “Tableau Desktop”. After the installation, you will be asked to provide an activation key, which you can find on the Canvas page for this assignment. This key is for your use in this course only. Do not share the key with anyone.
If you do not have access to a Mac or Windows machine, please use the 14-day trial version of Tableau Online:
1. Visit https://www.tableau.com/trial/tableau-online
2. Enter your information (name, email, GT details, etc)
3. You will then receive an email to access your Tableau Online site
4. Go to your Site and create a workbook

Figure 1: Example of a stacked bar chart
Q1 Deliverables:
The directory structure should be as follows:
Q1/
table.png barchart.png age-distribution.csv population.csv
● table.png - An image/screenshot of the table in Q1.a (png format only).
● barchart.png - An image of the chart in Q1.b (png format only), Tableau workbooks will not be graded!). The image should be clear and of high-quality. ● age-distribution.csv and population.csv - the datasets.
Q2 [15 points] Force-directed graph layout
You will experiment with many aspects of D3 for graph visualization. To help you get started, we have provided the graph.html file (in the Q2 folder).
Note: You are welcome to split graph.html into graph.html, graph.css, and graph.js. Please also make certain that any paths in your code are relative paths. Nonfunctioning code will result in a five point deduction.
a. [3 points] Adding node labels: Modify graph.html to show a node label (the node name, i.e., the source) on the top right of each node. If a node is dragged, its label must move with it.
b. [3 points] Styling edges: Style the edges based on the “value” field in the links array. Assign the following styles:
If the value of the edge is equal to 0, the edge should be black, thin, and dashed.
If the value of the edge is equal to 1, the edge should be green, thick, and solid. c. [3 points] Scaling nodes:
Note: Regardless of which scale you decide to use, you should avoid extreme node sizes (e.g., nodes that are mere points, barely visible, or of huge sizes. Failure to do so will result in a poor quality visualization.
https://stackoverflow.com/questions/43906686/d3-node-radius-depends-on-number-of-links-weightproperty
2. [1.5 points] The degree of each node should be represented by varying colors. Pick a meaningful color scheme (hint: color gradients). The number of color gradations is up to you, but it must be visually evident that the nodes with higher degree are colored a darker/deeper color and the nodes with lesser degree are colored lighter.
d. [6 points] Pinning nodes (fixing node positions):
1. [2 points] Modify the code so that when you double click a node, it pins the node’s position such that it will not be modified by the graph layout algorithm (note: pinned nodes can still be dragged around by the user but they will remain at their positions otherwise). Node pinning is an effective interaction technique to help users spatially organize nodes during graph exploration.
2. [2 points] Mark pinned nodes to visually distinguish them from unpinned nodes, e.g., pinned nodes are shown in a different color, border thickness or visually annotated with an “asterisk” (*), etc.
3. [2 points] Double clicking a pinned node should unpin (unfreeze) its position and unmark it.

Figure 2a. Example Visualization
Q2 Deliverables:
The directory structure should be as follows:
Q2/
graph.(html / js / css)
● graph.(html / js / css) - the html file created, and the js / css files if not included in graph.html
Q3 [15 points] Line Charts
Use the dataset[2] provided in the file earthquakes.csv (in the Q3 folder) to create line charts.
Refer to the tutorial for line chart here.
Note: You will create four plots in this question, which should be placed one after the other on a single HTML page, similar to the example image below (Figure 3). Note that your design need NOT be identical to the example.
a. [5 points] Creating line chart. Create a line chart that visualizes the number of earthquakes worldwide from 2000 to 2015 (inclusively), for the four magnitude ranges: ['5_5.9', '6_6.9', '7_7.9', '8.0+']. Use the color scheme provided below for the magnitude ranges. Add a legend at the top right corner of the chart showing the magnitude-color mapping.
● Chart title: Worldwide Earthquake stats 2000-2015
● Horizontal axis label: Year
○ Use scaleTime like you did in HW1Q3
● Vertical axis label: Num of Earthquakes
○ Use linear scale for this part a
● Colors scheme: {'5_5.9': '#FFC300', '6_6.9': '#FF5733', '7_7.9': '#C70039', '8.0+':
'#900C3F'}
b. [4 points] Adding symbols and scaling symbol sizes. Create a line chart for this part (append to the HTML page) whose design is a variant of what you have created in part a. Start with your chart from part a. Then modify the code to visualize each data point in the chart as a solid circle, whose size is proportional to “Estimated Deaths”. Use a good scaling coefficient (your choice) to make the chart legible, visually attractive and meaningful. Keep the legend.
● Chart title: Worldwide Earthquake stats 2000-2015 with symbols
■ First chart
○ Chart title: Worldwide Earthquake stats 2000-2015 square root scale ○ This chart uses the square root scale for its vertical axis (only) ○ Other features should be the same as part b.
■ Second chart
○ Chart title: Worldwide Earthquake stats 2000-2015 log scale ○ This chart uses the log scale for its vertical axis (only) ○ Other features should be the same as part b.

Figure 3a: Example line chart

Figure 3b: Example line chart with symbols

Figure 3c-1: Example line chart using square root scale

Figure 3c-2: Example line chart using log scale
Q3 Deliverables:
The directory structure should be organized as follows:
Q3/
earthquakes.csv linecharts.(html / js / css) linecharts.pdf explanation.txt
● earthquakes.csv - the dataset.
● linecharts.(html / js / css) - the html file created, and the js / css files if not included in linecharts.html ● linecharts.pdf - a PDF document showing the screenshots of the four line charts created above (one for Q3.a, one for Q3.b and two for Q3.c). You should print the HTML page as a PDF file, and each PDF page shows one plot (hint: use CSS page break). Clearly title the plots as instructed (see examples in Figure 3).
● explanation.txt - the text file explaining your observations for Q3.c.
Q4 [15 points] Heatmap and Select Box
Example: 2D Histogram, Select Options
Use the dataset provided in earthquakes.csv (in the Q4 folder) that describes the earthquake counts for different states from 2010 to 2015 in the US. Visualize the data using D3 heatmaps.
a. [3 points] Create a file named heatmap.html. Within this file, create a heatmap of the earthquakes for different states from year 2010 to 2015 (inclusively). Place the state name on the heatmap's horizontal axis and the year on its vertical axis.
b. [1 point] A heatmap’s color scheme is a very important design element that has a direct impact on the heatmap’s effectiveness. Colorize the earthquake counts for each state, using a meaningful 9-gradation color gradient of your choice.
d. [6 pt] Create a drop down select box with D3 based on the total counts (from 2010 to 2015) of earthquakes of a state. The selections are “0 to 9”, “10 to 99”, “100 to 499”, and “500 or above”. When the user selects a different range in this select box, the heatmap and the legend should both be updated with values corresponding to the selected range. Note the differences in the horizontal axes and legends for “0 to 9” and “500 or above” in Figure 4a and Figure 4b below. While the 9 color gradations in the legend remain the same, the threshold values are different. The default category when the page loads should be “0 to 9”.
e. [2 pt] Implement a mouseover effect. When the mouse cursor is on a heatmap cell , the value of that cell will be displayed between the chart title and the heatmap.
Note:
1. The Earthquake Statistics is from USGS with some modifications.
3. The threshold values should not be hardcoded. They do not necessarily have to match the ones provided in the screenshots below.
The screenshots provided below serve as an example only. You are not expected to produce an exact copy of the screenshots. Please feel free to experiment with fonts, placement, color, etc. as long as the output looks reasonable for a heatmap and meets the functional requirements mentioned above.

Figure 4a: Counts of earthquakes in the states that have 0-9 earthquakes in total from 2010 to 2015. When the mouse is placed on the grid (Tennessee, 2012), the value of 9 will show up.

Figure 4b: Counts of earthquakes in the states that have 500 or above earthquakes in total from 2010 to 2015. When the mouse is placed on the grid (California, 2014), the value of 191 will show up.
Q4 Deliverables:
The directory structure should look like:
Q4/
heatmap.(html / js /css) earthquakes.csv
● heatmap.(html / js / css) - the html file created, and the js / css files if not included in heatmap.html ● earthquakes.csv - the dataset
Q5 [20 points] Interactive Visualization
Use the dataset state-year-earthquakes.csv provided in the Q5 folder to create an interactive line chart and sub-chart.
This dataset[3] contains the earthquake counts by U.S. state and region, in the years 2010 to 2015 (inclusively). In the data sample below, each row under the header represents a state, its region, year, and count of earthquakes.
state, region, year, count
Hawaii,West,2010,17
Hawaii,West,2011,34
a. [3 points] Create a line chart.
Summarize the data by displaying the count of earthquakes by region for each year. You will need to sum the count of earthquakes by year for all states in their respective regions. Then, display one line for each of the 4 regions in the dataset.
Axes: All axes should automatically adjust based on the data. Do not hard-code any values.
- The vertical axis will represent the total count of earthquakes for a region. Display these values using a linear scale.
- The horizontal axis will represent the years. Display these values using a time scale. b. [3 points] Line styling, legend, and title.
Lines: Each line should use a different color of your choosing to differentiate between regions. Display a dot shape over each data point in the line chart(i.e., a line should have one dot displayed for each year).
Legend: Display a legend on the right-hand portion of the chart that maps the line color to the name of the region.
Title: Display the title “US Earthquakes by Region 2010-2015” at the top of the plot.
The line chart should be similar in appearance to the chart provided in figure 5.b

Figure 5b.Line Chart representing count of earthquakes by year for each region
Interactivity and sub-chart. In the next few parts of this question, you will create event handlers to detect mouseover and mouseout events over each dot shape that you added in Q5.b, so that when hovering over a dot, a horizontal bar chart representing the earthquake count for each state in a region will be shown below the line chart (for the year of that dot). For example, hovering over the dot for the West region in 2011 will display the bar chart for all states in the Western region and their individual earthquake counts in 2011. See Figure 5c for an example.

Figure 5c.Bar chart representing count of earthquakes for the Western region in 2011
c. [5 points] Create a Bar chart
Use a horizontal design for the bar chart, with one bar per state in the selected region. Each bar represents the count of earthquakes for one state in the selected year.
Axes: All axes should automatically adjust based on the data. Do not hard-code any values.
- The vertical axis represents states in a region. The state names should be sorted in ascending order on the vertical axis where the state with the lowest amount of earthquakes is at the bottom and the state with the highest order of earthquakes is at the top.
Note: If a region has multiple states with an equivalent count of earthquakes, then order those state names in ascending alphabetical order. e.g., Alabama, Delaware, and Florida have 0 earthquakes in 2013. They will be ordered as:
...
Florida
Delaware
Alabama
- The horizontal axis represents the count of earthquakes for the selected year. Display these values using a linear scale.
d. [3 points] Bar styling and title
- Bars: All bars should have the same color and a fixed bar width.
- Title: Display a title with format “<Region>ern Region Earthquakes <Year>” at the top of the plot where <Region>, and <Year> are the variables set by hovering over a dot in the line chart. e.g., If displaying earthquakes for the South in 2012, the title would read: “Southern Earthquakes 2012”
e. [3 points] Mouseover Event Handling
- The barchart and its title should only be displayed during mouseover events for a dot in the line chart.
- The dot in the line chart should change to a larger size during mouseover to emphasize that it is the selected point.
f. [3 points] Mouseout Event Handling

- The barchart and its title should be hidden from view on mouseout and the dot previously mouseovered should return to its original size.
The graph should exhibit interactivity similar to the .gif in Figure 5f.

Figure 5f.Line Chart+BarChart demonstrating interactivity
Q5 Deliverables:
- The size of the dot in the line chart should be reset.
The directory structure should be as follows:
Q5/
interactive.(html/js/css) state-year-earthquakes.csv
● interactive.(html/js/css) - The html, javascript, css to render the visualization in Q5.
● state-year-earthquakes.csv - The datasets used to show the information of each state.
Q6 [20 points] Choropleth Map of State Data
Example of choropleth map: Unemployment rates
Use the dataset[4] provided in the file state-earthquakes.csv and states-10m.json (in the Q6 folder) and visualize them as a choropleth map.
● Each record in state-earthquakes.csv represents a state and is of the form
<State,Region,2010,2011,2012,2013,2014,2015,Total Earthquakes>, where
○ State: the name of the state. e.g., Alabama.
○ Region: the region which the state belongs to. e.g., South.
○ 2010,…,2015: the number of earthquakes in that state in 2010, …, 2015, respectively. ○ Total Earthquakes: the total number of earthquakes in that state during 2010-2015 (the number of earthquakes in the state-earthquakes.csv file have been slightly modified from the original values and do not represent the official figures).
● The states-10m.json file is a TopoJSON topology containing two geometry collections: states, and nation.
a. [15 points] Create a choropleth map using the provided datasets, use Figure 6 below as reference.
1. [10 points] The color of each state should correspond to the log of total earthquakes in that state (Total Earthquakes field in state-earthquakes.csv.). i.e., darker colors correspond to higher total earthquakes in that state and lighter colors correspond to lower total earthquakes in that state in log scale. Use gradients of only one particular hue. Use promises (part of the d3.v5.min.js file present in the lib directory; there is no need to download or install anything) to easily load data from multiple files into a function. Use topojson (present in the lib folder) to draw the choropleth map.
2. [5 points] Add a vertical legend showing how colors map to the total number of earthquakes. (In the example shown in Figure 6, there are 7 color gradations, but you must use exactly 9 in your submission.)
b. [5 points] Add a tooltip using the d3-tip.min library (in the lib folder). On hovering over a state, the tooltip should show the following information on each line: (1) state name, (2) region, and (3) total earthquakes. The tooltip should appear when the mouse hovers over the state. On mouseout, the tooltip should disappear. Use Figure 6 below as reference. We recommend that you position the tooltip some distance away from the mouse cursor, which will prevent the tooltip from “flickering” as you move the mouse around quickly (the tooltip disappears when your mouse leaves a state and enters the tooltip’s bounding box). Please ensure that the tooltip is fully visible (i.e., not clipped, especially near the page edges).
Note: You must create the tooltip by only using d3-tip.min.js in the lib folder.

Figure 6. Reference example for Choropleth Maps
Q6 Deliverables:
The directory structure should be organized as follows:
Q6/
choropleth.(html/js/css) state-earthquakes.csv states-10m.json
● choropleth.(html /js /css)- The html/js/css file to render the visualization.
● state-earthquakes.csv - The datasets used to show the information of each state.
● states-10m.json - Dataset needed to draw the map.
Q7 [5 points] Pros and Cons of Visualization Tools
This question has two parts. The first part is optional and WILL NOT be graded and the second part is required and WILL be graded.
a. [OPTIONAL - NO points] Line chart using R. Use R to create a line chart that looks the same as the 4th line chart in Q3, i.e., the line chart in Q3c with log scale y-axis.
1. Ease to develop for developers [40 words]
2. Ease to maintain the visualization for developers (e.g., difficulty of the maintenance of the product as the requirements change, the data changes, the hosting platform changes, etc.) [40 words]
3. Usability of visualization developed for end users [40 words]
4. Scalability of visualization to “large” datasets [40 words]
5. System requirements to run the visualization (e.g., browsers, OS, software licensing) for end users [40 words]
Your answer will depend on what you have learned from working through the questions in this assignment, and your personal experience.
Note: Your claims should be well justified, supported with compelling reasons. Simply stating that a tool is better (or worse) than D3 without justifications will receive a low (or no) score.
We recommend formatting your answers as bullet lists for better readability. For example:
1. Ease to develop
R: …
Tableau: …
D3: …
2. Ease to maintain the visualization
R: …
Tableau: …
D3: …
...
Text (e.g., “Ease to develop”, “D3:“ above) mainly for organizing you answers do not count towards the word limit.
Q7 Deliverables:
The directory structure should be as follows:
Q7/
linechart.jpg (optional) analysis.txt
● chart.jpg - the line chart you created using R. (note: this is optional and will not be graded) ● analysis.txt - comparison of R and D3.
Important: folder structure of the zip file that you submit
You are submitting a single zip file HW2-GTUsername.zip (e.g., HW2-jdoe3.zip, where “jdoe3” is your GT username), which must unzip to the following directory structure (i.e., a folder “HW2-jdoe3”, containing folders “Q1”, “Q2”, etc.). The files to be included in each question’s folder have been clearly specified at the end of each question’s problem description above.
HW2-GTUsername/ lib/
d3.v5.min.js d3-tip.min.js
d3-scale-chromatic.v1.min.js
topojson.v2.min.js d3-dsv.min.js d3-fetch.min.js
Q1/ table.png barchart.png age-distribution.csv population.csv
Q2/
graph.(html / js / css) Q3/
linecharts.(html / js / css)
linecharts.pdf earthquakes.csv explanation.txt Q4/
heatmap.(html / js /css) earthquakes.csv Q5/
interactive.(html / js / css) state-year-earthquakes.csv Q6/
choropleth.(html / js / css) state-earthquakes.csv states-10m.json
Q7/
linechart.jpg (optional) analysis.txt
Version 2

[1] Source: here
[2]Source: USGS https://earthquake.usgs.gov/earthquakes/browse/stats.php
[3] Source: USGS https://earthquake.usgs.gov/earthquakes/browse/stats.php [4]Source: USGS https://earthquake.usgs.gov/earthquakes/browse/stats.php

通过Google云端硬盘发布 – 举报滥⽤⾏为 – 每5分钟⾃动更新⼀次

More products