$34.99
You are asked to do Project-1 (NYC Yellow Taxi Trip Data) using Apache Pig. Like in Project-1, we are interested in the attributes trip_distance (attribute at position 5) and total_amount (last attribute) of this dataset. We first need to round off the trip distance to integers (e.g., 2.34 miles to 2 miles). Then, for each different rounded trip distance that is less than 200 miles, we want to calculate the average total amount paid. You need to print the results to the output using dump. In your Pig script, you can access the path of the trip dataset as '$T'. That is, you can use LOAD '$T' USING ... to load this dataset. Steps: Read the input as CSV without schema. If the first column of CSV is VentorID, then skip it. Otherwise, extract the trip_distance and the total_amount and round them up and convert them to the appropriate types. Construct the groups and calculate the averages in the groups.