Starting from:

$30

Big Data-Project 1 Solved

Querying using Hive on Yellow Taxi data

 

Problem statement:

In this case study, we are giving a real-world example of how to use HIVE on top of the HADOOP for different exploratory data analysis. In here, we have a predefined dataset (2018_Yellow_Taxi_Trip_Data.csv) having more than 15 columns and more than 100000 records in it. The dataset has different attributes like

vendor_id string,
pickup_datetime string,
dropoff_datetime string,
passenger_count int,
trip_distance DECIMAL(9,6),
pickup_longitude DECIMAL(9,6),
pickup_latitude DECIMAL(9,6),
rate_code int,
store_and_fwd_flag string,
dropoff_longitude DECIMAL(9,6),
dropoff_latitude DECIMAL(9,6),
payment_type string,
fare_amount DECIMAL(9,6),
extra DECIMAL(9,6),
mta_tax DECIMAL(9,6),
tip_amount DECIMAL(9,6),
tolls_amount DECIMAL(9,6),
total_amount DECIMAL(9,6),
trip_time_in_secs int
 

Perform taxi trip analysis by solving the questions below:

What is the total Number of trips ( equal to the number of rows)?
What is the total revenue generated by all the trips? The fare is stored in the column total_amount.
What fraction of the total is paid for tolls? The toll is stored in tolls_amount.
What fraction of it is driver tips? The tip is stored in tip_amount.
What is the average trip amount?
What is the average distance of the trips? Distance is stored in the column trip_distance.
How many different payment types are used?
For each payment type, display the following details:
Average fare generated
Average tip
Average tax – tax is stored in column mta_tax
On average which hour of the day generates the highest revenue?
 

 

Q1) Creating table:-

 
 

Q.2) Finding no of Records: 

 

Q3. Info about Table: -

 

 

Q4) Listing the Distinct Vendors: - 

 

 

Q5) Maximum Passengers hold by a vendor:-

 

 

Q6) How many times each vendor carry their passengers: - 

 

 

Q7) Maximum tip to the vendor ever: -

 

 

Q8) which vendor have been given maximum tip and how much:-

 

 

Q9) Fraction to tolls amount w.r.t to total amount:- 

Q10) On an average how much mta_tax has been given:- 

 

Q11) Average trip time :- Q12) How many different payment types are used:- 

 

Q13) Average passenger count: 

 

Q14) Maximum time spent by passenger:- 

 

Q15)Distinct Rate Code: 

More products