← Back to Datasets

U.S. flights data - Sample big dataset

Explore a large sample dataset of all U.S. flights in 2018. The dataset includes one row for every U.S. flight for a total of 7.2 million rows. Columns include departing and arriving airports, carrier, scheduled departure, minutes delayed, taxi time, and cancelled. Explore for free in a Row Zero spreadsheet. Open U.S. Flights Dataset



Dataset Summary

View Spreadsheet

This spreadsheet includes 7.2 million rows of U.S. flight data from 2018. There is one row for each U.S. flight. U.S. flights dataset Columns include Date, Carrier Code, Carrier, Flight Number, Origin, Destination, Scheduled Departure, Departure Time, Delay, Taxi Out, Wheels Off, Wheels On, Taxi In, Cancelled.

The workbook contains 3 sheets:

  • Dataset (all raw data)
  • Example analysis
  • U.S. Carrier Lookup, which maps U.S. airlines to their carrier code
  • Airport Code Lookup, which maps U.S. airports to their IATA codes

Highlights from the Dataset:

Busiest U.S. airports by domestic flight volume:

  1. ATL - Atlanta Hartsfield International Airport - 390,046 flights
  2. ORD - Chicago, O'Hare International Airport - 332,953 flights
  3. DFW - Dallas/Fort Worth International Airport - 279,298 flights
  4. DEN - Denver International Airport - 235,989 flights
  5. CLT - Charlotte/Douglas International Airport - 233,317 flights

Days with the most flights

Sunday after Thanksgiving was the day with the most flights. The next busiest travel days were Fridays in the summer - the last 3 Fridays in July and 1st Friday in August were 2 through 5 in most flights.

Days with the least flights

Thanksgiving day was the least busiest travel day followed by Saturdays in mid-January to mid-February.

Days with the most flight delays

Holiday travel stood out with the most flight delays. The worst day to travel was 12/28 with 42% of flights delayed 10+ mins and 12/21 and 12/27 were close behind with 39% of flights delayed 10+ min.

Largest U.S. Airlines by Flights:

  1. Southwest - 1,353,552 flights
  2. Delta - 949,283 flights
  3. American - 916,818 flights
  4. SkyWest - 774,137 flights
  5. United - 621,565

Airport with the most direct flights:

ORD Chicago, O'Hare with 174 direct flight destinations

Example Analysis

View the Spreadsheet
We can quickly create a few pivot tables and charts to analyze this dataset.

First, we'll create a pivot table to look at total flights, delays, and cancellations by airport. We can look at delays in a few ways - mean, median, 90th percentile (gives us the 10% worst delays) using the pivot table calculations. We can also use XLOOKUP to join with another pivot table that filters to flights delayed 10+ mins. Next, we can add calculated columns onto our pivot table to see the percent of flights delayed and cancelled at U.S. airports. pivot table of flights

We can insert a chart to better visualize the data. U.S. flight delays by airport

Next we can look at cancels and delays by departure time to see that flights are more likely to be delayed and cancelled later in the day, whereas morning flights have the fewest delays and cancellations. flight delays by departure time

We can also look at U.S. flight delays by airline. Here we see that the median flight delay is typically less than zero, meaning the typical flight is on time. However, we can use a pivot table to calculate percentile values, to show a more interesting story. Looking at the 90th percentile delay for each airline (p90 of Delay) shows the 10% worst flight delays for each airline. Here you can see that it's not uncommon to have flight delays over 30 mins. U.S. flight delays by airline

We can use a unique Row Zero feature to analyze the busiest days to travel and worst days to travel. Row Zero pivot tables support date/time aggregation for rows, so you can summarize pivot table data by day, week, month, quarter as well as hour, minute, second. Below we aggregate all of the flights by day and sort to see the days with the most flights and days with the most flight delays. pivot table group by date

Finally, we can use another unique Row Zero feature, pivot table count unique, to quickly see the number of direct flight destinations out of each airport: number of direct flights by airport in U.S.

View the Spreadsheet