Tagged: GTFS

In order to get data for “a day in the life of Metrorail,” I needed a way to filter the GTFS (Google Transit Feed Specification) data for a single day.

The large size of WMATA’s GTFS data can make it hard to work with. The largest file is stop_times.txt, 1,451,295 lines, one for each scheduled trip (and the header). My project to filter only Metrorail data reduced that to 80,186 lines (see Filtering Metrorail Data from WMATA GTFS). The size could be reduced further by showing trips for only a single day.

The calendar information is linked to from the file trips.txt, using the service_id field. The GTFS specification recommends keeping the service schedule information in the calendar.txt file, but for some reason WMATA is using the alternate method of using only calendar_dates.txt, which is normally used to show exceptions. This file’s data looks like this: » Continue Reading…

WMATA’s Google Transit Feed Specification (GTFS) data includes Metrobus, Metrorail, and the DC Circulator. If you want to analyze only Metrorail data, the task is daunting, because the sheer volume of bus data overwhelms the files. So, I wanted to create smaller versions of the relevant files just for the rail system.

The GTFS transit data was released in March 2009. The data is available on their Developer Resources page. Once you sign their license agreement you have immediate access to download a 16MB zip file, which contains 9 text files, in the “comma-separated values” (CSV) format.

My goal was to get stop information (time and place) for each trip on Metrorail. The time information is kept in the GTFS file stop_times.txt, while location information is kept in stops.txt. They are linked by the stop_id field. But neither file has data that tells you if the stop is for a bus or a subway train. That information is kept in routes.txt, which doesn’t directly connect to stops.txt or stop_times.txt. Instead, trips.txt links the routes together with the stops. Phew! » Continue Reading…