How It Works

← Back to G train

Data sources

Two sources, stitched together.

Socrata (historical)

MTA Open Data has every service alert since April 2020. 8,400+ for the G alone. Problem is the MTA publishes monthly, so it typically runs 4-8 weeks behind.

GTFS-RT (live)

A server polls the MTA's real-time feed every two minutes and persists alerts to disk. Your browser also pushes any alerts it sees directly to the server. Once Socrata catches up, it takes over.

The merge

Everything on this site pulls from all three sources combined: Socrata for history, persisted alerts for the recent gap, and the live feed for right now. Days past the last known record show as "no data" instead of being assumed good.

Reading the numbers

Day breakdown

The percentages on the G's own card — Full Service, Delays, Planned, Suspended — are factual: what share of days fell into each bucket. A day takes the worst single status that occurred.

Reliability score

The percentage in the Service Report (G vs 7 vs L) is different: a severity-weighted average where each status contributes a specific weight. See how the score is calculated at the bottom. The two numbers measure different things and won't always agree.

Streak

Consecutive days without a suspension or delay. Planned work counts as running. A suspension or delay resets it.

Alert counts

One day can have multiple alerts (morning delay, evening planned work), so alert counts are higher than day counts.

The reliability score

The Service Report compares the G against the 7 and L using a single percentage per line: the average daily reliability over the chosen window.

How it's calculated

Pull every alert for the line from MTA Open Data over the window. For each day, find the worst single status that occurred and convert it to a severity weight. Day reliability = 1 − severity. The score is the average daily reliability across all days with data.

A clean day is 1.00. A fully suspended day is 0. A day with both delays and part-suspended takes the higher severity.

Notes

For combined statuses like stops-skipped | delays the score takes the worst component. Planned and unplanned weigh the same — the train either ran or it didn't. Full closures get a weight above 1.00 because a full-day shutdown costs riders more than a single day's worth of normal service can make up for, especially on a line like the G with no parallel alternative. Days after the latest data drop are excluded so the gap doesn't pad the score with implicit good days. Source is MTA Open Data only, so every line uses the same dataset for a fair fight.