Public transit ETAs are often fantastically wrong. Is there a more reliable way to predict the bus?

Transit’s in-app prediction engine now tells you if your ride is running late or early — with up to 80% more accuracy than official agency data.

October 28, 2021

No matter which app you use to look up the times for public transit — at one point you’ve headed to the stop, happy as a two-tailed corgi, with plenty of time to spare before the next departure. Yet mid-walk, you’ve caught sight of something strange, out there in the distance: your bus? sailing off into the horizon?? without you on board???

“That’s just perfect,” you say.
“I’ve been lied to! Lied to again!”

The sense of betrayal — the rottweiler fury — that results from a bad transit ETA is something we’ve all experienced. But it’s not your transit agency’s fault.

Even though transit agencies are the ones supplying predictions to apps like Google Maps and Transit, “predicting” ETAs isn’t a brainless game of point and shoot  —  extrapolating from where your bus is, and adding x-minutes. Think about it. Could you foresee the blocked intersection? The hockey game that went into overtime, holding up the buses? The hailstorm? The stroller fiesta that required the driver to use the ramp at ten different stops? Neither could your transit agency.

Until now, public transit apps (and signs) all had one source for their departure times: the transit agency.

Public transit has a certain degree of baked-in uncertainty. Buses fight with traffic. Trains have to wait for other trains. 

And when an app says your bus is running “seven minutes behind schedule” — it’s actually understating the complexity of the situation. Can your bus make up that seven-minute differential? Or will that “seven minutes” turn into a ten-minute delay?

Your transit agency usually has reliable data about where its vehicles are right now. But anything could happen between that moment, and its arrival at your stop. When an agency’s prediction isn’t quite right? Users blame whatever app they’re using — often, it’s Transit.

So we scoured the earth for better prediction sources. We were happy to pay any price for better data if it helped our riders! But eventually we realized there was only one, very ugly, way out of our prediction predicament.

We were gonna have to do it ourselves.

Better predictions =
more data + machine power

Back in 2019 we made our first foray into the wild west of bus predictions. Instead of being a conduit for agency predictions, we were going to predict the ETAs ourselves. Our engineers looked at the weakness of existing ETA prediction techniques, and sought to create a better model that could adapt to changing road conditions.

We used the STM (the agency in Montreal, our hometown) as the guinea pig for our new prediction model.

First, we cross-checked the accuracy of every historical ETA prediction that we had on file. Every stop, for every transit line. We asked questions like:

  1. At 8:30am, what was the predicted ETA for bus x to arrive at stop yand when did that bus actually arrive at the stop?
  2. ☝️… same thing, but at 8:31am?
  3. ☝️… same thing, but at 8:32am?
  4.  Times infinity.

To assess what a “good” prediction was, we first set up some benchmarks: if you looked up a transit departure far into the future, it was ok if our ETA prediction was off by a few minutes give-or-take.

But we were much stricter with ETA predictions as the bus got closer to your stop — because you can’t react to a last-minute change to your bus’s ETA if your bus is already speeding past you! 😵

We penalize “early” arrivals more heavily, since you’ll miss your bus. A “late” arrival just means you wait a little longer.

If a bus was “three minutes away” (according to us) but it arrived a minute sooner than what we showed you, you could have missed it 🚌💨🏃‍♀️💨 — so we put it in the “bad prediction” bucket. That way our model could learn from its mistakes.

But if the bus was less than a minute late than what we predicted, we considered it an acceptable prediction. Because you still made the bus!

We also penalized ETA predictions that were jittery. If your vehicle was predicted to be 5 minutes away, then 3, then 8, then 3, then 10, we tried to smooth out those jitters.

With these benchmarks in place, we took all that historical transit data and started processing it. 

We started out with a really ratchety formula (a few variables, not much precision) and asked the computer to minimize the difference between “the predicted time the bus shows up” and “the actual time it showed up”.

Revision after revision, our prediction formula was getting more and more complicated — and accurate. We found in most cases our prediction model was better than transit agency models (because ours uses real-time road conditions, not just real-time bus locations) at predicting ETAs for downstream stops.

Machines vs. humans 🤖🧠

It was time to export our fancy prediction model from Montreal, to all our other cities. Except for one thing. Traffic patterns are different in every city — and even within a single city they’re constantly in flux.

While we technically could export Montreal’s prediction model to other cities, it would stop working if there was the slightest change to the transit network. Semi-permanent road detour? New speed limit on a street used by buses? Our model stopped working.

For Montreal, we’d succeeded in building a high-performing prediction model  —  but it was a local solution, and required manual intervention to keep prediction quality high. It wasn’t scalable!

To bring that excellence to other cities, we had to invent Autotraining™.

Next up: scaling from 1 to 25+

How could we build a system that could supervise itself: a model that could identify when it was underperforming, adapt to new cities and street conditions?

When long-term road conditions change, they reduce our model’s accuracy. So we set up bumper rails to assess the performance of our model in each city. If we catch performance slipping? We spin up a bunch of virtual machines to retrain that city’s model, to better fit its recent ETAs. It’s a model that can adapt to different local conditions, no babysitting required.

We also introduced a “recency” override to address short-term changes in road conditions (like snowstorms or gridlock). When we have transit data from the last few hours, which is often but not always, our model uses that data in lieu of historical predictions.

We keep track of bus travel times on a spreadsheet that looks like a Mario Kart leaderboard: 

Our “recency” model gathers all the relevant stop-to-stop travel times and spits out a prediction using a weighted average — where the recorded travel times from more recent trips carry more weight.

By marrying our historical predictions with recency-minded ones, we had a way of predicting ETAs that were consistently more accurate than those supplied to us from transit agencies — and which appear in Google Maps et al. (Some exceptions apply: when our brilliant friends Swiftly are hired by an agency to handle departure times, we don’t go through the trouble of setting up our model — we let Swiftly do their thing.)

But what about those nettlesome outliers — the bus lines whose ETAs were better predicted by the transit agency, than our own predictions? 🧐

These outliers gave us night tremors: we wanted a perfect model, one that gave our users better predictions for EVERYTHING. Not just 95% of transit lines. We realized, though, that debugging the last 5% of transit predictions would take a long while. So we made the choice: rather than dogmatically showing our ETA predictions for every line, we’d just show the best predictions we had. Sometimes they come from us, sometimes from your agency. And if we ever find a better prediction source? We’ll use that instead! Because as much as we love clever engineering solutions — ensuring riders have the best data is all that really matters.

Putting lipstick on the machine 💋

We’d reached a point where our model didn’t require hourly supervision and manual overrides to maintain accuracy.

It was time to launch! But also, it was time to inform our users what we had done.

To give our riders more trust when they ride, we now show a 🔮 on any trip where our prediction adds more than 20% accuracy, compared to their transit agency’s prediction (these are the predictions that show up in Google Maps, etc.)

Our model is optimized for system-wide performance. Very rarely, our prediction model will underperform the agency’s own predictions on a particular line. Usually it’s one of three things: buses are idling for an overly-long time at the first stop, OR a bus spent an overly-long time picking up passengers at a stop midway through its journey, OR we weren’t getting good real-time bus locations from the transit agency.

Now with our historical prediction engine  —  augmented by recency data —  we can more reliably guess when your vehicle will show up. To get any more precise, we’d need to operate the actual bus/train hardware that sends out GPS signals!

Of course, in an ideal world with better transit service, being able to boast about “50% more accurate predictions” would be a moot point — because buses and trains would depart so frequently that missing one would cost you a minute or two at most.

Who cares if your 8:31 bus came two minute early, if another one came at 8:33?

Sadly in most places, that’s not the case. Many of our riders live in cities where a missed bus or train can lengthen their commute by an hour or more, cost them a hundred dollars in taxi fare, or worse. When riders kept telling us “the ETAs in your app are lies, damn lies!” — rather than pass the buck, blaming our data source, we saw an opportunity to do something golden.

No more lies. No more damn lies. Just pure, sweet, statistics. 😎

Try our new, machine-improved™ departure times. Now live in 25+ cities.
Download Transit on iOS and Android.
You may also like...
Our mission:
make cars obsolete.
Get Transit