by WAYNE WATANABE and MARK HALLENBECK*
There has been considerable concern expressed about the declining reliability of bus arrival information provided by OneBusAway (OBA) over the past seven months. As the managers on both ends of the data stream, we’d like to provide a little more insight into why those errors are happening and what is being done.
Last summer, OBA and the other tracking applications appeared to be working reasonably well. But, users probably weren’t aware of how much work it took behind the scenes to make that happen. Metro and University of Washington researchers led by Dr. Dan Dailey had 15 years of data debugging invested in the stream going out to developers like Brian Ferris, who created OBA. On the receiving end, Brian was making adjustments to the data and code to make OBA present the most accurate predicted-arrival information possible.
The data and its presentation in the tracking programs has never been 100 percent accurate, but the errors coming from Metro’s legacy system using Automatic Vehicle Location (AVL) were well understood, allowing for better correction and filtering by both Metro and the app developers.
Then, last October several things happened:
- Metro had a huge service change that created large, multiple data updates. After the service change took place, some corrections were needed. It was hard for app developers to keep up with all the updates.
- More Metro buses were converting to the new GPS-based On Board Systems (OBS), which required Metro and OBA to handle data streams from two different vehicle-location systems simultaneously.
- The OBS system was providing more data, but it had to work in tandem with the old systems and that created more places where inconsistencies and data errors could happen.
- The legacy system had been repeatedly fine tuned over the years. The OBS data stream has not yet had that same long-term rigorous analysis.
- Accurate bus tracking depends on both scheduled data that defines where and when the bus should be at a location, and “real-time” data that reports where that bus actually is right now. The only one that Metro controls is the scheduled data. With the new system, Metro had to manipulate the scheduled data to create a format that OBS could accept. That led to more opportunities for data errors, and the need for more corrections and updates. We had to perform some unexpected manipulations to our own data. Now, we’re going back over those manipulations to see if they are contributing to some of the problems we’ve been seeing over the past seven months.
- Over at OBA, there was a transition in support and operations as Brian left the UW, and a partnership between the transit agencies and the university was created to keep OBA running.
The problems are not only showing up on OBA; Metro’s Tracker app is also affected. We believe there are three key places where the errors are popping up: the new bus hardware; the data manipulation in the middle; and how the applications handle the output from the two bus-location data feeds.
Metro and OBA staff have been talking and meeting a lot over the past several months. We have identified and fixed a variety of errors in the process already, but we haven’t finished identifying all of the errors. As a result, we’ve prioritized the creation of an error-trapping process which allows us to trace the errors we (and others) observe. We then trace those errors backwards through all of the current data manipulation steps to find and fix the cause. It’s tedious, painstaking work, but necessary while we sort out the complex data path.
Metro is also starting a data remodeling project. The same data model has been used for 30 years for everything from the creation of paper timetables to the current automated systems. Over the years, new products like Trip Planner, different vendors, and changing systems have been additions to Metro’s “data house.” It’s time to rebuild that house.
Take a step back and look at all the factors. It’s not the same, but think about what happens when you try to digitize your dad’s (or grandpa’s) Led Zeppelin album to play on your iPhone. When you convert a classic vinyl LP into a digital format, it’s no good without noise correction. It can be a tricky balance that you can’t correct until you actually listen to all the songs. Leave in all the clicks and pops, and the static is louder than the music. If you eliminate too much, you weaken the integrity of the analog signal. And, if you are too brutal with the noise reduction, you can lose the signal entirely.
Right now, OBA is catching all the clicks and pops in Metro’s data streams and reporting them as information about buses on the road. With more data streaming in from Metro’s new OBS, there is “noise” we never expected – and now we are trying to track it, filter it, and improve the accuracy of tracking information for public use. We assure you that everyone involved is very serious about finding solutions to fix the problems.
Problems delivering real-time transit info are not just happening here in King County. There’s a lot of research and reporting about how difficult deployment of these new bus systems are, and what havoc they can create with tracking apps. If you want to read old articles about this, a good place to start is Washington, DC’s NextBus implementation. The agencies that seem to have fewer problems are those that spend a lot of money with one vendor on a single system. Metro is integrating as much and as fast as it can, but it’s working with multiple systems and vendors – and dwindling dollars for its entire budget.
While Metro has assigned resources to make fixing the tracker apps a top priority, not everything can be done at once. We have already started the work and are seeing some successes. For example, on April 11 Metro replaced its old General Transit Feed Specification (GTFS) feed and saw some problems resolve that same day. UW staff will be looking at ways that OBA might be modified to prevent the propagation of data errors.
It won’t be an overnight fix. We know there are multiple challenges, and as it always has been, the fixes will be ongoing. Each service change, every reroute, and all the traffic disruptions mean there is no rest for the programmers on either end of the data feed.
*Mr. Watanabe is the IT Service Delivery Manager for the King County Dept. of Transportation. Dr. Hallenbeck is Director of the Washington State Transportation Center.