Comments (6)
I just looked into this a bit. Adding some notes that could assist with future work:
On the specific place where CSV path output fetches IDs: Currently PathResult.summarizeIterations()
calls RouteSequence.detailsWithGtfsIds()
which in turn calls TransitLayer.routeString()
and TransitLayer.stopString()
. The routeString
is from a RouteInfo
which does not include the feed ID. On the other hand, the stopString
is taken from stopIdForIndex
which does include the feed ID (see TransitLayer.loadFromGtfs
around L209), but that feed ID is being removed in the method.
More generally on which IDs are available within R5: The core internal data model for public transit is rooted at TransitLayer
, which retains gtfs-lib Stop
objects in the stopForIndex
field. These objects have a feed ID field. On the other hand, in TransitLayer.routes
we are storing RouteInfo
objects (which do not have a feed ID) rather than gtfs-lib Route
objects (which do have the feed ID). Around TransitLayer
L292 where we construct the TripPattern
, the Route
is converted to a RouteInfo
and discarded. However, the feed ID is included in the route ID in each newly constructed TripPattern
(all trips grouped together under a TripPattern come from the same Route).
Considering all that, it would be relatively straightforward to include the feed ID in CSV output, though the question arises of whether this should always happen or it should be configurable, as it changes the existing format and adds some repetitive noise to the output from the perspective of users with only a single feed. The ID is most readily available on the stops (not the routes), but references to stops are always from routes in the same GTFS feed so we need only one feed ID per routeId / boardStopId / alightStopId triple. The structure of the CSV rows as parallel pipe-separated arrays means it should be possible to just add another pipe-separated feedIds array in a supplemental column.
As for the nature of the feed IDs and how they are set: It is possible for feeds to specify their own identifier in the feed_id
column of feed_info.txt
, but Conveyal overrides this ID with a random unique identifier (BundleController.java:198
). Conveyal often needs to handle several different versions of the same feed from different times or with modifications applied, and it's useful in many places for the feed IDs within the application to match the unique upload IDs rather than a feed-specified feed ID that could collide across several uploaded feeds.
This is related to issue #909, where the system also depends on entities having unique IDs that won't collide across feeds, so can't reliably fall back on user-specified IDs, so other fields are under consideration (the route short and long name).
So, in the CSV output would it be most useful to see:
- The random unique GTFS file upload ID used in Conveyal's internal database
- The ID of the agency operating the route
- The name of the agency operating the route
- The feed_publisher_name, feed_publisher_url, or feed_id from feed_info
- Some other user-specified string injected into the feed file or specified at upload to Conveyal
The first three should be more straightforward to retrieve and add as a new column to the CSV. The last two would be a bit more tricky as we'd need to retain them in the Bundle or TransportNetwork or TransitLayer, but they seem possible in principle.
from r5.
Thanks @abyrd!
From your list, I think "Some other user-specified string injected into the feed file or specified at upload to Conveyal" would be ideal for our needs.
That said, we could totally work with "The random unique GTFS file upload ID used in Conveyal's internal database", especially if that would be a lot faster to get up and running.
from r5.
Hey @abyrd @ansoncfit
Just checking in to see if you have had any more time to think about this, having this capability would be a huge help for us.
Happy to jump on a call sometime if that would help.
Thanks!
from r5.
Hi @edasmalchi,
Apologies for the delayed reply, I'm still catching up on things after TRB preparations/travel.
We should have a prototype ready for you to try next week, based on the Conveyal-generated UUID. For initial testing, you can grab these ids by using DevTools to inspect network requests to the https://analysis.conveyal.com/api/db/bundles
endpoint.
More soon!
from r5.
references to stops are always from routes in the same GTFS feed so we need only one feed ID per routeId / boardStopId / alightStopId triple
One wrinkle: with a reroute modification, a route from feed x can actually reference stops from feed y.
from r5.
Related Issues (20)
- r5 analysis problem HOT 1
- Make TransferAllowance an interface
- Readme in docs/README.r5.md still relevant?
- Send IDs for origin and destination pointsets in HTTP API
- Improve error messages on small/large GIS features
- Support uploading Shapefiles as a single .zip HOT 1
- Extra minute of wait time?
- Unsupported Operation on single point with decay function HOT 2
- Routes added by modifications are not recognizable in paths output HOT 1
- Return warning when decay function is above 0 at maximum cutoff
- Selected-link analysis for assignment HOT 1
- Incorrect display of regional results referencing deleted destination grids
- Some fields of CSV writers are not initialized while headers are written
- User chooses to take a long boomerang transfer at a far away station rather than walk across the street
- Reduce errors sent to broker HOT 2
- Check number of opportunities in destination layers used for regional analysis
- Human-readable names for destination layers in CSV output HOT 1
- Transit layer loaded but numerous "ERROR c.c.r.s.StreetLayer" errors (r5r)
- Snapped distance not being calculated as crow flies "walking" distance (r5r) HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from r5.