Proposal: Add GTFS-NewShapes as experimental #272

ericouyang · 2021-05-10T18:28:16Z

Background

Detours are a major part of a dynamic urban environment. Deviations from the original route path can occur in a wide range of situations, including planned events and traffic incidents.

Today, in GTFS-RT, in these situations, there is already the ability to mark a stop as SKIPPED when a deviated route will no longer visit the stop. Additionally, an Alert can be created where the effect=DETOUR.

While these can communicate some of the impact of detours, neither of these approaches communicate the new shape of the route to enable passenger applications to display the path and assist in rider navigation.

Proposal

This proposal builds on a recent addition (#221) to support the specification of trip-level properties in real-time and adds in shape_id as a supported field. This can reference an existing shape in the static GTFS from shapes.txt or a new Shape specified in real-time via an encoded polyline

Both the gtfs-realtime.proto file and documentation have been updated.

This pull request is a subset of the GTFS-ServiceChanges v3.1 spec:
https://bit.ly/gtfs-service-changes-v3_1

This proposal builds on prior art from @lionel-nj and @barbeau (see MobilityData#47)

gcamp · 2021-05-10T18:34:32Z

I think it would be worthwhile to maybe not duplicate exactly what is in GTFS for the NewShape paradigm. Sending all that information that way is far from compact for a real time system. I would suggest sending encoded polyline , but open to any alternatives in the same vein.

skinkie · 2021-05-10T18:40:15Z

I agree with @gcamp. What is the reason you want to build a topological shape?

barbeau · 2021-05-10T20:46:23Z

Encoded polylines are definitely a more compact way to represent lines, but they only encode lat/lon (to my knowledge). This means we'd still need a way to represent shape_dist_traveled and shape_pt_sequence if we want to retain all the information from static GTFS.

We could consider dropping shape_pt_sequence as this information should be implicit in the encoded polyline.

shape_dist_traveled is optional and could be a separate array (repeated) of float values, with a requirement that if provided it's length is the same number of values as the points in the line. Although that may be prone to error.

skinkie · 2021-05-10T21:34:42Z

Encoded polylines are definitely a more compact way to represent lines, but they only encode lat/lon (to my knowledge). This means we'd still need a way to represent shape_dist_traveled and shape_pt_sequence if we want to retain all the information from static GTFS.

But this information should be exchanged "once" to introduce a new shape. Given the current data exchange it will be exchanged every time. Has some thought already been done on how to detect which data is actually new and should be added to the system, opposed to update everything that flies into a fetch?

shape_dist_traveled is optional and could be a separate array (repeated) of float values, with a requirement that if provided it's length is the same number of values as the points in the line. Although that may be prone to error.

I could argue that representing this information as a column store will result in a more compact representation. Hence the repeated approach. Within the format the number of elements is known, and it will have the exact same effect if one of the values in the row based format is 'forgotten'.

ericouyang · 2021-06-11T20:51:38Z

Thanks all for the feedback on this!

@gcamp - I'd love to get some more context in terms of how the payload size impacts Transit App. Does the GTFS-realtime pb feed get transmitted as-is to the end clients, hence needing to be sensitive from a mobile data consumption perspective? If so, to @skinkie's point, does your system today repeat exchange of largely static information that doesn't change very frequently, like GTFS-realtime Alerts, which probably would have similar update characteristics?

As suggested, we ran an experiment on our side to see the impact of using encoding on filesize and did find using encoded polylines to be much smaller:

Agency	Unencoded NewShapes Size (Original Proposal)	Encoded NewShapes Size (Alternate Proposal)	Example size of TripUpdates, for scale
Agency 1 (~150 routes)	108KB	24KB	~1.5MB
Agency 2 (~120 routes)	84KB	20KB	~850KB
Agency 3 (~110 routes)	200KB	36KB	~250KB
Agency 4 (~70 routes)	60KB	12KB	~200KB

Our methodology here was to randomly generate a new shape where each point was 100-200m away from the previous point for 25% of the all routes as an approximation of a more extreme situation where a lot of routes are on detour.

Given that the sizes are much smaller, I'm open to updating this proposal accordingly. I'd love to hear more from other folks on this, particularly other producers & consumers on the tradeoff here between filesize, consistency with GTFS Static, and interpretability.

For reference, here's a snippet of what the .proto would look like for this alternate representation:

message Shape {
  // Identifier of the shape. Must be different than any shape_id defined in the (CSV) GTFS.
  // NOTE: This field is still experimental, and subject to change. It may be formally adopted in the future.
  required string shape_id = 1;

  // Encoded polyline representation of the shape. This polyline must contain at least two points.
  // NOTE: This field is still experimental, and subject to change. It may be formally adopted in the future.
  required string encoded_points = 2;

  // Optional list of actual cumulative distances traveled along the shape to each point.
  // See definition of shapes.shape_dist_traveled in (CSV) GTFS.
  // NOTE: This field is still experimental, and subject to change. It may be formally adopted in the future.
  repeated float shape_dist_traveled = 3;

  // The extensions namespace allows 3rd-party developers to extend the
  // GTFS Realtime Specification in order to add and evaluate new features and
  // modifications to the spec.
  extensions 1000 to 1999;

  // The following extension IDs are reserved for private use by any organization.
  extensions 9000 to 9999;
}

gcamp · 2021-07-12T15:08:59Z

To me encoded polyline is a 👍. The only downside is that encoded polyline is not lossless and some precision is lost during encoding. Not to the level that will be noticed by a user but it might make some automated test harder to design.

Consistency with GTFS is a non-issue for me as this is a trivial conversion.

botanize · 2021-07-28T17:26:33Z

It looks like the current version of this pull-request defines a message Shape that includes an encoded polyline in the proto file, but the reference.md file documents a ShapePoint message that includes the individual coordinate pairs.

botanize · 2021-07-28T17:49:38Z

Changing or setting shape_id in TripUpdates works fine last-minute changes, but doesn't work well for near-term changes. By limiting the application of NewShapes to TripUpdates it's impossible to apply NewShapes to trips beginning the next service day through the next GTFS-static update, up to a week away. This gaps exists because TripUpdates only apply to the current service day and most consumers will only consume GTFS-static weekly, or require multiple days to ingest the feed.

The ability to apply near-term service changes is critical. We often know a day or two ahead of a detour, and want to provide the best information to customers as soon as possible. For example, we know there will be a detour affecting many of our routes tomorrow and we'd like to show trip plans for tomorrow with correct detour routing using a new shape. But even with this proposal the best we can do is show an incorrect routing for tomorrow's trip and add a service alert. We don't like to rely on service alerts because people either don't read them, are overwhelmed by the number of alerts, or don't understand the impact of them the way they do a visualization of a detour (speaking for myself here).

To meet our needs for near-term detour communication we could add another message to this proposal, which like Alerts uses time ranges and selectors to apply the new shape defined in Shapes to one or more current or future trips. Trips that are currently active in TripUpdates could use the proposed mechanism (the TripProperties.shape_id field), or both, with the contents of TripUpdates taking precedence over the alerts style selector message.

ericouyang · 2021-08-17T00:05:22Z

Great catch, @botanize! I've updated the reference file now to reflect the updated proposal

That's a great point about near-to-mid-term changes. I can't find the conversation now, but I believe there was some previous alignment in the community around GTFS-ServiceChanges representing things for up to the next 7 days. Anything that's longer should instead look to be reflected in Static GTFS.

I think the way in which one would do this is by creating a TripUpdate entity where TripDescriptor.start_date would be for a future start date. I think doing something like this rather than adding a new message reduces potential ambiguity in terms of how to represent it.

gcamp · 2021-08-17T20:12:59Z

I would add a link to the encoded polyline doc so it's clear what is being returned.

ericouyang · 2021-08-17T22:15:18Z

I've called for a vote for adoption of this experimental future on the Google Group (https://groups.google.com/g/gtfs-realtime/c/YWY9IoMQF7g?pli=1).

Please vote with a +1 (in favor) or -1 (against) before Wednesday, Aug 25th at 23:59:59 UTC. Thanks!

ericouyang · 2021-08-17T22:19:12Z

Thanks for the catch, @gcamp. It was already in the .proto file but wasn't in the reference file. Updated that so they're both consistent as well as also updated the description of the PR with a link to that resource.

juanborre

Reading the whole conversation, it is unclear to me why the dist_traveled field was dropped.

Although that information is not useful if all we want to do is to draw a line on a map, it is definitely useful for tools like a trip planner or a predictive model in order to calculate travel times.

Do we rely on the consumer calculating straight line distances based on the GPS coordinates of the polyline?

@ericouyang Could you provide more details about that decision? 🙏

gtfs-realtime/proto/gtfs-realtime.proto

ericouyang · 2021-08-19T13:44:54Z

Reading the whole conversation, it is unclear to me why the dist_traveled field was dropped.

Although that information is not useful if all we want to do is to draw a line on a map, it is definitely useful for tools like a trip planner or a predictive model in order to calculate travel times.

Do we rely on the consumer calculating straight line distances based on the GPS coordinates of the polyline?

@ericouyang Could you provide more details about that decision? 🙏

@juanborre - Good question! I removed it in the spirit of following the guiding principles to avoid speculative features and on the presumption that it's largely redundant information from the polyline itself. As a producer, if we were to populate this field, we would end up calculating the straight line distances anyways to pass along, which I would guess that trip planners already need to do today since shape_dist_traveled is optional in shapes.txt.

colemccarren · 2021-08-23T14:40:12Z

+1 RTA Maryland 👏

lauramatson · 2021-08-24T13:57:03Z

I think the way in which one would do this is by creating a TripUpdate entity where TripDescriptor.start_date would be for a future start date. I think doing something like this rather than adding a new message reduces potential ambiguity in terms of how to represent it.

I think this would work for us as a producer, but I want to make sure it wouldn't cause any issues for consumers. During summer construction & event season, the same trip may show up 5 times because there would be a different shape / combination of detours each weekday. Is it safe to assume consumers will be able to process the same trip id showing up multiple times as long as the start_date clarifies each unique trip instance?

gcamp · 2021-08-25T14:23:55Z

+1 Transit

paulswartz · 2021-08-25T15:04:05Z

+1 @mbta

gtfs-realtime/proto/gtfs-realtime.proto

gtfs-realtime/spec/en/reference.md

gtfs-realtime/proto/gtfs-realtime.proto

ericouyang · 2021-08-26T04:59:34Z

Adoption of this proposal as experimental has been accepted, with 3 votes in favor and 0 votes opposed. Thanks so much to everyone for your input into this and excited to see this improving rider experiences!

scmcca · 2021-08-26T15:02:24Z

I noticed the voting period was extended by 1 day and that 2/3 votes happened outside of the original period that you announced on the Google Changes mailing list (ending on August 24th at 23:59:59 UTC). While article 7.4 of the Specification Amendment Process states:

If the advocate continues the work on proposal then a new vote can be called for at any point in time.

I think the quiet extension of a voting period invalidates those last 2/3 votes.

Normally the practice is that another 7 day block must follow between content changes and vote recalls (albeit this is not strictly codified in the SAP, it should be!), especially if they are made during a voting period (article 6.2 allows only editorial changes to be made).

Let me know if I'm missing some context!

ericouyang · 2021-08-27T04:26:06Z

Thanks for flagging that, @scmcca! I had intended for the voting period to be slightly longer than the minimum requirement, but accidentally wrote "Wednesday, Aug 24th at 23:59:59 UTC" instead of "Wednesday, Aug 25th at 23:59:59 UTC" in the original announcement. Apologies for creating confusion there.

As there were only editorial changes made during this time period, it is our understanding that all votes are valid in support of this proposal.

Due to the incorrect original date, we'll keep this PR open until the end of the week and then the intent is for the change to be merged is as an experimental addition.

alesk1978 · 2023-03-13T17:33:16Z

Just want to make sure I understand correctly, the Shapepoint message that was in the proposal document is not used anymore, we only use the encoded polyline, correct?

google-cla bot added the cla: yes label May 10, 2021

Add Shapes to GTFS-RT

8d8fdfc

ericouyang force-pushed the gtfs-servicechanges-v3.1-newshapes branch from e12fb3a to 8d8fdfc Compare May 10, 2021 18:30

Change to using encoded polylines

9f17895

ericouyang added 2 commits August 16, 2021 19:16

Remove dist_traveled

eec0f89

Update reference

dd1decc

Update reference.md

513192f

juanborre reviewed Aug 18, 2021

View reviewed changes

gtfs-realtime/proto/gtfs-realtime.proto Outdated Show resolved Hide resolved

Clarify TripProperties.shape_id

603d4bf

scmcca added the Status: Voting Pull Requests where the advocate has called for a vote as described in the changes.md label Aug 21, 2021

scmcca removed the Status: Voting Pull Requests where the advocate has called for a vote as described in the changes.md label Aug 25, 2021

barbeau reviewed Aug 25, 2021

View reviewed changes

gtfs-realtime/proto/gtfs-realtime.proto Outdated Show resolved Hide resolved

barbeau reviewed Aug 25, 2021

View reviewed changes

gtfs-realtime/spec/en/reference.md Outdated Show resolved Hide resolved

barbeau reviewed Aug 25, 2021

View reviewed changes

gtfs-realtime/proto/gtfs-realtime.proto Show resolved Hide resolved

Address PR comments

3206e36

barbeau merged commit 99fdfba into google:master Aug 30, 2021

scmcca added proposal GTFS Realtime Issues and Pull Requests that focus on GTFS Realtime labels May 20, 2022

mads14 mentioned this pull request Feb 27, 2023

Draft proposal : GTFS-TripModifications #369

Closed

This was referenced Sep 25, 2023

GTFS Trip-Modifications TransitApp/transit#9

Closed

GTFS Trip-Modifications #403

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Add GTFS-NewShapes as experimental #272

Proposal: Add GTFS-NewShapes as experimental #272

ericouyang commented May 10, 2021 •

edited

Loading

gcamp commented May 10, 2021

skinkie commented May 10, 2021

barbeau commented May 10, 2021

skinkie commented May 10, 2021

ericouyang commented Jun 11, 2021

gcamp commented Jul 12, 2021

botanize commented Jul 28, 2021

botanize commented Jul 28, 2021

ericouyang commented Aug 17, 2021

gcamp commented Aug 17, 2021

ericouyang commented Aug 17, 2021 •

edited

Loading

ericouyang commented Aug 17, 2021

juanborre left a comment

ericouyang commented Aug 19, 2021

colemccarren commented Aug 23, 2021

lauramatson commented Aug 24, 2021

gcamp commented Aug 25, 2021

paulswartz commented Aug 25, 2021

ericouyang commented Aug 26, 2021

scmcca commented Aug 26, 2021 •

edited

Loading

ericouyang commented Aug 27, 2021

alesk1978 commented Mar 13, 2023

Proposal: Add GTFS-NewShapes as experimental #272

Proposal: Add GTFS-NewShapes as experimental #272

Conversation

ericouyang commented May 10, 2021 • edited Loading

Background

Proposal

gcamp commented May 10, 2021

skinkie commented May 10, 2021

barbeau commented May 10, 2021

skinkie commented May 10, 2021

ericouyang commented Jun 11, 2021

gcamp commented Jul 12, 2021

botanize commented Jul 28, 2021

botanize commented Jul 28, 2021

ericouyang commented Aug 17, 2021

gcamp commented Aug 17, 2021

ericouyang commented Aug 17, 2021 • edited Loading

ericouyang commented Aug 17, 2021

juanborre left a comment

Choose a reason for hiding this comment

ericouyang commented Aug 19, 2021

colemccarren commented Aug 23, 2021

lauramatson commented Aug 24, 2021

gcamp commented Aug 25, 2021

paulswartz commented Aug 25, 2021

ericouyang commented Aug 26, 2021

scmcca commented Aug 26, 2021 • edited Loading

ericouyang commented Aug 27, 2021

alesk1978 commented Mar 13, 2023

ericouyang commented May 10, 2021 •

edited

Loading

ericouyang commented Aug 17, 2021 •

edited

Loading

scmcca commented Aug 26, 2021 •

edited

Loading