tides-generator

Tools for generating synthetic TIDES-compliant data files from GTFS inputs.

Vehicle Locations Generator

The generate_vehicle_locations.py script creates synthetic vehicle location data that follows the TIDES vehicle_locations schema.

Prerequisites

Python 3.9 or higher
Poetry (Python package manager)
GTFS data files (routes.txt, trips.txt, stops.txt, stop_times.txt)

Installation

# Install Poetry if you haven't already
curl -sSL https://install.python-poetry.org | python3 -

# Clone and install project
git clone https://github.com/your-username/tides-generator.git
cd tides-generator
poetry install

This will create a virtual environment and install all required dependencies:

pandas: Data manipulation library
frictionless: Data validation toolkit

Usage

poetry run generate-vehicle-locations <gtfs_path> --routes <route_ids> --start-date <start_date> --end-date <end_date> [--output <output_file>]

Arguments:

gtfs_path: Path to directory containing GTFS files
--routes: Comma-separated list of route IDs to generate data for
--start-date: Start date in YYYY-MM-DD format
--end-date: End date in YYYY-MM-DD format
--output: Output file path (default: vehicle_locations.csv)

Example:

# Generate one day of data for route 1
poetry run generate-vehicle-locations septa-gtfs-bus --routes 1 --start-date 2024-01-01 --end-date 2024-01-01

# Generate a month of data for multiple routes
poetry run generate-vehicle-locations septa-gtfs-bus --routes 1,2,3 --start-date 2024-01-01 --end-date 2024-01-31 --output january_locations.csv

Output Format

The script generates a CSV file following the TIDES vehicle_locations schema with these fields:

location_ping_id: Unique identifier for each location event
service_date: Date of service
event_timestamp: Timestamp of the location event
trip_id_performed: ID of the performed trip
trip_id_scheduled: ID of the scheduled trip
trip_stop_sequence: Order of stops within the trip
scheduled_stop_sequence: Scheduled order of stops
vehicle_id: Vehicle identifier
device_id: Device identifier
pattern_id: Route pattern identifier
stop_id: Current or next stop ID
current_status: Vehicle status (Stopped at/In transit to)
latitude: Vehicle latitude
longitude: Vehicle longitude
gps_quality: GPS signal quality
heading: Vehicle heading in degrees
speed: Vehicle speed in meters per second
odometer: Vehicle odometer reading
schedule_deviation: Schedule adherence in seconds
headway_deviation: Headway adherence in seconds
trip_type: Type of trip (In service/Deadhead/etc.)
schedule_relationship: Stop status (Scheduled/Skipped/etc.)

The generated data includes realistic vehicle movements interpolated between stops, with appropriate status changes, speed variations, and schedule deviations.

Validation

The validate_vehicle_locations.py script validates generated files against the TIDES schema using the frictionless data toolkit.

Prerequisites

The validator is included in the main package installation.

Usage

poetry run validate-vehicle-locations <vehicle_locations.csv>

Example:

# Validate a generated file
poetry run validate-vehicle-locations test_vehicle_locations.csv

The validator will check that:

All required fields are present
Field values match their specified types
Values are within allowed ranges
Enumerated fields contain valid values
Foreign key references are valid

If validation fails, the script will output detailed error messages indicating which rows and fields have issues.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

tides-generator

Vehicle Locations Generator

Prerequisites

Installation

Usage

Output Format

Validation

Prerequisites

Usage

Files

README.md

Latest commit

History

README.md

File metadata and controls

tides-generator

Vehicle Locations Generator

Prerequisites

Installation

Usage

Output Format

Validation

Prerequisites

Usage