Ether Data Documentation

Comprehensive documentation for the Ether Data spatio-temporal data workspace

View the Project on GitHub

TomTom Traffic Pipeline

This package provides functionality for fetching traffic data from TomTom’s Intermediate Traffic API and loading it into BigQuery using Cloud Run Jobs.

Quick Start

Local Development

# Run the pipeline locally (from workspace root)
./proj/tomtom_intermediate_traffic/run-local.sh

# This will:
# - Navigate to workspace root automatically
# - Use uv to manage dependencies and virtual environment
# - Run the pipeline job with proper workspace setup

Prerequisites for local development:

Cloud Deployment

# Deploy to Google Cloud Run Jobs
./deployment/deploy.sh your-project-id us-central1

# Setup Cloud Scheduler
./deployment/setup-scheduler.sh your-project-id us-central1

Required Secrets

The pipeline requires these secrets in Google Cloud Secret Manager:

Create them using:

# API key (string value)
echo 'your-api-key-string' | gcloud secrets create tomtom_intermediate_api_key --data-file=-

# Certificate files (PEM format)
gcloud secrets create tomtom_client_certificate --data-file=client.pem  
gcloud secrets create tomtom_client_key --data-file=client-key.pem

Environment Variables

Note: For local development, you can set these in a .env file in the workspace root.

Architecture

Data Pipeline

  1. Checks BigQuery for last update timestamp
  2. Fetches data from TomTom API with conditional requests
  3. Converts protobuf to structured data with WKT geometry
  4. Loads data into BigQuery with staging table approach
  5. Atomically updates main table (idempotent)