Ether Data Documentation

Comprehensive documentation for the Ether Data spatio-temporal data workspace

View the Project on GitHub

Census Data API

API for querying US Census demographic data using natural language and flexible geography inputs.

Overview

This API provides an intelligent interface to US Census demographic data, allowing users to query comprehensive population statistics using natural language queries and flexible geographic boundaries. The system combines AI-powered SQL generation with advanced spatial indexing to deliver accurate and efficient demographic insights.

Key Features:

Project Structure

proj/apis/census/
├── src/census/
│   ├── api/                     # FastAPI endpoints
│   │   ├── census.py            # Census query endpoint
│   │   └── health.py            # Health check endpoint
│   ├── models/                  # Pydantic data models
│   │   └── census.py            # Request/response models
│   ├── services/                # Business logic
│   │   └── census_service.py    # Census data query logic
│   ├── main.py                  # FastAPI application
│   └── server.py                # Server startup script
├── scripts/                     # SQL scripts and utilities
│   └── census_us_h3_aggregation.sql  # H3 aggregation SQL script
├── pyproject.toml               # Project configuration
└── README.md                    # Project documentation

API Endpoints

POST /query

Summary: Query Census Demographic Data

Query US Census demographic data for specific geographic areas using natural language. The API automatically converts geography inputs to H3 indexes and uses AI to generate and execute SQL queries against census data stored in BigQuery.

Geography Input: Supports multiple input formats including ZIP codes, DMA codes, cities, counties, coordinates, WKT polygons, and H3 indices. See the Geography Input Documentation for complete details on all supported types and examples.

Request Body (CensusQueryRequest):

{
  "geography": {
    "kind": "zip",
    "code": "94595"
  },
  "query": "What is the total population?",
  "include_h3_indexes": false
}

Geography Examples:

// ZIP Code
{"kind": "zip", "code": "94595"}

// City
{"kind": "city", "name": "Essex", "state": "MA"}

// County
{"kind": "county", "name": "Alameda", "state": "CA"}

// Point with radius
{"kind": "point", "lat": 37.7749, "lon": -122.4194, "radius": 5000}

// H3 Index
{"kind": "h3", "h3": "87283082bffffff"}

Response (CensusQueryResponse):

{
  "results": [
    {"geoid": "87283082bffffff", "total_pop": 1500}
  ],
  "total_results": 1,
  "h3_indexes": ["87283082bffffff"]
}

GET /health

Health check endpoint for service monitoring.

Natural Language Query Examples

The API supports a wide variety of natural language queries about demographic data:

Population Queries:

Age Demographics:

Income and Employment:

Housing:

Education:

Running the API

Standalone Operation

# Run the census API locally
cd proj/apis/census
uv run python -m census.server

The API will be available at http://localhost:8020

Via Gateway Integration

The Census API is integrated into the main API gateway and available at /v1/census/query when running through the gateway.

# Run via gateway
cd proj/apis/gateway
./run-local.sh

Access via gateway at http://localhost:8000/v1/census/query

Environment Variables

Configure the service using environment variables:

# Google Cloud Configuration
GOOGLE_CLOUD_PROJECT=your-project-id    # Required for BigQuery access

# API Configuration
CENSUS_PORT=8020                         # Port for standalone operation
ENVIRONMENT=development                  # Environment (development/production)

# BigQuery Settings
MAX_BYTES_BILLED=1000000000             # Max bytes for BigQuery queries
MAX_RESULTS_RETURN=10000                # Max results to return

Data Sources

The API accesses US Census data from several BigQuery datasets:

Example Usage

Query Population by ZIP Code

curl -X POST "http://localhost:8020/query" \
  -H "Content-Type: application/json" \
  -d '{
    "geography": {"kind": "zip", "code": "94107"},
    "query": "What is the total population?"
  }'

Query Demographics by City

curl -X POST "http://localhost:8020/query" \
  -H "Content-Type: application/json" \
  -d '{
    "geography": {"kind": "city", "name": "Essex", "state": "MA"},
    "query": "What is the median household income?"
  }'

Query with H3 Index Output

curl -X POST "http://localhost:8020/query" \
  -H "Content-Type: application/json" \
  -d '{
    "geography": {"kind": "county", "name": "Alameda", "state": "CA"},
    "query": "Population density",
    "include_h3_indexes": true
  }'

H3 Spatial Indexing

The Census API leverages Uber’s H3 hexagonal spatial indexing system for efficient geographic operations:

API Documentation

Interactive documentation is available when the service is running:

Standalone Mode:

Via Gateway:

Performance Considerations

Error Handling

The API provides detailed error responses for common issues:

Integration Examples

Python Client

import requests

response = requests.post('http://localhost:8020/query', json={
    'geography': {'kind': 'zip', 'code': '94107'},
    'query': 'What is the total population?'
})

data = response.json()
print(f"Population: {data['results'][0]['total_pop']}")

JavaScript/Node.js

const response = await fetch('http://localhost:8020/query', {
  method: 'POST',
  headers: {'Content-Type': 'application/json'},
  body: JSON.stringify({
    geography: {kind: 'city', name: 'Essex', state: 'MA'},
    query: 'median household income'
  })
});

const data = await response.json();
console.log('Results:', data.results);