CANOE Transportation Sector — Data Processing Documentation

Comprehensive documentation of the data pipeline that converts upstream data sources into a Temoa-ready SQLite database for the Canadian Open Energy (CANOE) model's transportation sector.


Table of Contents

  1. Overview
  2. Pipeline Architecture
  3. Upstream Data Fetching
  4. Spreadsheet Database Layer
  5. The Compiler Logic
  6. Known Assumptions and Limitations

1. Overview

Purpose

The CANOE transportation sector differs fundamentally from the other end-use sectors. Rather than pure Python transformations directly acting on raw APIs, the transportation pipeline utilizes an intermediate, human-readable Spreadsheet Database (Excel). A Python Compiler script then sanitizes, temporal-aggregates, and translates that spreadsheet into a strict Temoa-compatible SQLite database.

What the Tool Produces

Scope

The model operates across:


2. Pipeline Architecture

The transportation pipeline executes in a two-stage process:

1. get_nrcan_data.py    → Fetches raw base statistics and injects them into an Excel Template
2. CANOE_TRN_v4.xlsx    → *Human Editable Layer* containing projected bounds, tech trees, and costs
3. compile_transport.py → The Compiler:
   a. instantiate_database → Opens SQLite schema structure
   b. compile_techs/comms  → Loads rigid string parameters
   c. compile_demand       → Loads generic sectoral energy bounds
   d. compile_excap        → Loads historical fleet stock (Aggregated Quinquennially)
   e. compile_efficiency/costs → Connects financial & thermodynamic bounds to fleet logic
   f. compile_dsd/cft      → Attaches external RAMP-mobility hourly logic to EV charging nodes
   g. cleanup              → Actively prunes "dead" or unused technologies from the database to save solver memory

3. Upstream Data Fetching

Executed via get_nrcan_data.py:


4. Spreadsheet Database Layer

The spreadsheet_database directory houses the .xlsx files acting as the true modeling nexus for transportation. Unlike purely algorithmic sectors, predicting the capital cost decay of hydrogen fuel-cell heavy-duty trucks necessitates human assumptions that are better governed in a spreadsheet.

Key Tabs:


5. The Compiler Logic

Executed via compile_transport.py:

Quinquennial Aggregation

EV Charging Profiles (DSDs and CFTs)

Automatic Cleanup & Garbage Collection

Data Quality Indexing (DQI)


6. Known Assumptions and Limitations

  1. Quinquennial Distortion: Aggregating historical fleet data into 5-year blocks artificially smooths historical fleet adoption. It assumes the fleet turnover dynamics can ignore granular year-over-year deviations.
  2. RAMP-Mobility Fixation: Electric vehicle grid charging profiles rely strictly on static representations from RAMP-mobility models. It inherently models "dumb" charging patterns based on typical commuter behavior, rather than perfectly responsive price-arbitrage charging mechanisms.
  3. Time Zone Lock: RAMP profiles currently hard-force tz_convert('America/Toronto'). Applying this strictly to models operating in BC or AB may cause sunset/charging peaks to artificially miss-align with local grid solar profiles.
  4. Decoupled Future Projections: Because future bounds exist entirely within the Excel spreadsheets, CANOE updates do not automatically project transport dynamics. If the CER alters GDP assumptions, canoe-transportation does not magically respond; a human must manually readjust the Excel macro-scalars.

CANOE Transportation Sector — Data Sources Catalog

1. Data Source Summary

Data Type Primary Source Granularity Update Frequency
Historical Baselines NRCan CEUD Province / Annual Annual
EV Charging Maps RAMP-mobility Hourly / Seasonal Static / Independent
Financial Trajectories AEO, GREET, Internal Technology-Level Manual
Usage Patterns NHTS / Regional Studies Demographic / Daily Manual

2. Natural Resources Canada (NRCan) CEUD

3. RAMP-mobility Profiles

4. Analytical Forecasts (AEO, GREET, NHTS)

5. Update Procedures (Checklist)

During regular annual or biannual CANOE updates:

  1. NRCan Fetching: Run get_nrcan_data.py. Ensure the NRCan API URLs still correctly route to the .xls formats rather than .csv updates.
  2. Review Spreadsheets: Open the generated CANOE_TRN_[REGION]...xlsx files. Validate that the newly injected Background Data correctly cascaded through the formulas linking to ExCap and Demand.
  3. Literature Review: Manually scrutinize EV cost trajectories and ICCT/GREET emissions estimates. Update the spreadsheet manually if reality has outpaced the older projections.
  4. Compile: Execute compile_transport.py.
  5. Review Logs: The terminal will spit out specific cleanup() logs noting exactly which technologies were deleted. Ensure it hasn't accidentally wiped out a critical emerging technology due to a rounding threshold error.