CANOE Fuel Sector — Data Processing Documentation
Comprehensive documentation of the data pipeline that converts upstream data sources into a Temoa-ready SQLite database for the Canadian Open Energy (CANOE) model's fuel delivery/price sector.
Table of Contents
- Overview
- Pipeline Architecture
- Configuration and Setup
- External Data Fetching (EIA AEO)
- Technology and Commodity Mapping
- Cost Variables and Proxying
- Efficiency and Emissions
- Known Assumptions and Limitations
1. Overview
Purpose
The CANOE fuel sector tool does not represent physical extraction or refining bounds locally. Instead, it creates a structured "dummy" fuel delivery layer wrapping around the demand sectors. It functionally provides the infinite-supply energy source nodes bound strictly by dynamic energy prices, emission limits, and currency/inflation adjustments.
What the Tool Produces
- A Temoa-schema SQLite database containing the "import" technologies connecting raw theoretical fuel pools (
F_ng,F_dsl) into sector-specific end-use commodities (C_ng,I_dsl,T_gsl). - Fully extrapolated fuel pathways holding capital costs (
CostVariable), efficiencies, emission generation flags, and lifetime definitions.
Scope
The model operates across:
- Regions: Canadian provinces.
- Commodities: Encompasses fundamental end-use fuels (Natural Gas, Diesel, Heavy Fuel Oil, Propane, Gasoline, Biomass, Uranium, Ethanol, Renewable Diesel, SPK, and Hydrogen).
- Planning horizon: Configurable periods defined within the core parameters (default spans 2025 onwards).
2. Pipeline Architecture
The processing logic utilizes an ETL pipeline overseen by aggregator.py:
1. aggregator.py → Main orchestrator
a. setup_runtime → Initialize database, build runtime frames, and define currency/inflation factors
b. eia_api → Fetch EIA AEO price trajectories (or load from local pickle cache)
c. techcom → Format the baseline physical flows (mapping inputs -> outputs)
d. efficiency → Establish perfect bounds representing theoretical transport/delivery (Eff = 1.0)
e. costvariable → The core logic block calculating dynamic and static proxy fuel prices
f. emissionactivity → Loads specific emission limits from static CSVs for combustion flows
g. post_processing → Embed datasets and general sources taxonomy
3. Configuration and Setup
General Settings
Defined in input/params.yaml:
schema_version: Set to version 3.1.periods: List of modeled future years (e.g.,[2025, ...]).eia_year: The specific vintage of the EIA AEO forecast to query (e.g., 2025 vs 2023).b_price&u_price: Statically defined $CAD prices for specific insulated domestic resources like biomass and uranium.
Inflation and Currency
Handled structurally inside setup.py (via inflation_constants() function):
- Conversion variables (e.g.,
deflation_2022,deflation_2025). - Volumetric adjustments (e.g.,
mmbtuconvertor = 1.055). - Forex alignment (
currencyadjustment = 1.22).
4. External Data Fetching (EIA AEO)
CANOE uses the U.S. EIA Annual Energy Outlook (AEO) as the bedrock assumption for future North American wholesale energy price evolution.
- Mechanism (
eia_api.py): Uses the EIA open API to explicitly pull Table 3: Energy Prices by Sector and Source. - It queries the specific
eia_yearconfiguration from the Reference scenario (ref2025). - Outputs are aggressively filtered downstream inside
setup.pystrictly keeping fields utilizing explicit MMBtu measurement units and discarding unneeded averages to isolate single physical flows.
5. Technology and Commodity Mapping
The fuel sector creates bridging nodes. For every physical vector, an input commodity (e.g., generic electricity E_elc_dem) passes through a dummy technology (E_COMMERCIAL_ELC) and becomes available precisely to the specific end-use node (C_elc).
fuel_list.csv: Contains the taxonomy defining how generic commodities format.- Lifetime: Transfer technologies are assigned an arbitrary
LifetimeTechof 5 years. This simply forces the optimizer to continually recognize the capacity as fully liquid rather than physically locked.
6. Cost Variables and Proxying
The CostVariable component is highly complex as it resolves mapping disparate U.S. pricing models into full Canadian sectoral resolutions.
Variables are resolved inside costvariable.py using _calc_value():
- Explicit Static Config: Biomass and Uranium are explicitly calculated via config values (
b_price,u_price). - Explicit Code Config: Non-fossil drop-in fuels (Ethanol, Renewable Diesel, SPK) utilize hard-coded price floors configured in
inflation_constants(). - EIA Baseline Flows: Core demands (e.g., Industrial Natural Gas) pull their specific value dynamically directly from the loaded AEO Table 3 DataFrame.
- Proxy Logic: The tool contains massive internal redirect mechanisms. Where exact matches aren't provided by AEO, proxy logic is utilized:
- LNG/CNG: Proxies to Transportation Natural Gas pricing using a discount factor (
0.89). - LPG: Residential LPG maps to Residential Propane; generic LPG maps to Transport Propane.
- Agriculture: Ag NG, Diesel, and Gasoline all directly mirror Transportation or Industrial sector pricing for their equivalent liquids.
- Marine Diesel (MDO): Bounded explicitly as 90% (
0.9 *) of Transportation Diesel. - Electricity-Side Fuels: Residual oils, coal, and coke purchased by the electricity sector proxy back tightly to generic industrial generation fuel cost profiles.
- LNG/CNG: Proxies to Transportation Natural Gas pricing using a discount factor (
All outputs are fundamentally pushed through the MMBtu -> PJ conversion while rectifying $USD to $CAD(2020) vectors.
7. Efficiency and Emissions
Efficiencies
Calculated in efficiency.py:
- Fixed at 1.0: The mechanism assumes that drawing energy across a border boundary to an end-use node is a perfect transfer energetically. Any actual pipeline leaks, resistance losses, or spillage must be handled dynamically elsewhere (e.g., via provincial grid line efficiency variables).
Emissions
Calculated in emissionactivity.py:
- Sources: Connects local CSVs (
upstream_emissions_fuels.csvanddirect_comb_emission.csv) to map specificemis_commtargets against combustion flows. - Ties dummy tokens directly to operations ensuring that 1 PJ pulled enforces exactly N kilotons of generated physical emissions strictly inside the optimization math.
8. Known Assumptions and Limitations
- Proxying Economics via AEO: Extending US energy pricing vectors natively against the entire Canadian energy regime ignores major geographic realities. Domestic pipeline constraints, provincial royalties (like in Alberta), and varying carbon tax architectures are smoothed over massively by inheriting AEO wholesale trends.
- Hardcoded Biofuel Economics: Ethanol, SPK, and Renewable Diesel are fundamentally hardcoded. They do not enjoy dynamic cost curve scaling corresponding to shifting global agricultural futures heavily skewing biofuel adoption pathways outside the near term.
- Marine & Transport Mappings: Anchoring prices (like
MDO -> 0.9 * T_dslorAgriculture Gas -> T_gsl) involves rigid fractional assumptions. Changes in marine bunker markets independent of general diesel markets are fundamentally invisible. - Inflation Anchoring: Core currency conversion strings and deflators are rigidly locked at 2022/2025 baselines inside
setup.py. Severe rapid economic realignments post-2025 are inherently missed. - Efficiency Staticism (1.0): Fuel imports strictly lack transmission/pipeline losses represented inside this specific subsystem.
CANOE Fuel Sector — Data Sources Catalog
1. Data Source Summary
| Data Type | Primary Source | Granularity | Update Frequency |
|---|---|---|---|
| Fossil Fuel Prices | U.S. EIA AEO (Table 3) | USA Aggregate / Annual | Annual |
| Emission Factors | Direct Internal CSVs | Fuel / Output Type | Static/Periodic |
| Biomass/Uranium Prices | params.yaml config |
Domestic Aggregate | Periodic |
| Alternative Fuel Specs | Python Constants | Domestic Aggregate | Periodic |
2. U.S. EIA Annual Energy Outlook (AEO)
- Usage: Provides Table 3 dynamic trajectories anchoring general wholesale and retail fuel prices over the multi-decade horizon.
- Update Procedure: Relies profoundly upon their open API tool. An API key must be inserted properly. Adjust the
eia_yearconfiguration parameter insideparams.yamlto instructeia_api.pyregarding which AEO release vintage to query initially.
3. Internal CSVs
- Usage:
upstream_emissions_fuels.csv,direct_comb_emission.csv, andfuel_list.csvmanually map basic physics directly corresponding to fossil combustion dynamics inside Canada. - Update Procedure: Must be manually updated inside the
/input/repository. Update cycles are infrequent unless EPA/ECCC fundamentally recategorize emission equivalence thresholds.
4. Internal Constants (Setup/Config)
- Usage: Controls hardcoded variables mapping exact boundaries for Biofuels, Forex ratios, and Deflation baselines.
- Update Procedure: For CANOE periodic audits, users must review
inflation_constants()explicitly insidesetup.pyand modify the values ofcurrencyadjustment,deflation,eth_price,rdsl_price, andspk_priceleveraging updated monetary and spot-market policy bounds.
5. Update Procedures (Checklist)
During regular annual or biannual CANOE updates:
- EIA API Key & Caching: Assert an active EIA developer API string exists inside the configuration boundaries. Delete the contents of
cache/dataframes.pklforcing CANOE to communicate with Washington and ingest the newest energy matrices. - Review Params.yaml: Update
eia_yearreflecting newly published AEO projections. Determine ifb_priceoru_pricedemand spot-market corrections. - Review Setup.py Constants: Critically review
inflation_constants()insidesetup.py. The inflation, Forex ratio, and raw alternative fuel (Eth/SPK/RDsl) variables explicitly need manual oversight against current Canadian fiscal policy. - Execution Test: Generate the fuel node database. Search manually for logging output failures warning around
"No base price for X"to ensure AEO hasn't renamed fundamental query strings that broke downstream price proxy matching.