CANOE CEF Sector — Data Processing Documentation

Comprehensive documentation of the data pipeline that converts upstream data sources into a Temoa-ready SQLite database for the Canadian Open Energy (CANOE) model's CEF (Canada's Energy Future) generalized sector.


Table of Contents

  1. Overview
  2. Pipeline Architecture
  3. Configuration and Setup
  4. External Data Handling
  5. Process and Commodity Generation
  6. Demands, Efficiencies, and Inputs
  7. Time-Slicing (DSDs)
  8. Known Assumptions and Limitations

1. Overview

Purpose

The CANOE CEF sector aggregation tool provides a top-down, generalized representation of Canadian energy demand directly inheriting the Canada Energy Regulator's (CER) Energy Futures scenarios. Instead of building detailed, bottom-up operational models (like residential buildings or industrial kilns), this module forces the optimizer to perfectly mirror the top-down macroscopic energy demands and fuel mixes published by the CER.

What the Tool Produces

Scope

The model operates across:


2. Pipeline Architecture

The processing logic is fundamentally contained within a singular orchestrator pattern rather than distinct domain scripts:

1. __main__.py          → Execution trigger
2. setup.py             → Instantiates SQLite database, parses `params.yaml`, and loads mapping CSVs
3. all_sectors.py       
   a. build_sectors     → Parses the core CEF CSV, maps domains, drops insignificant flows, and generates topology
   b. build_tester      → Establishes generic time-period bounds matching the data years
   c. build_dsd         → (Optional) Loads electricity demand-specific distributions mapping summer/winter/day/night profiles
   d. build_metadata    → Finalizes dataset labels and reference sources

3. Configuration and Setup

General Settings

Defined in input_files/params.yaml:

Domain Mapping

Configured meticulously through static CSVs within input_files/:


4. External Data Handling


5. Process and Commodity Generation

Inside all_sectors.py -> build_sectors():

  1. Pivoting to Proportions: Calculates the grand total energy usage for every generated technology node per year. Subsequently maps the proportional share of every fuel making up that total.
  2. Filtering Noise: If a specific fuel commodity represents a tiny fraction of the overarching technology bounds (falling under prop_thresh), it is entirely deleted to prevent solver instability from tracking infinitesimal matrices.
  3. Database Injection: Generates independent Technology, SectorLabel, and dual variants of Commodity blocks (one strictly for Demands, and one denoting physical Fuels).

6. Demands, Efficiencies, and Inputs


7. Time-Slicing (DSDs)

If toggled by use_dsd, the tool executes build_dsd().


8. Known Assumptions and Limitations

  1. Rigid Optimistic Modeling: Since LimitTechInputSplitAnnual is heavily enforced, the resulting Temoa model acting upon this dataset cannot actively optimize fuel-switching based on its own cost curves. It merely verifies and costs out the scenario already pre-calculated by the Canadian Energy Regulator.
  2. Data Staleness: Expecting a manual, local flat-file (end-use-demand-2023.csv) means the module does not automatically upgrade its dataset when the CER releases modern revisions unless the repository manager manually downloads and overwrites it.
  3. Efficiency Staticism (1.0): Bypasses complex realities involving technological turnover. It assumes the structural makeup modeled by the CER already resolves underlying efficiency evolutions internally.
  4. No Disaggregation: The model inherently accepts massive macro-sectors (e.g., 'Industrial') and ignores complex sub-sector granularity necessary for studying specialized industrial clusters or housing envelopes.
  5. DSDs are Electricity Only: Time-slicing strictly profiles electricity. Gas ramping or peak heating burdens functionally disappear into arbitrary annual averaged totals.

CANOE CEF Sector — Data Sources Catalog

1. Data Source Summary

Data Type Primary Source Granularity Update Frequency
End-Use Trajectories CER Canada's Energy Future Prov / Macro-Sector Periodic (1-2 Years)
Electricity Time Slices Internal Reference CSVs Hourly/Seasonal Proxy Static
Mappings Internal CSVs Sector / Fuel Nodes Static

2. CER Canada's Energy Future (CEF)

3. Internal DSD and Mapping Definitions

4. Update Procedures (Checklist)

During regular annual or biannual CANOE updates:

  1. Flat File Download: Manually procure the updated CER End-Use data file. Place it forcefully in the /input_files/ directory.
  2. Script Update: Point line 23 of all_sectors.py (e.g., pd.read_csv(config.input_files + 'end-use-demand-[YEAR].csv')) exactly toward the newly localized version.
  3. Parameter Audit: Open params.yaml. Verify the scenario configuration precisely patches a valid internal string inside the newly downloaded CER matrix (e.g., ensuring 'Global Net-zero' didn't transform into 'GNZ2050'). Ensure model_periods correctly aligns with your targeted study bounds.
  4. Mapping Audit: If execution fails throwing KeyError mappings, verify the CER didn't rename a fuel or sector label necessitating modifications inside sectors.csv or commodities.csv.