CANOE Commercial Sector — Data Processing Documentation

Comprehensive documentation of the data pipeline that converts upstream data sources into a Temoa-ready SQLite database for the Canadian Open Energy (CANOE) model's commercial buildings sector.


Table of Contents

  1. Overview
  2. Pipeline Architecture
  3. Configuration and Setup
  4. Pre-Processing
  5. Existing Capacity and Demand
  6. New Capacity
  7. Hourly Demand Profiles (ComStock)
  8. Emissions
  9. Post-Processing
  10. Known Assumptions and Limitations

1. Overview

Purpose

The CANOE commercial sector aggregation tool automatically constructs a Temoa-compatible SQLite database representing the Canadian commercial buildings sector. It dynamically downloads and processes data from multiple sources—including Natural Resources Canada (NRCan), U.S. EIA Annual Energy Outlook (AEO), NREL ComStock, and Statistics Canada—to build a comprehensive set of end-use demands, existing technology stocks, and new technology options.

What the Tool Produces

Scope

The model covers:


2. Pipeline Architecture

The aggregation runs in a fixed sequence, orchestrated by commercial_sector.py and all_subsectors.py:

1. setup.py             → Load configuration, download macro/demographic data, build API bridges
2. instantiate_database → Create or wipe SQLite database using the Temoa schema
3. all_subsectors.py    → Orchestrator for all subsector components
   a. pre_process()     → Write time periods, regions, temporal structures, and commodities
   b. [Per Region Loop]
      i.   comstock_dsd.py      → Download and map hourly demand profiles from NREL ComStock
      ii.  existing_capacity.py → Calculate annual demand, allocate existing stock, and write DSDs
      iii. new_capacity.py      → Write techno-economic parameters for new technology options
   c. aggregate_emissions() → Compute direct emissions for energy combustion
   d. post_process()    → Clean up vintages, attach references, and validate data IDs

3. Configuration and Setup

3.1 Configuration File (params.yaml)

The primary configuration file dictates macro variables, API links, and processing rules:

Parameter Default Description
period_step 5 Years between model periods
model_periods [2025, 2030, ..., 2050] Planning horizon periods
base_year 2022 Default year for pulling non-timeseries data (NRCan context)
weather_year 2018 Year matching the ComStock hourly profiles
other_electrification_factor 0.62 Target fraction of non-electric fuel shifted to electricity for "other" enduses by 2050
dsd_tolerance 0.02 Minimum threshold for hourly demand as fraction of mean
sec_tolerance 0.05 Minimum threshold for secondary energy consumption

3.2 Technology and Definitions CSVs

3.3 Data Caching

All remote data (AEO spreadsheets, StatCan zips, ComStock profiles, macro indicators) are downloaded to a local data_cache/ directory to speed up subsequent aggregations. force_download: true will bypass the cache.


4. Pre-Processing

The pre_process() function builds the structural foundation of the model:


5. Existing Capacity and Demand

This constitutes the core analytical engine in existing_capacity.py.

5.1 Defining the Existing Stock Equation

The model defines the existing commercial building stock mathematically:

DEM (Output Demand) = SEC (Input Fuel) × EFF (Efficiency)

CAP (Capacity) = DEM / (ACF × C2A)

5.2 Base Year Secondary Energy Consumption (SEC)

Base year (e.g., 2022) fuel consumption for space heating, space cooling, and other end-uses is downloaded from the NRCan Comprehensive Energy Use Database (CEUD).

5.3 Proxied Efficiencies and Market Shares

Since NRCan does not provide detailed installed-base efficiencies, the tool proxies them from the U.S. EIA Annual Energy Outlook (AEO) Commercial Demand Module (CDM):

5.4 "Other" End-Uses

All residual energy usage aside from space heating and space cooling is lumped into a generalized "other" end-use. The fuel shares of this "other" demand are allowed to evolve.

5.5 Scaling Future Demand

Base year service demands are scaled into the future (params.yaml model periods) using GDP growth projections derived from the Canada Energy Regulator (CER) Energy Future reports.


6. New Capacity

For prospective technology adoption, new_capacity.py ingests data entirely from the EIA AEO ktekx.xlsx technology menus.


7. Hourly Demand Profiles (ComStock)

To capture the intra-annual temporal dynamics necessary for capacity expansion modeling, hourly demand shapes are synthesized via comstock_dsd.py:


8. Emissions

If include_emissions is set to true in params.yaml:


9. Post-Processing

The final step handles bookkeeping:


10. Known Assumptions and Limitations

  1. US Proxies for Canadian Building Stock: The commercial pipeline relies heavily on the EIA AEO for specific HVAC efficiencies, technology costs, and installed base distributions. Furthermore, NREL ComStock is used for hourly usage profiles. While efforts are made to align similar climates (e.g., US Census Divisions matching Canadian Provinces), systemic differences between US and Canadian commercial building codes, occupancy norms, and construction materials are not explicitly captured.
  2. Atlantic Aggregation Proxying: Extracting individual province data from NRCan’s aggregated Atlantic region requires using a different Statistics Canada dataset with different survey scopes. This disaggregation is an approximation.
  3. Fixed Peak-to-Mean Demand Profiles: The hourly demand profile shapes generated from the base weather year (e.g., 2018) are assumed to remain constant across all future model periods.
  4. Simplification of "Other" Demands: Distinct end-uses such as water heating, lighting, and refrigeration are aggressively aggregated into a single "other" commodity, smoothing out individual usage profiles. The electrification transition for these is deterministic (linear interpolation) rather than endogenously optimized.

CANOE Commercial Sector — Data Sources Catalog

1. Data Source Summary

Data Type Primary Source Granularity Update Frequency
Existing Demand/Fuel NRCan CEUD Regional / Annual Annual
Future Demand Scaling CER Canada's Energy Future National/Regional GDP Annual/Biannual
Tech Shares & Costs U.S. EIA AEO US Census Div. Annual
Hourly Demand Shapes NREL ComStock US State / Hourly Periodic updates
Emissions Factors US EPA GHG Hub Fuel Type Annual
Population Config Statistics Canada Provincial / Annual Annual

2. NRCan Comprehensive Energy Use Database (CEUD)

3. U.S. EIA Annual Energy Outlook (AEO) Commercial Demand Module

4. NREL ComStock

5. Statistics Canada

6. CER Canada's Energy Future

7. US EPA GHG Emission Factors Hub

8. Renewables Ninja (Weather Mapping)

9. Update Procedures (Checklist)

As defined in annual_update_checklist.txt:

  1. Latest AEO NEMS data: Download and replace ktekx.xlsx in input_files. Update row/column indices in code and names in new_technologies.csv.
  2. Update datasets: Modify data years and reference strings in params.yaml.
  3. Clear Cache: Empty the data_cache/ directory to force the pipeline to pull freshly aligned data.
  4. Execute & Test: Run commercial_sector.py and resolve any CSV layout or API structural changes from upstream data providers. Ensure the output aligns with the new Temoa SQLite schema.