bloc.non_regression#

Non-regression snapshot utilities for Bloc model simulations.

This module provides tools to write and compare deterministic CSV snapshots derived from simulation scenario_results dicts, enabling model-level non-regression testing across code changes.

The snapshot schema mirrors the oo-refactor Calculation Note data model so that, once that branch merges, only the extraction adapter changes — not the snapshot format, the comparison rules, or the committed baselines.

Public API#

CSV dialect (all files)#

  • UTF-8, no BOM, LF line endings, trailing newline.

  • csv.QUOTE_MINIMAL, comma delimiter, decimal point, no thousands separator.

  • Float format: repr(float(v)) for exact round-trip.

  • nan, inf, -inf serialized as lowercase literals.

  • All rows sorted by the set of key columns before writing.

Schema version#

snapshot_schema_version is recorded in metadata.csv. The comparator refuses to compare files with different schema versions and raises a clear error. Bump SNAPSHOT_SCHEMA_VERSION whenever a non-trivial schema change is made.

Attributes#

Classes#

ToleranceRule

Per-key numeric tolerance override loaded from tolerances.yaml.

NodeRule

Per-model node-name mapping loaded from tolerances.yaml.

SnapshotConfig

Configuration for snapshot writing and comparison.

FileMismatch

Mismatch details for a single CSV file.

ComparisonReport

Report produced by compare_scenario_snapshots().

Functions#

write_scenario_snapshots(scenario_id, scenario_data, ...)

Write the five canonical CSV snapshot files for one scenario.

compare_scenario_snapshots(scenario_id, baseline_dir, ...)

Compare produced snapshot CSV files against committed baselines.

load_tolerance_config(path)

Load a SnapshotConfig from a YAML tolerances file.

non_regression_out_dir(script_dir, scenario_id)

Return the canonical non-regression output path for one scenario.

find_produced_snapshots(script_dir)

Return [(scenario_id, path), ...] for all produced snapshot dirs.

baseline_dir_for(repo_root, model_id, scenario_id)

Return the committed baseline directory for one scenario.

snapshot_from_ctsim(ctsim, script_path[, scenario_id, cfg])

Write non-regression snapshots directly from a ctwrap.Simulation object.

model_id_from_script(script_path, repo_root)

Derive the model_id from a model script path.

Module Contents#

bloc.non_regression.SNAPSHOT_SCHEMA_VERSION = 1#
bloc.non_regression.OPTIONAL_BASELINE_MODEL_IDS: frozenset[str]#
bloc.non_regression.DEFAULT_COMPOSITION_FLOOR = 1e-12#
class bloc.non_regression.ToleranceRule#

Per-key numeric tolerance override loaded from tolerances.yaml.

model: str = '*'#
scenario: str = '*'#
key: str = '*'#
rtol: float = 0.0001#
atol: float = 1e-10#
class bloc.non_regression.NodeRule#

Per-model node-name mapping loaded from tolerances.yaml.

Lets a renamed reactor network compare against an unchanged baseline without rewriting the committed CSV files.

model, scenario

fnmatch patterns selecting which scenarios the rule applies to.

aliases#

Map of baseline_node -> canonical_node. Node names are matched after the ``[N] `` display-number prefix is stripped. The same map is applied to both baseline and actual rows, so any entry whose key is the current node name is a harmless identity. Several baseline names may map to one canonical node (e.g. a node split that was later merged).

ignored#

Node names (after prefix stripping) present in the baseline but not yet implemented in the current model. Rows for these nodes are skipped on both sides instead of being reported as missing.

model: str = '*'#
scenario: str = '*'#
aliases: dict[str, str]#
ignored: frozenset[str]#
class bloc.non_regression.SnapshotConfig#

Configuration for snapshot writing and comparison.

Parameters:
  • composition_floor – Drop species with |value| < composition_floor from compositions.csv.

  • default_rtol – Default relative tolerance for numeric comparisons.

  • default_atol – Default absolute tolerance for numeric comparisons.

  • overrides – List of ToleranceRule objects; first match wins.

  • max_diff_rows – Maximum number of differing rows shown per file in the report.

composition_floor: float = 1e-12#
default_rtol: float = 0.0001#
default_atol: float = 1e-10#
overrides: list[ToleranceRule] = []#
node_rules: list[NodeRule] = []#
max_diff_rows: int = 50#
class bloc.non_regression.FileMismatch#

Mismatch details for a single CSV file.

filename: str#
changed: list[dict[str, Any]] = []#
missing: list[dict[str, Any]] = []#
extra: list[dict[str, Any]] = []#
property ok: bool#

Return True when the file has no mismatches.

format_report(max_rows=50)#

Return a human-readable diff-style report for this file.

class bloc.non_regression.ComparisonReport#

Report produced by compare_scenario_snapshots().

scenario_id: str#
model_id: str#
baseline_dir: pathlib.Path#
actual_dir: pathlib.Path#
file_mismatches: list[FileMismatch] = []#
schema_error: str = ''#
property ok: bool#

Return True if the scenario matches the baseline fully.

format_report(max_rows=50)#

Return the full human-readable comparison report.

format_summary(top_n=5)#

Return a compact digest with the top-N mismatches by relative error.

Surfaces the rows with the largest numeric relative error (from all changed lists across all files) and up to top_n missing-in-actual entries. Type-change rows (no rel_err field) appear after numeric rows. extra rows are omitted from the summary as they are less immediately actionable than missing or changed ones.

bloc.non_regression.write_scenario_snapshots(scenario_id, scenario_data, out_dir, model_id, cfg=None)#

Write the five canonical CSV snapshot files for one scenario.

Parameters:
  • scenario_id – Scenario identifier used as the first column in every CSV file.

  • scenario_data – Dict with keys outputs, node_data, physical_node_names, energy_flows, metadata. Missing keys default to empty.

  • out_dir – Directory where the five CSV files are written.

  • model_id – Relative model identifier (e.g. "SPRING_A3"); stored in metadata.

  • cfg – Optional SnapshotConfig; defaults are used when None.

bloc.non_regression.compare_scenario_snapshots(scenario_id, baseline_dir, actual_dir, model_id, cfg=None)#

Compare produced snapshot CSV files against committed baselines.

Parameters:
  • scenario_id – Scenario identifier (used for tolerance resolution).

  • baseline_dir – Directory containing committed baseline CSV files.

  • actual_dir – Directory containing freshly produced CSV files.

  • model_id – Relative model identifier (e.g. "SPRING_A3").

  • cfg – Optional SnapshotConfig.

Returns:

Contains per-file mismatch details.

Return type:

ComparisonReport

bloc.non_regression.load_tolerance_config(path)#

Load a SnapshotConfig from a YAML tolerances file.

The YAML file has the following structure:

defaults:
  rtol: 1.0e-4
  atol: 1.0e-10

overrides:
  - model: "SPRING_A3"
    scenario: "*"
    key: "X_*"
    rtol: 1.0e-4
Parameters:

path – Path to tolerances.yaml.

Returns:

Populated from the file; defaults applied when keys are absent.

Return type:

SnapshotConfig

bloc.non_regression.non_regression_out_dir(script_dir, scenario_id)#

Return the canonical non-regression output path for one scenario.

Written to:

<script_dir>/Results/_non_regression/<scenario_id>/

This location is git-ignored; the parent runner discovers it after each model subprocess finishes.

Parameters:
  • script_dir – Directory of the model run script.

  • scenario_id – Scenario identifier used as the leaf directory name.

bloc.non_regression.find_produced_snapshots(script_dir)#

Return [(scenario_id, path), ...] for all produced snapshot dirs.

Parameters:

script_dir – Directory of the model run script.

bloc.non_regression.baseline_dir_for(repo_root, model_id, scenario_id)#

Return the committed baseline directory for one scenario.

Parameters:
  • repo_root – Root of the repository.

  • model_id – Relative model identifier (e.g. "SPRING_A3").

  • scenario_id – Scenario identifier.

bloc.non_regression.snapshot_from_ctsim(ctsim, script_path, scenario_id='BASE', cfg=None)#

Write non-regression snapshots directly from a ctwrap.Simulation object.

This adapter is intended for model scripts that run a single scenario without going through bloc.calc_note.generate_calculation_note(). It extracts outputs from ctsim.data (res_dic if available, plus reactor state if sim is available) and writes the canonical CSV files.

Parameters:
  • ctsim – A ctwrap.Simulation object whose .run(...) has completed.

  • script_path – Path to the calling model script (used to derive model_id and the output directory).

  • scenario_id – Identifier for this scenario; defaults to "BASE".

  • cfg – Optional SnapshotConfig; defaults are used when None.

Notes

The function is best-effort: fields that cannot be extracted are silently omitted. Prefer using bloc.calc_note.generate_calculation_note() which hooks snapshots automatically and provides richer node_data.

bloc.non_regression.model_id_from_script(script_path, repo_root)#

Derive the model_id from a model script path.

Returns the relative directory under models/, e.g. "SPRING_A3" for models/SPRING_A3/run_concept.py.

Parameters:
  • script_path – Absolute path to the model script.

  • repo_root – Root of the repository.