Skip to content

Fix fp-stability log preservation and add local/inline reporting#1537

Merged
sbryngelson merged 5 commits into
MFlowCode:masterfrom
sbryngelson:fix-fp-stability-logs
Jun 4, 2026
Merged

Fix fp-stability log preservation and add local/inline reporting#1537
sbryngelson merged 5 commits into
MFlowCode:masterfrom
sbryngelson:fix-fp-stability-logs

Conversation

@sbryngelson
Copy link
Copy Markdown
Member

@sbryngelson sbryngelson commented Jun 4, 2026

Description

mfc: Starting...
mfc: OK > (venv) Entered the Python 3.12.5 virtual environment (>= 3.9).

      .=++*:          -+*+=.        | sbryngelson3@atl1-1-03-020-32-0.pace.gatech.edu [Linux]
     :+   -*-        ==   =* .      | -------------------------------------------------------
   :*+      ==      ++    .+-       | 
  :*##-.....:*+   .#%+++=--+=:::.   | --jobs 1
  -=-++-======#=--**+++==+*++=::-:. | --no-mpi --no-gpu --debug --no-reldebug --no-gcov --no-unified --no-single --no-mixed --no-fastmath
 .:++=----------====+*= ==..:%..... | --targets pre_process, simulation, and post_process
  .:-=++++===--==+=-+=   +.  :=     | 
  +#=::::::::=%=. -+:    =+   *:    | ------------------------------------------------------------------
 .*=-=*=..    :=+*+:      -...--    | $ ./mfc.sh (build, run, test, clean, new, validate, params) --help


 MFC Floating-Point Stability Suite
   verrou:      /storage/home/hcoda1/6/sbryngelson3/.local/verrou/bin/valgrind
   simulation:  /storage/project/r-sbryngelson3-0/sbryngelson3/MFC-verrou/build/install/b73eac3112/bin/simulation
   pre_process: /storage/project/r-sbryngelson3-0/sbryngelson3/MFC-verrou/build/install/543b9105dd/bin/pre_process
   samples:     5
   features:    float-proxy, vprec-sweep, cancellation, float-max
   logs:        /storage/project/r-sbryngelson3-0/sbryngelson3/MFC-verrou/fp-stability-logs

 sod_standard: 1-D standard Sod, p_L/p_R=10, ideal gas (well-conditioned baseline)
     pass floor: >= 24 significant bits retained
     running pre_process...
     reference run (rounding=nearest)...
     random-rounding runs (N=5)...
     PASS  46.8 bits retained (floor 24)  max_dev=1.998e-14
     float proxy: dev=1.240e-06  (single-precision sensitivity)
     VPREC precision sweep:
       52 bits (double): dev=0.000e+00
       23 bits (single): dev=1.240e-06  FAIL
       16 bits (~half): dev=1.880e-04  FAIL
       10 bits (ultra-low): dev=3.769e-02  FAIL
     cancellation detection...
     cancellation: 23 site(s), worst loses >= 14 of ~16 digits
       >= 14 digits lost  m_rhs.fpp:921
       >= 14 digits lost  m_rhs.fpp:1176
       >= 14 digits lost  m_riemann_solvers.fpp:3071 (fypp-expanded)
       >= 14 digits lost  m_riemann_solvers.fpp:3100 (fypp-expanded)
       >= 14 digits lost  m_riemann_solvers.fpp:3131 (fypp-expanded)
       ...and 18 more; see fp-stability-logs/summary.md
     18 inside fypp expansions - line maps to multiple instances
     float-max overflow check...
     float-max: no overflows

 sod_strong: 1-D Sod, p_L/p_R=100,000, ideal gas
     ill-conditioning: HLLC xi factor: (s_L - vel_L)/(s_L - s_S) cancels near sonic contact
     pass floor: >= 24 significant bits retained
     running pre_process...
     reference run (rounding=nearest)...
     random-rounding runs (N=5)...
     PASS  47.0 bits retained (floor 24)  max_dev=1.728e-11
     float proxy: dev=1.088e-03  (single-precision sensitivity)
     VPREC precision sweep:
       52 bits (double): dev=0.000e+00
       23 bits (single): dev=1.088e-03  FAIL
       16 bits (~half): dev=3.734e-01  FAIL
       10 bits (ultra-low): dev=3.993e+01  FAIL
     cancellation detection...
     cancellation: 27 site(s), worst loses >= 14 of ~16 digits
       >= 14 digits lost  m_rhs.fpp:921
       >= 14 digits lost  m_rhs.fpp:1176
       >= 14 digits lost  m_riemann_solvers.fpp:3023 (fypp-expanded)
       >= 14 digits lost  m_riemann_solvers.fpp:3071 (fypp-expanded)
       >= 14 digits lost  m_riemann_solvers.fpp:3099 (fypp-expanded)
       ...and 22 more; see fp-stability-logs/summary.md
     23 inside fypp expansions - line maps to multiple instances
     float-max overflow check...
     float-max: no overflows

 water_stiffened: 1-D water shock, stiffened EOS (pi_inf=4046)
     ill-conditioning: Pressure recovery: p=(E-pi_inf)/gamma loses ~4 digits (pi_inf/p_right~40,000)
     pass floor: >= 24 significant bits retained
     running pre_process...
     reference run (rounding=nearest)...
     random-rounding runs (N=5)...
     PASS  34.9 bits retained (floor 24)  max_dev=3.122e-09
     float proxy: dev=1.068e-02  (single-precision sensitivity)
     VPREC precision sweep:
       52 bits (double): dev=0.000e+00
       23 bits (single): dev=1.068e-02  FAIL
       16 bits (~half): dev=1.109e+00  FAIL
       10 bits (ultra-low): dev=5.305e+01  FAIL
     cancellation detection...
     cancellation: 29 site(s), worst loses >= 14 of ~16 digits
       >= 14 digits lost  m_rhs.fpp:921
       >= 14 digits lost  m_rhs.fpp:1176
       >= 14 digits lost  m_riemann_solvers.fpp:3023 (fypp-expanded)
       >= 14 digits lost  m_riemann_solvers.fpp:3071 (fypp-expanded)
       >= 14 digits lost  m_riemann_solvers.fpp:3100 (fypp-expanded)
       ...and 24 more; see fp-stability-logs/summary.md
     24 inside fypp expansions - line maps to multiple instances
     float-max overflow check...
     float-max: no overflows

 air_water_interface: 1-D air/water isobaric contact (two-fluid, pi_inf=4046)
     ill-conditioning: Mixed-cell pressure recovery: E-alpha_w*gamma_w*pi_inf cancels when alpha_w<<1
     pass floor: >= 24 significant bits retained
     running pre_process...
     reference run (rounding=nearest)...
     random-rounding runs (N=5)...
     PASS  47.5 bits retained (floor 24)  max_dev=2.046e-11
     float proxy: dev=1.855e-04  (single-precision sensitivity)
     VPREC precision sweep:
       52 bits (double): dev=0.000e+00
       23 bits (single): dev=1.855e-04
       16 bits (~half): dev=4.437e-02  FAIL
       10 bits (ultra-low): dev=2.602e+00  FAIL
     cancellation detection...
     cancellation: 18 site(s), worst loses >= 14 of ~16 digits
       >= 14 digits lost  m_rhs.fpp:921
       >= 14 digits lost  m_rhs.fpp:1176
       >= 14 digits lost  m_riemann_solvers.fpp:3023 (fypp-expanded)
       >= 14 digits lost  m_riemann_solvers.fpp:3071 (fypp-expanded)
       >= 14 digits lost  m_riemann_solvers.fpp:3099 (fypp-expanded)
       ...and 13 more; see fp-stability-logs/summary.md
     13 inside fypp expansions - line maps to multiple instances
     float-max overflow check...
     float-max: no overflows

 bubble_rp: 1-D bubbly water, pressure step 2:1 driving Rayleigh-Plesset oscillations (nb=1, Keller-Miksis)
     ill-conditioning: RP ODE: (p_bub - p_ext) cancels near bubble equilibrium
     pass floor: >= 24 significant bits retained
     running pre_process...
     reference run (rounding=nearest)...
     random-rounding runs (N=5)...
     PASS  30.3 bits retained (floor 24)  max_dev=1.563e-09
     float proxy: dev=1.150e-02  (single-precision sensitivity)
     VPREC precision sweep:
       52 bits (double): dev=0.000e+00
       23 bits (single): dev=1.150e-02  FAIL
       16 bits (~half): dev=6.564e-01  FAIL
       10 bits (ultra-low): dev=1.450e+01  FAIL
     cancellation detection...
     cancellation: 35 site(s), worst loses >= 14 of ~16 digits
       >= 14 digits lost  m_bubbles.fpp:190
       >= 14 digits lost  m_bubbles_EE.fpp:108
       >= 14 digits lost  m_rhs.fpp:921
       >= 14 digits lost  m_rhs.fpp:1176
       >= 14 digits lost  m_riemann_solvers.fpp:2606 (fypp-expanded)
       ...and 30 more; see fp-stability-logs/summary.md
     27 inside fypp expansions - line maps to multiple instances
     float-max overflow check...
     float-max: no overflows

 low_mach: 1-D water shock with low_Mach=1 HLLC correction active
     ill-conditioning: low_Mach correction: velocity perturbation ~u/c cancels severely at M~0
     pass floor: >= 24 significant bits retained
     running pre_process...
     reference run (rounding=nearest)...
     random-rounding runs (N=5)...
     PASS  29.9 bits retained (floor 24)  max_dev=9.954e-08
     float proxy: dev=4.257e-02  (single-precision sensitivity)
     VPREC precision sweep:
       52 bits (double): dev=0.000e+00
       23 bits (single): dev=4.257e-02  FAIL
       16 bits (~half): dev=2.410e+00  FAIL
       10 bits (ultra-low): dev=5.653e+01  FAIL
     cancellation detection...
     cancellation: 29 site(s), worst loses >= 14 of ~16 digits
       >= 14 digits lost  m_rhs.fpp:921
       >= 14 digits lost  m_rhs.fpp:1176
       >= 14 digits lost  m_riemann_solvers.fpp:3023 (fypp-expanded)
       >= 14 digits lost  m_riemann_solvers.fpp:3071 (fypp-expanded)
       >= 14 digits lost  m_riemann_solvers.fpp:3100 (fypp-expanded)
       ...and 24 more; see fp-stability-logs/summary.md
     24 inside fypp expansions - line maps to multiple instances
     float-max overflow check...
     float-max: no overflows

 Results (92s):  6 passed  0 failed
   PASS  sod_standard
   PASS  sod_strong
   PASS  water_stiffened
   PASS  air_water_interface
   PASS  bubble_rp
   PASS  low_mach
   report: /storage/project/r-sbryngelson3-0/sbryngelson3/MFC-verrou/fp-stability-logs/summary.md

mfc: (venv) Exiting the Python virtual environment. 

created fp-stability-logs/ and printed its path, but nothing ever wrote into it: all logs went to a tempdir deleted in _run_case's finally block. The directory was empty after every run — locally and in CI, where the artifact upload of fp-stability-logs/ has been shipping nothing. The markdown report was also emitted only to GITHUB_STEP_SUMMARY, so local runs got no report at all.

This PR fixes both and improves error-source visibility:

  • Preserve logs: each case's text artifacts (*.log, *.out, *.inp, cancel_gen.txt) are copied into fp-stability-logs/<case>/ (mirroring the run-dir layout) before the work dir is removed, including on failure paths. Bulky .dat field data is skipped.
  • Local report: the markdown report is always written to fp-stability-logs/summary.md (path printed at the end); still appended to GITHUB_STEP_SUMMARY when set.
  • Console lines: the top-5 cancellation sites (ranked by digits lost, fypp-expansion ambiguity marked) and float-max overflow sites print file:line directly to the console instead of just counts.
  • Source snippets: summary.md embeds a line-numbered excerpt (marked line ± 3 lines of context) under each cancellation/float-max site as a nested f90 code block, so the offending code is readable without opening the file.

Type of change

  • Bug fix
  • New feature

Testing

  • 4 new unit tests in toolchain/mfc/test_fp_stability.py (log preservation keeps text artifacts / skips field data; local summary.md written without CI env; snippet marks the right line with correct context and degrades to empty for unresolvable sources). All 20 pass.
  • End-to-end verified with real Verrou runs on a coarsened 1D Shu-Osher case: fp-stability-logs/ now contains pre.log, per-run verrou.log/sim.out, .inp files, and summary.md with ranked cancellation sites and source snippets.
  • ./mfc.sh precheck passes (via pre-commit hook on each commit).

Checklist

  • I added or updated tests for new behavior
  • I updated documentation if user-facing behavior changed (module docstring reflects the new log/report layout)

fp-stability-logs/ was created but never written to: all logs went to a
tempdir deleted in _run_case's finally block, so the directory was empty
after every run, locally and in CI (the artifact upload shipped nothing).
The markdown report was also only emitted to GITHUB_STEP_SUMMARY, so
local runs got no report at all.

- preserve each case's text artifacts (*.log, *.out, *.inp,
  cancel_gen.txt) into fp-stability-logs/<case>/ before the work dir is
  removed, mirroring the run-dir layout; bulky .dat field data is skipped
- always write the report to fp-stability-logs/summary.md and print its
  path; still appended to GITHUB_STEP_SUMMARY when set
… console

The file:line sites were already collected and reported in summary.md and
GitHub annotations, but the console only printed counts. Print the top 5
cancellation sites (ranked by digits lost, fypp-expansion ambiguity marked)
and float-max overflow sites inline, eliding the rest to summary.md.
For each cancellation and float-max site shown in the report, embed a
line-numbered excerpt of the source (marked line plus 3 lines of context)
as a nested fenced code block, so the offending code is readable without
opening the file. Unresolvable sources degrade to no snippet.
Copilot AI review requested due to automatic review settings June 4, 2026 04:22
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes the FP-stability suite’s log/report handling so artifacts aren’t lost when the per-case scratch directory is removed, and enhances reporting to make error sources easier to locate (console file:line output + summary.md snippets).

Changes:

  • Preserve per-case text artifacts by copying them into fp-stability-logs/<case>/ before removing the scratch work_dir.
  • Always write a local markdown report to fp-stability-logs/summary.md (and still append to GITHUB_STEP_SUMMARY in CI).
  • Add inline source excerpts for cancellation / float-max sites and add unit tests covering the new behaviors.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
toolchain/mfc/fp_stability.py Adds log preservation and improves console reporting for top-ranked cancellation/overflow sites.
toolchain/mfc/fp_stability_report.py Writes summary.md locally and embeds source snippets under reported sites.
toolchain/mfc/fp_stability_metrics.py Adds _source_snippet() helper to produce line-numbered excerpts with context.
toolchain/mfc/test_fp_stability.py Adds unit tests for snippet extraction, local report writing, and log preservation behavior.

Comment thread toolchain/mfc/fp_stability.py
Comment thread toolchain/mfc/fp_stability.py
Comment thread toolchain/mfc/fp_stability.py
Comment thread toolchain/mfc/fp_stability_report.py
The cells/cell-steps feasibility guard protects against accidentally
launching hours of Verrou runs, but offered no escape hatch for users
who knowingly want a larger grid or more time steps. --force runs the
case anyway, printing the expected cost multiple and a reminder to trim
passes (-N 1, --no-*). Guard errors now mention the flag.
…back)

- _preserve_logs runs in _run_case's finally block; an OSError there
  (disk full, dest collision) would replace the case's real outcome and,
  being uncatchable by the suite's MFCException handler, abort the whole
  run. Make it best-effort with a printed warning, matching the adjacent
  ignore_errors rmtree.
- _emit_github_summary creates log_dir if missing instead of raising
  FileNotFoundError at the very end of a run.
@sbryngelson sbryngelson merged commit 68dbb6b into MFlowCode:master Jun 4, 2026
84 checks passed
@sbryngelson sbryngelson deleted the fix-fp-stability-logs branch June 4, 2026 06:31
@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 4, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 60.80%. Comparing base (2d0d2fd) to head (451ab76).
⚠️ Report is 5 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1537      +/-   ##
==========================================
+ Coverage   60.63%   60.80%   +0.16%     
==========================================
  Files          73       73              
  Lines       20219    20199      -20     
  Branches     2937     2932       -5     
==========================================
+ Hits        12259    12281      +22     
+ Misses       5972     5932      -40     
+ Partials     1988     1986       -2     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants