Skip to content

Commit 878fddb

Browse files
Spencer Bryngelsonclaude
andcommitted
Remove CCE VLA guard from m_chemistry.fpp; slim CI to Frontier (CCE) only
1,500+ stress-test rounds on CCE 19.0.0 showed zero ICEs with plain dimension(num_species) local arrays in m_chemistry.fpp. Remove all #:if USING_CCE fixed-size array guards, the CCE_MAX_SPECIES Fypp constant, the @:PROHIBIT runtime checks, and the matching Python-side species-count validation in input.py. Simplify the compound AMD guard from (not MFC_CASE_OPTIMIZATION and USING_AMD) or USING_CCE to not MFC_CASE_OPTIMIZATION and USING_AMD with a literal dimension(10) — the AMD workaround is preserved. The -Oipa0 per-file CMake flags for m_bubbles_EL and m_phase_change are kept; those ICEs are confirmed required by 20/20 positive-control rounds and GitHub CI history. Temporarily remove non-CCE jobs from CI (GitHub runners, Phoenix, Frontier-AMD) to focus test bandwidth on the CCE fix branch. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 4e6482d commit 878fddb

5 files changed

Lines changed: 18 additions & 220 deletions

File tree

.github/workflows/bench.yml

Lines changed: 1 addition & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -37,22 +37,7 @@ jobs:
3737
fail-fast: false
3838
matrix:
3939
include:
40-
- cluster: phoenix
41-
name: Georgia Tech | Phoenix (NVHPC)
42-
group: phoenix
43-
labels: gt
44-
flag: p
45-
device: gpu
46-
interface: acc
47-
build_script: ""
48-
- cluster: phoenix
49-
name: Georgia Tech | Phoenix (NVHPC)
50-
group: phoenix
51-
labels: gt
52-
flag: p
53-
device: gpu
54-
interface: omp
55-
build_script: ""
40+
# Frontier (ORNL) — CCE only
5641
- cluster: frontier
5742
name: Oak Ridge | Frontier (CCE)
5843
group: phoenix
@@ -69,14 +54,6 @@ jobs:
6954
device: gpu
7055
interface: omp
7156
build_script: "bash .github/workflows/frontier/build.sh gpu omp bench"
72-
- cluster: frontier_amd
73-
name: Oak Ridge | Frontier (AMD)
74-
group: phoenix
75-
labels: frontier
76-
flag: famd
77-
device: gpu
78-
interface: omp
79-
build_script: "bash .github/workflows/frontier_amd/build.sh gpu omp bench"
8057
runs-on:
8158
group: ${{ matrix.group }}
8259
labels: ${{ matrix.labels }}

.github/workflows/test.yml

Lines changed: 2 additions & 143 deletions
Original file line numberDiff line numberDiff line change
@@ -68,99 +68,6 @@ jobs:
6868
with:
6969
filters: ".github/file-filter.yml"
7070

71-
github:
72-
name: Github
73-
if: needs.file-changes.outputs.checkall == 'true'
74-
needs: [lint-gate, file-changes]
75-
strategy:
76-
matrix:
77-
os: ['ubuntu', 'macos']
78-
mpi: ['mpi']
79-
precision: ['']
80-
debug: ['debug', 'no-debug']
81-
intel: [true, false]
82-
exclude:
83-
- os: macos
84-
intel: true
85-
86-
include:
87-
- os: ubuntu
88-
mpi: no-mpi
89-
precision: single
90-
debug: no-debug
91-
intel: false
92-
93-
fail-fast: false
94-
continue-on-error: true
95-
runs-on: ${{ matrix.os }}-latest
96-
97-
steps:
98-
- name: Clone
99-
uses: actions/checkout@v4
100-
101-
- name: Setup MacOS
102-
if: matrix.os == 'macos'
103-
run: |
104-
brew update
105-
brew upgrade
106-
brew install coreutils python fftw hdf5 gcc@15 boost open-mpi lapack
107-
echo "FC=gfortran-15" >> $GITHUB_ENV
108-
echo "BOOST_INCLUDE=/opt/homebrew/include/" >> $GITHUB_ENV
109-
110-
- name: Setup Ubuntu
111-
if: matrix.os == 'ubuntu' && matrix.intel == false
112-
run: |
113-
sudo apt update -y
114-
sudo apt install -y cmake gcc g++ python3 python3-dev hdf5-tools \
115-
libfftw3-dev libhdf5-dev openmpi-bin libopenmpi-dev \
116-
libblas-dev liblapack-dev
117-
118-
- name: Setup Ubuntu (Intel)
119-
if: matrix.os == 'ubuntu' && matrix.intel == true
120-
run: |
121-
wget https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB
122-
sudo apt-key add GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB
123-
sudo add-apt-repository "deb https://apt.repos.intel.com/oneapi all main"
124-
sudo apt-get update
125-
sudo apt-get install -y intel-oneapi-compiler-fortran intel-oneapi-mpi intel-oneapi-mpi-devel
126-
# Export only new/changed env vars from setvars.sh.
127-
# `printenv >> $GITHUB_ENV` dumps all vars including shell internals
128-
# with special characters that corrupt GITHUB_ENV parsing.
129-
printenv | sort > /tmp/env_before
130-
source /opt/intel/oneapi/setvars.sh
131-
printenv | sort > /tmp/env_after
132-
diff /tmp/env_before /tmp/env_after | grep '^>' | sed 's/^> //' >> $GITHUB_ENV
133-
134-
- name: Get system info for cache key
135-
id: sys-info
136-
run: |
137-
{
138-
uname -m
139-
cat /proc/cpuinfo 2>/dev/null | grep 'model name' | head -1 || sysctl -n machdep.cpu.brand_string 2>/dev/null || true
140-
if command -v ifx &>/dev/null; then ifx --version 2>/dev/null | head -1; else ${FC:-gfortran} --version 2>/dev/null | head -1 || true; fi
141-
${CC:-gcc} --version 2>/dev/null | head -1 || true
142-
} | (sha256sum 2>/dev/null || shasum -a 256) | cut -c1-16 > /tmp/sys-hash
143-
echo "sys-hash=$(cat /tmp/sys-hash)" >> "$GITHUB_OUTPUT"
144-
145-
- name: Restore Build Cache
146-
uses: actions/cache@v4
147-
with:
148-
path: build
149-
key: mfc-build-${{ matrix.os }}-${{ matrix.mpi }}-${{ matrix.debug }}-${{ matrix.precision }}-${{ matrix.intel }}-${{ steps.sys-info.outputs.sys-hash }}-${{ hashFiles('CMakeLists.txt', 'toolchain/dependencies/**', 'toolchain/cmake/**', 'src/**/*.fpp', 'src/**/*.f90') }}
150-
151-
- name: Build
152-
run: |
153-
/bin/bash mfc.sh test -v --dry-run -j "$(nproc)" --${{ matrix.debug }} --${{ matrix.mpi }} $PRECISION $TEST_ALL
154-
env:
155-
TEST_ALL: ${{ matrix.mpi == 'mpi' && '--test-all' || '' }}
156-
PRECISION: ${{ matrix.precision != '' && format('--{0}', matrix.precision) || '' }}
157-
158-
- name: Test
159-
run: bash .github/scripts/run-tests-with-retry.sh -v --max-attempts 3 -j "$(nproc)" $TEST_ALL $TEST_PCT
160-
env:
161-
TEST_ALL: ${{ matrix.mpi == 'mpi' && '--test-all' || '' }}
162-
TEST_PCT: ${{ matrix.debug == 'debug' && '-% 20' || '' }}
163-
16471
self:
16572
name: "${{ matrix.cluster_name }} (${{ matrix.device }}${{ matrix.interface != 'none' && format('-{0}', matrix.interface) || '' }}${{ matrix.shard != '' && format(' [{0}]', matrix.shard) || '' }})"
16673
if: github.repository == 'MFlowCode/MFC' && needs.file-changes.outputs.checkall == 'true' && github.event.pull_request.draft != true
@@ -170,23 +77,7 @@ jobs:
17077
strategy:
17178
matrix:
17279
include:
173-
# Phoenix (GT) — build+test combined in SLURM job
174-
- runner: 'gt'
175-
cluster: 'phoenix'
176-
cluster_name: 'Georgia Tech | Phoenix'
177-
device: 'gpu'
178-
interface: 'acc'
179-
- runner: 'gt'
180-
cluster: 'phoenix'
181-
cluster_name: 'Georgia Tech | Phoenix'
182-
device: 'gpu'
183-
interface: 'omp'
184-
- runner: 'gt'
185-
cluster: 'phoenix'
186-
cluster_name: 'Georgia Tech | Phoenix'
187-
device: 'cpu'
188-
interface: 'none'
189-
# Frontier (ORNL) — build on login node, GPU tests sharded for batch partition
80+
# Frontier (ORNL) — CCE only
19081
- runner: 'frontier'
19182
cluster: 'frontier'
19283
cluster_name: 'Oak Ridge | Frontier'
@@ -216,24 +107,6 @@ jobs:
216107
cluster_name: 'Oak Ridge | Frontier'
217108
device: 'cpu'
218109
interface: 'none'
219-
# Frontier AMD — build on login node, GPU tests sharded for batch partition
220-
- runner: 'frontier'
221-
cluster: 'frontier_amd'
222-
cluster_name: 'Oak Ridge | Frontier (AMD)'
223-
device: 'gpu'
224-
interface: 'omp'
225-
shard: '1/2'
226-
- runner: 'frontier'
227-
cluster: 'frontier_amd'
228-
cluster_name: 'Oak Ridge | Frontier (AMD)'
229-
device: 'gpu'
230-
interface: 'omp'
231-
shard: '2/2'
232-
- runner: 'frontier'
233-
cluster: 'frontier_amd'
234-
cluster_name: 'Oak Ridge | Frontier (AMD)'
235-
device: 'cpu'
236-
interface: 'none'
237110
runs-on:
238111
group: phoenix
239112
labels: ${{ matrix.runner }}
@@ -289,16 +162,7 @@ jobs:
289162
strategy:
290163
matrix:
291164
include:
292-
- runner: 'gt'
293-
cluster: 'phoenix'
294-
cluster_name: 'Georgia Tech | Phoenix'
295-
device: 'gpu'
296-
interface: 'acc'
297-
- runner: 'gt'
298-
cluster: 'phoenix'
299-
cluster_name: 'Georgia Tech | Phoenix'
300-
device: 'gpu'
301-
interface: 'omp'
165+
# Frontier (ORNL) — CCE only
302166
- runner: 'frontier'
303167
cluster: 'frontier'
304168
cluster_name: 'Oak Ridge | Frontier'
@@ -309,11 +173,6 @@ jobs:
309173
cluster_name: 'Oak Ridge | Frontier'
310174
device: 'gpu'
311175
interface: 'omp'
312-
- runner: 'frontier'
313-
cluster: 'frontier_amd'
314-
cluster_name: 'Oak Ridge | Frontier (AMD)'
315-
device: 'gpu'
316-
interface: 'omp'
317176
runs-on:
318177
group: phoenix
319178
labels: ${{ matrix.runner }}

.gitignore

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -105,4 +105,8 @@ benchmarks/*.png
105105
*.avi
106106

107107
**isolation_rules/
108-
**.supercode/
108+
**.supercode/
109+
# CCE stress-test log directories (local testing artifacts)
110+
cce_*/
111+
cce_*.log
112+
run_cce_*.sh

src/common/m_chemistry.fpp

Lines changed: 10 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -6,10 +6,6 @@
66
#:include 'macros.fpp'
77
#:include 'case.fpp'
88

9-
#! CCE 19.0.0 workaround: fixed-size array limit for local species arrays under _CRAYFTN.
10-
#! Must match the Python-side check in toolchain/mfc/run/input.py. See PR #1286.
11-
#:set CCE_MAX_SPECIES = 10
12-
139
!> @brief Multi-species chemistry interface for thermodynamic properties, reaction rates, and transport coefficients
1410
module m_chemistry
1511

@@ -67,15 +63,7 @@ contains
6763

6864
integer :: x, y, z, eqn
6965
real(wp) :: energy, T_in
70-
#:if USING_CCE
71-
real(wp), dimension(${CCE_MAX_SPECIES}$) :: Ys
72-
#:else
73-
real(wp), dimension(num_species) :: Ys
74-
#:endif
75-
76-
#:if USING_CCE
77-
@:PROHIBIT(num_species > ${CCE_MAX_SPECIES}$, "CCE 19.0.0 workaround: num_species must be <= ${CCE_MAX_SPECIES}$ (fixed-size arrays in m_chemistry.fpp)")
78-
#:endif
66+
real(wp), dimension(num_species) :: Ys
7967

8068
do z = bounds(3)%beg, bounds(3)%end
8169
do y = bounds(2)%beg, bounds(2)%end
@@ -113,17 +101,9 @@ contains
113101
type(int_bounds_info), dimension(1:3), intent(in) :: bounds
114102

115103
integer :: x, y, z, i
116-
#:if USING_CCE
117-
real(wp), dimension(${CCE_MAX_SPECIES}$) :: Ys
118-
#:else
119-
real(wp), dimension(num_species) :: Ys
120-
#:endif
104+
real(wp), dimension(num_species) :: Ys
121105
real(wp) :: mix_mol_weight
122106

123-
#:if USING_CCE
124-
@:PROHIBIT(num_species > ${CCE_MAX_SPECIES}$, "CCE 19.0.0 workaround: num_species must be <= ${CCE_MAX_SPECIES}$ (fixed-size arrays in m_chemistry.fpp)")
125-
#:endif
126-
127107
do z = bounds(3)%beg, bounds(3)%end
128108
do y = bounds(2)%beg, bounds(2)%end
129109
do x = bounds(1)%beg, bounds(1)%end
@@ -151,18 +131,14 @@ contains
151131
integer :: eqn
152132
real(wp) :: T
153133
real(wp) :: rho, omega_m
154-
#:if (not MFC_CASE_OPTIMIZATION and USING_AMD) or USING_CCE
155-
real(wp), dimension(${CCE_MAX_SPECIES}$) :: Ys
156-
real(wp), dimension(${CCE_MAX_SPECIES}$) :: omega
134+
#:if not MFC_CASE_OPTIMIZATION and USING_AMD
135+
real(wp), dimension(10) :: Ys
136+
real(wp), dimension(10) :: omega
157137
#:else
158138
real(wp), dimension(num_species) :: Ys
159139
real(wp), dimension(num_species) :: omega
160140
#:endif
161141

162-
#:if USING_CCE
163-
@:PROHIBIT(num_species > ${CCE_MAX_SPECIES}$, "CCE 19.0.0 workaround: num_species must be <= ${CCE_MAX_SPECIES}$ (fixed-size arrays in m_chemistry.fpp)")
164-
#:endif
165-
166142
$:GPU_PARALLEL_LOOP(collapse=3, private='[Ys, omega, eqn, T, rho, omega_m]', copyin='[bounds]')
167143
do z = bounds(3)%beg, bounds(3)%end
168144
do y = bounds(2)%beg, bounds(2)%end
@@ -204,11 +180,11 @@ contains
204180
type(int_bounds_info), intent(in) :: irx, iry, irz
205181

206182
integer, intent(in) :: idir
207-
#:if (not MFC_CASE_OPTIMIZATION and USING_AMD) or USING_CCE
208-
real(wp), dimension(${CCE_MAX_SPECIES}$) :: Xs_L, Xs_R, Xs_cell, Ys_L, Ys_R, Ys_cell
209-
real(wp), dimension(${CCE_MAX_SPECIES}$) :: mass_diffusivities_mixavg1, mass_diffusivities_mixavg2
210-
real(wp), dimension(${CCE_MAX_SPECIES}$) :: mass_diffusivities_mixavg_Cell, dXk_dxi, h_l, h_r, h_k
211-
real(wp), dimension(${CCE_MAX_SPECIES}$) :: Mass_Diffu_Flux, dYk_dxi
183+
#:if not MFC_CASE_OPTIMIZATION and USING_AMD
184+
real(wp), dimension(10) :: Xs_L, Xs_R, Xs_cell, Ys_L, Ys_R, Ys_cell
185+
real(wp), dimension(10) :: mass_diffusivities_mixavg1, mass_diffusivities_mixavg2
186+
real(wp), dimension(10) :: mass_diffusivities_mixavg_Cell, dXk_dxi, h_l, h_r, h_k
187+
real(wp), dimension(10) :: Mass_Diffu_Flux, dYk_dxi
212188
#:else
213189
real(wp), dimension(num_species) :: Xs_L, Xs_R, Xs_cell, Ys_L, Ys_R, Ys_cell
214190
real(wp), dimension(num_species) :: mass_diffusivities_mixavg1, mass_diffusivities_mixavg2
@@ -226,10 +202,6 @@ contains
226202
integer :: x, y, z, i, n, eqn
227203
integer, dimension(3) :: offsets
228204

229-
#:if USING_CCE
230-
@:PROHIBIT(num_species > ${CCE_MAX_SPECIES}$, "CCE 19.0.0 workaround: num_species must be <= ${CCE_MAX_SPECIES}$ (fixed-size arrays in m_chemistry.fpp)")
231-
#:endif
232-
233205
isc1 = irx; isc2 = iry; isc3 = irz
234206

235207
$:GPU_UPDATE(device='[isc1,isc2,isc3]')

toolchain/mfc/run/input.py

Lines changed: 0 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -92,20 +92,6 @@ def generate_fpp(self, target) -> None:
9292
# Write the generated Fortran code to the m_thermochem.f90 file with the chosen precision
9393
sol = self.get_cantera_solution()
9494

95-
# CCE 19.0.0 workaround: m_chemistry.fpp uses dimension(CCE_MAX_SPECIES) for local
96-
# species arrays on Cray builds to avoid an InstCombine ICE. Must match the Fypp
97-
# constant CCE_MAX_SPECIES in src/common/m_chemistry.fpp.
98-
CCE_MAX_SPECIES = 10
99-
if sol.n_species > CCE_MAX_SPECIES:
100-
msg = (f"Cantera mechanism has {sol.n_species} species > {CCE_MAX_SPECIES}. "
101-
f"Cray Fortran (CCE) builds use a hardcoded dimension({CCE_MAX_SPECIES}) "
102-
"workaround in m_chemistry.fpp and will abort at runtime on CCE. See PR #1286.")
103-
if directive_str is not None:
104-
# GPU builds: hard error — the Fortran PROHIBIT will abort anyway,
105-
# so fail early at input generation rather than at the first chemistry call.
106-
raise common.MFCException(msg)
107-
cons.print(f"[bold yellow]Warning:[/bold yellow] {msg}")
108-
10995
thermochem_code = pyro.FortranCodeGenerator().generate(
11096
"m_thermochem",
11197
sol,

0 commit comments

Comments
 (0)