Skip to content

IB Memory Reduction#1307

Merged
sbryngelson merged 7 commits intoMFlowCode:masterfrom
danieljvickers:ib-memory-reduction
Mar 16, 2026
Merged

IB Memory Reduction#1307
sbryngelson merged 7 commits intoMFlowCode:masterfrom
danieljvickers:ib-memory-reduction

Conversation

@danieljvickers
Copy link
Copy Markdown
Member

@danieljvickers danieljvickers commented Mar 13, 2026

Description

Before scaling up, I was looking for any free memeory that we have been loose with and cleaning it up.

This removes the inner point array, as it is just a waste of memory resources. It also means we can reduce the number of ghost points to keep in memory. I also added protection preventing arbitrary memory allocation in non-moving cases.

Type of change

  • Bug fix

Testing

How did you test your changes?

I reran all mibm examples to ensure we did not run out of memory

GPU changes (expand if you modified src/simulation/)
  • GPU results match CPU results
  • Tested on NVIDIA GPU or AMD GPU

AI code reviews

Reviews are not triggered automatically. To request a review, comment on the PR:

  • @coderabbitai review — incremental review (new changes only)
  • @coderabbitai full review — full review from scratch
  • /review — Qodo review
  • /improve — Qodo code suggestions

@github-actions
Copy link
Copy Markdown

Claude Code Review

Head SHA: 23a49931ad4070320a7013bc4def2a2134ef46d8
Files changed: 1 — src/simulation/m_ibm.fpp
+20 / -62


Summary

  • Removes the inner_points array and num_inner_gps counter, reducing GPU/CPU memory usage for IBM cases.
  • Removes the inner-points momentum-correction loop from s_ibm_correct_state.
  • Changes ghost_points allocation from a generous over-allocated buffer to an exact-count allocation.
  • Adds a moving_immersed_boundary_flag guard around the MPI allreduce + cap, skipping it for static IB cases.
  • Updates s_find_num_ghost_points and s_find_ghost_points signatures to drop the inner-points output/argument.

Findings

1. Correctness — removed inner-points momentum-correction loop (high concern)
src/simulation/m_ibm.fpp, s_ibm_correct_state (around old line 401–418):
The loop that zeroed momentum components (q_cons_vf(q)%sf(j,k,l) = 0._wp for q = momxb, momxe) for all cells classified as inner points (i.e., cells fully inside an immersed boundary) has been removed entirely. The PR description does not explain why this correction is no longer needed. If any fluid solver path writes nonzero momentum into those cells, the immersed boundary no longer resets them, which could affect mass/momentum conservation near solid boundaries.

Please confirm: is this loop genuinely redundant (e.g., because inner points are never touched by the Riemann solver), or does removing it change physics results? The testing note says "reran all mibm examples to ensure we did not run out of memory" — it would be valuable to also confirm that pressure/velocity fields inside IBs are unchanged.


2. Dead code — count_i declared but now unused in s_find_ghost_points
src/simulation/m_ibm.fpp, s_find_ghost_points:

The only code that used count_i was the removed inner-points else branch. count_i is now declared but never referenced. Strict compilers (Intel ifx, Cray ftn with -warn unused) may emit a warning; some configurations treat this as an error. Remove the declaration.


3. Exact allocation without buffer for non-moving case (low-risk but worth noting)
src/simulation/m_ibm.fpp, s_ibm_setup (around new line 129):

For moving_immersed_boundary_flag = .false., num_gps is now the raw local GPU-counted value with no allreduce and no safety factor. The old code allocated (max_num_gps + max_num_inner_gps) * 2, which was admittedly very generous. The tightened allocation is safe if and only if the two GPU parallel loops in s_find_num_ghost_points and s_find_ghost_points see an identical grid state. If ib_markers is updated between the two calls (e.g., by s_find_ghost_points itself writing ib_markers%sf(i,j,k) = patch_id), a second call path through s_update_ib_markers calling s_find_num_ghost_points + s_find_ghost_points would re-count on already-modified markers. Confirm this is safe in the moving-boundary update path.


Minor

  • The PR description checkboxes ("Bug fix", "GPU results match CPU results", "Tested on NVIDIA GPU or AMD GPU") are all unchecked. For a simulation-only GPU module change, please fill in the GPU testing checklist.

@github-actions
Copy link
Copy Markdown

Claude Code Review

Head SHA: 4150b12

Files changed: 1

  • src/simulation/m_ibm.fpp (+20 / -62)

Summary

  • Removes the inner_points array and num_inner_gps counter, saving memory for IB simulations.
  • Adds a guard: MPI allreduce and 2× buffer allocation only happen when moving_immersed_boundary_flag is true; non-moving IBs allocate exactly num_gps elements.
  • Removes the inner-point momentum-zeroing loop in s_ibm_correct_state.
  • Simplifies s_find_num_ghost_points and s_find_ghost_points to operate only on ghost points.

Findings

1. Medium — ib_markers%sf not decoded for inner IB cells (potential downstream correctness issue)

File: src/simulation/m_ibm.fpp, s_find_ghost_points (removed else branch around line 630)

Previously, for inner IB cells the code decoded the encoded patch ID and wrote back the plain patch_id to ib_markers%sf(i,j,k). This step is now removed. Any code that later reads ib_markers%sf for a deeply-interior IB cell will now see the encoded value (with periodicity bits set) rather than the clean patch ID.

If interior cells are never read after this point (e.g., they are completely masked), this is harmless. But it warrants confirming that no downstream consumer of ib_markers%sf can reach interior cells and misinterpret the encoded value.


2. Medium — Removal of inner-point momentum zeroing (physics correctness)

File: src/simulation/m_ibm.fpp, s_ibm_correct_state (removed ~lines 398–419)

The deleted loop set q_cons_vf(q)%sf(j,k,l) = 0 for all momentum components of every inner IB cell. Removing this means deeply-interior cells will accumulate whatever state the solver writes there. For a stationary IB this is generally benign (they never influence the exterior solution), but for moving IBs where a cell transitions from interior to ghost/fluid, stale momentum in those cells could contaminate the solution momentarily.

Since you tested mibm examples this may be validated, but consider adding a note in a comment or PR body about why removing this is safe for both static and moving IB cases.


3. Low — Non-moving IB allocation uses exact count with no buffer

File: src/simulation/m_ibm.fpp, s_ibm_setup (~line 130)

For non-moving IBs, num_gps is the local rank's exact count with no additional buffer. s_find_ghost_points then fills up to exactly that count, so there is no out-of-bounds risk for the static case. This is intentional and safe, just worth a brief comment explaining the design decision (static = exact, moving = 2× global sum).


4. Low — Verify s_finalize_ibm_module was updated

The diff does not show s_finalize_ibm_module being touched. Confirm that any prior @:DEALLOCATE(inner_points(...)) in that routine was also removed. If it was not present, this means inner_points was a pre-existing memory leak now resolved by elimination.


Improvement Opportunity

The comment at the top of the new moving-IB block could clarify why a 2× factor is used:

This would help future maintainers understand the intent without reading git history.


Overall this is a clean, well-scoped reduction. The main item to double-check is the ib_markers%sf encoding for interior cells and whether any downstream reader can hit those cells.

@github-actions
Copy link
Copy Markdown

Claude Code Review

Head SHA: 48a588009f9aaadae65ce8f321241e787a290ed4

Files changed: 1

  • src/simulation/m_ibm.fpp (+22 / -62)

Summary

  • Removes inner_points array and num_inner_gps counter, reducing GPU/CPU memory for IB simulations.
  • Adds a moving_immersed_boundary_flag guard: static IBs allocate exactly num_gps locally; moving IBs allreduce-sum and apply a 2× cap.
  • Removes inner-point momentum-zeroing loop from s_ibm_correct_state.
  • Simplifies s_find_num_ghost_points and s_find_ghost_points signatures to drop inner-point arguments.

Findings

1. Medium — ib_markers%sf encoding not cleaned for interior cells

File: src/simulation/m_ibm.fpp, s_find_ghost_points (removed else branch ~line 630)

The removed else block decoded the encoded patch ID and wrote the clean patch_id back to ib_markers%sf(i,j,k) for deeply-interior cells. That writeback is now gone. Any downstream code reading ib_markers%sf for an interior cell will see the encoded value (with periodicity bits) instead of the plain patch ID.

If interior cells are fully masked in all downstream readers this is benign, but please confirm no consumer of ib_markers%sf can reach interior cells and misinterpret the encoded value.


2. Medium — Removal of inner-point momentum zeroing (physics change)

File: src/simulation/m_ibm.fpp, s_ibm_correct_state (removed ~lines 401–418)

The deleted loop set q_cons_vf(q)%sf(j,k,l) = 0 for momentum of every fully-interior IB cell. For a static IB these cells are inert, but for a moving IB a cell that transitions from interior to ghost/fluid could carry stale nonzero momentum until the next correction cycle.

Testing "reran all mibm examples" covers memory correctness; it would also be valuable to confirm pressure/velocity inside IBs are unchanged between the old and new code, particularly for moving-boundary cases.


3. Low — Possible unused count_i variable in s_find_ghost_points

File: src/simulation/m_ibm.fpp, s_find_ghost_points

The diff removes all uses of count_i (it was only used in the removed inner-points else branch), but the declaration line (integer :: count, count_i, local_idx) is not shown being removed. If count_i is still declared, Intel ifx and Cray ftn may emit an unused-variable warning — or a build error under strict flags. Please verify the declaration is removed.


4. Low — Verify s_finalize_ibm_module was updated

The diff does not include any change to the finalization subroutine. Please confirm that @:DEALLOCATE(inner_points(...)) (or equivalent GPU deallocation) is also removed there. If no deallocation existed before, this was a pre-existing memory/GPU-data leak now resolved by eliminating the array.


Minor

  • PR description checkboxes ("Bug fix", "GPU results match CPU results", "Tested on NVIDIA GPU") are all unchecked. For a src/simulation/ change touching GPU memory layout, filling in the GPU testing checklist would strengthen confidence.
  • A short comment explaining why the 2× factor is used for the moving-IB allocation (headroom for boundary motion between setup and first timestep update) would help future readers.

@github-actions
Copy link
Copy Markdown

Claude Code Review

Head SHA: 8e0afea76d36a88d17ffd7f61b1933c38e64b50b

Files changed: 1

  • src/simulation/m_ibm.fpp (+22 / -62)

Summary

  • Removes the inner_points array and associated num_inner_gps counter, reducing GPU memory usage for immersed boundary (IBM) simulations.
  • Adds a guard so that the expensive MPI allreduce + 2× buffer allocation only occurs for moving-IB cases; static IB cases now allocate exactly num_gps entries.
  • Removes the inner-point momentum-zeroing pass in s_ibm_correct_state.
  • Updates s_find_num_ghost_points and s_find_ghost_points signatures to remove inner-point arguments.

Findings

1. Unused variable count_i — likely compiler warning / error (m_ibm.fpp ~L583)

count_i is still declared in s_find_ghost_points but all code that used it was removed:

integer :: count, count_i, local_idx   ! count_i now unused

Strict compilers (nvfortran, Intel ifx) may warn or error on unused variables. Recommend removing count_i from the declaration.

2. Physics correctness — inner-point momentum zeroing removed (m_ibm.fpp ~L400)

The deleted block in s_ibm_correct_state was zeroing momentum inside solid bodies:

! Removed:
do q = momxb, momxe
    q_cons_vf(q)%sf(j, k, l) = 0._wp
end do

If inner cells are ever accessed by the stencil (e.g., wide WENO reconstruction near sharp IB edges), non-zero spurious velocities inside the solid can corrupt results. The PR description says all mibm examples were re-run successfully, which is good. However, it is worth confirming whether these inner cells are truly read-isolated from the fluid stencil or whether the ghost-point correction already ensures they remain inert.

3. Potential out-of-bounds on moving-IB re-entry (m_ibm.fpp ~L932)

In s_update_mib, for each time step of a moving IB simulation:

call s_find_num_ghost_points(num_gps)          ! recomputes num_gps
call s_find_ghost_points(ghost_points)         ! uses dimension(num_gps)

The array ghost_points was allocated once in s_ibm_setup as 1:max_num_gps (= allreduce-sum × 2). If the IB moves such that a rank's local num_gps exceeds its share of max_num_gps (which was sized on the global sum, not per-rank max), writes in s_find_ghost_points could overflow the allocation. The previous code had the same structure with the same risk, but it is worth documenting the assumption that num_gps × 2 (global sum, used per-rank) is always a safe upper bound per rank.

4. @:DEALLOCATE for inner_points — confirm removal

The PR removes @:ALLOCATE(inner_points(...)) and $:GPU_ENTER_DATA(copyin='[...,inner_points]'). Per project rules, every @:ALLOCATE must have a matching @:DEALLOCATE. Please confirm that any corresponding @:DEALLOCATE(inner_points(...)) in the finalization subroutine was also removed (it is not visible in the diff, so it may have been outside the diffed context or already absent).


Minor

  • PR type-of-change checkboxes were not filled in.
  • For the non-moving path, a brief inline comment explaining why the allreduce is skipped (! static IB: no inter-rank redistribution needed) would improve readability.

@github-actions
Copy link
Copy Markdown

Claude Code Review

Head SHA: 58ac6bc68229fcb3b537f2d2efa8329ae88b1dec
Files changed: 1 — src/simulation/m_ibm.fpp
+31 / -73


Summary

  • Removes the inner_points array and num_inner_gps counter, saving two full ghost-point-sized GPU allocations.
  • Conditionalizes MPI allreduce and buffer sizing on moving_immersed_boundary_flag so static IBM cases no longer over-allocate.
  • Removes the inner-point momentum-zeroing loop in s_ibm_correct_state.
  • Removes the moving_ibm > 0 guard in the velocity-assignment loop, making velocity always set on IB cells.
  • Signature changes propagated consistently to s_find_num_ghost_points and s_find_ghost_points.

Findings

[HIGH — Physics behavior change] Inner-point momentum zeroing removed without explicit justification
src/simulation/m_ibm.fpp (around old line ~401, removed block)

The removed block:

if (num_inner_gps > 0) then
    do q = momxb, momxe
        q_cons_vf(q)%sf(j, k, l) = 0._wp
    end do
end if

zeroed momentum for interior IB points (cells fully inside an immersed boundary). These points still exist in the domain; they're just no longer tracked separately. Without the zeroing step, these cells will carry whatever conservative state the solver computes, which may be non-physical. The PR description says the inner-point array is "a waste of memory" but doesn't explain how the physics is preserved after removal. Please clarify: are interior-IB cells' states now handled elsewhere, or is zeroing them no longer necessary for correctness?


[MEDIUM — Behavioral change] moving_ibm > 0 guard removed from velocity assignment
src/simulation/m_ibm.fpp (around new line ~199)

Old code only set patch_ib(patch_id)%vel(i) on momentum when the patch was marked as moving. New code always sets it. This is safe only if vel is guaranteed zero for static IB patches. If vel can be non-zero for a patch where moving_ibm == 0 (e.g., a user specifies initial velocity but moving_ibm is off), the result changes silently. Worth confirming this invariant is enforced in case validation or default initialization.


[MEDIUM — Allocation sizing] Moving IBM buffer may be undersized in some MPI configurations
src/simulation/m_ibm.fpp (around new line ~128)

call s_mpi_allreduce_integer_sum(num_gps, max_num_gps)  ! sum across all ranks
max_num_gps = min(max_num_gps*2, (m + 1)*(n + 1)*(p + 1))
@:ALLOCATE(ghost_points(1:max_num_gps))

s_mpi_allreduce_integer_sum returns the global sum across all MPI ranks. With many ranks, max_num_gps before the *2 cap could already be a multiple of the per-rank domain size, and the min(..., (m+1)*(n+1)*(p+1)) cap is a per-rank domain count. On a fine decomposition this cap could be hit, effectively halving the available slots relative to the *2 intent. The old logic was min(allreduce_sum/2, domain/2) (with two separate arrays). The new logic feels equivalent in spirit but the relationship between the global sum and the per-rank domain cap may not preserve the same safety margin. Please verify this allocation is sufficient when tested with ≥2 MPI ranks and a moving IB near a rank boundary.


[LOW — Exact allocation for static case] s_ibm_recompute_ghost_points path for static IBM
src/simulation/m_ibm.fpp (around new line ~930)

For static IBM, ghost_points is allocated to exactly num_gps (no buffer). If s_ibm_recompute_ghost_points can ever be invoked in the static case (e.g., through a code path that doesn't check moving_immersed_boundary_flag before calling), there is no risk of out-of-bounds since the count doesn't change. But it would be safer to add an assertion or early return guarding that routine for non-moving cases, both for correctness documentation and to avoid surprise from future call-site changes.


Checklist items noted

The PR template GPU checklist boxes (GPU results match CPU and Tested on NVIDIA/AMD GPU) are unchecked. Given this touches s_find_ghost_points which uses GPU_PARALLEL_LOOP with atomic operations, GPU validation is important before merge.


Overall: the memory reduction intent is sound and the implementation is clean. The primary concern is the silent physics change from dropping the inner-point momentum zeroing — please either explain why it's safe or restore it. The allocation sizing for multi-rank moving IBM also warrants a closer look.

@danieljvickers danieljvickers marked this pull request as ready for review March 15, 2026 19:38
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 15, 2026

📝 Walkthrough

Walkthrough

This change simplifies the IBM (Immersed Boundary Method) module by removing support for tracking inner points and inner GPU-related declarations. The implementation now relies exclusively on ghost points for tracking boundary interactions and GPU data management. Allocations, GPU data transfers, and data processing paths have been consolidated to use ghost points only. Two subroutine signatures were updated: s_find_num_ghost_points no longer returns a count of inner GPs, and s_find_ghost_points no longer accepts inner points as input. Conditional logic handling inner GP state has been removed throughout the module, resulting in a streamlined, ghost-point-centric design.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Description check ❓ Inconclusive PR description is mostly complete with motivation, type of change, and testing details, but lacks issue reference and GPU testing confirmation despite modifying src/simulation/. Add the issue number (Fixes #...) and confirm GPU test results (CPU/GPU match) since src/simulation/ files were modified, as specified in the GPU changes checklist.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'IB Memory Reduction' directly and accurately summarizes the main change—removing the inner point array and reducing ghost points to free memory resources.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

You can validate your CodeRabbit configuration file in your editor.

If your editor has YAML language server, you can enable auto-completion and validation by adding # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json at the top of your CodeRabbit configuration file.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c32d7245-f497-4577-84f9-8476db657ca5

📥 Commits

Reviewing files that changed from the base of the PR and between 93e3d09 and 58ac6bc.

📒 Files selected for processing (1)
  • src/simulation/m_ibm.fpp

Comment on lines +75 to 76
$:GPU_ENTER_DATA(copyin='[num_gps]')

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing GPU_EXIT_DATA for num_gps.

num_gps is entered to GPU data region here but there's no corresponding GPU_EXIT_DATA in s_finalize_ibm_module to release the device memory.

Comment on lines +129 to +132
@:ALLOCATE(ghost_points(1:max_num_gps))

$:GPU_ENTER_DATA(copyin='[ghost_points,inner_points]')
call s_find_ghost_points(ghost_points, inner_points)
$:GPU_ENTER_DATA(copyin='[ghost_points]')
call s_find_ghost_points(ghost_points)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Missing @:DEALLOCATE and GPU_EXIT_DATA for ghost_points.

ghost_points is allocated here with @:ALLOCATE and entered to GPU with GPU_ENTER_DATA, but s_finalize_ibm_module does not include matching cleanup calls. This will cause a memory leak on both host and device.

🔧 Proposed fix in s_finalize_ibm_module
 impure subroutine s_finalize_ibm_module()

+    $:GPU_EXIT_DATA(delete='[ghost_points, num_gps]')
+    @:DEALLOCATE(ghost_points)
     @:DEALLOCATE(ib_markers%sf)
     if (allocated(airfoil_grid_u)) then
         @:DEALLOCATE(airfoil_grid_u)
         @:DEALLOCATE(airfoil_grid_l)
     end if

 end subroutine s_finalize_ibm_module

As per coding guidelines: "Every @:ALLOCATE(...) macro call must have a matching @:DEALLOCATE(...) in Fortran files" and "Lifecycle of allocations must be paired with deallocations; ensure all data entering GPU memory has corresponding exit/cleanup."

@github-actions
Copy link
Copy Markdown

Claude Code Review

Head SHA: 58ac6bc
Files changed: 1 — `src/simulation/m_ibm.fpp`
Net diff: +31 / -73

Summary

  • Removes `inner_points` array and `num_inner_gps` counter, reducing memory for IBM
  • Adds a guard to skip MPI allreduce and avoid over-allocating for non-moving IB cases
  • Removes the inner-points momentum-zeroing loop in `s_ibm_correct_state`
  • Removes the `moving_ibm > 0` guard on the momentum-enforcement loop, applying it unconditionally to all IB cells

Findings

1. Encoded ib_markers values for inner cells may cause out-of-bounds access (potential correctness bug)

Location: `s_ibm_correct_state`, ~line 199

`s_apply_ib_patches` stores encoded patch IDs in `ib_markers%sf` (encoding includes periodicity bits). Previously, `s_find_ghost_points` called `s_decode_patch_periodicity` for both ghost cells and inner cells, writing the decoded `patch_id` back to `ib_markers%sf(i,j,k)`. That inner-cell decoding path has been removed in this PR.

Now, inner IB cells (cells inside the IB that are not adjacent to a fluid cell) retain their encoded values in `ib_markers%sf`. The momentum-enforcement loop reads:

patch_id = ib_markers%sf(j, k, l)   ! may be an encoded value for inner cells
if (patch_id /= 0) then
    ...
    q_cons_vf(momxb + i - 1)%sf(j, k, l) = patch_ib(patch_id)%vel(i)*rho  ! out-of-bounds if encoded > num_patches_ib

For non-periodic IBMs the encoded value equals the plain patch ID — no problem. For periodic IBM simulations where inner cells exist, `encoded_patch_id > num_ibs` and `patch_ib(patch_id)` is an out-of-bounds array access.

The old code was protected because (a) inner cells were decoded and (b) the `if (patch_ib(patch_id)%moving_ibm > 0)` guard was present. Both protections are now gone. Please confirm that all tested mibm examples are non-periodic, or add a decode step for inner cells, or verify that `s_apply_ib_patches` never stores encoded values for non-ghost cells.


2. Missing @:DEALLOCATE(ghost_points) in finalization (convention violation)

Location: `s_finalize_ibm_module`, line ~1083

`s_finalize_ibm_module` deallocates `ib_markers%sf` but not `ghost_points`. Per project convention (CLAUDE.md): "Every `@:ALLOCATE(...)` MUST have a matching `@:DEALLOCATE(...)`." The `inner_points` allocation was also missing this before the PR (which this PR fixes by removing it). But `ghost_points` remains without a matching `@:DEALLOCATE`.

This should be:

subroutine s_finalize_ibm_module()
    @:DEALLOCATE(ib_markers%sf)
+   if (allocated(ghost_points)) then
+       @:DEALLOCATE(ghost_points)
+   end if
    ...
end subroutine

3. Momentum enforcement now applies to stationary IBMs (behavioral change)

Location: `s_ibm_correct_state`, lines ~195–215

Removing the `if (patch_ib(patch_id)%moving_ibm > 0)` guard means the momentum-enforcement block now runs for all IB cells regardless of whether the IB is moving. For stationary patches with `vel = [0,0,0]`, this explicitly sets momentum to zero on every timestep — previously a no-op implicit. This is safe if and only if stationary `patch_ib` entries always have `vel = 0`.

If a use case sets `vel /= 0` for a "stationary" patch (e.g., a fixed-velocity inflow IB), this change alters the behavior. Please confirm the intended semantics or add a comment explaining the new invariant.


4. Allocation factor for moving IBMs (minor)

Location: `s_ibm_setup`, lines ~120–130

call s_mpi_allreduce_integer_sum(num_gps, max_num_gps)
max_num_gps = min(max_num_gps*2, (m + 1)*(n + 1)*(p + 1))

`s_mpi_allreduce_integer_sum` gives the sum across all MPI ranks. So each rank allocates a buffer for 2× the global total, capped at the local domain size. This is very conservative but safe. A comment explaining why the sum (not max) is used, and why the factor of 2 is sufficient, would help future readers.


Overall

The memory savings goal is well-motivated and the non-moving optimization is clean. Items 1 and 2 need attention before merge; items 3–4 are lower priority clarifications.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 15, 2026

Codecov Report

❌ Patch coverage is 64.28571% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 45.36%. Comparing base (93e3d09) to head (58ac6bc).
⚠️ Report is 5 commits behind head on master.

Files with missing lines Patch % Lines
src/simulation/m_ibm.fpp 64.28% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1307      +/-   ##
==========================================
- Coverage   45.36%   45.36%   -0.01%     
==========================================
  Files          70       70              
  Lines       20499    20488      -11     
  Branches     1953     1951       -2     
==========================================
- Hits         9300     9294       -6     
+ Misses      10074    10069       -5     
  Partials     1125     1125              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sbryngelson sbryngelson merged commit d8c479f into MFlowCode:master Mar 16, 2026
91 of 94 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants