Equation Indices and BC Types - Issue (#828) #860

mohdsaid497566 · Jun 5, 2025

Description

making derived types and addressing issue #828

Fixes #(issue) [optional]

Type of change

Please delete options that are not relevant.

New feature (non-breaking change which adds functionality)

Scope

This PR comprises a set of related changes with a common goal

If you cannot check the above box, please split your PR into multiple PRs that each have a common goal.

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.
Provide instructions so we can reproduce.
Please also list any relevant details for your test configuration

Test A
Test B

Test Configuration:

What computers and compilers did you use to test this:

Checklist

I have added comments for the new code
I added Doxygen docstrings to the new code
I have made corresponding changes to the documentation (docs/)
I have added regression tests to the test suite so that people can verify in the future that the feature is behaving as expected
I have added example cases in examples/ that demonstrate my new feature performing as expected.
They run to completion and demonstrate "interesting physics"
I ran ./mfc.sh format before committing my code
New and existing tests pass locally with my changes, including with GPU capability enabled (both NVIDIA hardware with NVHPC compilers and AMD hardware with CRAY compilers) and disabled
This PR does not introduce any repeated code (it follows the DRY principle)
I cannot think of a way to condense this code and reduce any introduced additional line count

If your code changes any code source files (anything in `src/simulation`)

To make sure the code is performing as expected on GPU devices, I have:

Checked that the code compiles using NVHPC compilers
Checked that the code compiles using CRAY compilers
Ran the code on either V100, A100, or H100 GPUs and ensured the new feature performed as expected (the GPU results match the CPU results)
Ran the code on MI200+ GPUs and ensure the new features performed as expected (the GPU results match the CPU results)
Enclosed the new feature via nvtx ranges so that they can be identified in profiles
Ran a Nsight Systems profile using ./mfc.sh run XXXX --gpu -t simulation --nsys, and have attached the output file (.nsys-rep) and plain text results to this PR
Ran a Rocprof Systems profile using ./mfc.sh run XXXX --gpu -t simulation --rsys --hip-trace, and have attached the output file and plain text results to this PR.
Ran my code using various numbers of different GPUs (1, 2, and 8, for example) in parallel and made sure that the results scale similarly to what happens if you run without the new code/feature

sbryngelson · Jun 6, 2025

.vscode/settings.json

@@ -16,8 +16,8 @@
    "python.defaultInterpreterPath": "${workspaceFolder}/build/venv/bin/python3",

    "files.associations": {
-        "*.f90": "FortranFreeForm",
-        "*.fpp": "FortranFreeForm",
+        "*.f90": "fortran-modern",


does this help anything?

sbryngelson · Jun 6, 2025

src/common/m_boundary_common.fpp

@@ -22,7 +22,7 @@ module m_boundary_common
    type(scalar_field), dimension(:, :), allocatable :: bc_buffers
    !$acc declare create(bc_buffers)

-    real(wp) :: bcxb, bcxe, bcyb, bcye, bczb, bcze
+    type(boundary_flags) :: bc_flag


These aren't flags but rather bounds. bc_x_b = boundarycondition_xdirection_beginning.

sbryngelson · Jun 6, 2025

src/common/m_derived_types.fpp

@@ -345,6 +345,11 @@ module m_derived_types
        type(vec3_dt), allocatable, dimension(:) :: var
    end type mpi_io_airfoil_ib_var

+    !> Derived type for boundary flags
+    type boundary_flags


see above comment about the word 'flags'

sbryngelson · Jun 6, 2025

src/simulation/m_ibm.fpp

-                    call s_convert_species_to_mixture_variables_acc(rho, gamma, pi_inf, qv_K, alpha_IP, &
-                                                                    alpha_rho_IP, Re_K, j, k, l, G_K, Gs)
+                    call s_convert_species_to_mixture_variables_acc(rho, gamma, pi_inf, qv_K, &
+                                                                    gp%ip%alpha, gp%ip%alpha_rho, Re_K, j, k, l, G_K, Gs)


one advantage of this strategy of using more derived types is that we can just pass in gp%ip (for ibm) or state%L (or something like this) for riemann solvers. basically gathering all the data that could exist at a given point and lumping it into a derived type, and then the argument list becomes very short. for example, you just pass in

call s_routine(state)

where

subroutine s_routine(mystate) type(state) :: mystate [...] mystate%rho = [......] end subroutine

sbryngelson · Jun 6, 2025

src/simulation/m_ibm.fpp

@@ -743,22 +757,15 @@ contains

    !> Function that uses the interpolation coefficients and the current state
    !! at the cell centers in order to estimate the state at the image point
-    pure subroutine s_interpolate_image_point(q_prim_vf, gp, alpha_rho_IP, alpha_IP, pres_IP, vel_IP, c_IP, r_IP, v_IP, pb_IP, mv_IP, nmom_IP, pb, mv, presb_IP, massv_IP)
+    pure subroutine s_interpolate_image_point(q_prim_vf, gp, pb, mv)


this is an good example of what i was talking about above... for removing so many arguments.

yup yup I see that

sbryngelson · Jun 6, 2025

src/simulation/m_ibm.fpp

+
+        if (allocated(ghost_points)) then
+            do i = 1, size(ghost_points)
+                if (allocated(ghost_points(i)%ip%alpha_rho)) deallocate (ghost_points(i)%ip%alpha_rho)


FYI from perplexity:

Perfect! Given that your point_data type uses allocatable components (not pointers), your manual deallocation loop is completely unnecessary.

The Clean Solution

Since all the components you're manually deallocating (alpha_rho, alpha, r, v, pb, mv, nmom, presb, massv) are declared as allocatable, Fortran will automatically deallocate them when the parent object is deallocated.

You can replace your entire verbose loop with just:

if (allocated(ghost_points)) deallocate(ghost_points)

That's it! When ghost_points is deallocated, Fortran automatically:

Deallocates all allocatable components within each ghost_points(i)%ip

Deallocates the ghost_points array itself

Why This Works

The Fortran standard guarantees that when a derived type object with allocatable components is deallocated (or goes out of scope), all of its allocatable components are automatically deallocated first. This is called automatic deallocation and is one of the key advantages of using allocatable over pointer.

Additional Benefits

This approach is not only cleaner but also:

Safer: No risk of forgetting to deallocate a component

More maintainable: Adding new allocatable components to point_data requires no changes to deallocation code

Better performance: No unnecessary allocation status checks in loops

Your original manual approach was doing work that Fortran handles automatically for allocatable components. The verbose loop is only necessary when dealing with pointer components, which require explicit deallocation.

neat will remove the loop

sbryngelson · Jun 6, 2025

src/simulation/m_ibm.fpp

@@ -95,6 +95,35 @@ contains
        @:ALLOCATE(ghost_points(1:num_gps))


i'm not sure how this ever worked. i would ask on the MFC Slack and tag @abbotts . this is an array of derived types. it seems like you would need to use something like this macro for AMD GPUs:

MFC/src/common/include/macros.fpp

Lines 48 to 62 in 32c0347

#:def ACC_SETUP_SFs(*args)

#ifdef _CRAYFTN

block

@:LOG({'@:ACC_SETUP_SFs(${', '.join(args)}$)'})

#:for arg in args

!$acc enter data copyin(${arg}$)

if (associated(${arg}$%sf)) then

!$acc enter data create(${arg}$%sf)

end if

#:endfor

end block

#endif

#:enddef

the reason is perhaps that sf is another derived type. when you have a derived type that has another allocatable derived type within it, you need to use a macro like this (except instead of sf it would be something else)

sbryngelson · Jun 6, 2025

src/simulation/m_ibm.fpp

+            end if
+
+            if (qbmm) then
+                allocate (ghost_points(i)%ip%nmom(nb*nmom))


we use the @:ALLOCATE macro in almost all cases i think, since it declares the variable on the device.

sbryngelson

this is a good PR but isn't quite right. the issues are subtle though, and are somewhat MFC specific. ask about the things i brought up on the slack channel (or ask @wilfonba @abbotts @anandrdbz etc.)

codecov · Jun 6, 2025

Codecov Report

Attention: Patch coverage is 25.00000% with 72 lines in your changes missing coverage. Please review.

Project coverage is 43.39%. Comparing base (32c0347) to head (e689a6b).

Files with missing lines	Patch %	Lines
src/simulation/m_ibm.fpp	5.79%	65 Missing ⚠️
src/common/m_boundary_common.fpp	33.33%	6 Missing ⚠️
src/simulation/m_cbc.fpp	92.85%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #860      +/-   ##
==========================================
- Coverage   43.42%   43.39%   -0.04%     
==========================================
  Files          68       68              
  Lines       19630    19655      +25     
  Branches     2353     2355       +2     
==========================================
+ Hits         8525     8529       +4     
- Misses       9676     9697      +21     
  Partials     1429     1429

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

sbryngelson · Jun 6, 2025

This CCE failure suggests something is wrong with allocation, deallocation, or memory access. It happens on Cray CPU cases.

+ srun --nodes 1 --ntasks-per-node 1 --cpus-per-task 7 /lustre/orion/cfd154/scratch/sbryngelson/runner/actions-runner/_work/MFC/MFC/build/install/877bac7bd7/bin/simulation
corrupted size vs. prev_size in fastbins
srun: error: frontier10353: task 0: Aborted (core dumped)
srun: Terminating StepId=3499970.414

and chatgpt says

Thanks — since your simulation is written in Fortran, the corrupted size vs. prev_size in fastbins error still points to heap corruption, but it's likely caused by one of the following Fortran-specific issues:

🔍 Common Causes in Fortran

Out-of-bounds array access

Fortran does not check array bounds unless explicitly compiled with bounds checking.

For example:

real, allocatable :: arr(:)
allocate(arr(10))
arr(11) = 1.0  ! <- invalid access, no error unless bounds checking is on

Use of uninitialized pointers or allocatables
- Deallocating something not allocated, or reusing pointers improperly, can corrupt memory.
```
real, pointer :: p(:)
deallocate(p)  ! <- invalid if never allocated
```
Double deallocate()
- Fortran does not always protect against this unless you're careful.
Mixing C interop with Fortran allocatables
- If you’re using ISO_C_BINDING and passing Fortran arrays to C (e.g., through OpenACC, CUDA Fortran, or MPI buffers), and there's a mismatch in memory handling, heap corruption can result.

🛠️ Recommended Actions

✅ Compile with bounds and memory checks:

Add flags to enable runtime checking:

For gfortran:
```
-g -O0 -fcheck=all -fbacktrace -Wall
```
For ifort (Intel Fortran):
```
-g -check all -traceback -warn all
```

✅ Use `valgrind` with MPI wrapper or standalone:

valgrind ./simulation

Or with MPI:

mpirun -n 1 valgrind ./simulation

✅ Reproduce with small input

Try to isolate the problem with a minimal input case. Heap corruption is often non-deterministic, so smaller runs can help localize it.

✅ Consider AddressSanitizer (if using gfortran):

Modern gfortran supports it:

-fsanitize=address -fsanitize=undefined -g

⚠️ This slows down execution but will immediately point out invalid memory use.

mohdsaid497566 · Jun 8, 2025

So I made the following derived type, but the build always corrupts mainly in the post-process because sys_size clashes with all processes while being in the form eqn_idx%sys_size for some reason. So for now, I will undo all changes pertaining to sys_size. Other than that, all ought to be sound and functional I hope so.

    !> @name Annotations of the structure of the state and flux vectors in terms of the
    !! size and the configuration of the system of equations to which they belong
    !> @{
    type system_of_equations
        integer :: sys_size                            !< Number of unknowns in system of eqns.
        type(int_bounds_info) :: cont                  !< Indexes of first & last continuity eqns.
        type(int_bounds_info) :: mom                   !< Indexes of first & last momentum eqns.
        integer :: E                                   !< Index of energy equation
        integer :: n                                   !< Index of number density
        type(int_bounds_info) :: adv                   !< Indexes of first & last advection eqns.
        type(int_bounds_info) :: internalEnergies      !< Indexes of first & last internal energy eqns.
        type(bub_bounds_info) :: bub                   !< Indexes of first & last bubble variable eqns.
        integer :: alf                                 !< Index of void fraction
        integer :: gamma                               !< Index of specific heat ratio func. eqn.
        integer :: pi_inf                              !< Index of liquid stiffness func. eqn.
        type(int_bounds_info) :: B                     !< Indexes of first and last magnetic field eqns.
        type(int_bounds_info) :: stress                !< Indexes of first and last shear stress eqns.
        type(int_bounds_info) :: xi                    !< Indexes of first and last reference map eqns.
        integer :: b_size                              !< Number of elements in the symmetric b tensor, plus one
        integer :: tensor_size                         !< Number of elements in the full tensor plus one
        type(int_bounds_info) :: species               !< Indexes of first & last concentration eqns.
        integer :: c                                   !< Index of color function
        integer :: damage                              !< Index of damage state variable (D) for continuum damage model
        integer, dimension(3) :: dir
        real(wp), dimension(3) :: dir_flg
        integer, dimension(3) :: dir_tau !!used for hypoelasticity=true
        integer, dimension(2) :: Re_size
        integer, allocatable, dimension(:, :) :: Re
    end type system_of_equations

mohdsaid497566 · Jun 8, 2025

src/simulation/m_ibm.fpp

-                if (allocated(ghost_points(i)%ip%massv)) deallocate (ghost_points(i)%ip%massv)
-            end do
-        end if
+


@abbotts

Do you mind checking out this line?
I suspect it to be causing all IBM cases to fail.

sbryngelson · Jun 9, 2025

src/common/m_derived_types.fpp

+        real(wp), dimension(3) :: dir_flg
+        integer, dimension(3) :: dir_tau !!used for hypoelasticity=true
+        integer, dimension(2) :: Re_size
+        integer, allocatable, dimension(:, :) :: Re


you may need to be careful with this on GPU cases (especially Frontier/Cray). i forget exactly how we deal with cases where we have an allocatable in a derived type but i would look for other examples of the code that do this. also, isn't Re a real and not an integer? and why is its size unknown at compile time?

I kinda omitted _idx from all variables listed. It was Re_idx and I just copied the declaration into under system_of_equations type.

sbryngelson · Jun 9, 2025

So I made the following derived type, but the build always corrupts mainly in the post-process because sys_size clashes with all processes while being in the form eqn_idx%sys_size for some reason. So for now, I will undo all changes pertaining to sys_size. Other than that, all ought to be sound and functional I hope so.

    !> @name Annotations of the structure of the state and flux vectors in terms of the
    !! size and the configuration of the system of equations to which they belong
    !> @{
    type system_of_equations
        integer :: sys_size                            !< Number of unknowns in system of eqns.
        type(int_bounds_info) :: cont                  !< Indexes of first & last continuity eqns.
        type(int_bounds_info) :: mom                   !< Indexes of first & last momentum eqns.
        integer :: E                                   !< Index of energy equation
        integer :: n                                   !< Index of number density
        type(int_bounds_info) :: adv                   !< Indexes of first & last advection eqns.
        type(int_bounds_info) :: internalEnergies      !< Indexes of first & last internal energy eqns.
        type(bub_bounds_info) :: bub                   !< Indexes of first & last bubble variable eqns.
        integer :: alf                                 !< Index of void fraction
        integer :: gamma                               !< Index of specific heat ratio func. eqn.
        integer :: pi_inf                              !< Index of liquid stiffness func. eqn.
        type(int_bounds_info) :: B                     !< Indexes of first and last magnetic field eqns.
        type(int_bounds_info) :: stress                !< Indexes of first and last shear stress eqns.
        type(int_bounds_info) :: xi                    !< Indexes of first and last reference map eqns.
        integer :: b_size                              !< Number of elements in the symmetric b tensor, plus one
        integer :: tensor_size                         !< Number of elements in the full tensor plus one
        type(int_bounds_info) :: species               !< Indexes of first & last concentration eqns.
        integer :: c                                   !< Index of color function
        integer :: damage                              !< Index of damage state variable (D) for continuum damage model
        integer, dimension(3) :: dir
        real(wp), dimension(3) :: dir_flg
        integer, dimension(3) :: dir_tau !!used for hypoelasticity=true
        integer, dimension(2) :: Re_size
        integer, allocatable, dimension(:, :) :: Re
    end type system_of_equations

sys_size is used in so many places in the code that it's possible you missed something somewhere. also, did you make sure to !$acc update device(eqn_idx%sys_size) after it is set in m_global_parameters? i'm thinking about this line here

MFC/src/simulation/m_global_parameters.fpp

Line 1219 in db44da1

    
                   !$acc update device(momxb, momxe, advxb, advxe, contxb, contxe, bubxb, bubxe, intxb, intxe, sys_size, buff_size, E_idx, alf_idx, n_idx, adv_n, adap_dt, pi_fac, strxb, strxe, chemxb, chemxe, c_idx)

sbryngelson · Jun 9, 2025

nitpick but probably can change the naming pattern of bub_bounds_info to bub_bounds and similiar. these types are used in the code a lot so it makes sense to keep the names of types short-ish.

sbryngelson · Jun 9, 2025

FYI i moved pressure relaxation from RHS to its own module, which is why the merge conflict there looks so crazy

…to bc-dev

mohdsaid497566 added 2 commits June 5, 2025 17:30

ip as derived type (MFlowCode#828)

69a5312

push for testing

f5554d4

mohdsaid497566 requested review from a team and sbryngelson as code owners June 5, 2025 23:08

bc_flag as derived type

d60b4ea

sbryngelson reviewed Jun 6, 2025

View reviewed changes

mohdsaid497566 added 2 commits June 6, 2025 01:10

bc_flag implemented

bf2a897

renamed to bc_bound and fixed things

e689a6b

mohdsaid497566 added 2 commits June 6, 2025 20:24

reverted allocate cases

485b1cb

eqn_idx as derived type

f50daac

mohdsaid497566 commented Jun 8, 2025

View reviewed changes

mohdsaid497566 added 2 commits June 8, 2025 22:34

debugging build issues

11e1e71

added derived type eqn_idx

4b813ab

sbryngelson reviewed Jun 9, 2025

View reviewed changes

mohdsaid497566 added 2 commits June 9, 2025 00:07

eqn_idx implemented in pre_process build

b11682b

eqn_idx implemented in simulation build

57f6b1a

mohdsaid497566 and others added 6 commits June 9, 2025 01:03

eqn_idx implemented in post_process build

6bd7e36

Merge branch 'bc-dev2' into bc-dev

1e7c62b

Merge branch 'bc-dev' of https://github.com/mohdsaid497566/MFC-mo2 in…

37aa5d9

…to bc-dev

Merge branch 'master' into bc-dev

ac48caa

removed duplicate type definition

bd3269a

added sys_size under eqn_idx

514b771

		@@ -95,6 +95,35 @@ contains
		@:ALLOCATE(ghost_points(1:num_gps))

	#:def ACC_SETUP_SFs(*args)
	#ifdef _CRAYFTN
	block

	@:LOG({'@:ACC_SETUP_SFs(${', '.join(args)}$)'})

	#:for arg in args
	!$acc enter data copyin(${arg}$)
	if (associated(${arg}$%sf)) then
	!$acc enter data create(${arg}$%sf)
	end if
	#:endfor
	end block
	#endif
	#:enddef

Search code, repositories, users, issues, pull requests...

Equation Indices and BC Types - Issue (#828) #860

Are you sure you want to change the base?

Equation Indices and BC Types - Issue (#828) #860

Conversation

mohdsaid497566 commented Jun 5, 2025

Description

Type of change

Scope

How Has This Been Tested?

Checklist

If your code changes any code source files (anything in src/simulation)

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

The Clean Solution

Why This Works

Additional Benefits

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sbryngelson left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jun 6, 2025

Codecov Report

Uh oh!

sbryngelson commented Jun 6, 2025

🔍 Common Causes in Fortran

🛠️ Recommended Actions

✅ Compile with bounds and memory checks:

✅ Use valgrind with MPI wrapper or standalone:

✅ Reproduce with small input

✅ Consider AddressSanitizer (if using gfortran):

Uh oh!

mohdsaid497566 commented Jun 8, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sbryngelson commented Jun 9, 2025

Uh oh!

sbryngelson commented Jun 9, 2025

Uh oh!

sbryngelson commented Jun 9, 2025

Uh oh!

Uh oh!

If your code changes any code source files (anything in `src/simulation`)

✅ Use `valgrind` with MPI wrapper or standalone: