-
Notifications
You must be signed in to change notification settings - Fork 383
Fix framework field deallocation errors for XL compiler #578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix framework field deallocation errors for XL compiler #578
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amametjanov thanks for your work on this. I merged this in locally, and it passes the ocean regression suite for both intel and gnu, opt and debug. I did not test the longer simulations that originally had a memory leak, but I would assume a nulify removes the memory the same way a deallocate does.
|
@mgduda do you have any comments on this one? |
|
Let me take another look in the next couple of days. I had been working independently to address Issue #481 back in April. It looks like what I had done is a bit different from the approach taken in this PR, and I want to convince myself that the approaches are equivalent before approving (or to determine why what I was doing was wrong). |
|
@mgduda if we could merge this in the next few days, that would be helpful. |
mgduda
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the changes in this PR introduce a new memory leak. To demonstrate this, define a new field in a core's Registry.xml file:
<var name="foo" type="real" dimensions="vertexDegree maxEdges nEdges"/>Then use code like the following to allocate a duplicate of the foo field before calling mpas_deallocate_field:
type (field3dReal), pointer :: foo, foo_copy
call mpas_pool_get_field(mesh, 'foo', foo)
call mpas_duplicate_field(foo, foo_copy)
call mpas_deallocate_field(foo_copy)|
@amametjanov @mark-petersen I have some changes in progress that should fix the issue with the XL compilers while avoiding the memory leak I mentioned in my review. The next few days are somewhat busy, but I could probably finish the changes by 18 June. I'm not sure whether it would be easier to push additional changes to this PR, or to just create a new PR. I'm open to suggestions. |
|
@mgduda thanks for looking into this. We are hoping to merge the develop branch to all our cores on Monday. If you have time to update this before then, we will stick with this PR. Otherwise, we can merge this one and then make a correction on a new PR. |
|
@amametjanov Do you have any objection to my pushing of additional commits to your |
|
Not at all, please go ahead. |
|
@amametjanov @mark-petersen I've taken a slightly different approach to fixing the deallocation errors when using the XL compilers. I need to clean up the commit messages and do more testing, but if either of you have a few minutes, would you be able to give the framework/fix_xl_deallocate_errors branch from my fork a try on Summit? If that branch works for you, I can try to formulate it as an addition to the changes in this PR. Since we've spent so much time on this issue, I do think having a detailed description of the changes in the commit history will be helpful to anyone looking through the code in future. |
|
@mark-petersen @amametjanov Since you mentioned merging the |
|
I ran @mgduda please go ahead with the updates. Thanks. |
Prior to this commit, MPAS models would fail with the error message "mpas_field_routines.F", line 1981: 1525-109 Error encountered while attempting to deallocate a data object. The program will stop. when using the XL compilers. The issue appears to be that the XL compiler does not allow for deallocation of dummy arguments that have the 'target' attribute. To address these runtime deallocation errors, the mpas_deallocate_field_target routine never deallocates the head pointer for the field; accordingly, the keep_head_ptr optional argument has been removed. ** Note ** : The changes in this commit introduce memory leaks, since the mpas_deallocate_field routine calls mpas_deallocate_field_target, and only nullifies the head pointer of the field after these calls.
This commit removes the unused function write_field_pointers from the registry code-generation tool.
This commit removes the unused function generate_struct_links from the registry code-generation tool.
This commit addresses an error in deallocating fields in the mpas_pool_destroy_pool routine for fields with only one time level: "mpas_pool_routines.F", line 259: 1525-109 Error encountered while attempting to deallocate a data object. The program will stop. Fields with just one time level are allocated as an array of fields with size one, and the first (and only) element of this array is added to a pool. During deallocation of the fields, the above error is a result of the attempt to deallocate an element of an array (rather than the entire array). To address this problem, the registry-generated code to build pools allocates fields with a single time level using a scalar field pointer, rather than as an array of fields of size one.
Previously, the head pointer of a field was just nullified in mpas_deallocate_field after the call to mpas_deallocate_field_target. Now, the head pointers for all blocks of a field are deallocated. Also, the mpas_pool_destroy_pool routine can now call mpas_deallocate_field rather than calling mpas_deallocate_field_target followed by a deallocation of the field head pointer.
This commit first removes the source of a potential deallocation error: "mpas_pool_routines.F", line 358: 1525-109 Error encountered while attempting to deallocate a data object. The program will stop. when using the XL compilers; this error is a result of the fact that the mpas_pool_add_subpool routine declares the subpool dummy argument with the target attribute, and deallocation can only happen for variables that are allocatable or have the pointer attribute. With the change to the mpas_pool_add_subpool routine, all "stat" optional arguments can be removed from calls to deallocate in the mpas_pool_destroy_pool routine.
c02d215 to
a9d2aea
Compare
|
@amametjanov It got a bit messy trying to rebase my changes onto c02d215, and doing so also invalidated some of the line numbers that I referenced in error messages in my commit messages, so the easiest approach seem to be to simply push my branch to |
|
That's okay. I keep running into this error with multiple repo clones: sooner=better. |
|
Never mind - I refreshed my page and it shows approved now. Thanks! |
#3737) Update MPAS framework This PR brings in a new mpas-source submodule with updates to the mpas framework itself, plus changes to the cores supporting the framework update. Some changes are made to the atmosphere core, even though it is not used by E3SM, in order to maintain consistency with the framework. This update includes the following individual branches and PRs, many of which are additions to the makefiles for summit, or optional libraries: * az/azamat/mpas-cmake/mpas-tool-dir (MPAS-Dev/MPAS-Model#629) * init_atmosphere/kd_tree_ties (MPAS-Dev/MPAS-Model#630) * mark-petersen/framework/add_FillValue (MPAS-Dev/MPAS-Model#616) * mark-petersen/framework/add_lapack_option (MPAS-Dev/MPAS-Model#613) * amametjanov/framework/nullify-field-pointers (MPAS-Dev/MPAS-Model#578) * framework/makefile_e3sm (MPAS-Dev/MPAS-Model#603) * xylar/framework/make_shell_bash (MPAS-Dev/MPAS-Model#594) * registry/missing_value (MPAS-Dev/MPAS-Model#562) * pwolfram/updates_make_intel_stack (MPAS-Dev/MPAS-Model#592) * azamat/framework/pgi-cpr-omp-workaround (MPAS-Dev/MPAS-Model#449) * az/framework/e3sm-cmake-qnosmp-typo (MPAS-Dev/MPAS-Model#579) * xylar/compass/add_prerequisite_tag * atmosphere/fix_timekeeping_imports (MPAS-Dev/MPAS-Model#582) * xylar/fix_docs_ci (MPAS-Dev/MPAS-Model#575) * xylar/add_compass_docs (MPAS-Dev/MPAS-Model#472) * philipwjones/framework/summitmake (MPAS-Dev/MPAS-Model#536) * atmosphere/atm_core_cleanup (MPAS-Dev/MPAS-Model#548) * init_atmosphere/parse_geoindex (MPAS-Dev/MPAS-Model#459) * xylar/compass/add_copy_file (MPAS-Dev/MPAS-Model#467) * init_atmosphere/kd_tree (MPAS-Dev/MPAS-Model#438) * init_atmosphere/mpas_stack (MPAS-Dev/MPAS-Model#426) * operators/add_mpas_in_cell (MPAS-Dev/MPAS-Model#400) Fixes #3396 Fixes #3236 [BFB]
Update MPAS framework This PR brings in a new mpas-source submodule with updates to the mpas framework itself, plus changes to the cores supporting the framework update. Some changes are made to the atmosphere core, even though it is not used by E3SM, in order to maintain consistency with the framework. This update includes the following individual branches and PRs, many of which are additions to the makefiles for summit, or optional libraries: * az/azamat/mpas-cmake/mpas-tool-dir (MPAS-Dev/MPAS-Model#629) * init_atmosphere/kd_tree_ties (MPAS-Dev/MPAS-Model#630) * mark-petersen/framework/add_FillValue (MPAS-Dev/MPAS-Model#616) * mark-petersen/framework/add_lapack_option (MPAS-Dev/MPAS-Model#613) * amametjanov/framework/nullify-field-pointers (MPAS-Dev/MPAS-Model#578) * framework/makefile_e3sm (MPAS-Dev/MPAS-Model#603) * xylar/framework/make_shell_bash (MPAS-Dev/MPAS-Model#594) * registry/missing_value (MPAS-Dev/MPAS-Model#562) * pwolfram/updates_make_intel_stack (MPAS-Dev/MPAS-Model#592) * azamat/framework/pgi-cpr-omp-workaround (MPAS-Dev/MPAS-Model#449) * az/framework/e3sm-cmake-qnosmp-typo (MPAS-Dev/MPAS-Model#579) * xylar/compass/add_prerequisite_tag * atmosphere/fix_timekeeping_imports (MPAS-Dev/MPAS-Model#582) * xylar/fix_docs_ci (MPAS-Dev/MPAS-Model#575) * xylar/add_compass_docs (MPAS-Dev/MPAS-Model#472) * philipwjones/framework/summitmake (MPAS-Dev/MPAS-Model#536) * atmosphere/atm_core_cleanup (MPAS-Dev/MPAS-Model#548) * init_atmosphere/parse_geoindex (MPAS-Dev/MPAS-Model#459) * xylar/compass/add_copy_file (MPAS-Dev/MPAS-Model#467) * init_atmosphere/kd_tree (MPAS-Dev/MPAS-Model#438) * init_atmosphere/mpas_stack (MPAS-Dev/MPAS-Model#426) * operators/add_mpas_in_cell (MPAS-Dev/MPAS-Model#400) Fixes #3396 Fixes #3236 [BFB]
I found exactly the same problem as we are having here: MPAS-Dev#367 (review). And I said it was later fixed here: MPAS-Dev#367 (comment). The problem was introduced since then, in this PR: MPAS-Dev#578
This merge addresses several sources of pool and field deallocation errors
that arise when MPAS is built with the XL 16 compilers.
Additional discussion is at #367 (review).
Fixes #481
[bit-for-bit]