Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Conversation

georglauterbach
Copy link
Member

@georglauterbach georglauterbach commented Jan 18, 2025

Description

The DMS v14 release introduced improved support for container restarts, but had some regressions that this PR resolves.

Fixes #4108
Fixes #4150

ref #4302

Type of change

  • Improvement (non-breaking change that does improve existing functionality)

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation (README.md or the documentation under docs/)
  • If necessary, I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have added information about changes made in this PR to CHANGELOG.md

@georglauterbach georglauterbach added area/scripts kind/improvement Improve an existing feature, configuration file or the documentation labels Jan 18, 2025
@georglauterbach georglauterbach added this to the v15.0.0 milestone Jan 18, 2025
@georglauterbach georglauterbach self-assigned this Jan 18, 2025
target/scripts/start-mailserver.sh Outdated Show resolved Hide resolved
Copy link
Member

@polarathene polarathene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TLDR is a checklist of suggested changes at the end of one of the review comments.

Context for those change suggestions is covered by the rest of the feedback 👍

target/scripts/start-mailserver.sh Outdated Show resolved Hide resolved
target/scripts/start-mailserver.sh Outdated Show resolved Hide resolved
target/scripts/start-mailserver.sh Outdated Show resolved Hide resolved
@georglauterbach
Copy link
Member Author

Alrighty - tests are currently running locally, but I think I finished the analysis and implementation for this one. Thank you for the feedback @casperklein and @polarathene - as always.

I have split everything into digestible commits, please review commit by commit.

@polarathene In your #4323 (comment), I have ticked what I implemented, and I explained what I had to do differently with comments in my commits.

@georglauterbach georglauterbach changed the title chore: run check even when the container is restarted scripts: restructure container restart behavior Jan 19, 2025
Copy link
Member

@polarathene polarathene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of change requests from review feedback. Not everything needs to be addressed, but the env export needs to be resolved either as described below or by reverting that change (it doesn't look like ENV can be adjusted for restarts, that results in a fresh container AFAIK, thus may not be a bug to restrict it to _setup())

  • __environment_variables_export is a regression now. To prevent that you'll need to:
    • Migrate _check_log_level() from check-stack.sh into variables-stack.sh
    • Migrate the feature flag co-located ldap, oauth, saslauthd variable-stack.sh method calls from _register_functions (start-mailserver.sh) to a new method in variable-stack.sh with equivalent feature flag ENV checks.
    • Have the two methods (log-level check and feature-flag) called prior to the export method.
      • NOTE: This would make the _early_variables_setup name a bit redundant, but I'm fine leaving that as-is for now in this PR so the diff is easier to grok / follow 😅
  • _setup_adjust_state_permissions and _setup_directory_and_file_permissions are both suitable to be managed during _check, but I won't enforce this change request for the PR.
    • NOTE: _setup_adjust_state_permissions is partially dependent upon _setup_save_states on a fresh container start should the volume not already provide the service dir for a feature. Running _setup_save_states in _check is also valid AFAIK, but the split is of mail_state.sh into two methods is still good to have.
  • _setup_adjust_state_permissions() should inverse the debug log condition, as suggested. I don't see much value in debug log for notifying the user of skipping the logic, as per review feedback.

From prior feedback:

  • Optionally consider dropping _early_supervisor_setup() from start-mailserver.sh and moving that method from setup-stack.sh to daemons-stack.sh where _start_daemons() can call it before the for loop.

target/scripts/startup/setup.d/mail_state.sh Outdated Show resolved Hide resolved
target/scripts/startup/setup.d/mail_state.sh Show resolved Hide resolved
target/scripts/startup/variables-stack.sh Show resolved Hide resolved
target/scripts/start-mailserver.sh Outdated Show resolved Hide resolved
_register_setup_function '_setup_save_states'
_register_setup_function '_setup_apply_fixes_after_configuration'
_register_setup_function '_environment_variables_export'
_register_setup_function '_setup_adjust_state_permissions'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this, but you could also have _setup_save_states() internally call this follow-up method directly at the end of it's function since that is the intention.


EDIT: Per other review feedback, if _setup_save_states() isn't actually a required dependency, better to call separately as you're doing here.

I think it's a dependency in the sense of the operations being called on paths that are expected to exist when the feature flags are enabled, in which case the first time enabled _setup_save_states() should have run first.

I just tested with compose and adjusting an ENV value (hard-coded or via the external variable interpolation) of environment or adding a new ENV cleared the internal state, same with env_file. So it shouldn't be possible to restart with ENV adjusted, it'd go through the fresh container setup.

Per other review feedback, it does seem like _setup_save_states() could still run during a restart, thus keeping the ownership/permissions fix called internally at the end of _setup_save_states() seems viable, along with moving it into _check().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure what to make out of this now. Are you satisfied with the way it is or should I change something?

# Ensure DMS only adjusts config files for a new container.
# Container restarts should skip as they retain the modified config.
if [[ -f /CONTAINER_START ]]; then
_log 'info' 'Container was restarted. Skipping most setup routines.'
# We cannot skip all setup routines because some need to run _after_
# the initial setup (and hence, they cannot be moved to the check stack).
_setup_directory_and_file_permissions
_setup_adjust_state_permissions
else
_setup
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_setup_adjust_state_permissions AFAIK only has a dependency upon the feature flags ENV being set (and now a global var), thus it's possible to group under _check / check-stack.sh.

_setup_directory_and_file_permissions likewise was suggested to be migrated to check-stack.sh, it's dependent upon ENV with logic that should be fine before _setup. It's mostly a collection of misc checks/fixes, there's _chown_var_mail_if_necessary(), but AFAIK we shouldn't need that post _setup during first container start as Dovecot + setup CLI should create /var/mail folders with correct permissions, this has always been a restart related fix I think.


Thus moving these to _check should resolve the concern. For mail_state.sh a fresh container would however output a bit of a misleading debug log for _setup_adjust_state_permissions(), since it ran before _setup_save_states().. so technically that could be run after this conditional block, but that has the caveat that user-patches.sh should also be run after that (but we don't want to run that on a restart.. 🫤)

In my prior review the concern was if _setup_save_states() could potentially do something undesirable on a restart, but we were running that for many releases prior to the introduction of better container restart support via /CONTAINER_START. So presumably it could join other preparation methods in _check to keep everything simple.

The ownership/permission fix that follows is mostly for upgrades or other scenarios which may have fumbled the ownership/permissions AFAIK, so I don't think it actually has a hard dependency to run after _setup_save_states()? If so we could probably point that out via comment above the function name in mail_state.sh so maintainers don't have to think about it 😝


Is the debug log for _setup_adjust_state_permissions() the right approach to output when the directory is missing during a restart? (I'm not sure how common that event would be for a container restart)

For a fresh container start, it seems irrelevant to log when the prior _setup_save_states() emitted a very similar log for that condition. Perhaps it should be a silent return that only emits a debug log when it's actually met the condition to modify the state dir? (_setup_save_states() is always logging for a fresh container, so I think that's sufficient)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried putting _setup_directory_and_file_permissions into _check_..., but this resulted in failing tests with /var/mail/... having wrong permissions. Due to a lack of time, I did not investigate further but assumed something during setup touched the directories. Hence, it needs to run after all other setup functions. Consequently, I was unable to move it into the check stack.

Please give it a try if you think I made a mistake elsewhere (entirely possible, really), but I think we cannot move it yet without more significant adjustments, which I'm reluctant towards considering we want to release a new major.


Concerning the log if _setup_adjust_state_permissions, I'll change that.

Comment on lines 87 to 90
if [[ ! -d ${DMS_STATE_DIR:?DMS_STATE_DIR is not set} ]]; then
_log 'debug' "'${DMS_STATE_DIR}' is not present - not adjusting state permissions"
return 0
fi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per related review feedback, flipping the debug log here makes more sense to only emit when changes will occur, the lack of the directory should only be relevant to a fresh container startup, in which case the prior dependency upon _setup_save_states() has us covered in equivalent skipping log.

Suggested change
if [[ ! -d ${DMS_STATE_DIR:?DMS_STATE_DIR is not set} ]]; then
_log 'debug' "'${DMS_STATE_DIR}' is not present - not adjusting state permissions"
return 0
fi
if [[ ! -d ${DMS_STATE_DIR:?DMS_STATE_DIR is not set} ]]; then
return 0
fi
_log 'debug' "Ensuring correct ownership + permissions for DMS state dir: '${DMS_STATE_DIR}'"

EDIT: I'm not sure if there's an actual dependency to _setup_save_states()? I haven't verified if that method correctly syncs over ownership + permissions. I believe that's handled by the mv command, which should preserve everything from the initial internal copy (probably fails if a restart can toggle a feature off and then back on again due to the rm logic of the other conditional though) (containers cannot be started with internal state persisted if ENV is changed from what I've observed).

Still, I'm not sure how useful a debug log when the state dir is missing is helpful. It'd only seem relevant to a container restart and for troubleshooting we generally advise ensuring a fresh container is run which would provide this same context via _setup_save_states().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed the log as per e549b73

Signed-off-by: georglauterbach <44545919+georglauterbach@users.noreply.github.com>
Signed-off-by: georglauterbach <44545919+georglauterbach@users.noreply.github.com>
Signed-off-by: georglauterbach <44545919+georglauterbach@users.noreply.github.com>
Signed-off-by: georglauterbach <44545919+georglauterbach@users.noreply.github.com>
Signed-off-by: georglauterbach <44545919+georglauterbach@users.noreply.github.com>
Signed-off-by: georglauterbach <44545919+georglauterbach@users.noreply.github.com>
Signed-off-by: georglauterbach <44545919+georglauterbach@users.noreply.github.com>
Signed-off-by: georglauterbach <44545919+georglauterbach@users.noreply.github.com>
Signed-off-by: georglauterbach <44545919+georglauterbach@users.noreply.github.com>
Signed-off-by: georglauterbach <44545919+georglauterbach@users.noreply.github.com>
Thanks to @polarathene for his feedback in
#4323 (review)

Signed-off-by: georglauterbach <44545919+georglauterbach@users.noreply.github.com>
Signed-off-by: georglauterbach <44545919+georglauterbach@users.noreply.github.com>
Signed-off-by: georglauterbach <44545919+georglauterbach@users.noreply.github.com>
@georglauterbach
Copy link
Member Author

georglauterbach commented Jan 22, 2025

Summary of change requests from review feedback. Not everything needs to be addressed, but the env export needs to be resolved either as described below or by reverting that change (it doesn't look like ENV can be adjusted for restarts, that results in a fresh container AFAIK, thus may not be a bug to restrict it to _setup())

  • __environment_variables_export is a regression now. To prevent that you'll need to:
    • Migrate _check_log_level() from check-stack.sh into variables-stack.sh
    • Migrate the feature flag co-located ldap, oauth, saslauthd variable-stack.sh method calls from _register_functions (start-mailserver.sh) to a new method in variable-stack.sh with equivalent feature flag ENV checks.
    • Have the two methods (log-level check and feature-flag) called prior to the export method.

Done

  • _setup_adjust_state_permissions and _setup_directory_and_file_permissions are both suitable to be managed during _check, but I won't enforce this change request for the PR.
    • NOTE: _setup_adjust_state_permissions is partially dependent upon _setup_save_states on a fresh container start should the volume not already provide the service dir for a feature. Running _setup_save_states in _check is also valid AFAIK, but the split is of mail_state.sh into two methods is still good to have.

I raised my concerns and problems with this in #4323 (comment).

  • _setup_adjust_state_permissions() should inverse the debug log condition, as suggested. I don't see much value in debug log for notifying the user of skipping the logic, as per review feedback.

Done (I think)

From prior feedback:

  • Optionally consider dropping _early_supervisor_setup() from start-mailserver.sh and moving that method from setup-stack.sh to daemons-stack.sh where _start_daemons() can call it before the for loop.

I do not think this would be a viable approach, as the script calls exit, which I'd like to call as early as possible, and not after the setup stack has already completed.


New commits begin here: 70eedac

dependabot bot and others added 3 commits February 2, 2025 11:51
Bumps [anchore/scan-action](https://github.com/anchore/scan-action) from 6.0.0 to 6.1.0.
- [Release notes](https://github.com/anchore/scan-action/releases)
- [Changelog](https://github.com/anchore/scan-action/blob/main/RELEASE.md)
- [Commits](anchore/scan-action@v6.0.0...v6.1.0)

---
updated-dependencies:
- dependency-name: anchore/scan-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [docker/build-push-action](https://github.com/docker/build-push-action) from 6.12.0 to 6.13.0.
- [Release notes](https://github.com/docker/build-push-action/releases)
- [Commits](docker/build-push-action@v6.12.0...v6.13.0)

---
updated-dependencies:
- dependency-name: docker/build-push-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Georg Lauterbach <44545919+georglauterbach@users.noreply.github.com>
Signed-off-by: georglauterbach <44545919+georglauterbach@users.noreply.github.com>
@georglauterbach
Copy link
Member Author

I applied the changes and appeased the linting gods.

@casperklein
Copy link
Member

Some update from my side: This went from a small change into something bigger + a lot of discussion which I haven't fully read yet. To make a proper review, I'll have to take some time. At the moment, that's a bit challenging for me. I'd like to review asap, but unfortunately I can't give an ETA right now.
If this can wait some more, I am happy to review. Otherwise I try to review afterwards. I'll leave it to you both, when this gets merged 👍

@polarathene
Copy link
Member

polarathene commented Feb 2, 2025

If this can wait some more, I am happy to review. Otherwise I try to review afterwards. I'll leave it to you both, when this gets merged

I'm in no rush, I'll be working on the podman PR in the meantime 👍

PR Overview

This went from a small change into something bigger + a lot of discussion which I haven't fully read yet.

If it helps here is a quick overview for the diff.

  • Primary change is in start-mailserver.sh to adjust the restart logic, it's mostly unified now so that it only calls _setup on fresh containers.
  • Additional changes are more house-keeping related, some supportive to the primary change. A bulk of the diffs aren't as large as they seem, just the shuffling of lines and indentation adjustments.

Primary change is adjusting the restart check for /CONTAINER_START to this:

_early_supervisor_setup
_early_variables_setup

_log 'info' "Welcome to docker-mailserver ${DMS_RELEASE}"

_register_functions
_check

# Ensure DMS only adjusts config files for a new container.
# Container restarts should skip as they retain the modified config.
if [[ -f /CONTAINER_START ]]; then
  _log 'info' 'Container was restarted. Skipping most setup routines.'
  # We cannot skip all setup routines because some need to run _after_
  # the initial setup (and hence, they cannot be moved to the check stack).
  _setup_directory_and_file_permissions
  _setup_adjust_state_permissions
else
  _setup
fi
  • Effectively now a restart runs everything except _setup.
  • There are two _setup methods that still need to be run for restarts, but were proposed to be migrated to _check functions.
    • _setup_directory_and_file_permissions() is a bunch of misc cleanup operations. One of those is to remove PID files (primary regression we're fixing by ensuring this method is run), and also ensures /var/mail ownership is corrected if necessary.
    • _setup_adjust_state_permissions() is a new method in mail_state.sh, which was split into two separate methods. This is the 2nd method for mail_state.sh which just manages ownership/permission fixes, while the earlier portion only runs on fresh containers to create/copy and symlink to the DMS state volume (if any ENV is changed to toggle a feature, Docker will force recreate the container, however it looks like it's probably safe to still run this during restarts too, we effectively were prior to v14).
  • @georglauterbach did attempt to migrate those methods to _check(), but ran into test failures with /var/mail ownership. I haven't looked into reproducing that yet. That would simplify the conditional block, removing the two separate calls for restart there and also their repeated use in _setup. I'm not too fond of the two call sites and would prefer to simplify, but not blocking the PR on that.

Additional changes:

  • STATEDIR in mail_state.sh was renamed to DMS_STATE_DIR, along with an adjust to the conditional and related log and that is repeated with the 2nd method.
    • Part of this change removed the indent of the method as it previously was all wrapped in the conditional (that's now inverted to exit early).
    • It also introduced __declare_readonly(), which personally I don't feel is necessary and adds some complexity, but I can understand why @georglauterbach wanted to introduce it (this is more of a DX issue with using bash though). It could be avoided and DMS_STATE_DIR could just be declared at the top of mail_state.sh.
  • While the diff is large/noisy for mail_state.sh actual changes is minimal and it's effectively white-space. I did add some minor housekeeping changes but dropping a variable name DESTDIR that was only used for mkdir -p, preferring to inline that and add a comment instead. Likewise for a chcon call, I had the comment append a reference link in favor over relying on git blame for context.
  • As part of the earlier restart logic changes, _setup() was making calls to setup ENV vars via variables-stack.sh but most of these were not registered as part of _setup(), just co-located with the equivalent feature. These have instead been extracted out and managed via variables-stack.sh at an earlier stage as part of _early_variables_setup().
  • _check_log_level() (check-stack.sh) was migrated to variables-stack.sh as __environment_variables_log_level(), which is a house-keeping thing rather than addressing anything in particular, the logic was ENV related so it belonged in variables-stack.sh instead.

polarathene
polarathene previously approved these changes Feb 2, 2025
@georglauterbach
Copy link
Member Author

georglauterbach commented Feb 2, 2025

Indeed, this has become bigger than I anticipated as well. I'd still like to merge ASAP TBH, because I'd really like publish v15.0.0.

Here is my proposal: I'd merge this and prepare the v15 release. We'll pick 1.5 weeks for the freeze because of this change. Please let me know whether this is okay for you both.


@polarathene thanks for the overview :) I added the __declare_readonly to ensure we do not run into problems with readonly variables being defined more than once (e.g. if this script is ever used while the environment was already loaded). It's really a Bash thingy, I guess.. readonly needs special precautions.

@georglauterbach
Copy link
Member Author

I have resolved the merge conflicts with master and enabled autp-merge now.

@georglauterbach georglauterbach enabled auto-merge (squash) February 8, 2025 17:28
Copy link
Member

@polarathene polarathene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll apply these and shift the changelog breaking items to more appropriate sections.

CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
@polarathene polarathene changed the title scripts: restructure container restart behavior fix: Better support container restarts Feb 8, 2025
@georglauterbach georglauterbach merged commit 59a379a into master Feb 8, 2025
6 of 7 checks passed
@georglauterbach georglauterbach deleted the gl/check-on-restart branch February 8, 2025 21:23
@casperklein
Copy link
Member

casperklein commented Feb 15, 2025

@georglauterbach Good structured commits 👍
@polarathene Thank you for the summary, that helped me to understand things (faster). 👍

The only nitpick I have is the "DMS_STATE_DIR" change:

  • Why making it global when not needed?
  • DMS_STATE_DIR is set explicitly, why is this check needed? {DMS_STATE_DIR:?DMS_STATE_DIR is not set}. This should never be unset?
  • do we really need __declare_readonly?

I just think, that this makes things unnecessary complicated. Like @polarathene said, DMS_STATE_DIR could just be declared at the top of mail_state.sh should be enough.

Edit: Side effect of __declare_readonly I run into while reviewing is, I couldn't easily search for the variable definition of DMS_STATE_DIR. I usually search for "VAR=" to find the location of a variable definition. Sure, not a big thing, I just wanted to mention it.

@polarathene
Copy link
Member

Why making it global when not needed?

I felt the same way but I didn't want to nitpick heavily on it.

I understand the intention to minimize caveats from using Bash. It's one of the main friction points we have with maintaining the project when variables / methods are being pulled into scope from other scripts, something we shouldn't have to feel like we're walking on glass with.

If we do explore migrating some scripts away from bash at some point, that's one of the main benefits. So far NuShell seems like a good compromise for that (a rust-based shell that provides optional type annotation, structured data, module imports, etc). I understand wanting to avoid Python and concerns with languages that'd involve compilation vs an interpreter for scripts, Nu might be a good candidate for such. We've had this discussion quite a bit so I won't drag it out again here 😅

I do agree that I'm not fond of the added complexity / abstraction to add preventative measures prematurely though.

@georglauterbach
Copy link
Member Author

I will remove this part again after seeing your feedback :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/scripts kind/improvement Improve an existing feature, configuration file or the documentation

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

bug: DMS v14 regression - Container restart support may fail due to unclean exit bug: Container restarts skip ownership fixes

3 participants

Morty Proxy This is a proxified and sanitized view of the page, visit original site.