Better support regular container restarts #3929

casperklein · Mar 9, 2024

Description

Described here.

Not covered by this PR:

Documentation
Additional test(s)

Fixes #3886

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Improvement (non-breaking change that does improve existing functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

Checklist

My code follows the style guidelines of this project
I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation (README.md or the documentation under docs/)
If necessary, I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I have added information about changes made in this PR to CHANGELOG.md

target/scripts/start-mailserver.sh

Co-authored-by: Brennan Kinney <5098581+polarathene@users.noreply.github.com>

polarathene

Earlier suggestion for simplifying the conditional branch. Suggestion is slightly awkward due to web UI functionality 😅

target/scripts/start-mailserver.sh

casperklein · Mar 12, 2024

Earlier suggestion for simplifying the conditional branch. Suggestion is slightly awkward due to web UI functionality 😅

Thanks, but I already got what you proposed 👍 I just had no time yesterday answering you.

The reason I did it my way is:

Depending on the log level defined in SUPERVISOR_LOGLEVEL, _early_supervisor_setup modifies the supervisor configuration and runs a reload afterwards. When supervisord is reloaded, all services (only mailserver at this point) are restarted. That means start-mailserver.sh is run a second time. That is not necessary, especially on a container restart. That's why _early_supervisor_setup should only be called run once (on first container start).

After your comment, I read the source again and noticed, that for that situation there is already a safe-guard not to try to modify the configuration (and reloading supervisor), when not needed. But I think the intention for that in the first place was not in regard of a container restart.

Even with that safe-guard in place, there is no need to call that function on container restarts.

Regarding _early_variables_setup: We could put that above the if [[ ! -f /CONTAINER_START ]]; then condition. In the case of SUPERVISOR_LOGLEVEL=info for example, supervisord is reloaded later on the first container start, which then runs start-mailserver.sh again. _early_variables_setup would then be called again, which makes the first invocation useless.

I hope you understand what I tried to say. Summery: I did it that way, with a view to effectiveness (avoiding double functions calls on some circumstances) --> saving some CPU cycles 😆

polarathene · Mar 12, 2024

Depending on the log level defined in SUPERVISOR_LOGLEVEL, _early_supervisor_setup modifies the supervisor configuration and runs a reload afterwards.

Ideally it shouldn't be a problem for container restarts since ENV doesn't change.

When supervisord is reloaded, all services (only mailserver at this point) are restarted.
That means start-mailserver.sh is run a second time. That is not necessary, especially on a container restart.
That's why _early_supervisor_setup should only be called run once (on first container start).

This is the actual problem, and is due to:

docker-mailserver/Dockerfile

Lines 316 to 317 in 2133b51

    
           ENTRYPOINT ["/usr/bin/dumb-init", "--"] 
        
           CMD ["supervisord", "-c", "/etc/supervisor/supervisord.conf"]

docker-mailserver/target/supervisor/conf.d/dms-services.conf

Lines 7 to 16 in 2133b51

    
           [program:mailserver] 
        
           startsecs=0 
        
           stopwaitsecs=55 
        
           autostart=true 
        
           autorestart=false 
        
           stdout_logfile=/dev/stdout 
        
           stdout_logfile_maxbytes=0 
        
           stderr_logfile=/dev/stderr 
        
           stderr_logfile_maxbytes=0 
        
           command=/usr/local/bin/start-mailserver.sh

docker-mailserver/target/scripts/start-mailserver.sh

Lines 190 to 191 in 2133b51

    
           touch /var/log/mail/mail.log 
        
           tail -Fn 0 /var/log/mail/mail.log

What is the technical reason for needing to run the startup script as a separate service? If we didn't then this early supervisord support wouldn't be necessary and the script would not need to restart it and exit itself to re-run?

docker-mailserver/target/scripts/startup/setup-stack.sh

Lines 47 to 48 in 2133b51

    
           supervisorctl reload 
        
           exit

I hope you understand what I tried to say.

Yes thanks, I hadn't considered it.

Although it should be fine for our default CMD to be start-mailserver.sh, so long as it starts supervisor via exec at the end it'd be PID 1? (although that'd probably require refactoring of daemon-stack.sh to instead set autostart=true instead of supervisorctl start ..., and may complicate the log output a bit 🤷‍♂️)

EDIT: My bad dumb-init would be PID 1 wouldn't it, but the process group it'd manage would be tied to whatever CMD is set to, and a script can use exec to transition seamlessly to the current CMD AFAIK 🤔

So while this would be preferable:

_early_supervisor_setup
_early_variables_setup
_register_functions
_log 'info' "Welcome to docker-mailserver ${DMS_RELEASE}"
  
# Ensure DMS only adjusts config files for a new container.
# Container restarts should skip as they retain the modified config.
if [[ ! -f /CONTAINER_START ]]; then
  _check
  _setup
  _run_user_patches
  # Only displayed on fresh containers (better for bug reports)
  [[ ${LOG_LEVEL} =~ (debug|trace) ]] && print-environment
else
  _log 'info' 'Container was restarted. Skipping setup routines..'
fi

_early_supervisor_setup is presently a problem, only because we rely on it for starting start-mailserver.sh and managing the logs that the script outputs afterwards via exec tail ...? (although technically due to an existing safe-guard, the concern isn't actually valid)

CHANGELOG.md

georglauterbach

Two small nitpicks. Otherwise, LGMT 👍🏼 Why doesn't this PR ship the corresponding documentation updates?

target/scripts/start-mailserver.sh

Co-authored-by: Georg Lauterbach <44545919+georglauterbach@users.noreply.github.com>

georglauterbach · Mar 13, 2024

Oops, my suggestion contained an additional space - praise be linters 😝

casperklein · Mar 13, 2024

Although it should be fine for our default CMD to be start-mailserver.sh, so long as it starts supervisor via exec at the end it'd be PID 1?

EDIT: My bad dumb-init would be PID 1 wouldn't it, but the process group it'd manage would be tied to whatever CMD is set to, and a script can use exec to transition seamlessly to the current CMD AFAIK

Just in short: There may be other ways to do things as you noted - absolutely. And I don't say, that these are bad or anything. Those ways may have new issues/problems. The current solution is approved and solid for a long time.
Personally I like that everything, including start-mailserver.sh, is handled as a service and supervisor is directly called on container start. With this PR I wanted to introduce just some small changes, to allow a (better than before) container restart. I didn't want to revolutionize the whole startup process 😉

# Only displayed on fresh containers (better for bug reports)
[[ ${LOG_LEVEL} =~ (debug|trace) ]] && print-environment

I don't care about this one and can change that, but I am wondering what your reasoning for that is. If someone enables debugging, why should we omit debug information on container restarts? There are solutions/logging drivers where the log size is limited and you can't scroll back to the very beginning and only get the last X lines.

polarathene · Mar 13, 2024

# Only displayed on fresh containers (better for bug reports)
[[ ${LOG_LEVEL} =~ (debug|trace) ]] && print-environment

I don't care about this one and can change that, but I am wondering what your reasoning for that is.
If someone enables debugging, why should we omit debug information on container restarts?

If they enable the ENV it'll do a fresh start AFAIK. Unless there's other surprising cases like how compose will keep a VOLUME (implicit anonymous volume in Dockerfile) even with --force-recreate or a new tag upgrade (it seems to key it to the service name and VOLUME instruction regardless of image).

Sure there might be a legitimate need for it with a container restart, I just figured we'd prefer bug reports for fresh containers only and this would be more evident in their log output without us having to do extra back and forth?

If there is a legit bug that is reproduced only with a container restart, they could manually get the equivalent output since we write it to a file anyway?

There are solutions/logging drivers where the log size is limited and you can't scroll back to the very beginning and only get the last X lines.

In that sense, this output is even less helpful? Personally I'd rather we had this limited to trace if we need early output for a startup failure, otherwise we could just instruct the user to run a setup debug ... command to get the same output separate from the main log output?

Often when ENV matters, the user doesn't provide such log output, and we're just given a mailserver.env reference and have to ask for that, which is often a copy/paste dump with plenty of default values for us to wade through, even at times when I ask for them to only share the non-default values they changed 😩

georglauterbach · Mar 14, 2024

LGTM 👍🏼 Still, why isn't this PR adjusting our docs?

casperklein · Mar 15, 2024

If they enable the ENV it'll do a fresh start AFAIK

? If you change ENVs and want to take them place, you'll have to recreate the container. "start/stop" does not take changed variables into account.

.. I just figured we'd prefer bug reports for fresh containers only and this would be more evident in their log output without us having to do extra back and forth?

I haven't thought about this at all 👍 For me, enabling debugging is not mutual exclusive to opening an issue. Enabling any feature, I would not expect, that some functions are only run on first container start. That's why I find it in this case inconsistent, when printing the ENVs, would only take place on the first container start.

Often when ENV matters, the user doesn't provide such log output

That's unfortunately nothing we can fix scripting wise 😆

LGTM 👍🏼 Still, why isn't this PR adjusting our docs?

I haven't checked it, but I guess that there are many places that need to be adjusted. I simply have no time for that right now. We can leave this PR open if you or @polarathene want to do the docu stuff.

On the other hand, starting with a fresh container is still a good advice. That "wrong" information does not hurt in any way. We could wait some time to see, how well this PR ages and adjust the docs, when we are 100% confident, that container restarts are fully supported and there are no more issues with that.

polarathene · Mar 15, 2024

Enabling any feature, I would not expect, that some functions are only run on first container start.

But that's exactly what this PR does, conditionally run startup logic based on if this is a new container instance or not?

That's why I find it in this case inconsistent, when printing the ENVs, would only take place on the first container start.

This is related to troubleshooting right? That's the main purpose of that output, yet by skipping the output from running initial container setup, you're lacking any helpful insights that come from that which can emit warnings/errors.

It's better to encourage a fresh container instance for this and by ensuring the ENV print method is also only run for that it will be communicated much better to us with bug reports, and for the user who's more likely to see the skipped log message this PR adds when there is less noise being output?

That's unfortunately nothing we can fix scripting wise 😆

Disagree, we know what are default ENV values are, we could absolutely output only non-defaults, either at startup via log or a setup debug command.

? If you change ENVs and want to take them place, you'll have to recreate the container. "start/stop" does not take changed variables into account.

That was kind of my point. With docker start you're resuming a container and the ENV will carry over as you can't adjust them with that command.

With Docker Compose, IIRC adjusting ENV via compose.yaml would get detected and recreate the container, at least with the common docker compose up command?

So if a user explicitly changes ENV for a container, they'll need a fresh container instance regardless.

Not sure what value we're providing to the user by keeping ENV printing at container startup for restarts/resumes?

I haven't checked it, but I guess that there are many places that need to be adjusted. I simply have no time for that right now. We can leave this PR open if you or @polarathene want to do the docu stuff.

I've not chimed in on the docs as I don't know what docs are being referred to that need updating. There's two pages (usage and debugging) with the docker compose up --force-recreate advice.

Does this PR completely address the internal state issue?

We could wait some time to see, how well this PR ages and adjust the docs, when we are 100% confident, that container restarts are fully supported and there are no more issues with that.

👍

I don't see issue with leaving doc as-is for now either. Rather have that there for now and we can remove it later if it doesn't provide value. I think it's still good advice for the debugging page.

casperklein · Mar 16, 2024

Let's stop the discussion at this point about printing the ENVs on startup if that's fine for you. I see you have valid points - me too. I was about to reply to your points and noticed it was getting quiet lengthy again. Surely you would reply again to my answers. I think it's not worth to invest that much time for 1 LOC. If you or Georg demand a change, I can do it or you can change it directly 👍

Does this PR completely address the internal state issue?

I hope so, but I wouldn't bet any money on this 😆

georglauterbach · Mar 16, 2024

Agreed, let's not make a lengthy discussion out of it 🚀 I am fine with not updating the docs for now, but only if it does not become the default behavior. We also need to update the docs when we are confident that all is well with this feature.

casperklein added 3 commits March 9, 2024 23:26

run check and setup functions only on first container start

f8f4fae

no need to keep bash process running

43b6078

remove _check_improper_restart function

7755bca

casperklein self-assigned this Mar 9, 2024

casperklein added this to the v14.0.0 milestone Mar 9, 2024

casperklein added area/scripts kind/improvement Improve an existing feature, configuration file or the documentation labels Mar 9, 2024

casperklein marked this pull request as ready for review March 10, 2024 10:06

polarathene requested changes Mar 10, 2024

View reviewed changes

target/scripts/start-mailserver.sh Outdated Show resolved Hide resolved

target/scripts/start-mailserver.sh Show resolved Hide resolved

target/scripts/start-mailserver.sh Show resolved Hide resolved

review feedback: comments enhanced

203da5c

Co-authored-by: Brennan Kinney <5098581+polarathene@users.noreply.github.com>

polarathene requested changes Mar 11, 2024

View reviewed changes

casperklein added 2 commits March 12, 2024 18:58

Merge branch 'master' into container-restart

1d70aed

sed --> sedfile

2727c67

changelog

0274599

polarathene requested changes Mar 13, 2024

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

Update CHANGELOG.md

1e7d1fd

polarathene previously approved these changes Mar 13, 2024

View reviewed changes

polarathene requested a review from georglauterbach March 13, 2024 00:14

georglauterbach requested changes Mar 13, 2024

View reviewed changes

target/scripts/start-mailserver.sh Show resolved Hide resolved

target/scripts/start-mailserver.sh Outdated Show resolved Hide resolved

review feedback: style adjusted

9c5c337

Co-authored-by: Georg Lauterbach <44545919+georglauterbach@users.noreply.github.com>

casperklein dismissed polarathene’s stale review via 9c5c337 March 13, 2024 12:02

fix typo

e482627

polarathene approved these changes Mar 13, 2024

View reviewed changes

Merge branch 'master' into container-restart

9dca90a

georglauterbach approved these changes Mar 16, 2024

View reviewed changes

casperklein changed the title ~~Support container restart~~ Better support regular container restarts Mar 17, 2024

casperklein merged commit 066773e into docker-mailserver:master Mar 17, 2024

casperklein deleted the container-restart branch March 17, 2024 15:31

polarathene mentioned this pull request Jul 10, 2024

Amavis socket permissions error #1146

Closed

casperklein mentioned this pull request Jul 21, 2024

docs: Remove restart: always from compose.yaml #4124

Closed

3 tasks

This was referenced Aug 2, 2024

fix: Ensure main log file is tailed from the start #4146

Merged

bug: DMS v14 regression - Container restart support may fail due to unclean exit #4150

Closed

Search code, repositories, users, issues, pull requests...

Uh oh!

Better support regular container restarts #3929

Better support regular container restarts #3929

Uh oh!

Conversation

casperklein commented Mar 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Checklist

Uh oh!

Uh oh!

Uh oh!

Uh oh!

polarathene left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

casperklein commented Mar 12, 2024

Uh oh!

polarathene commented Mar 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

georglauterbach left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

georglauterbach commented Mar 13, 2024

Uh oh!

casperklein commented Mar 13, 2024

Uh oh!

polarathene commented Mar 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

georglauterbach commented Mar 14, 2024

Uh oh!

casperklein commented Mar 15, 2024

Uh oh!

polarathene commented Mar 15, 2024

Uh oh!

casperklein commented Mar 16, 2024

Uh oh!

georglauterbach commented Mar 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

casperklein commented Mar 9, 2024 •

edited

Loading

polarathene commented Mar 12, 2024 •

edited

Loading

georglauterbach left a comment •

edited

Loading

polarathene commented Mar 13, 2024 •

edited

Loading