bpo-36977: Make SharedMemoryManager release its resources if its parent process dies #13451

pierreglaser · May 20, 2019

https://bugs.python.org/issue36977

pierreglaser · May 20, 2019

eamanu

Hello, IMO you should test the if your changes. I think that you are not test it, isn't?

Are you making this changes for the test? I think if you want to say anything about (in this
example) process child are alive yet, you need to say it on the code (no in the test).
Because, anyone will know that the process are alive.

Lib/test/_test_multiprocessing.py

pablogsal · May 20, 2019

Lib/test/_test_multiprocessing.py

+        deadline = start_time + 60
+        t = 0.1
+        while time.monotonic() < deadline:
+            time.sleep(t)


I don't feel comfortable having tests that rely on sleep, they are almost assured to fail in some of the buildbots
(check #10700 for example). Could you restructure this
tests to use synchronization primitives?

I understand your worries. I am having trouble thinking of a test that does not require a timeout though (else, we risk hanging forever waiting for the synchronization event to occur). Inherently, this tests a feature that relies on connection.wait, which can block for a non-deterministic amount of time.

Note that using this pattern:

while time.monotonic() < deadline: time.sleep(t) if condition(): break else: raise AssertionError

is already more robust than a simple

time.sleep(dt) self.assertTrue(condition())

It was discussed here: https://bugs.python.org/issue36867, and @vstinner seemed OK with its usage. I may be missing something though.

multiprocessing remains the module with the higher amount of random failures due to sleeps/race conditions. I will dismiss my review in case there is no other way, but I still uneasy about adding more sleep() :(

bedevere-bot · May 20, 2019

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

Lib/test/_test_multiprocessing.py

.

pierreglaser · May 24, 2019

Maybe @pitrou wants to have a look at this.

pitrou

Some comments below. It would also be nice if @applio could take a look and say if it looks fine on the principle.

pitrou · May 27, 2019

Lib/multiprocessing/managers.py

@@ -159,6 +159,10 @@ def __init__(self, registry, address, authkey, serializer):
        self.id_to_local_proxy_obj = {}
        self.mutex = threading.Lock()

+    def _track_parent(self):
+        process.parent_process().join()


What if process.parent_process() returns None? For example if the Server was launched in an independent Python instance?

Did not know this was a supported use-case. But we can add a guard.

However, adding a guard will simply discard the case where parent_process() is None, which is not the best solution. Yet, I feel that tracking the usage of manager-allocated resources by other processes than the manager's direct parent is tricky, especially for SharedMemoryManager (see more detailed explanation here #13247 (comment)).

Do we want to make this feature more robust by trying to make the manager wait on any process that tries to connect to it?

I'm afraid I'm not following you here. We're simply trying to fix the case where the manager was launched by its parent process, no? (that's what the issue title says)

The case where the manager is autonomous looks pretty much unfixable on its own.

We're simply trying to fix the case where the manager was launched by its parent process

Ok, lets stick to this in this PR.

The case where the manager is autonomous looks pretty much unfixable on its own.

I agree, but there are in-betweens: if a manager is created by a process (its parent), and then connected to by other processes, then should the manager just watch its parent? Technically, if other processes are also connected to it, it probably should not shut down unless ALL processes connected to it (and not only its parent) terminated.

Hmm... So, conversely, if the manager loses all its connections and was created by a parent process, then it should die?

Perhaps that should be the right heuristic?

In other words: SharedMemoryServer.handle_request can easily keep track of the number of connected clients. At the end, it decrements that number and, if it falls to 0 and there is a parent process, then die?

Yes, I think this is already a more robust solution. However, as opposed to with SyncManager, manipulating/modifying SharedMemoryManager delivered memory segments does not necessarily trigger communication with the manager server process. This is because SharedMemoryManager delivers SharedMemory objects, and not proxies. Thus, keeping track of the number of connected clients does not equate keep track of the processes manipulating those objects. But maybe this approach is still good enough for now.

Also, the recent os.memfd_create mentioned in this python issue looks interesting, but I need to look more into it.

pitrou · May 27, 2019

Lib/test/_test_multiprocessing.py

+        shm_name = p.stdout.readline().decode().strip()
+        p.terminate()
+        start_time = time.monotonic()
+        deadline = start_time + 60


10 seconds sounds enough, no?

In case of overloaded CI workers one can see weird things on rare occasions.

99.9% of the time it will probably take less than 10ms (although I have not checked), this limit is just a timeout, so I think it's fine to keep a large value to make it extremely unlikely to get false positive failures on CI servers.

These are seconds here not milliseconds.

But, no strong feelings either.

Lib/test/_test_multiprocessing.py

Lib/multiprocessing/managers.py

pitrou · May 27, 2019

Lib/multiprocessing/managers.py

@@ -159,6 +159,10 @@ def __init__(self, registry, address, authkey, serializer):
        self.id_to_local_proxy_obj = {}
        self.mutex = threading.Lock()

+    def _track_parent(self):
+        process.parent_process().join()


I'm afraid I'm not following you here. We're simply trying to fix the case where the manager was launched by its parent process, no? (that's what the issue title says)

The case where the manager is autonomous looks pretty much unfixable on its own.

FIX shut down manager if parent process dies

fc7126b

the-knights-who-say-ni added the CLA signed label May 20, 2019

bedevere-bot added the awaiting review label May 20, 2019

eamanu reviewed May 20, 2019

View reviewed changes

Lib/test/_test_multiprocessing.py Show resolved Hide resolved

pierreglaser force-pushed the shared-memory-manager-shutdown branch from 56c0933 to c8add0d Compare May 20, 2019 20:50

TST test manager shutdown on parent termination

87c889f

pierreglaser force-pushed the shared-memory-manager-shutdown branch from c8add0d to 87c889f Compare May 20, 2019 20:52

📜🤖 Added by blurb_it.

1799897

pablogsal previously requested changes May 20, 2019

View reviewed changes

bedevere-bot removed the awaiting review label May 20, 2019

bedevere-bot added the awaiting changes label May 20, 2019

ogrisel reviewed May 21, 2019

View reviewed changes

Lib/test/_test_multiprocessing.py Show resolved Hide resolved

pitrou reviewed May 27, 2019

View reviewed changes

address review comments

6e5d7ed

pitrou reviewed May 27, 2019

View reviewed changes

syntax, guards

dd6e558

ezio-melotti removed the CLA signed label Jul 13, 2022

pierreglaser mannequin mentioned this pull request Jun 6, 2022

SharedMemoryManager should relase its resources when its parent process dies #81158

Open

Search code, repositories, users, issues, pull requests...

Uh oh!

bpo-36977: Make SharedMemoryManager release its resources if its parent process dies #13451

Are you sure you want to change the base?

bpo-36977: Make SharedMemoryManager release its resources if its parent process dies #13451

Uh oh!

Conversation

pierreglaser commented May 20, 2019 • edited by bedevere-bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pierreglaser commented May 20, 2019

Uh oh!

eamanu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bedevere-bot commented May 20, 2019

Uh oh!

Uh oh!

pierreglaser commented May 24, 2019

Uh oh!

pitrou left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pitrou May 30, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ogrisel May 27, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pitrou May 27, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pierreglaser commented May 20, 2019 •

edited by bedevere-bot

Loading

pitrou May 30, 2019 •

edited

Loading

ogrisel May 27, 2019 •

edited

Loading

pitrou May 27, 2019 •

edited

Loading