Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

bpo-36977: Make SharedMemoryManager release its resources if its parent process dies #13451

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
Loading
from

Conversation

pierreglaser
Copy link
Contributor

@pierreglaser pierreglaser commented May 20, 2019

@pierreglaser
Copy link
Contributor Author

@ogrisel

Copy link
Contributor

@eamanu eamanu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, IMO you should test the if your changes. I think that you are not test it, isn't?

Are you making this changes for the test? I think if you want to say anything about (in this
example) process child are alive yet, you need to say it on the code (no in the test).
Because, anyone will know that the process are alive.

Lib/test/_test_multiprocessing.py Show resolved Hide resolved
@pierreglaser pierreglaser force-pushed the shared-memory-manager-shutdown branch from 56c0933 to c8add0d Compare May 20, 2019 20:50
@pierreglaser pierreglaser force-pushed the shared-memory-manager-shutdown branch from c8add0d to 87c889f Compare May 20, 2019 20:52
deadline = start_time + 60
t = 0.1
while time.monotonic() < deadline:
time.sleep(t)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel comfortable having tests that rely on sleep, they are almost assured to fail in some of the buildbots
(check #10700 for example). Could you restructure this
tests to use synchronization primitives?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your worries. I am having trouble thinking of a test that does not require a timeout though (else, we risk hanging forever waiting for the synchronization event to occur). Inherently, this tests a feature that relies on connection.wait, which can block for a non-deterministic amount of time.

Note that using this pattern:

while time.monotonic() < deadline:
    time.sleep(t)
    if condition():
        break
else:
    raise AssertionError

is already more robust than a simple

time.sleep(dt)
self.assertTrue(condition())

It was discussed here: https://bugs.python.org/issue36867, and @vstinner seemed OK with its usage. I may be missing something though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multiprocessing remains the module with the higher amount of random failures due to sleeps/race conditions. I will dismiss my review in case there is no other way, but I still uneasy about adding more sleep() :(

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

Lib/test/_test_multiprocessing.py Show resolved Hide resolved
@pierreglaser
Copy link
Contributor Author

Maybe @pitrou wants to have a look at this.

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments below. It would also be nice if @applio could take a look and say if it looks fine on the principle.

@@ -159,6 +159,10 @@ def __init__(self, registry, address, authkey, serializer):
self.id_to_local_proxy_obj = {}
self.mutex = threading.Lock()

def _track_parent(self):
process.parent_process().join()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if process.parent_process() returns None? For example if the Server was launched in an independent Python instance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did not know this was a supported use-case. But we can add a guard.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, adding a guard will simply discard the case where parent_process() is None, which is not the best solution. Yet, I feel that tracking the usage of manager-allocated resources by other processes than the manager's direct parent is tricky, especially for SharedMemoryManager (see more detailed explanation here #13247 (comment)).

Do we want to make this feature more robust by trying to make the manager wait on any process that tries to connect to it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid I'm not following you here. We're simply trying to fix the case where the manager was launched by its parent process, no? (that's what the issue title says)

The case where the manager is autonomous looks pretty much unfixable on its own.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're simply trying to fix the case where the manager was launched by its parent process

Ok, lets stick to this in this PR.

The case where the manager is autonomous looks pretty much unfixable on its own.

I agree, but there are in-betweens: if a manager is created by a process (its parent), and then connected to by other processes, then should the manager just watch its parent? Technically, if other processes are also connected to it, it probably should not shut down unless ALL processes connected to it (and not only its parent) terminated.

Copy link
Member

@pitrou pitrou May 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... So, conversely, if the manager loses all its connections and was created by a parent process, then it should die?

Perhaps that should be the right heuristic?

In other words: SharedMemoryServer.handle_request can easily keep track of the number of connected clients. At the end, it decrements that number and, if it falls to 0 and there is a parent process, then die?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think this is already a more robust solution. However, as opposed to with SyncManager, manipulating/modifying SharedMemoryManager delivered memory segments does not necessarily trigger communication with the manager server process. This is because SharedMemoryManager delivers SharedMemory objects, and not proxies. Thus, keeping track of the number of connected clients does not equate keep track of the processes manipulating those objects. But maybe this approach is still good enough for now.

Also, the recent os.memfd_create mentioned in this python issue looks interesting, but I need to look more into it.

shm_name = p.stdout.readline().decode().strip()
p.terminate()
start_time = time.monotonic()
deadline = start_time + 60
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10 seconds sounds enough, no?

Copy link
Contributor

@ogrisel ogrisel May 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of overloaded CI workers one can see weird things on rare occasions.

99.9% of the time it will probably take less than 10ms (although I have not checked), this limit is just a timeout, so I think it's fine to keep a large value to make it extremely unlikely to get false positive failures on CI servers.

Copy link
Member

@pitrou pitrou May 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are seconds here not milliseconds.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But, no strong feelings either.

Lib/test/_test_multiprocessing.py Outdated Show resolved Hide resolved
Lib/multiprocessing/managers.py Outdated Show resolved Hide resolved
Lib/multiprocessing/managers.py Outdated Show resolved Hide resolved
@@ -159,6 +159,10 @@ def __init__(self, registry, address, authkey, serializer):
self.id_to_local_proxy_obj = {}
self.mutex = threading.Lock()

def _track_parent(self):
process.parent_process().join()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid I'm not following you here. We're simply trying to fix the case where the manager was launched by its parent process, no? (that's what the issue title says)

The case where the manager is autonomous looks pretty much unfixable on its own.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.