-
-
Notifications
You must be signed in to change notification settings - Fork 9.6k
[Messenger] Remove indices in messenger table on MySQL to prevent deadlocks while removing messages when running multiple consumers #42345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
IIUC, this issue only exists on MySQL. It this is the case, the change should only be done for MySQL and not for all database engines. |
790721a
to
ee0c97b
Compare
@fabpot, you are right. To be sure, I tested with PostgreSQL and I could not reproduce the issues there. So I changed this to only apply to MySQL. I also added a unit test for these changes. |
364079e
to
f7cc0b1
Compare
… removing messages when running multiple consumers SELECT ... FOR UPDATE row locks also locks indices. Since locking rows and indices is not one atomic operation, this might cause deadlocks when running multiple workers. Removing indices on queue_name and available_at resolves this problem.
f7cc0b1
to
8c3c0a3
Compare
Thank you @jeroennoten. |
Thank you @jeroennoten, I just saw this exact error in production for the very first time this weekend. Very nice to see it has already been resolved! |
@bobvandevijver, you're welcome! Don't forget to run |
We're using migrations to define the table, but thanks for the tip! |
@jeroennoten removing the indexes fixes the lock issue but caused us significant performance issues. I was searching the issues and it seems to me that multiple consumers are not working very well for others either, be it redis or doctrine transport, no matter. |
I'm proposing to revert this PR and replace it by another approach in #45888 |
…ks using soft-delete (nicolas-grekas) This PR was merged into the 4.4 branch. Discussion ---------- [Messenger] Add mysql indexes back and work around deadlocks using soft-delete | Q | A | ------------- | --- | Branch? | 4.4 | Bug fix? | yes | New feature? | no | Deprecations? | no | Tickets | Fix #42868 | License | MIT | Doc PR | - #42345 removed some indexes because they create too many deadlocks when running many concurrent consumers. Yet, as reported in the linked issue, those indexes are useful when processing large queues (typically the failed messages queue). #45007 is proposing to add an option to force the indexes back, but I don't like it because it requires ppl to learn about the issue. I looked for a more seamless solution and here is my proposal. Instead of possibly triggering the deadlock during `ack()`/`reject()`, I propose to use a soft-delete there, and do the real delete in `get()`. This makes ack/reject safe because they don't alter any indexes anymore (the delete was), and this moves deadlock exceptions to the same function that creates the locks. This allows the retry mechanism in `DoctrineReceiver` to recover from at most 3 consecutive deadlock exceptions. There can be more, and in this case, the consumer will stop. But this should be much less likely. (yes, I did create a reproducer to work on this issue ;) ) Commits ------- 12271a4 [Messenger] Add mysql indexes back and work around deadlocks using soft-delete
SELECT ... FOR UPDATE locks rows but also relevant indices. Since locking rows and indices is not one atomic operation,
this might cause deadlocks when running multiple workers. Removing indices on queue_name and available_at
resolves this problem.
Using Doctrine transport with multiple consumers occasionally results in MySQL deadlocks while removing a message from the messages database table.
This can be reproduced consistently by setting up a default
async
queue with the Doctrine transport and creating an emptyTestMessage
andTestMessageHandler
. Create a command that dispatches 10000 of these messages in a for loop en start 4 message consumers. After a while, several consumers report a deadlock:A similar problem with Laravel's queue worker (and a solution) is reported here: laravel/framework#31660
The solution is to remove indices on the
queue_name
andavailable_at
columns. After removing these indices, I could not reproduce the issue anymore. Also, I did not notice any performance degradations.