-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Timer consistency across backends #29062
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
With this PR all backends produce 7 events for this script: |
I added tests and they all pass locally for me, but I can't seem to get things working on the CI runners, so I'm going to put some thoughts/notes down here. I'm wondering if there are issues with trying to assert against timing on shared resources? Thinking it was just trying to run things too fast on limited resources, I increased the timer interval to 500ms and that didn't seem to help. Currently, I am asserting against the expected number of calls, is there possibly a delay involved in first timer spin-up so I should be asserting against the final timer interval somehow instead? If anyone has ideas for better tests here let me know. |
8596fb6
to
7f40ba5
Compare
Since this is only failing on 3.13 (on macOS at least), one possibly suspicious change is in
But I thought you had used |
I can reproduce the failure on AppVeyor locally (Expected 8, got 5); given #28647 (comment), I think that means it is re-evaluating the timeout after the slow callback and not attempting to create a consistent time. |
That is actually a different failure than the one we are seeing here of course 😂 https://github.com/matplotlib/matplotlib/actions/runs/11688137387/job/32547874573?pr=29062#step:15:200
These tests on CI systems seem flakey because I had a job right before this push where all the tests passed. I think this gets to the comment you left about system resources and scheduling because sometimes QT is the one failing in CI, others it is GTK. We are running pytest with as many processors as possible Running locally with this script: import time
import matplotlib.pyplot as plt
fig = plt.figure()
last_time = orig_time = time.perf_counter()
x = 0
def on_timer():
global x
global last_time
t = time.perf_counter()
print(f"{x:02d}: {t - last_time:.3f} ({t - orig_time:.3f})")
last_time = t
if x == 0:
time.sleep(0.5)
else:
time.sleep(0.1)
x += 1
timer = fig.canvas.new_timer(interval=150)
timer.add_callback(on_timer)
timer.start()
fig.canvas.start_event_loop(3) GTK has an ~8ms drift associated with it, whereas all the other backends hit within ~1ms.
|
It looks like GTK has no plans on changing this: https://gitlab.gnome.org/GNOME/glib/-/issues/503
|
7f40ba5
to
8bc2e95
Compare
The implementation of start_event_loop would previously just count the number of sleeps that occurred. But this could lead to longer event loop times if flush_events() added time into the loop. We want the condition to be dependent on the end-time so we don't run our loop longer than necessary.
If we set interval in a loop, this could cause the timing to become dependent on the callback processing time.
The Tk timer would reset itself after the callback had processed to add a new interval. This meant that if a long callback was being added we would get a drift in the timer. We need to manually track the original firing time and intervals based on that.
Make sure that the interactive timers don't drift when long callbacks are associated with them.
Previously, the timers were dependent on the length of time it took for the timer callback to execute. This dispatches the callback to the task queue to avoid synchronously waiting on long-running callback tasks.
This adds more robust interactive timer tests to assert against some of the discrepencies that were found in testing. - Run loop shouldn't depend on callback time - Slow callbacks shouldn't cause a timer to drift over time, it should continually fire at the requested cadence - When start() is called again it should invalidate the previous timer associated with that Timer object
8bc2e95
to
54e07ed
Compare
PR summary
Currently, all of the backend timers have subtly different behavior relating to long-running callbacks. Meaning we don't make any guarantees about when the timers will fire currently and it is the wild-west of what will happen. This PR has quite a few things going on to try and bring these all into consistency and adds tests for these cases. I'll describe the main updates below:
default backend main loop
This is a really bad error IMO. We are currently running the loop for x number of sleeps(), but if the callback takes longer than the sleep duration this makes the runloop run longer than a user requested. I have updated this to be dependent on total duration within the loop.
interval updates
When setting the interval or singleshot attribute of a timer, we would always call the underlying
start()
again. Meaning that this would trigger a timer "reset" in a sense, even if the interval was exactly the same as before (i.e. setting it constantly in a loop). I have updated this to only "reset"/"restart" the timer if the value has changed.Tk
We need to keep track of the expected firing time ourselves since there is no repeating timer. We keep track of the original requested firing time and the callback trigger to account for the case of a callback taking longer than the repeat timer when we actually want the timer to fire immediately and not on the next interval.
Added singleshot and interval handler updates.
wx
Added singleshot update handler.
macos
Added an asynchronous dispatch to the main queue. This prevents timer drift with synchronous slow callbacks.
Added singleshot and interval update handling.
Removed previously started repeating timer
testing
Added a test for a slow callback to the interactive timers and refactored the tests to try to process in parallel and reduce some of the time spent in the tests.
closes #28647
closes #29029
closes #29076
PR checklist