-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
performance optimization of vindex
for large indexes
#12074
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
vindex
for large indexes
Unit Test ResultsSee test report for an extended history of previous test failures. This is useful for diagnosing flaky tests. 9 files ±0 9 suites ±0 3h 14m 2s ⏱️ -31s For more details on these failures, see this check. Results for commit 196b85e. ± Comparison against base commit 37835c4. |
Hi, thanks for digging so deeply in to this. I had trouble understanding the description so here's what I have understood so far. In your example problem, It would certainly be good to come up with a better algorithm for this case; I spent a bunch of time trying to do it but didn't succeed (at least not cleanly). However I don't understand what you've changed. Can you add some comments to the PR please explaining the logic. FYI I developed #11625 looking at the benchmark in #10237. It would be good to not regress here.
In particular, writing code like this: |
Not sure if he didn't leave for the week-end already, but @GwenaelCaer has created a documentation website with a lot of details on the optimization: https://odatis-public.gitlab-pages.ifremer.fr/fast_vindex/index.html Does that answer your questions? Edit: in particular, this might be interesting: https://odatis-public.gitlab-pages.ifremer.fr/fast_vindex/development/algorithm_step_by_step.html |
ah thanks, that does help! Conceptually, the trick is to completely avoid broadcasting, which I agree with. However, I'm not sure this implementation works well for the pointwise case (which is the case I heavily optimized for). Note that there is no advantage to be gained by not broadcasting in that case. My conclusion earlier was that we'd need two algorithms to handle the two cases; in fact i remember thinking it's probably better to ask the user to use |
If I remember correctly, the proposed algorithm doesn't try to work with point-wise indexing, it is specialized on extracting regions (conceptually similar to what "vectorized multi-slices" would do). Which is why I suggested to add the proposed algorithm as a optimization for that case, and keep the existing algorithm for the other cases (not sure if the code already does that).
Not sure |
Hmmm I don't understand --- the indices going in are broadcastable against each other. I'll need to spend some time thinking about it. This feels like a different indecing type almost
In that case the first step may be to clearly write out a condition that triggers the optimization after checking a condition. and then we could explore how to share code. |
It appears I might have confused myself. After clearing my head a bit I think you're right in that the arrays could be broadcasted against each other (to get three Fundamentally, the problem can be decomposed into
Agreed, that makes sense to me. |
Yes i agree here. The challenge is figuring out when to do it, which is why maybe |
I believe it's a hybrid between vectorized and orthogonal indexing: if we have the three arrays of shape |
Thank you @keewis and @dcherian for the very interesting discussion. My algorithm is designed for a specific case with a hybrid of vectorized and orthogonal indexing, and it is therefore not suitable for point-wise indexing. It would therefore be useful to identify a condition that indicates when it could actually be applied. |
Hello,
In a use case with xarray, I needed to colocalize in-situ data with a Zarr datacube, and for each colocalization, extract a mini-cube around the observation point. This selection uses Dask’s vindex function, and I quickly ran into performance limitations.
After some research, I discovered that
vindex
uses point-wise indexing, which can become very slow when the index is large.I therefore tried to optimize
vindex
for my use case. The idea behind this optimization is to split the indices by chunks, index each chunk individually using these split indices, and then reconstruct the final result from the resulting pieces.The current version does not yet pass all tests, but it works for my use case. I’m new to Dask internals, so if anyone wants to help or has advice on improving it, I’d greatly appreciate it.
I’m sharing an example to test the function’s performance.
Thanks @keewis for your help and advice.