-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
BLD: use smaller scipy-openblas builds #27147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@charris This would be nice to backport since it shrinks the wheel sizes |
Thanks Matti. |
Done for 2.1 because it will have an rc, but skipped for 2.0. |
Has anyone compared these benchmarks against the Zen kernels on AMD chips? The original post only tested Intel archs b/c its a mac-focused repo, but its entirely possible that there will be a not-insignificant performance difference. |
We would need someone to rerun the benchmark scripts with an AMD processor that has AVX512 features. |
Only AVX2 over here, unfortunately. |
Looks like M7a instances should do the trick. edit: i'm just going to do it |
Marginally worse than SKYLAKEX despite, according to OpenBLAS docs, being HASWELL with zen2/3 optimizations (i.e. no AVX512). Curious what this looks like on my local zen 3 machine.
|
resounding meh |
I am not sure what I am seeing. What are the two results? |
First one is an AWS m7a-medium. One zen 4 core. Second is my personal machine, which is zen 3. Can't really see a reason to include the ZEN kernel based on either of those. |
And that is using an openblas from before the shrink? |
Hm. If I followed the instructions from the script repository exactly, it would have pulled down latest scipy-openblas, wouldn't it. |
Builds on #27140 to use the same OpenBLAS build but with fewer kernels. Based on the analysis in MacPython/openblas-libs#144 there are now 5 kernels based on cpu core labels
PRESCOTT NEHALEM SANDYBRIDGE HASWELL SKYLAKEX
. Needs a release note about the possible performance implications, and will also add a note about the windows changes in #27140.