Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

BLD: use smaller scipy-openblas builds #27147

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Aug 9, 2024

Conversation

mattip
Copy link
Member

@mattip mattip commented Aug 8, 2024

Builds on #27140 to use the same OpenBLAS build but with fewer kernels. Based on the analysis in MacPython/openblas-libs#144 there are now 5 kernels based on cpu core labels PRESCOTT NEHALEM SANDYBRIDGE HASWELL SKYLAKEX. Needs a release note about the possible performance implications, and will also add a note about the windows changes in #27140.

@github-actions github-actions bot added the 36 - Build Build related PR label Aug 8, 2024
@mattip
Copy link
Member Author

mattip commented Aug 9, 2024

@charris This would be nice to backport since it shrinks the wheel sizes

@charris charris added the 09 - Backport-Candidate PRs tagged should be backported label Aug 9, 2024
@charris charris merged commit 807cd74 into numpy:main Aug 9, 2024
66 checks passed
@charris
Copy link
Member

charris commented Aug 9, 2024

Thanks Matti.

@charris charris removed the 09 - Backport-Candidate PRs tagged should be backported label Aug 9, 2024
@charris
Copy link
Member

charris commented Aug 9, 2024

This would be nice to backport

Done for 2.1 because it will have an rc, but skipped for 2.0.

@theAeon
Copy link

theAeon commented Sep 8, 2024

Has anyone compared these benchmarks against the Zen kernels on AMD chips? The original post only tested Intel archs b/c its a mac-focused repo, but its entirely possible that there will be a not-insignificant performance difference.

@mattip
Copy link
Member Author

mattip commented Sep 8, 2024

We would need someone to rerun the benchmark scripts with an AMD processor that has AVX512 features.

@theAeon
Copy link

theAeon commented Sep 8, 2024

Only AVX2 over here, unfortunately.

@theAeon
Copy link

theAeon commented Sep 9, 2024

Looks like M7a instances should do the trick.

edit: i'm just going to do it

@theAeon
Copy link

theAeon commented Sep 9, 2024

Marginally worse than SKYLAKEX despite, according to OpenBLAS docs, being HASWELL with zen2/3 optimizations (i.e. no AVX512). Curious what this looks like on my local zen 3 machine.

arch mean spread perf_ratios
SAPPHIRERAPIDS 2.57453 0.01505 1
CORE2 2.58216 0.01535 1.00296
COOPERLAKE 2.58281 0.01735 1.00321
SKYLAKEX 2.58809 0.0184 1.00527
ZEN 2.58923 0.01005 1.00571
PRESCOTT 2.59641 0.0072 1.0085
PENRYN 2.59959 0.0206 1.00973
HASWELL 2.60049 0.0165 1.01008
KATMAI 2.60156 0.0104 1.0105
ATOM 2.6024 0.02285 1.01082
COPPERMINE 2.60262 0.0155 1.01091
NORTHWOOD 2.60514 0.01525 1.01189
DUNNINGTON 2.60977 0.00905 1.01369
SANDYBRIDGE 2.61479 0.00485 1.01564
NEHALEM 2.6164 0.01415 1.01626
BANIAS 2.61833 0.00895 1.01701

@theAeon
Copy link

theAeon commented Sep 9, 2024

arch mean spread perf_ratios
HASWELL 0.0699409 0.000441 1
ZEN 0.0714307 0.003309 1.0213
SANDYBRIDGE 0.0949565 0.0010665 1.35767
CORE2 0.167046 0.003455 2.38839
PENRYN 0.175879 0.00074 2.51468
DUNNINGTON 0.18015 0.00605 2.57574
NEHALEM 0.195515 0.002045 2.79543
COPPERMINE 0.251743 0.00092 3.59937
BANIAS 0.253619 0.00164 3.62619
PRESCOTT 0.253751 0.00249 3.62807
KATMAI 0.256017 0.003495 3.66047
NORTHWOOD 0.25638 0.00475 3.66567
ATOM 0.317082 0.00437 4.53357

resounding meh

@mattip
Copy link
Member Author

mattip commented Sep 9, 2024

I am not sure what I am seeing. What are the two results?

@theAeon
Copy link

theAeon commented Sep 9, 2024

First one is an AWS m7a-medium. One zen 4 core.

Second is my personal machine, which is zen 3.

Can't really see a reason to include the ZEN kernel based on either of those.

@mattip
Copy link
Member Author

mattip commented Sep 9, 2024

And that is using an openblas from before the shrink?

@theAeon
Copy link

theAeon commented Sep 9, 2024

Hm. If I followed the instructions from the script repository exactly, it would have pulled down latest scipy-openblas, wouldn't it.

@mattip mattip deleted the scipy-openblas-0.3.27.44.5 branch May 5, 2025 11:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
36 - Build Build related PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.