-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
ENH: Add LSX optimization for LoongArch #25215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@loongson-zn thank you for this PR. At first sight, the test/benchmark results look good and the code is clean. My first question is regarding testing, since we don't have access to this hardware. We test other architectures for which we don't have regular hosted CI available through QEMU (see https://github.com/numpy/numpy/blob/main/.github/workflows/linux_qemu.yml). Is there QEMU support for loongarch, and if so can you add a CI job in that |
LoongArch is promoting the Debian community, and ubuntu does not yet have a suitable toolchain, so ci is not applicable( ex: apt install gcc-xx). I am verifying with QEMU, but I have encountered some trouble. Please give me some time to debug. |
@rgommers I have x86_64 crosscompile toolchain and enabling native execution via binfmt, but I encountered issues related to Python. Perhaps you can provide me with some help |
I can give you a hand, what kind of issues are you facing? maybe it would be better to open a separate pull request that adds a CI test for LoongArch. |
Thanks @seiko2plus, I will try my best to provide a CI testing environment within a week and start a new issue. |
Add LSX optimization for LoongArch.
First, Maximum performance improvement of 42.3x (python3.9 runtest.py --bench);
Second, I have completed the functional testing on V1.25.2 \ V1.26.2 and main branch.
V1.25.2 : 'python3.9 runtest.py -v -m full', python3.9 and gcc 8.3
35023 passed, 443 skipped, 31 xfailed, 5 xpassed
;'spin test -v '
33801 passed, 395 skipped, 1303 deselected, 30 xfailed, 4 xpassed
V1.26.2 : 'python3.9 runtest.py -v -m full', python3.9 and gcc 8.3
35142 passed, 357 skipped, 30 xfailed, 4 xpassed
'spin test -v '
34020 passed, 176 skipped, 1303 deselected, 30 xfailed, 4 xpassed
main branch: 'spin test -v' , python3.9 and gcc 10.3
39359 passed, 177 skipped, 1303 deselected, 34 xfailed, 6 xpassed in 450.49s