igzip/riscv64: Add adler32_rvv optimization for VLEN=128 #374

leiwen2025 · Nov 20, 2025

This PR introduces an optimized adler32_rvv implementation for vlen=128.

The optimization has been verified on the SG2044 platform:

SG2044:
        new: adler32_warm: runtime =    3062471 usecs, bandwidth 23095 MB in 3.0625 sec = 7541.43 MB/s
        old: adler32_warm: runtime =    3062465 usecs, bandwidth 9233 MB in 3.0625 sec = 3015.15 MB/s

pablodelara · Nov 24, 2025

@sunyuechi can you review this? Thanks!

sunyuechi · Dec 3, 2025

igzip/riscv64/igzip_isal_adler32_rvv128.S

+    addi    sp, sp, -32
+    sd      ra, 24(sp)
+    sd      s1, 16(sp)
+    sd      s2, 8(sp)


You can use the unused registers to reduce stack operations (at least a7, t5)

sunyuechi · Dec 3, 2025

igzip/riscv64/igzip_isal_adler32_rvv128.S

+    slli    s1, a0, 48
+    srli    s1, s1, 48              // s1: A = adler32 & 0xffff
+    srliw   s2, a0, 16              // s2: B = adler32 >> 16
+    add     s3, a1, a2              // s3 = end


sunyuechi · Dec 5, 2025

igzip/riscv64/igzip_isal_adler32_rvv128.S

+    la      a7, factors
+    vle8.v  v0, (a7)
+    vmv.v.i v4, 0
+    vmv.v.i v8, 0


v4 hasn’t been modified, so you can just use v4.

Done, thanks for the review!

sunyuechi · Dec 5, 2025

igzip/riscv64/igzip_isal_adler32_rvv128.S

+    mv      t2, t1
+1:
+    mv      a3, t5
+    mv      a4, t6


t5, t6 -> a3, a4 update a3, a4 a3, a4 -> t5, t6

It doesn’t seem to be needed here — is it fine to just update t5 and t6 directly?

Done, thanks for the review!

sunyuechi · Dec 13, 2025

igzip/riscv64/igzip_isal_adler32_rvv128.S

+    mul     a3, t6, t3
+    srli    a3, a3, 47
+    mul     a4, a3, t4
+    sub     t6, t6, a4


You can directly copy the same 8 lines with the logic above. Not changing the temporary registers makes it a bit clearer. Then combine these commits into one, and it should be ready to be merged.

Done, thanks for the review!

Signed-off-by: WenLei <lei.wen2@zte.com.cn>

pablodelara · Dec 15, 2025

@sunyuechi is this OK to merge now?

sunyuechi · Dec 15, 2025

@pablodelara ok

leiwen2025 · Dec 18, 2025

Could this PR be merged, or is there anything else I need to change? Thanks!

sunyuechi reviewed Dec 3, 2025

View reviewed changes

sunyuechi reviewed Dec 5, 2025

View reviewed changes

leiwen2025 force-pushed the rv64-igzip-adler32rvv128 branch from 787635b to 7ed6de8 Compare December 6, 2025 11:24

sunyuechi reviewed Dec 13, 2025

View reviewed changes

igzip/riscv64: Add adler32_rvv optimization for VLEN=128

efb9e33

Signed-off-by: WenLei <lei.wen2@zte.com.cn>

leiwen2025 force-pushed the rv64-igzip-adler32rvv128 branch from 7ed6de8 to efb9e33 Compare December 15, 2025 03:19

Search code, repositories, users, issues, pull requests...

igzip/riscv64: Add adler32_rvv optimization for VLEN=128 #374

Are you sure you want to change the base?

igzip/riscv64: Add adler32_rvv optimization for VLEN=128 #374

Uh oh!

Conversation

leiwen2025 commented Nov 20, 2025

Uh oh!

pablodelara commented Nov 24, 2025

Uh oh!

sunyuechi Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

sunyuechi Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

sunyuechi Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

leiwen2025 Dec 6, 2025

Choose a reason for hiding this comment

Uh oh!

sunyuechi Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leiwen2025 Dec 6, 2025

Choose a reason for hiding this comment

Uh oh!

sunyuechi Dec 13, 2025

Choose a reason for hiding this comment

Uh oh!

leiwen2025 Dec 15, 2025

Choose a reason for hiding this comment

Uh oh!

pablodelara commented Dec 15, 2025

Uh oh!

sunyuechi commented Dec 15, 2025

Uh oh!

leiwen2025 commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sunyuechi Dec 5, 2025 •

edited

Loading