[GISel] Funnel shift combiner port from SelectionDAG ISel to GlobalISel #135132

axelcool1234 · Apr 10, 2025

The funnel shift combiner rule from 4a3708c is currently missing from GlobalISel. The following is a port of that combiner to GlobalISel.

If anyone has a better name for the tablegen functions (they're currently called funnel_shift_or_shift_to_funnel_shift_X), let me know so I can change it. I'm not very imaginative with my naming!

axelcool1234 · Apr 10, 2025

@mshockwave I finished that combiner port! Let me know what you think :)

llvmbot · Apr 10, 2025

@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-llvm-globalisel

Author: Axel Sorenson (axelcool1234)

Changes

The funnel shift combiner rule from 4a3708c is currently missing from GlobalISel. The following is a port of that combiner to GlobalISel.

If anyone has a better name for the tablegen functions (they're currently called funnel_shift_or_shift_to_funnel_shift_X), let me know so I can change it. I'm not very imaginative with my naming!

Full diff: https://github.com/llvm/llvm-project/pull/135132.diff

2 Files Affected:

(modified) llvm/include/llvm/Target/GlobalISel/Combine.td (+21-1)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/shift.ll (+52)

diff --git a/llvm/include/llvm/Target/GlobalISel/Combine.td b/llvm/include/llvm/Target/GlobalISel/Combine.td
index 5309d5952f087..615617d4ca01b 100644
--- a/llvm/include/llvm/Target/GlobalISel/Combine.td
+++ b/llvm/include/llvm/Target/GlobalISel/Combine.td
@@ -1033,6 +1033,24 @@ def funnel_shift_overshift: GICombineRule<
   (apply [{ Helper.applyFunnelShiftConstantModulo(*${root}); }])
 >;
 
+// Transform: fshl x, ?, y | shl x, y -> fshl x, ?, y
+def funnel_shift_or_shift_to_funnel_shift_left: GICombineRule<
+  (defs root:$root), 
+  (match (G_FSHL $out1, $x, $_, $y),
+         (G_SHL $out2, $x, $y),
+         (G_OR $root, $out1, $out2)),
+  (apply (G_FSHL $root, $x, $_, $y))
+>;
+
+// Transform: fshr ?, x, y | srl x, y -> fshr ?, x, y
+def funnel_shift_or_shift_to_funnel_shift_right: GICombineRule<
+  (defs root:$root), 
+  (match (G_FSHR $out1, $_, $x, $y),
+         (G_LSHR $out2, $x, $y),
+         (G_OR $root, $out1, $out2)),
+  (apply (G_FSHR $root, $_, $x, $y))
+>;
+
 def rotate_out_of_range : GICombineRule<
   (defs root:$root),
   (match (wip_match_opcode G_ROTR, G_ROTL):$root,
@@ -1105,7 +1123,9 @@ def funnel_shift_combines : GICombineGroup<[funnel_shift_from_or_shift,
                                             funnel_shift_to_rotate,
                                             funnel_shift_right_zero,
                                             funnel_shift_left_zero,
-                                            funnel_shift_overshift]>;
+                                            funnel_shift_overshift,
+                                            funnel_shift_or_shift_to_funnel_shift_left,
+                                            funnel_shift_or_shift_to_funnel_shift_right]>;
 
 def bitfield_extract_from_sext_inreg : GICombineRule<
   (defs root:$root, build_fn_matchinfo:$info),
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/shift.ll b/llvm/test/CodeGen/RISCV/GlobalISel/shift.ll
index 75e318a58fd45..b52924cac0031 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/shift.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/shift.ll
@@ -105,3 +105,55 @@ define i16 @test_shl_i48_2(i48 %x, i48 %y) {
   %trunc = trunc i48 %shl to i16
   ret i16 %trunc
 }
+
+define i16 @test_fshl_i32(i32 %x, i32 %_, i32 %y) {
+; RV32-LABEL: test_fshl_i32:
+; RV32:       # %bb.0:
+; RV32-NEXT:    not a3, a2
+; RV32-NEXT:    sll a0, a0, a2
+; RV32-NEXT:    srli a1, a1, 1
+; RV32-NEXT:    srl a1, a1, a3
+; RV32-NEXT:    or a0, a0, a1
+; RV32-NEXT:    ret
+;
+; RV64-LABEL: test_fshl_i32:
+; RV64:       # %bb.0:
+; RV64-NEXT:    not a3, a2
+; RV64-NEXT:    sllw a0, a0, a2
+; RV64-NEXT:    srliw a1, a1, 1
+; RV64-NEXT:    srlw a1, a1, a3
+; RV64-NEXT:    or a0, a0, a1
+; RV64-NEXT:    ret
+
+    %fshl = call i32 @llvm.fshl.i32(i32 %x, i32 %_, i32 %y)
+    %shl = shl i32 %x, %y
+    %or = or i32 %fshl, %shl
+    %trunc = trunc i32 %or to i16
+    ret i16 %trunc
+}
+
+define i16 @test_fshr_i32(i32 %_, i32 %x, i32 %y) {
+; RV32-LABEL: test_fshr_i32:
+; RV32:       # %bb.0:
+; RV32-NEXT:    not a3, a2
+; RV32-NEXT:    slli a0, a0, 1
+; RV32-NEXT:    sll a0, a0, a3
+; RV32-NEXT:    srl a1, a1, a2
+; RV32-NEXT:    or a0, a0, a1
+; RV32-NEXT:    ret
+;
+; RV64-LABEL: test_fshr_i32:
+; RV64:       # %bb.0:
+; RV64-NEXT:    not a3, a2
+; RV64-NEXT:    slli a0, a0, 1
+; RV64-NEXT:    sllw a0, a0, a3
+; RV64-NEXT:    srlw a1, a1, a2
+; RV64-NEXT:    or a0, a0, a1
+; RV64-NEXT:    ret
+
+    %fshr = call i32 @llvm.fshr.i32(i32 %_, i32 %x, i32 %y)
+    %lshr = lshr i32 %x, %y
+    %or = or i32 %fshr, %lshr
+    %trunc = trunc i32 %or to i16
+    ret i16 %trunc
+}

davemgreen · Apr 10, 2025

Can you regenerate llvm/test/CodeGen/AArch64/funnel-shift.ll too? It looks like it is failing in the precommit tests and needs an update.

axelcool1234 · Apr 10, 2025

I updated the AArch64 test using update_llc_test_checks.py (it removed the redundant left shift and it being ORed with the left funnel shift).

davemgreen

Thanks. The combine looks good - nice and clean - other than the points mentioned below.

We would not usually need to add combines that the midend can reliably perform unless they come up during lowering. I am not sure if that is expected in this case or not, from the legalization of higher sized types or from vectors.

llvm/include/llvm/Target/GlobalISel/Combine.td

llvm/test/CodeGen/AArch64/funnel-shift.ll

llvm/include/llvm/Target/GlobalISel/Combine.td

llvm/test/CodeGen/RISCV/GlobalISel/shift.ll

arsenm · Apr 11, 2025

llvm/test/CodeGen/RISCV/GlobalISel/shift.ll

+    %or = or i32 %fshr, %lshr
+    %trunc = trunc i32 %or to i16
+    ret i16 %trunc
+}


Vector version of tests? Also negative multi-use tests. Ideally would share the existing DAG combine tests here instead of creating new copies

Ideally would share the existing DAG combine tests here instead of creating new copies

The issue I ran into has to do with the i48 type leading to a lot of code being generated to emulate @llvm.fshl.i48 since it isn't natively supported. The resulting expected test output would be really large. Should I still go for it?

Also negative multi-use tests.

Is this referring to the fact that I'm truncating or something else? I apologize, I'm still getting a hang to contributing here (and thanks for reviewing my patch!)

Vector version of tests?

Is there a common vector size that many contributors use for vector versions of tests? I'll make sure to implement that :)

The issue I ran into has to do with the i48 type leading to a lot of code being generated to emulate @llvm.fshl.i48 since it isn't natively supported. The resulting expected test output would be really large. Should I still go for it?

That's going to be true in either case. As long as it actually compiles, it's not a problem and gives a point of reference for whenever the code improves for illegal types.

Also negative multi-use tests.
Is this referring to the fact that I'm truncating or something else? I apologize, I'm still getting a hang to contributing here (and thanks for reviewing my patch!)

I mean where the intermediate instructions have unrelated uses and need to be emitted anyway

Is there a common vector size that many contributors use for vector versions of tests? I'll make sure to implement that :)

No, whatever is relevant for the target (you also wouldn't need to think too hard about that if reusing the original test, which I hope covers some vectors)

you also wouldn't need to think too hard about that if reusing the original test, which I hope covers some vectors

There aren't any vector tests here, but I'll gladly add some myself!

That's going to be true in either case. As long as it actually compiles, it's not a problem and gives a point of reference for whenever the code improves for illegal types.

Before I go forward and introduce i48 funnel shift calls to the original tests (in order to also test for multi-use i.e. if it correctly ensures the intermediate instructions are still emitted), I want to further clarify what I meant. The following is a test that uses an i32 fshl and another test that uses an i48 fshl. The difference in generated code (via update_llc_test_checks) is quite large; is this something I should even be concerned about? Once I introduce these i48 funnel shift calls to the original tests their emitted code checks will also explode in size. One way we could avoid this (if that's desirable of course) would be to change the original tests to use i32 types instead of i48. What do you think?

define i32 @test_fshl_i32(i32 %x, i32 %y) { ; RV32-LABEL: test_fshl_i32: ; RV32: # %bb.0: ; RV32-NEXT: not a2, a1 ; RV32-NEXT: sll a0, a0, a1 ; RV32-NEXT: li a1, 8 ; RV32-NEXT: srl a1, a1, a2 ; RV32-NEXT: or a0, a0, a1 ; RV32-NEXT: ret ; ; RV64-LABEL: test_fshl_i32: ; RV64: # %bb.0: ; RV64-NEXT: not a2, a1 ; RV64-NEXT: sllw a0, a0, a1 ; RV64-NEXT: li a1, 8 ; RV64-NEXT: srlw a1, a1, a2 ; RV64-NEXT: or a0, a0, a1 ; RV64-NEXT: ret %fshl = call i32 @llvm.fshl.i32(i32 %x, i32 16, i32 %y) ret i32 %fshl } define i48 @test_fshl_i48(i48 %x, i48 %y) { ; RV32-LABEL: test_fshl_i48: ; RV32: # %bb.0: ; RV32-NEXT: addi sp, sp, -16 ; RV32-NEXT: .cfi_def_cfa_offset 16 ; RV32-NEXT: sw ra, 12(sp) # 4-byte Folded Spill ; RV32-NEXT: sw s0, 8(sp) # 4-byte Folded Spill ; RV32-NEXT: sw s1, 4(sp) # 4-byte Folded Spill ; RV32-NEXT: sw s2, 0(sp) # 4-byte Folded Spill ; RV32-NEXT: .cfi_offset ra, -4 ; RV32-NEXT: .cfi_offset s0, -8 ; RV32-NEXT: .cfi_offset s1, -12 ; RV32-NEXT: .cfi_offset s2, -16 ; RV32-NEXT: mv s1, a0 ; RV32-NEXT: mv s0, a1 ; RV32-NEXT: mv a0, a2 ; RV32-NEXT: li s2, 47 ; RV32-NEXT: slli a1, a3, 16 ; RV32-NEXT: srli a1, a1, 16 ; RV32-NEXT: li a2, 48 ; RV32-NEXT: li a3, 0 ; RV32-NEXT: call __umoddi3 ; RV32-NEXT: li a2, 32 ; RV32-NEXT: bltu a0, a2, .LBB2_2 ; RV32-NEXT: # %bb.1: ; RV32-NEXT: li a1, 0 ; RV32-NEXT: sll a4, s1, a0 ; RV32-NEXT: sub a3, s2, a0 ; RV32-NEXT: bnez a0, .LBB2_3 ; RV32-NEXT: j .LBB2_4 ; RV32-NEXT: .LBB2_2: ; RV32-NEXT: sll a1, s1, a0 ; RV32-NEXT: neg a3, a0 ; RV32-NEXT: srl a3, s1, a3 ; RV32-NEXT: sll a4, s0, a0 ; RV32-NEXT: or a4, a3, a4 ; RV32-NEXT: sub a3, s2, a0 ; RV32-NEXT: beqz a0, .LBB2_4 ; RV32-NEXT: .LBB2_3: ; RV32-NEXT: mv s0, a4 ; RV32-NEXT: .LBB2_4: ; RV32-NEXT: li a0, 8 ; RV32-NEXT: bltu a3, a2, .LBB2_6 ; RV32-NEXT: # %bb.5: ; RV32-NEXT: li a2, 0 ; RV32-NEXT: bnez a3, .LBB2_7 ; RV32-NEXT: j .LBB2_8 ; RV32-NEXT: .LBB2_6: ; RV32-NEXT: srl a2, a0, a3 ; RV32-NEXT: beqz a3, .LBB2_8 ; RV32-NEXT: .LBB2_7: ; RV32-NEXT: mv a0, a2 ; RV32-NEXT: .LBB2_8: ; RV32-NEXT: or a0, a1, a0 ; RV32-NEXT: mv a1, s0 ; RV32-NEXT: lw ra, 12(sp) # 4-byte Folded Reload ; RV32-NEXT: lw s0, 8(sp) # 4-byte Folded Reload ; RV32-NEXT: lw s1, 4(sp) # 4-byte Folded Reload ; RV32-NEXT: lw s2, 0(sp) # 4-byte Folded Reload ; RV32-NEXT: .cfi_restore ra ; RV32-NEXT: .cfi_restore s0 ; RV32-NEXT: .cfi_restore s1 ; RV32-NEXT: .cfi_restore s2 ; RV32-NEXT: addi sp, sp, 16 ; RV32-NEXT: .cfi_def_cfa_offset 0 ; RV32-NEXT: ret ; ; RV64-LABEL: test_fshl_i48: ; RV64: # %bb.0: ; RV64-NEXT: addi sp, sp, -32 ; RV64-NEXT: .cfi_def_cfa_offset 32 ; RV64-NEXT: sd ra, 24(sp) # 8-byte Folded Spill ; RV64-NEXT: sd s0, 16(sp) # 8-byte Folded Spill ; RV64-NEXT: sd s1, 8(sp) # 8-byte Folded Spill ; RV64-NEXT: .cfi_offset ra, -8 ; RV64-NEXT: .cfi_offset s0, -16 ; RV64-NEXT: .cfi_offset s1, -24 ; RV64-NEXT: mv s0, a0 ; RV64-NEXT: li s1, 47 ; RV64-NEXT: slli a0, a1, 16 ; RV64-NEXT: srli a0, a0, 16 ; RV64-NEXT: li a1, 48 ; RV64-NEXT: call __umoddi3 ; RV64-NEXT: subw s1, s1, a0 ; RV64-NEXT: sll a0, s0, a0 ; RV64-NEXT: li a1, 8 ; RV64-NEXT: srl a1, a1, s1 ; RV64-NEXT: or a0, a0, a1 ; RV64-NEXT: ld ra, 24(sp) # 8-byte Folded Reload ; RV64-NEXT: ld s0, 16(sp) # 8-byte Folded Reload ; RV64-NEXT: ld s1, 8(sp) # 8-byte Folded Reload ; RV64-NEXT: .cfi_restore ra ; RV64-NEXT: .cfi_restore s0 ; RV64-NEXT: .cfi_restore s1 ; RV64-NEXT: addi sp, sp, 32 ; RV64-NEXT: .cfi_def_cfa_offset 0 ; RV64-NEXT: ret %fshl = call i48 @llvm.fshl.i48(i48 %x, i48 16, i48 %y) ret i48 %fshl }

Personally I don't think you have to test i48, which is an illegal type for RISC-V. Because the majority of the instructions you saw in i48's case are generated by the legalizer, which is unrelated to your patch here. So for the purpose of testing your pattern, it'll be more clear and succinct to only include legal types (i.e. i32 and i64)

Also negative multi-use tests.

I think this related to a check that is potentially missing from your pattern now: ideally, the pattern would replace three instructions

(G_FSHR $out1, $z, $x, $y), (G_LSHR $out2, $x, $y), (G_OR $root, $out1, $out2)

into just one:

(G_FSHR $root, $z, $x, $y)

But if $out2 or $out1 have more than one user, than you cannot remove the intermediate G_LSHR or G_FSHR, respectively. In which case you have to emit the following sequence to ensure the correctness

(G_FSHR $out1, $z, $x, $y), (G_LSHR $out2, $x, $y), (G_FSHR $root, $z, $x, $y)

But as you might already notice, this "new" sequence of 3 instructions is nowhere better than the original sequence, which also have 3 instructions (in most architectures, G_FSHR will not be faster than G_OR). So in the case of $out2 or $out1 having more than one user -- it makes more sense to just not do any optimization. I don't think your current pattern would conditionally trigger based on the number of users of intermediate values. You can checkout how other patterns handle it (hint: you don't necessarily need to check the number of $out1's users -- if $out2 only has a single use, you could just replace all uses of $root with $out1). Guarding an optimization pattern with checks on whether the intermediate values have one use is a pretty common thing among many optimizations. If you want to learn more, you can search hasOneUse or m_OneUse in DAGCombiner to see some examples.

Finally, a negative test is a test that makes sure your optimization does not trigger under certain conditions. In your case, it would be an input sequence with G_FSHR or G_LSHR that have more than one users, and the CHECK lines should pledge that they won't be optimized.

I apologize for my two week delay, I got severely sick in that time - I'm alright now though :)

I have changed the combiner to check for hasOneUse on $out2 uses and I now replace uses of $root with $out1 if $out2 only has one use. Additionally, I have changed the original tests test_lshr_i48 and test_shl_i48 to test_lshr_i32 and test_shl_i32. They serve as negative multi-use tests for the combiner now. I have yet to write vector tests - that's the next thing on my list.

I have left two of the original tests untouched and in their i48 form because they both include FIXME messages for the code that's generated.

mshockwave

Please refrain from force push unless it's necessary: https://www.llvm.org/docs/GitHub.html#rebasing-pull-requests-and-force-pushes.
Please append new commits instead, as they're all gonna be squash into one at the end.

mshockwave · Apr 20, 2025

llvm/test/CodeGen/RISCV/GlobalISel/shift.ll

@@ -105,3 +105,53 @@ define i16 @test_shl_i48_2(i48 %x, i48 %y) {
  %trunc = trunc i48 %shl to i16
  ret i16 %trunc
 }
+
+define i16 @test_fshl_i32(i32 %x, i32 %_, i32 %y) {


could you create two commits for this PR: the first one contains only the new test functions you added (e.g. test_fshl_i32 here) but with CHECK lines showing the unoptimized assembly code. And the second commit being your changes, in which the CHECK lines are updated with optimized code.
With such, it's easier to see the changes made by your patch by comparing the differences of the CHECK lines between the two commits. The first commit is also known as "pre-commit test"

Please refrain from force push unless it's necessary: https://www.llvm.org/docs/GitHub.html#rebasing-pull-requests-and-force-pushes. Please append new commits instead, as they're all gonna be squash into one at the end.

Woops! I apologize, I can't believe I forgot this.

could you create two commits for this PR: the first one contains only the new test functions you added (e.g. test_fshl_i32 here) but with CHECK lines showing the unoptimized assembly code. And the second commit being your changes, in which the CHECK lines are updated with optimized code. With such, it's easier to see the changes made by your patch by comparing the differences of the CHECK lines between the two commits. The first commit is also known as "pre-commit test"

I have pushed two commits following this style, although I'm probably going to have to do it again once I introduce vector tests, provide better names for the combiner and tests, and (hopefully) figure out what's causing the AArch64 fshr test to not trigger the combiner (I have a comment in another review thread in this PR noting that it's likely due to the OR instruction not matching if its operands are flipped which is strange since they should commute).

(not this patch) next time, you could make the first commit the pre-commit test, followed by your actual patch, such that you don't have to comment & un-comment like you did here.

llvm/include/llvm/Target/GlobalISel/Combine.td

jayfoad · May 6, 2025

I don't see why the hasOneUse checks are needed, and the DAG version that you copied does not have them.

axelcool1234 · May 6, 2025

I don't see why the hasOneUse checks are needed, and the DAG version that you copied does not have them.

I was told by @arsenm that it was a common combiner bug to lack the check. However, I do wonder:
@mshockwave You gave such a good explanation here #135132 (comment) about what's going on but I'm wondering if it's slightly incorrect now that we're applying a GIReplaceReg: In the combiner's current form, $root (which is the OR instruction) is replaced with the original FSHR instruction. If there are multiple uses of the LSHR instruction, would a hasOneUse check for that actually change anything? When LSHR has only one use, the replacement of the OR instruction leads to it having zero uses and not being emitted. If the LSHR has several uses, I suspect it'll be emitted as usual whether or not we have this hasOneUse check here since the combiner isn't removing it anymore (since it's simply replacing the OR instruction with the original FSHR instruction), right? Is my assessment of this correct?

I think my confusion stems from what happens when we do (apply (G_FSHR $root, $z, $x, $y)) vs (apply (GIReplaceReg $root, $out1)). This is how I believe it works:

The first apply I mentioned removes all of the instructions that are matched with the resulting instruction in the apply
The second apply I mentioned replaces only the specified instruction, not everything that's matched

If that's correct, then I think @jayfoad might be right: if the LSHR instruction lacks other users, it'll never be emitted anyways (and it will be emitted if it does have users). Thus, we don't need the hasOneUse check.

jayfoad · May 6, 2025

I think my confusion stems from what happens when we do (apply (G_FSHR $root, $z, $x, $y)) vs (apply (GIReplaceReg $root, $out1)). This is how I believe it works:

The first apply I mentioned removes all of the instructions that are matched with the resulting instruction in the apply

The second apply I mentioned replaces only the specified instruction, not everything that's matched

The combine doesn't really remove the original instructions, it just replaces all uses of the original root instruction. Then it's up to the normal dead code elimination to remove the original instructions if they are unused. The point of a hasOneUse check is to make sure that one of the sub-instructions in the pattern will be unused, because it doesn't have any other uses outside of the pattern we matched.

Typically for a cmobine that creates a new instruction, you would use hasOneUse checks, because you don't want to create a new instruction unless you can remove one or more of the existing instructions. But this combine does not create any new instructions, so I don't think there's any particular need for tha hasOneUse check.

… the redundant OR instruction!)

axelcool1234 · May 10, 2025

Now that the hasOneUse checks were determined to not be required, the original lshr and shl tests, which were converted into negative multi-use tests for funnel shifts, are no longer negative multi-use tests (they now trigger despite multiple uses of the intermediate instructions). Because of this, perhaps it'd be best to change these tests back to what they were before? I still have two new tests I've introduced that test fshl and fshr.

Additionally, this file lacks vector versions of any of the shift tests. I'll look into introducing them.

…rder of operands)

llvm/include/llvm/Target/GlobalISel/Combine.td

axelcool1234 · May 20, 2025

Additionally, this file lacks vector versions of any of the shift tests. I'll look into introducing them.

According to the docs, RISC-V supports vectory types: https://llvm.org/docs/RISCV/RISCVVectorExtension.html. However, every time I try to write a trivial function test with a vector I get an error stating that RISC-V is unable to "lower" the arguments. I think the reason I can't use them has to do with the test running a version of RISC-V without the vector extension enabled. What should I change the RUN commands to in order to test fshl and fshr with vectors?

mshockwave · May 20, 2025

Additionally, this file lacks vector versions of any of the shift tests. I'll look into introducing them.

According to the docs, RISC-V supports vectory types: https://llvm.org/docs/RISCV/RISCVVectorExtension.html. However, every time I try to write a trivial function test with a vector I get an error stating that RISC-V is unable to "lower" the arguments. I think the reason I can't use them has to do with the test running a version of RISC-V without the vector extension enabled. What should I change the RUN commands to in order to test fshl and fshr with vectors?

To enable vectors, you can llc -mattr=+v ...

axelcool1234 · May 24, 2025

@mshockwave So I've been exposed to scalable vectors for the first time - I've been exploring how they work in RISC-V (its very interesting). Please tell me if I'm mistaken, but it seems GlobalISel only supports scalable vectors (and not fixed vectors) for arguments to functions. Despite this, it seems most scalable vector instructions are not supported in GlobalISel. All of my fiddling around with LLVM IR is getting me LLVM ERROR: unable to translate instruction: call and LLVM ERROR: unable to lower arguments: errors. Investigation online leads me to believe that my observations are correct: LLVM lacks a lot of scalable vector instruction support. This talk from 2023 also says that GlobalISel lacks scalable vector support.

If this is all true, how should I construct these vector tests? Since I can't seem to pass fixed length vectors as arguments, I'm forced to write them in the function body (but this leads to the compiler constant folding it all down since it statically knows the content of said fixed vectors). Unless there's a better way, perhaps I should look into loading the vector from a pointer or something?

Admittedly, I feel I haven't done enough digging or deeper reading into this (beyond the language reference and this very useful article, combined with your blog post on this). If you have any resources I should really sink my teeth into, I will do so. Thanks again!

llvmbot added the llvm:globalisel label Apr 10, 2025

davemgreen requested review from aemerson, topperc, arsenm and davemgreen April 10, 2025 06:47

davemgreen requested a review from mshockwave April 10, 2025 06:47

axelcool1234 force-pushed the fsh-Combiner-Port branch from e8b1457 to 6b8c8bd Compare April 10, 2025 23:41

llvmbot added the backend:AArch64 label Apr 10, 2025

davemgreen reviewed Apr 11, 2025

View reviewed changes

llvm/include/llvm/Target/GlobalISel/Combine.td Outdated Show resolved Hide resolved

llvm/test/CodeGen/AArch64/funnel-shift.ll Show resolved Hide resolved

arsenm reviewed Apr 11, 2025

View reviewed changes

llvm/include/llvm/Target/GlobalISel/Combine.td Outdated Show resolved Hide resolved

arsenm reviewed Apr 11, 2025

View reviewed changes

axelcool1234 force-pushed the fsh-Combiner-Port branch from 6b8c8bd to 17c74e0 Compare April 12, 2025 07:03

funnel shift combiner port from SelectionDAG ISel to GlobalISel

733135e

axelcool1234 force-pushed the fsh-Combiner-Port branch from 17c74e0 to 733135e Compare April 12, 2025 07:19

mshockwave reviewed Apr 20, 2025

View reviewed changes

axelcool1234 added 2 commits May 3, 2025 05:30

pre-commit test

2fc366c

With combiner on (optimized code)

ddf3bd1

axelcool1234 requested a review from mshockwave May 3, 2025 12:50

cdevadas changed the title ~~[GISel] funnel shift combiner port from SelectionDAG ISel to GlobalISel~~ [GISel] Funnel shift combiner port from SelectionDAG ISel to GlobalISel May 5, 2025

arsenm reviewed May 5, 2025

View reviewed changes

llvm/include/llvm/Target/GlobalISel/Combine.td Outdated Show resolved Hide resolved

llvm/include/llvm/Target/GlobalISel/Combine.td Outdated Show resolved Hide resolved

Removed hasOneUse checks (note that the generated test code now lacks…

59ab794

… the redundant OR instruction!)

Ensured G_OR is commutative in combines (now triggers no matter the o…

3e6e0ac

…rder of operands)

axelcool1234 commented May 10, 2025

View reviewed changes

llvm/include/llvm/Target/GlobalISel/Combine.td Outdated Show resolved Hide resolved

axelcool1234 added 2 commits May 20, 2025 01:19

Added FIXME comments for TableGen's lack of G_OR commutativity handling

96b04fe

Reversed changes to original shift.ll tests

c929993

Search code, repositories, users, issues, pull requests...

[GISel] Funnel shift combiner port from SelectionDAG ISel to GlobalISel #135132

Are you sure you want to change the base?

[GISel] Funnel shift combiner port from SelectionDAG ISel to GlobalISel #135132

Uh oh!

Conversation

axelcool1234 commented Apr 10, 2025

Uh oh!

axelcool1234 commented Apr 10, 2025

Uh oh!

llvmbot commented Apr 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davemgreen commented Apr 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

axelcool1234 commented Apr 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davemgreen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

arsenm Apr 11, 2025

Choose a reason for hiding this comment

Uh oh!

axelcool1234 Apr 13, 2025

Choose a reason for hiding this comment

Uh oh!

arsenm Apr 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

axelcool1234 Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mshockwave Apr 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

axelcool1234 May 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mshockwave left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mshockwave Apr 20, 2025

Choose a reason for hiding this comment

Uh oh!

axelcool1234 May 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mshockwave May 6, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jayfoad commented May 6, 2025

Uh oh!

axelcool1234 commented May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jayfoad commented May 6, 2025

Uh oh!

axelcool1234 commented May 10, 2025

Uh oh!

Uh oh!

axelcool1234 commented May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mshockwave commented May 20, 2025

Uh oh!

axelcool1234 commented May 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

llvmbot commented Apr 10, 2025 •

edited

Loading

davemgreen commented Apr 10, 2025 •

edited

Loading

axelcool1234 commented Apr 10, 2025 •

edited

Loading

arsenm Apr 13, 2025 •

edited

Loading

axelcool1234 Apr 16, 2025 •

edited

Loading

mshockwave Apr 20, 2025 •

edited

Loading

axelcool1234 May 3, 2025 •

edited

Loading

mshockwave left a comment •

edited

Loading

axelcool1234 May 3, 2025 •

edited

Loading

axelcool1234 commented May 6, 2025 •

edited

Loading

axelcool1234 commented May 20, 2025 •

edited

Loading

axelcool1234 commented May 24, 2025 •

edited

Loading