Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

[AMDGPU] Simplify getIns64. NFCI. #139981

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 15, 2025

Conversation

rampitec
Copy link
Collaborator

This big switch is unmaintainable and buggy. In particular it unconditionally
adds clamp if there is omod to VOP3.

Copy link
Collaborator Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@rampitec rampitec requested review from arsenm and shiltian May 14, 2025 23:57
@rampitec rampitec force-pushed the users/rampitec/05-14-_amdgpu_simplify_getins64._nfci branch from 6c7c212 to 5ba584e Compare May 15, 2025 00:00
@rampitec rampitec marked this pull request as ready for review May 15, 2025 00:01
@llvmbot
Copy link
Member

llvmbot commented May 15, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Stanislav Mekhanoshin (rampitec)

Changes

This big switch is unmaintainable and buggy. In particular it unconditionally
adds clamp if there is omod to VOP3.


Full diff: https://github.com/llvm/llvm-project/pull/139981.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.td (+17-75)
  • (modified) llvm/lib/Target/AMDGPU/VOP2Instructions.td (+2-2)
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.td b/llvm/lib/Target/AMDGPU/SIInstrInfo.td
index 79667e5ff9285..203992e3853d0 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.td
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.td
@@ -1972,81 +1972,23 @@ class getIns64 <RegisterOperand Src0RC, RegisterOperand Src1RC,
                 RegisterOperand Src2RC, int NumSrcArgs,
                 bit HasClamp, bit HasModifiers, bit HasSrc2Mods, bit HasOMod,
                 Operand Src0Mod, Operand Src1Mod, Operand Src2Mod> {
-
-  dag ret =
-    !if (!eq(NumSrcArgs, 0),
-      // VOP1 without input operands (V_NOP, V_CLREXCP)
-      (ins),
-      /* else */
-    !if (!eq(NumSrcArgs, 1),
-      !if (HasModifiers,
-        // VOP1 with modifiers
-        !if(HasOMod,
-          (ins Src0Mod:$src0_modifiers, Src0RC:$src0,
-               Clamp0:$clamp, omod0:$omod),
-          !if (HasClamp,
-            (ins Src0Mod:$src0_modifiers, Src0RC:$src0, Clamp0:$clamp),
-            (ins Src0Mod:$src0_modifiers, Src0RC:$src0)))
-      /* else */,
-        // VOP1 without modifiers
-        !if(HasOMod,
-          (ins Src0RC:$src0, Clamp0:$clamp, omod0:$omod),
-          !if (HasClamp,
-            (ins Src0RC:$src0, Clamp0:$clamp),
-            (ins Src0RC:$src0)))
-      /* endif */ ),
-    !if (!eq(NumSrcArgs, 2),
-      !if (HasModifiers,
-        // VOP 2 with modifiers
-        !if(HasOMod,
-          (ins Src0Mod:$src0_modifiers, Src0RC:$src0,
-               Src1Mod:$src1_modifiers, Src1RC:$src1,
-               Clamp0:$clamp, omod0:$omod),
-          !con((ins Src0Mod:$src0_modifiers, Src0RC:$src0,
-                    Src1Mod:$src1_modifiers, Src1RC:$src1),
-                !if(HasClamp, (ins Clamp0:$clamp), (ins))))
-      /* else */,
-        // VOP2 without modifiers
-        !if (HasClamp,
-          (ins Src0RC:$src0, Src1RC:$src1, Clamp0:$clamp),
-          (ins Src0RC:$src0, Src1RC:$src1))
-
-      /* endif */ )
-    /* NumSrcArgs == 3 */,
-      !if (HasModifiers,
-        !if (HasSrc2Mods,
-          // VOP3 with modifiers
-          !if (HasOMod,
-            (ins Src0Mod:$src0_modifiers, Src0RC:$src0,
-                 Src1Mod:$src1_modifiers, Src1RC:$src1,
-                 Src2Mod:$src2_modifiers, Src2RC:$src2,
-                 Clamp0:$clamp, omod0:$omod),
-            !if (HasClamp,
-              (ins Src0Mod:$src0_modifiers, Src0RC:$src0,
-                   Src1Mod:$src1_modifiers, Src1RC:$src1,
-                   Src2Mod:$src2_modifiers, Src2RC:$src2,
-                   Clamp0:$clamp),
-              (ins Src0Mod:$src0_modifiers, Src0RC:$src0,
-                   Src1Mod:$src1_modifiers, Src1RC:$src1,
-                   Src2Mod:$src2_modifiers, Src2RC:$src2))),
-          // VOP3 with modifiers except src2
-          !if (HasOMod,
-            (ins Src0Mod:$src0_modifiers, Src0RC:$src0,
-                 Src1Mod:$src1_modifiers, Src1RC:$src1,
-                 Src2RC:$src2, Clamp0:$clamp, omod0:$omod),
-            !if (HasClamp,
-              (ins Src0Mod:$src0_modifiers, Src0RC:$src0,
-                   Src1Mod:$src1_modifiers, Src1RC:$src1,
-                   Src2RC:$src2, Clamp0:$clamp),
-              (ins Src0Mod:$src0_modifiers, Src0RC:$src0,
-                   Src1Mod:$src1_modifiers, Src1RC:$src1,
-                   Src2RC:$src2))))
-      /* else */,
-        // VOP3 without modifiers
-        !if (HasClamp,
-          (ins Src0RC:$src0, Src1RC:$src1, Src2RC:$src2, Clamp0:$clamp),
-          (ins Src0RC:$src0, Src1RC:$src1, Src2RC:$src2))
-      /* endif */ ))));
+  dag src0 = !if (HasModifiers,
+                  (ins Src0Mod:$src0_modifiers, Src0RC:$src0),
+                  (ins Src0RC:$src0));
+  dag src1 = !if(!ge(NumSrcArgs, 2),
+                 !if (HasModifiers,
+                      (ins Src1Mod:$src1_modifiers, Src1RC:$src1),
+                      (ins Src1RC:$src1)),
+                 (ins));
+  dag src2 = !if(!ge(NumSrcArgs, 3),
+                 !if (HasSrc2Mods,
+                      (ins Src2Mod:$src2_modifiers, Src2RC:$src2),
+                      (ins Src2RC:$src2)),
+                 (ins));
+  dag clamp = !if (HasClamp, (ins Clamp0:$clamp), (ins));
+  dag omod = !if(HasOMod, (ins omod0:$omod), (ins));
+
+  dag ret = !con(src0, src1, src2, clamp, omod);
 }
 
 class getInsVOP3Base<RegisterOperand Src0RC, RegisterOperand Src1RC,
diff --git a/llvm/lib/Target/AMDGPU/VOP2Instructions.td b/llvm/lib/Target/AMDGPU/VOP2Instructions.td
index 30cef69aa29c4..0c7e20fc1ebf3 100644
--- a/llvm/lib/Target/AMDGPU/VOP2Instructions.td
+++ b/llvm/lib/Target/AMDGPU/VOP2Instructions.td
@@ -439,7 +439,7 @@ class VOP_MAC <ValueType vt0, ValueType vt1=vt0> : VOPProfile <[vt0, vt1, vt1, v
   // Src2 must accept the same operand types as vdst, namely VGPRs only
   let Src2RC64 = getVOP3VRegForVT<Src2VT, IsTrue16, !not(IsRealTrue16)>.ret;
   let Ins64 = getIns64<Src0RC64, Src1RC64, Src2RC64, 3,
-                       0, HasModifiers, HasModifiers, HasOMod,
+                       HasClamp, HasModifiers, HasModifiers, HasOMod,
                        Src0Mod, Src1Mod, Src2Mod>.ret;
   let InsDPP = (ins Src0ModDPP:$src0_modifiers, Src0DPP:$src0,
                     Src1ModDPP:$src1_modifiers, Src1DPP:$src1,
@@ -448,7 +448,7 @@ class VOP_MAC <ValueType vt0, ValueType vt1=vt0> : VOPProfile <[vt0, vt1, vt1, v
                     DppBankMask:$bank_mask, DppBoundCtrl:$bound_ctrl);
   let InsDPP16 = !con(InsDPP, (ins Dpp16FI:$fi));
   let InsVOP3Base = getInsVOP3Base<Src0VOP3DPP, Src1VOP3DPP, RegisterOperand<VGPR_32>, 3,
-                       0, HasModifiers, HasModifiers, HasOMod,
+                       HasClamp, HasModifiers, HasModifiers, HasOMod,
                        Src0ModVOP3DPP, Src1ModVOP3DPP, Src2Mod, HasOpSel>.ret;
   // We need a dummy src2 tied to dst to track the use of that register for s_delay_alu
   let InsVOPDX = (ins Src0RC32:$src0X, Src1RC32:$vsrc1X, VGPRSrc_32:$src2X);

@rampitec
Copy link
Collaborator Author

Interesting... Failure only happens in a debug build and it is not an assert...

@rampitec rampitec force-pushed the users/rampitec/05-14-_amdgpu_simplify_getins64._nfci branch from 5ba584e to e64135b Compare May 15, 2025 08:10
@rampitec
Copy link
Collaborator Author

Interesting... Failure only happens in a debug build and it is not an assert...

OK, it did not factor v_nop w/o any operands. Fixed.

This big switch is unmaintainable and buggy. In particular it unconditionally
adds clamp if there is omod to VOP3.
@rampitec rampitec force-pushed the users/rampitec/05-14-_amdgpu_simplify_getins64._nfci branch from e64135b to 17bac1c Compare May 15, 2025 08:18
@rampitec rampitec merged commit 849ecbc into main May 15, 2025
11 checks passed
@rampitec rampitec deleted the users/rampitec/05-14-_amdgpu_simplify_getins64._nfci branch May 15, 2025 09:59
Copy link
Contributor

@jayfoad jayfoad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@rampitec
Copy link
Collaborator Author

Nice!

I mean... going down the rabbit hole of t16, you will inevitable come to this place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.