Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

[RISCV] Add MC layer support for XSfmm*. #133031

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
Loading
from
Open

Conversation

topperc
Copy link
Collaborator

@topperc topperc commented Mar 26, 2025

This adds assembler/disassembler support for XSfmmbase 0.6 and related SiFive matrix multiplication extensions based on the spec here https://www.sifive.com/document-file/xsfmm-matrix-extensions-specification

Functionality-wise, this is the same as the Zvma extension proposal that SiFive shared with the Attached Matrix Extension Task Group. The extension names and instruction mnemonics have been changed to use vendor prefixes.

Note this is a non-conforming extension as the opcodes used here are in the standard opcode space in OP-V or OP-VE.

This adds assembler/disassembler support for XSfmmbase 0.6 and related
SiFive matrix multiplication extensions based on the spec here
https://www.sifive.com/document-file/xsfmm-matrix-extensions-specification

Functionality-wise, this is the same as the Zvma extension proposal that SiFive shared
with the Attached Matrix Extension Task Group. The extension names and instruction
mnemonics have been changed to use vendor prefixes.

Note the opcodes used here are in the standard opcode space in OP-V or OP-VE.

Co-authored-by: Brandon Wu <brandon.wu@sifive.com>
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:RISC-V clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' mc Machine (object) code labels Mar 26, 2025
@topperc
Copy link
Collaborator Author

topperc commented Mar 26, 2025

CC @sequencer @FantasqueX

@llvmbot
Copy link
Member

llvmbot commented Mar 26, 2025

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-backend-risc-v

@llvm/pr-subscribers-mc

Author: Craig Topper (topperc)

Changes

This adds assembler/disassembler support for XSfmmbase 0.6 and related SiFive matrix multiplication extensions based on the spec here https://www.sifive.com/document-file/xsfmm-matrix-extensions-specification

Functionality-wise, this is the same as the Zvma extension proposal that SiFive shared with the Attached Matrix Extension Task Group. The extension names and instruction mnemonics have been changed to use vendor prefixes.

Note the opcodes used here are in the standard opcode space in OP-V or OP-VE.


Patch is 59.21 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/133031.diff

21 Files Affected:

  • (modified) clang/test/Driver/print-supported-extensions-riscv.c (+12)
  • (added) clang/test/Preprocessor/riscv-target-features-sifive.c (+95)
  • (modified) llvm/docs/RISCVUsage.rst (+3)
  • (modified) llvm/docs/ReleaseNotes.md (+2)
  • (modified) llvm/include/llvm/TargetParser/RISCVTargetParser.h (+28)
  • (modified) llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp (+89)
  • (modified) llvm/lib/Target/RISCV/Disassembler/RISCVDisassembler.cpp (+34)
  • (modified) llvm/lib/Target/RISCV/MCTargetDesc/RISCVInstPrinter.cpp (+14)
  • (modified) llvm/lib/Target/RISCV/MCTargetDesc/RISCVInstPrinter.h (+2)
  • (modified) llvm/lib/Target/RISCV/RISCVFeatures.td (+80)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.td (+1)
  • (added) llvm/lib/Target/RISCV/RISCVInstrInfoXSfmm.td (+285)
  • (modified) llvm/lib/Target/RISCV/RISCVRegisterInfo.td (+13)
  • (modified) llvm/lib/Target/RISCV/RISCVSubtarget.h (+1)
  • (modified) llvm/lib/TargetParser/RISCVTargetParser.cpp (+9)
  • (added) llvm/test/CodeGen/RISCV/attributes-sifive.ll (+58)
  • (modified) llvm/test/CodeGen/RISCV/features-info.ll (+12)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-zve64f.mir (+2-2)
  • (added) llvm/test/MC/RISCV/rvv/xsfmm-invalid.s (+36)
  • (added) llvm/test/MC/RISCV/rvv/xsfmm.s (+224)
  • (modified) llvm/unittests/TargetParser/RISCVISAInfoTest.cpp (+12)
diff --git a/clang/test/Driver/print-supported-extensions-riscv.c b/clang/test/Driver/print-supported-extensions-riscv.c
index 7e201b1149ec3..93f9e75b8413d 100644
--- a/clang/test/Driver/print-supported-extensions-riscv.c
+++ b/clang/test/Driver/print-supported-extensions-riscv.c
@@ -164,6 +164,18 @@
 // CHECK-NEXT:     xmipscmove           1.0       'XMIPSCMove' (MIPS conditional move instruction(s) (ccmov))
 // CHECK-NEXT:     xmipslsp             1.0       'XMIPSLSP' (MIPS optimization for hardware load-store bonding)
 // CHECK-NEXT:     xsfcease             1.0       'XSfcease' (SiFive sf.cease Instruction)
+// CHECK-NEXT:     xsfmm128t            0.6       'XSfmm128t' (TE=128 configuration)
+// CHECK-NEXT:     xsfmm16t             0.6       'XSfmm16t' (TE=16 configuration)
+// CHECK-NEXT:     xsfmm32a             0.6       'XSfmm32a' (TEW=32-bit accumulation, operands - int: 8b; float: fp16, bf16, fp32)
+// CHECK-NEXT:     xsfmm32a16f          0.6       'XSfmm32a16f' (TEW=32-bit accumulation, operands - float: 16b, widen=2 (IEEE, BF))
+// CHECK-NEXT:     xsfmm32a32f          0.6       'XSfmm32a32f' (TEW=32-bit accumulation, operands - float: 32b)
+// CHECK-NEXT:     xsfmm32a4i           0.6       'XSfmm32a4i' (TEW=32-bit accumulation, operands - int: 4b (packed))
+// CHECK-NEXT:     xsfmm32a8f           0.6       'XSfmm32a8f' (TEW=32-bit accumulation, operands - float: fp8)
+// CHECK-NEXT:     xsfmm32a8i           0.6       'XSfmm32a8i' (TEW=32-bit accumulation, operands - int: 8b)
+// CHECK-NEXT:     xsfmm32t             0.6       'XSfmm32t' (TE=32 configuration)
+// CHECK-NEXT:     xsfmm64a64f          0.6       'XSfmm64a64f' (TEW=64-bit accumulation, operands - float: fp64)
+// CHECK-NEXT:     xsfmm64t             0.6       'XSfmm64t' (TE=64 configuration)
+// CHECK-NEXT:     xsfmmbase            0.6       'XSfmmbase' (All non arithmetic instructions for all TEWs and sf.vtzero)
 // CHECK-NEXT:     xsfvcp               1.0       'XSfvcp' (SiFive Custom Vector Coprocessor Interface Instructions)
 // CHECK-NEXT:     xsfvfnrclipxfqf      1.0       'XSfvfnrclipxfqf' (SiFive FP32-to-int8 Ranged Clip Instructions)
 // CHECK-NEXT:     xsfvfwmaccqqq        1.0       'XSfvfwmaccqqq' (SiFive Matrix Multiply Accumulate Instruction and 4-by-4))
diff --git a/clang/test/Preprocessor/riscv-target-features-sifive.c b/clang/test/Preprocessor/riscv-target-features-sifive.c
new file mode 100644
index 0000000000000..a57db60a1b326
--- /dev/null
+++ b/clang/test/Preprocessor/riscv-target-features-sifive.c
@@ -0,0 +1,95 @@
+// RUN: %clang --target=riscv32 \
+// RUN: -march=rv32i_zve32x_xsfmm128t -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-XSFMM128T %s
+// RUN: %clang --target=riscv64 \
+// RUN: -march=rv64i_zve32x_xsfmm128t -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-XSFMM128T %s
+// CHECK-XSFMM128T: __riscv_xsfmm128t  6000{{$}}
+//
+// RUN: %clang --target=riscv32 \
+// RUN: -march=rv32i_zve32x_xsfmm16t -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-XSFMM16T %s
+// RUN: %clang --target=riscv64 \
+// RUN: -march=rv64i_zve32x_xsfmm16t -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-XSFMM16T %s
+// CHECK-XSFMM16T: __riscv_xsfmm16t  6000{{$}}
+
+// RUN: %clang --target=riscv32 \
+// RUN: -march=rv32i_zve32x_xsfmm32a -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-XSFMM32A %s
+// RUN: %clang --target=riscv64 \
+// RUN: -march=rv64i_zve32x_xsfmm32a -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-XSFMM32A %s
+// CHECK-XSFMM32A: __riscv_xsfmm32a  6000{{$}}
+
+// RUN: %clang --target=riscv32 \
+// RUN: -march=rv32i_zve32x_xsfmm32a4i -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-XSFMM32A4I %s
+// RUN: %clang --target=riscv64 \
+// RUN: -march=rv64i_zve32x_xsfmm32a4i -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-XSFMM32A4I %s
+// CHECK-XSFMM32A4I: __riscv_xsfmm32a4i  6000{{$}}
+
+// RUN: %clang --target=riscv32 \
+// RUN: -march=rv32i_zve32x_xsfmm32a8i -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-XSFMM32a8I %s
+// RUN: %clang --target=riscv64 \
+// RUN: -march=rv64i_zve32x_xsfmm32a8i -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-XSFMM32a8I %s
+// CHECK-XSFMM32a8I: __riscv_xsfmm32a8i  6000{{$}}
+
+// RUN: %clang --target=riscv32 \
+// RUN: -march=rv32i_zve32x_xsfmm32a8f -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-XSFMM32A8F %s
+// RUN: %clang --target=riscv64 \
+// RUN: -march=rv64i_zve32x_xsfmm32a8f -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-XSFMM32A8F %s
+// CHECK-XSFMM32A8F: __riscv_xsfmm32a8f  6000{{$}}
+
+// RUN: %clang --target=riscv32 \
+// RUN: -march=rv32i_zve32x_xsfmm32a16f -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-XSFMM32a16F %s
+// RUN: %clang --target=riscv64 \
+// RUN: -march=rv64i_zve32x_xsfmm32a16f -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-XSFMM32a16F %s
+// CHECK-XSFMM32a16F: __riscv_xsfmm32a16f  6000{{$}}
+
+// RUN: %clang --target=riscv32 \
+// RUN: -march=rv32i_zve32x_xsfmm32a32f -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-XSFMM32a32F %s
+// RUN: %clang --target=riscv64 \
+// RUN: -march=rv64i_zve32x_xsfmm32a32f -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-XSFMM32a32F %s
+// CHECK-XSFMM32a32F: __riscv_xsfmm32a32f  6000{{$}}
+
+// RUN: %clang --target=riscv32 \
+// RUN: -march=rv32i_zve32x_xsfmm32t -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-XSFMM32T %s
+// RUN: %clang --target=riscv64 \
+// RUN: -march=rv64i_zve32x_xsfmm32t -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-XSFMM32T %s
+// CHECK-XSFMM32T: __riscv_xsfmm32t  6000{{$}}
+
+// RUN: %clang --target=riscv32 \
+// RUN: -march=rv32i_zve32x_xsfmm64a64f -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-XSFMM64a64f %s
+// RUN: %clang --target=riscv64 \
+// RUN: -march=rv64i_zve32x_xsfmm64a64f -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-XSFMM64a64f %s
+// CHECK-XSFMM64a64f: __riscv_xsfmm64a64f  6000{{$}}
+
+// RUN: %clang --target=riscv32 \
+// RUN: -march=rv32i_zve32x_xsfmm64t -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-XSFMM64T %s
+// RUN: %clang --target=riscv64 \
+// RUN: -march=rv64i_zve32x_xsfmm64t -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-XSFMM64T %s
+// CHECK-XSFMM64T: __riscv_xsfmm64t  6000{{$}}
+
+// RUN: %clang --target=riscv32 \
+// RUN: -march=rv32i_zve32x_xsfmmbase -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-XSFMMBASE %s
+// RUN: %clang --target=riscv64 \
+// RUN: -march=rv64i_zve32x_xsfmmbase -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-XSFMMBASE %s
+// CHECK-XSFMMBASE: __riscv_xsfmmbase  6000{{$}}
diff --git a/llvm/docs/RISCVUsage.rst b/llvm/docs/RISCVUsage.rst
index 8735b274a805f..1100b1a8fbe3c 100644
--- a/llvm/docs/RISCVUsage.rst
+++ b/llvm/docs/RISCVUsage.rst
@@ -389,6 +389,9 @@ The current vendor extensions supported are:
 ``XVentanaCondOps``
   LLVM implements `version 1.0.0 of the VTx-family custom instructions specification <https://github.com/ventanamicro/ventana-custom-extensions/releases/download/v1.0.0/ventana-custom-extensions-v1.0.0.pdf>`__ by Ventana Micro Systems.  All instructions are prefixed with `vt.` as described in the specification, and the riscv-toolchain-convention document linked above.  These instructions are only available for riscv64 at this time.
 
+``Xsfmm*``
+  LLVM implements `version 0.6 of the Xsfmm Family of Attached Matrix Extensions Specification <https://www.sifive.com/document-file/xsfmm-matrix-extensions-specification>`__ by SiFive.  All instructions are prefixed with `sf.` as described in the specification.
+
 ``XSfvcp``
   LLVM implements `version 1.1.0 of the SiFive Vector Coprocessor Interface (VCIX) Software Specification <https://sifive.cdn.prismic.io/sifive/Zn3m1R5LeNNTwnLS_vcix-spec-software-v1p1.pdf>`__ by SiFive.  All instructions are prefixed with `sf.vc.` as described in the specification, and the riscv-toolchain-convention document linked above.
 
diff --git a/llvm/docs/ReleaseNotes.md b/llvm/docs/ReleaseNotes.md
index 445599fb9b770..b278e99d1adf3 100644
--- a/llvm/docs/ReleaseNotes.md
+++ b/llvm/docs/ReleaseNotes.md
@@ -152,6 +152,8 @@ Changes to the RISC-V Backend
   handlers.
 * When the experimental extension `Xqcili` is enabled, `qc.e.li` and `qc.li` may
   now be used to materialize immediates.
+* Adds experimental assembler support for the SiFive Xsfmm* Attached Matrix
+  Extensions.
 
 Changes to the WebAssembly Backend
 ----------------------------------
diff --git a/llvm/include/llvm/TargetParser/RISCVTargetParser.h b/llvm/include/llvm/TargetParser/RISCVTargetParser.h
index 6e231d32e7897..b4b6096f860bf 100644
--- a/llvm/include/llvm/TargetParser/RISCVTargetParser.h
+++ b/llvm/include/llvm/TargetParser/RISCVTargetParser.h
@@ -97,6 +97,8 @@ inline static bool isValidLMUL(unsigned LMUL, bool Fractional) {
 unsigned encodeVTYPE(VLMUL VLMUL, unsigned SEW, bool TailAgnostic,
                      bool MaskAgnostic);
 
+unsigned encodeXSfmmVType(unsigned SEW, unsigned Widen, bool AltFmt);
+
 inline static VLMUL getVLMUL(unsigned VType) {
   unsigned VLMul = VType & 0x7;
   return static_cast<VLMUL>(VLMul);
@@ -126,10 +128,36 @@ inline static unsigned getSEW(unsigned VType) {
   return decodeVSEW(VSEW);
 }
 
+inline static unsigned decodeTWiden(unsigned TWiden) {
+  assert((TWiden == 1 || TWiden == 2 || TWiden == 3) &&
+         "Unexpected TWiden value");
+  return 1 << (TWiden - 1);
+}
+
+inline static bool hasXSfmmWiden(unsigned VType) {
+  unsigned TWiden = (VType >> 9) & 0x3;
+  return TWiden != 0;
+}
+
+inline static unsigned getXSfmmWiden(unsigned VType) {
+  unsigned TWiden = (VType >> 9) & 0x3;
+  assert(TWiden != 0 && "Invalid widen value");
+  return 1 << (TWiden - 1);
+}
+
+inline static bool getXSfmmAltFmt(unsigned VType) { return (VType >> 8) & 1; }
+
+static inline bool isValidXSfmmVType(unsigned VTypeI) {
+  return (VTypeI & ~0x738) == 0 && RISCVVType::hasXSfmmWiden(VTypeI) &&
+         RISCVVType::getSEW(VTypeI) * RISCVVType::getXSfmmWiden(VTypeI) <= 64;
+}
+
 inline static bool isTailAgnostic(unsigned VType) { return VType & 0x40; }
 
 inline static bool isMaskAgnostic(unsigned VType) { return VType & 0x80; }
 
+inline static bool isAltFmt(unsigned VType) { return VType & 0x100; }
+
 void printVType(unsigned VType, raw_ostream &OS);
 
 unsigned getSEWLMULRatio(unsigned SEW, VLMUL VLMul);
diff --git a/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp b/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
index 05997cf78c6b1..abe734b1dab20 100644
--- a/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
+++ b/llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
@@ -77,6 +77,12 @@ class RISCVAsmParser : public MCTargetAsmParser {
     VTypeState_Done,
   };
 
+  enum WWEEState {
+    WWEEState_Widen,
+    WWEEState_SEW,
+    WWEEState_Done,
+  };
+
   SmallVector<FeatureBitset, 4> FeatureBitStack;
 
   SmallVector<ParserOptionsSet, 4> ParserOptionsStack;
@@ -125,6 +131,9 @@ class RISCVAsmParser : public MCTargetAsmParser {
                        bool &MaskAgnostic);
   bool generateVTypeError(SMLoc ErrorLoc);
 
+  bool parseXSfmmVTypeToken(const AsmToken &Tok, WWEEState &State, unsigned &WW,
+                            unsigned &EE, bool &AltFmt);
+  bool generateXSfmmVTypeError(SMLoc ErrorLoc);
   // Helper to actually emit an instruction to the MCStreamer. Also, when
   // possible, compression of the instruction is performed.
   void emitToStreamer(MCStreamer &S, const MCInst &Inst);
@@ -217,6 +226,7 @@ class RISCVAsmParser : public MCTargetAsmParser {
   ParseStatus parseFenceArg(OperandVector &Operands);
   ParseStatus parseReglist(OperandVector &Operands);
   ParseStatus parseRegReg(OperandVector &Operands);
+  ParseStatus parseXSfmmVType(OperandVector &Operands);
   ParseStatus parseRetval(OperandVector &Operands);
   ParseStatus parseZcmpStackAdj(OperandVector &Operands,
                                 bool ExpectNegative = false);
@@ -622,6 +632,10 @@ struct RISCVOperand final : public MCParsedAsmOperand {
     return Kind == KindTy::VType;
   }
 
+  bool isXSfmmVType() const {
+    return Kind == KindTy::VType && RISCVVType::isValidXSfmmVType(VType.Val);
+  }
+
   /// Return true if the operand is a valid for the fence instruction e.g.
   /// ('iorw').
   bool isFenceArg() const { return Kind == KindTy::Fence; }
@@ -2489,6 +2503,81 @@ bool RISCVAsmParser::generateVTypeError(SMLoc ErrorLoc) {
       "e[8|16|32|64],m[1|2|4|8|f2|f4|f8],[ta|tu],[ma|mu]");
 }
 
+bool RISCVAsmParser::parseXSfmmVTypeToken(const AsmToken &Tok, WWEEState &State,
+                                          unsigned &WW, unsigned &EE,
+                                          bool &AltFmt) {
+  if (getLexer().isNot(AsmToken::Identifier))
+    return true;
+
+  StringRef Identifier = getTok().getIdentifier();
+
+  switch (State) {
+  case WWEEState_SEW:
+    if (!Identifier.consume_front("e"))
+      break;
+    if (Identifier.getAsInteger(10, EE)) {
+      if (Identifier != "16alt")
+        break;
+
+      AltFmt = true;
+      EE = 16;
+    }
+    if (!RISCVVType::isValidSEW(EE))
+      break;
+    State = WWEEState_Widen;
+    return false;
+  case WWEEState_Widen:
+    if (!Identifier.consume_front("w"))
+      break;
+    if (Identifier.getAsInteger(10, WW))
+      break;
+    if (WW != 1 && WW != 2 && WW != 4)
+      break;
+    State = WWEEState_Done;
+    return false;
+  case WWEEState_Done:
+    // Extra token?
+    break;
+  }
+
+  return true;
+}
+
+ParseStatus RISCVAsmParser::parseXSfmmVType(OperandVector &Operands) {
+  SMLoc S = getLoc();
+
+  unsigned Widen = 0;
+  unsigned SEW = 0;
+  bool AltFmt = false;
+
+  WWEEState State = WWEEState_SEW;
+
+  if (parseXSfmmVTypeToken(getTok(), State, Widen, SEW, AltFmt))
+    return generateXSfmmVTypeError(S);
+
+  getLexer().Lex();
+
+  if (!parseOptionalToken(AsmToken::Comma))
+    return generateXSfmmVTypeError(S);
+
+  if (parseXSfmmVTypeToken(getTok(), State, Widen, SEW, AltFmt))
+    return generateXSfmmVTypeError(S);
+
+  getLexer().Lex();
+
+  if (getLexer().is(AsmToken::EndOfStatement) && State == WWEEState_Done) {
+    Operands.push_back(RISCVOperand::createVType(
+        RISCVVType::encodeXSfmmVType(SEW, Widen, AltFmt), S));
+    return ParseStatus::Success;
+  }
+
+  return generateXSfmmVTypeError(S);
+}
+
+bool RISCVAsmParser::generateXSfmmVTypeError(SMLoc ErrorLoc) {
+  return Error(ErrorLoc, "operand must be e[8|16|16alt|32|64],w[1|2|4]");
+}
+
 ParseStatus RISCVAsmParser::parseMaskReg(OperandVector &Operands) {
   if (getLexer().isNot(AsmToken::Identifier))
     return ParseStatus::NoMatch;
diff --git a/llvm/lib/Target/RISCV/Disassembler/RISCVDisassembler.cpp b/llvm/lib/Target/RISCV/Disassembler/RISCVDisassembler.cpp
index 93cbf662bfa32..2c2ea82b5e892 100644
--- a/llvm/lib/Target/RISCV/Disassembler/RISCVDisassembler.cpp
+++ b/llvm/lib/Target/RISCV/Disassembler/RISCVDisassembler.cpp
@@ -323,6 +323,39 @@ static DecodeStatus DecodeVMV0RegisterClass(MCInst &Inst, uint32_t RegNo,
   return MCDisassembler::Success;
 }
 
+static DecodeStatus DecodeTRRegisterClass(MCInst &Inst, uint32_t RegNo,
+                                          uint64_t Address,
+                                          const MCDisassembler *Decoder) {
+  if (RegNo > 15)
+    return MCDisassembler::Fail;
+
+  MCRegister Reg = RISCV::T0 + RegNo;
+  Inst.addOperand(MCOperand::createReg(Reg));
+  return MCDisassembler::Success;
+}
+
+static DecodeStatus DecodeTRM2RegisterClass(MCInst &Inst, uint32_t RegNo,
+                                            uint64_t Address,
+                                            const MCDisassembler *Decoder) {
+  if (RegNo > 15 || RegNo % 2)
+    return MCDisassembler::Fail;
+
+  MCRegister Reg = RISCV::T0 + RegNo;
+  Inst.addOperand(MCOperand::createReg(Reg));
+  return MCDisassembler::Success;
+}
+
+static DecodeStatus DecodeTRM4RegisterClass(MCInst &Inst, uint32_t RegNo,
+                                            uint64_t Address,
+                                            const MCDisassembler *Decoder) {
+  if (RegNo > 15 || RegNo % 4)
+    return MCDisassembler::Fail;
+
+  MCRegister Reg = RISCV::T0 + RegNo;
+  Inst.addOperand(MCOperand::createReg(Reg));
+  return MCDisassembler::Success;
+}
+
 static DecodeStatus decodeVMaskReg(MCInst &Inst, uint32_t RegNo,
                                    uint64_t Address,
                                    const MCDisassembler *Decoder) {
@@ -707,6 +740,7 @@ static constexpr DecoderListEntry DecoderList32[]{
      "XVentanaCondOps"},
     {DecoderTableXTHead32, XTHeadGroup, "T-Head extensions"},
     {DecoderTableXSfvector32, XSfVectorGroup, "SiFive vector extensions"},
+    {DecoderTableXSfmm32, {RISCV::FeatureVendorXSfmmbase}, "SiFive XSfmm"},
     {DecoderTableXSfsystem32, XSfSystemGroup, "SiFive system extensions"},
     {DecoderTableXSfcease32, {RISCV::FeatureVendorXSfcease}, "SiFive sf.cease"},
     {DecoderTableXmipslsp32, {RISCV::FeatureVendorXMIPSLSP}, "MIPS mips.lsp"},
diff --git a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVInstPrinter.cpp b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVInstPrinter.cpp
index a4a40862a67c6..f7aebac205ce7 100644
--- a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVInstPrinter.cpp
+++ b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVInstPrinter.cpp
@@ -219,6 +219,20 @@ void RISCVInstPrinter::printVTypeI(const MCInst *MI, unsigned OpNo,
   RISCVVType::printVType(Imm, O);
 }
 
+void RISCVInstPrinter::printXSfmmVType(const MCInst *MI, unsigned OpNo,
+                                       const MCSubtargetInfo &STI,
+                                       raw_ostream &O) {
+  unsigned Imm = MI->getOperand(OpNo).getImm();
+  assert(RISCVVType::isValidXSfmmVType(Imm));
+  unsigned SEW = RISCVVType::getSEW(Imm);
+  O << "e" << SEW;
+  bool AltFmt = RISCVVType::getXSfmmAltFmt(Imm);
+  if (AltFmt)
+    O << "alt";
+  unsigned Widen = RISCVVType::getXSfmmWiden(Imm);
+  O << ", w" << Widen;
+}
+
 // Print a Zcmp RList. If we are printing architectural register names rather
 // than ABI register names, we need to print "{x1, x8-x9, x18-x27}" for all
 // registers. Otherwise, we print "{ra, s0-s11}".
diff --git a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVInstPrinter.h b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVInstPrinter.h
index 6d4928ee64ec9..e4846c427beb7 100644
--- a/llvm/lib/Target/RISCV/MCTargetDesc/RISCVInstPrinter.h
+++ b/llvm/lib/Target/RISCV/MCTargetDesc/RISCVInstPrinter.h
@@ -48,6 +48,8 @@ class RISCVInstPrinter : public MCInstPrinter {
                             const MCSubtargetInfo &STI, raw_ostream &O);
   void printVTypeI(const MCInst *MI, unsigned OpNo, const MCSubtargetInfo &STI,
                    raw_ostream &O);
+  void printXSfmmVType(const MCInst *MI, unsigned OpNo,
+                       const MCSubtargetInfo &STI, raw_ostream &O);
   void printVMaskReg(const MCInst *MI, unsigned OpNo,
                      const MCSubtargetInfo &STI, raw_ostream &O);
   void printRlist(const MCInst *MI, unsigned OpNo, const MCSubtargetInfo &STI,
diff --git a/llvm/lib/Target/RISCV/RISCVFeatures.td b/llvm/lib/Target/RISCV/RISCVFeatures.td
index 5ed3ed917aa4c..2b42524e4fa2c 100644
--- a/llvm/lib/Target/RISCV/RISCVFeatures.td
+++ b/llvm/lib/Target/RISCV/RISCVFeatures.td
@@ -1176,6 +1176,86 @@ def HasVendorXSfvcp : Predicate<"Subtarget->hasVendorXSfvcp()">,
                       AssemblerPredicate<(all_of FeatureVendorXSfvcp),
                           "'XSfvcp' (SiFive Custom Vector Coprocessor Interface Instructions)">;
 
+def FeatureVendorXSfmmbase
+    : RISCVExtension<0, 6,
+                     "All non arithmetic instructions for all TEWs and sf.vtzero",
+                     [FeatureStdExtZve32x]>;
+def HasVendorXSfmmbase : Predicate<"Subtarget->hasVendorXSfmmbase()">,
+                         AssemblerPredicate<(all_of FeatureVendorXSfmmbase),
+                             "'XSfmmbase' (All non arithmetic instructions for all TEWs and sf.vtzero)">;
+
+def FeatureVendorXSfmm32a8f
+    : RISCVExtension<0, 6,
+                     "TEW=32-bit accumulation, operands - float: fp8",
+                     [FeatureVendorXSfmmbase, FeatureStdExtZve32f]>;
+def HasVendorXSfmm32a8f : Predicate<"Subtarget->hasVendorXSfmm32a8f()">,
+                          AssemblerPredicate<(all_of FeatureVendorXSf...
[truncated]

@sequencer
Copy link

Thanks for the tests on the rv32i!

Copy link
Contributor

@wangpc-pp wangpc-pp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have followed the progress of IME/AME for a long time, there are several candidates in parallel. I think the support of these extensions should be marked as early access, and we can review these patches but won't merge them until they are ratified just like Zvzip/Zvabd/Zibimm/...

llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-zve64f.mir Outdated Show resolved Hide resolved
DwarfRegNum<[!add(Index, 3072)]>;

let RegInfos = XLenRI in {
def TR : RISCVRegisterClass<[untyped], 32, (add (sequence "T%u", 0, 15))>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a PoC to use Target Extension Type to support matrix, does SiFive try this machanism in the CodeGen?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have done some experiment with that, but our current thought is to use constant "tile id" for intrinsics instead of allocating tiles in compiler. This what AArch64 SME does.

This ISA doesn't support load, store, or copy of whole tiles. These require loops to emulate using vector registers.

I think using Target Extension Type still needs an underlying type to calculate size. The tiles here are scalable in two dimensions which TypeSize can't represent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tiles here are scalable in two dimensions which TypeSize can't represent.

Yeah, we encountered the same problem and this is why XuanTie's AME is designed as that two dimensions are related IIRC.

@sequencer
Copy link

but won't merge them until they are ratified just like Zvzip/Zvabd/Zibimm

I think this patch is supported to be the vendor instruction set of SiFive Xsfmm* Attached Matrix Extensions if I understand correctly of this PR.

AME is another issue that we don’t have ratified AME but just SiFive AME proposal, XuanTie AME proposal.

@wangpc-pp
Copy link
Contributor

but won't merge them until they are ratified just like Zvzip/Zvabd/Zibimm

I think this patch is supported to be the vendor instruction set of SiFive Xsfmm* Attached Matrix Extensions if I understand correctly of this PR.

AME is another issue that we don’t have ratified AME but just SiFive AME proposal, XuanTie AME proposal.

SiFive's AME proposal locates in OP-V/OP-VE category, I think it is not designed as vendor extension. IIUC, this patch is just for early evaluation.

@sequencer
Copy link

SiFive's AME proposal locates in OP-V/OP-VE category

Yes, this is a good point.
I do think we should also remove other vendor instruction sets which abused the standard op fields.

@topperc
Copy link
Collaborator Author

topperc commented Mar 28, 2025

SiFive's AME proposal locates in OP-V/OP-VE category

Yes, this is a good point. I do think we should also remove other vendor instruction sets which abused the standard op fields.

The RISC-V specification does not take a hard stance on non-conforming extensions. It should not be consider "abusing". Whether a non-concforming extensions is allowed is up to indvidual platform requirements. We have discussed in the past about taking patches for the T-Head 0.7 vector extension. I think we were willing to take it if there was a promise of continued maintenance.

@topperc
Copy link
Collaborator Author

topperc commented Mar 28, 2025

but won't merge them until they are ratified just like Zvzip/Zvabd/Zibimm

I think this patch is supported to be the vendor instruction set of SiFive Xsfmm* Attached Matrix Extensions if I understand correctly of this PR.
AME is another issue that we don’t have ratified AME but just SiFive AME proposal, XuanTie AME proposal.

SiFive's AME proposal locates in OP-V/OP-VE category, I think it is not designed as vendor extension. IIUC, this patch is just for early evaluation.

SiFive is committed to maintaining this implementation and we would very much like to see it in tree to enable easier sharing and avoid continual rebasing. Expect to see patches for intrinsics in the near future.

@preames
Copy link
Collaborator

preames commented Mar 28, 2025

We have discussed whether to accept non-conforming vendor extensions in the past. Our consensus was clearly documented in RISCVUsage.rst in the statement " In particular, we expect to eventually accept both custom extensions and non-conforming extensions."

This is a non-conforming vendor extension, and that needs to be clearly described, but it is not blocking for whether we accept the change.

Copy link
Collaborator

@preames preames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass of technical comments. I need to take a much more careful look at e.g. encodings, but will do that on the next round.

switch (State) {
case WWEEState_SEW:
if (!Identifier.consume_front("e"))
break;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, but use return instead of break for readability.

unsigned SEW = 0;
bool AltFmt = false;

WWEEState State = WWEEState_SEW;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not really clear to me that the state mechanism is worthwhile over just inlining the two calls and specializing the switch. The state machine is exceedingly simple, and the extra helper may actually just confuse things.

llvm/lib/Target/RISCV/RISCVInstrInfoXSfmm.td Outdated Show resolved Hide resolved
def : InstAlias<"sf.vsettnt $rd, $rs1, $vtypei",
(VSETVLI GPR:$rd, GPR:$rs1, XSfmmVTypeOp:$vtypei)>;

let DecoderNamespace = "XSfmm" in {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be grouped into one of the existing decode tables? Or does it need to be separate due the non-conforming nature?

llvm/lib/Target/RISCV/RISCVInstrInfoXSfmm.td Outdated Show resolved Hide resolved
@@ -34,7 +34,7 @@ body: |
renamable $v11 = PseudoVMV_S_X undef renamable $v11, %1, 8, 5 /* e32 */
renamable $v8 = PseudoVLE64_V_M1 undef renamable $v8, %2, 1, 6 /* e64 */, 2 /* tu, ma */ :: (load unknown-size, align 8)
renamable $v9 = PseudoVLE32_V_M1 undef renamable $v9, %3, 8, 5 /* e32 */, 2 /* tu, ma */ :: (load unknown-size, align 4)
INLINEASM &"# use $0 $1 $2 $3", 1 /* sideeffect attdialect */, 3997705 /* reguse:VR */, killed renamable $v10, 3997705 /* reguse:VR */, killed renamable $v11, 3997705 /* reguse:VR */, killed renamable $v8, 3997705 /* reguse:VR */, killed renamable $v9
INLINEASM &"# use $0 $1 $2 $3", 1 /* sideeffect attdialect */, 4194313 /* reguse:VR */, killed renamable $v10, 4194313 /* reguse:VR */, killed renamable $v11, 4194313 /* reguse:VR */, killed renamable $v8, 4194313 /* reguse:VR */, killed renamable $v9
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this change?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The upper 16 bits on these operands is the internal numbering for LLVM's register classes. Adding the tile register classes bumped the number. The test was also updated when GPRNoX31 register class was added in 536fe74.

Comment on lines +1262 to +1265
def FeatureVendorXSfmm128t
: RISCVExtension<0, 6,
"TE=128 configuration",
[FeatureVendorXSfmmbase, FeatureStdExtZvl512b], "XSfmmTE", "128">;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the specification,

TE is constrained to be a power of 2, VLEN/4 >= TE >= 4. The upper bound
is set by the requirement that a tile row or column must fit within a single vector register group (VLEN*8 bits)

Since maximum size of VLEN is 64K, which means TE can be scale up to 64K/4=16K, is possible for compiler to support this for a larger TE?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you asking me to add FeatureVendorXSfmm16384t?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not so big TE, that’s not reasonable, however, VLEN=4K where TE=1K is a possible option in our case, so I may asking increase the maximum TE larger, e.g. 1K?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SiFive does not implement a TE that large so I'm a little hesitant to add an extension in our vendor namespace for it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that’s reasonable, we will maintain a patch based on this PR and wait for the ratification of AME or VME later.

@Avimitin
Copy link
Contributor

I encountered a clang compiler crash (segmentation fault) when attempting to compile the utf8_count.c benchmark from the rvv-bench project (https://github.com/camel-cdr/rvv-bench/blob/main/bench/utf8_count.c) using clang build from this branch. The crash occurs during the 'RISC-V DAG->DAG Pattern Instruction Selection' pass.

Steps to Reproduce:

  1. Obtain the utf8_count.c file from the rvv-bench project (or use the minimal C code provided below).

  2. Compile the C code with the following command:

    riscv32-none-elf-clang -mabi=ilp32f -march=rv32imafc_xsfmm128t_zve32f_zvl2048b -mno-relax -static -nostartfiles -mcmodel=medany -fvisibility=hidden -fno-PIC -g -O3 -fno-rtti -fno-exceptions -fno-threadsafe-statics -c utf8_count.c -o utf8_count.o

Error Log:

Compiler Error Output
$ riscv32-none-elf-clang -mabi=ilp32f -march=rv32imafc_xsfmm128t_zve32f_zvl2048b -mno-relax -static -nostartfiles -mcmodel=medany -fvisibility=hidden -fno-PIC -g -O3 -fno-rtti -fno-exceptions -fno-threadsafe-statics -c test.c -o test.o
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: /nix/store/4b17xsrdcyd2bnm91pdclfw2wz3g0ikg-clang-21.0.0-unstable-2025-03-23/bin/clang @/tmp/nix-shell.VojIba/cc-params.249Z4g
1.      <eof> parser at end of file
2.      Code generation
3.      Running pass 'Function Pass Manager' on module 'utf8_count.c'.
4.      Running pass 'RISC-V DAG->DAG Pattern Instruction Selection' on function '@utf8_count_SWAR_popc_bithack_autovec'
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  libLLVM.so.21.0git      0x00007fffea78aa0c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 60
1  libLLVM.so.21.0git      0x00007fffea788774 llvm::sys::CleanupOnSignal(unsigned long) + 148
2  libLLVM.so.21.0git      0x00007fffea668a88
3  libc.so.6               0x00007fffe8e40f30
4  libLLVM.so.21.0git      0x00007fffea9f4454 llvm::ScalableVectorType::get(llvm::Type*, unsigned int) + 20
5  libLLVM.so.21.0git      0x00007fffeb0d2c50 llvm::EVT::getExtendedVectorVT(llvm::LLVMContext&, llvm::EVT, llvm::ElementCount) + 48
6  libLLVM.so.21.0git      0x00007fffeb35a250
7  libLLVM.so.21.0git      0x00007fffeb35c008
8  libLLVM.so.21.0git      0x00007fffeb2fd970
9  libLLVM.so.21.0git      0x00007fffeb2fe3a9 llvm::SelectionDAG::LegalizeTypes() + 1321
10 libLLVM.so.21.0git      0x00007fffeb47af5f llvm::SelectionDAGISel::CodeGenAndEmitDAG() + 271
11 libLLVM.so.21.0git      0x00007fffeb47dc5a llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) + 5274
12 libLLVM.so.21.0git      0x00007fffeb47fb99 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) + 217
13 libLLVM.so.21.0git      0x00007fffeb46e8e9 llvm::SelectionDAGISelLegacy::runOnMachineFunction(llvm::MachineFunction&) + 409
14 libLLVM.so.21.0git      0x00007fffead5e853
15 libLLVM.so.21.0git      0x00007fffea972d89 llvm::FPPassManager::runOnFunction(llvm::Function&) + 1705
16 libLLVM.so.21.0git      0x00007fffea972f3c llvm::FPPassManager::runOnModule(llvm::Module&) + 44
17 libLLVM.so.21.0git      0x00007fffea971f69 llvm::legacy::PassManagerImpl::run(llvm::Module&) + 1081
18 libclang-cpp.so.21.0git 0x00007ffff5cc47f4 clang::emitBackendOutput(clang::CompilerInstance&, clang::CodeGenOptions&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) + 2772
19 libclang-cpp.so.21.0git 0x00007ffff60764bc clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) + 1692
20 libclang-cpp.so.21.0git 0x00007ffff4616eec clang::ParseAST(clang::Sema&, bool, bool) + 1212
21 libclang-cpp.so.21.0git 0x00007ffff6a79928 clang::FrontendAction::Execute() + 40
22 libclang-cpp.so.21.0git 0x00007ffff69eec13 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) + 1475
23 libclang-cpp.so.21.0git 0x00007ffff6b1b11b clang::ExecuteCompilerInvocation(clang::CompilerInstance*) + 539
24 clang                   0x000000000041670a cc1_main(llvm::ArrayRef<char const*>, char const*, void*) + 7306
25 clang                   0x000000000040e8f3
26 libclang-cpp.so.21.0git 0x00007ffff65d9a99
27 libLLVM.so.21.0git      0x00007fffea668e93 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) + 35
28 libclang-cpp.so.21.0git 0x00007ffff65da545
29 libclang-cpp.so.21.0git 0x00007ffff659966c clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const + 172
30 libclang-cpp.so.21.0git 0x00007ffff659a652 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const + 146
31 libclang-cpp.so.21.0git 0x00007ffff65b017c clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) + 364
32 clang                   0x00000000004132f4 clang_main(int, char**, llvm::ToolContext const&) + 8308
33 clang                   0x000000000040e2b4 main + 100
34 libc.so.6               0x00007fffe8e2a1fe
35 libc.so.6               0x00007fffe8e2a2b9 __libc_start_main + 137
36 clang                   0x000000000040e315 _start + 37
clang: error: clang frontend command failed with exit code 139 (use -v to see invocation)
clang version 21.0.0git
Target: riscv32-unknown-none-elf
Thread model: posix
InstalledDir: /nix/store/4b17xsrdcyd2bnm91pdclfw2wz3g0ikg-clang-21.0.0-unstable-2025-03-23/bin
clang: note: diagnostic msg:
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang: note: diagnostic msg: /tmp/nix-shell.VojIba/utf8_count-cefc8d.c
clang: note: diagnostic msg: /tmp/nix-shell.VojIba/utf8_count-cefc8d.sh
clang: note: diagnostic msg:

********************

Minimal Reproducible C Code:

The crash appears to be triggered by the utf8_count_SWAR_popc_bithack_autovec function.

Reproduce C code (test.c)
#include <limits.h>
#include <stdint.h>
#include <stddef.h>
#include <float.h>

typedef uint32_t ux;

static inline int upopcnt(ux x)
{

 x -= (x >> 1) & (-(ux)1/3);

 x = (x & (-(ux)1/15*3)) + ((x >> 2) & (-(ux)1/15*3));

 x = (x + (x >> 4)) & (-(ux)1/255*15);
 ({__asm volatile("" : "+r"(x) : "r"(x) : "memory");});


 x += (x >> 8);
 x += (x >> 16);
                     ;
 return x & 127;
}

size_t utf8_count_SWAR_popc_bithack_autovec(char const *str, size_t len) {
  ux const __attribute__((__may_alias__)) * u;
  size_t count = 0, tail = 0;
  uint8_t const *u8 = (uint8_t const *)str;
  if (len < sizeof *u) {
    tail = len;
    goto skip;
  }
  tail = sizeof *u - (uintptr_t)str % sizeof *u;
  len -= tail;
  while (tail--)
    count += (*u8++ & 0xC0) != 0x80, (void)0;
  u = (ux const *)u8;
  tail = len % sizeof *u;
  for (len /= sizeof *u; len--; ++u) {
    ux b1 = ~*u & (ux)0x80808080;
    ux b2 = *u & (ux)0x40404040;
    count += upopcnt((b1 >> 1) | b2);
    (void)0;
  }
  u8 = (uint8_t const *)u;
skip:
  while (tail--)
    count += (*u8++ & 0xC0) != 0x80, (void)0;
  return count;
}

Compiler Flags Used:

-mabi=ilp32f -march=rv32imafc_xsfmm128t_zve32f_zvl2048b -mno-relax -static -nostartfiles -mcmodel=medany -fvisibility=hidden -fno-PIC -g -O3 -fno-rtti -fno-exceptions -fno-threadsafe-statics

@topperc
Copy link
Collaborator Author

topperc commented May 16, 2025

@Avimitin can you please provide the 2 files indicated in the crash report

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang: note: diagnostic msg: /tmp/nix-shell.VojIba/utf8_count-cefc8d.c
clang: note: diagnostic msg: /tmp/nix-shell.VojIba/utf8_count-cefc8d.sh
clang: note: diagnostic msg:

@wangpc-pp
Copy link
Contributor

Is the issue really related to this PR? Why would a MC change cause CodeGen errors? Is it because uncommon extension combinations?

@Avimitin
Copy link
Contributor

can you please provide the 2 files indicated in the crash report

Yes here is the files (Sorry GitHub doesn't support uploading code file, so I have to change them to txt suffix):
test-ef7560.sh.txt
test-ef7560.c.txt

@Avimitin
Copy link
Contributor

Avimitin commented May 16, 2025

Is the issue really related to this PR? Why would a MC change cause CodeGen errors? Is it because uncommon extension combinations?

I've played around with the -march flag a bit more, and it doesn't seem to be the main culprit here. For example, even using -march=rv32gc_zve32f_xsfmm128t still leads to the same crash.

However, I've found that the optimization level seems to be the key factor. The crash happens when I use -O3 or -O2. But if I switch to -O1, -Og, or just don't specify an optimization level (which defaults to -O0), the code compiles successfully without any errors.

@topperc
Copy link
Collaborator Author

topperc commented May 16, 2025

Is the issue really related to this PR? Why would a MC change cause CodeGen errors? Is it because uncommon extension combinations?

I've played around with the -march flag a bit more, and it doesn't seem to be the main culprit here. For example, even using -march=rv32gc_zve32f_xsfmm128t still leads to the same crash.

However, I've found that the optimization level seems to be the key factor. The crash happens when I use -O3 or -O2. But if I switch to -O1, -Og, or just don't specify an optimization level (which defaults to -O0), the code compiles successfully without any errors.

I was able to reproduce the failure using the commit from when this PR was created. The later merge seems to have hid it. The failure still reproduces on trunk with the same IR file though.

The problem is in combineToVCPOP when Zve32* is enabled but not Zve64* or V*. I have a patch I'll post in a few minutes.

Avimitin added a commit to chipsalliance/t1 that referenced this pull request May 17, 2025
Avimitin added a commit to chipsalliance/t1 that referenced this pull request May 17, 2025
Avimitin added a commit to chipsalliance/t1 that referenced this pull request May 17, 2025
Avimitin added a commit to chipsalliance/t1 that referenced this pull request May 17, 2025
Avimitin added a commit to chipsalliance/t1 that referenced this pull request May 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:RISC-V clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category mc Machine (object) code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.