[MLIR][XeGPU] Add unroll patterns and blocking pass for XeGPU (1/N) #137010

chencha3 · Apr 23, 2025

Similar to vector ops, XeGPU ops need to be unrolled into smaller shapes such that they can be dispatched into a hardware instruction. This PR marks the initial phase of a series dedicated to incorporating unroll patterns for XeGPU operations. In this installment, we introduce patterns for the following operations:

createNd
updateNd
prefetchNd
loadNd
storeNd
dpas

github-actions · Apr 23, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Garra1980 · Apr 30, 2025

This PR marks the initial phase of a series dedicated to incorporating unroll patterns for XeGPU operations.

Can you please add some justification/explanation regarding those unroll patterns

chencha3 · May 7, 2025

Added a few minor comments. On the monkey-level this looks good to me. You might want to capitalize the first words in paragraphs and enums/bullet lists within comments.

Thanks @fschlimb, I made the changes according to your feedback. I hope I addressed all of your concerns.

charithaintc

LGTM

adam-smnk

It'll be interesting to later see if we could generalize and reuse vector unrolling to achieve the same. For now, I think it's a good addition to xegpu infrastructure and we'll see in practice how it holds up.

I take it depends on #138701?

mlir/lib/Dialect/XeGPU/IR/XeGPUOps.cpp

mlir/test/lib/Dialect/XeGPU/CMakeLists.txt

chencha3 · May 8, 2025

It'll be interesting to later see if we could generalize and reuse vector unrolling to achieve the same. For now, I think it's a good addition to xegpu infrastructure and we'll see in practice how it holds up.

I take it depends on #138701?

Yeah. these patterns are supposed to be companions to vector unrolling patterns. They share the same idea, one is for XeGPU ops only, and one is for vector ops. A pass are supposed to use both of them.

Garra1980 · May 8, 2025

mlir/lib/Dialect/XeGPU/Transforms/XeGPUUnroll.cpp

+  }
+
+private:
+  const char *const packAttrName = "__xetile_blocking_pack__";


xetile->xegpu I guess, here and in tests

good catch. fixed it

Garra1980 · May 8, 2025

mlir/include/mlir/Dialect/XeGPU/Transforms/Transforms.h

+/// provide a way to customize the native shape of the operation.
+struct UnrollOptions {
+  using FilterConstraintFnType = std::function<LogicalResult(Operation *op)>;
+  /// Callback function that indicates whether vector unrolling should be


nit: let's place this comment above "using" to have uniform look :)

chencha3 · May 9, 2025

It'll be interesting to later see if we could generalize and reuse vector unrolling to achieve the same. For now, I think it's a good addition to xegpu infrastructure and we'll see in practice how it holds up.

I take it depends on #138701?

#138701 has been merged, and this PR is rebased too.

chencha3 · May 9, 2025

Hi @fschlimb and @adam-smnk, do you have more suggestions?

fschlimb · May 9, 2025

Hi @fschlimb and @adam-smnk, do you have more suggestions?

no, LGTM!

chencha3 · May 12, 2025

Hi @adam-smnk, I am going to merge this first. If you have more suggestions, feel free to let me know. Thanks for your help!

mgorny · May 17, 2025

I'm seeing a regression on 32-bit platforms from this change:

FAIL: MLIR :: Dialect/XeGPU/xegpu-unroll-patterns.mlir (1488 of 2894)
******************** TEST 'MLIR :: Dialect/XeGPU/xegpu-unroll-patterns.mlir' FAILED ********************
Exit Code: 2

Command Output (stdout):
--
# RUN: at line 1
/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/mlir-opt --test-xegpu-unrolling-patterns -split-input-file /var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir/test/Dialect/XeGPU/xegpu-unroll-patterns.mlir | /usr/lib/llvm/21/bin/FileCheck /var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir/test/Dialect/XeGPU/xegpu-unroll-patterns.mlir
# executed command: /var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/mlir-opt --test-xegpu-unrolling-patterns -split-input-file /var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir/test/Dialect/XeGPU/xegpu-unroll-patterns.mlir
# .---command stderr------------
# | mlir-opt: /usr/lib/llvm/21/include/llvm/ADT/SmallVector.h:291: T& llvm::SmallVectorTemplateCommon<T, <template-parameter-1-2> >::operator[](size_type) [with T = long long int; <template-parameter-1-2> = void; reference = long long int&; size_type = unsigned int]: Assertion `idx < size()' failed.
# | PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
# | Stack dump:
# | 0.  Program arguments: /var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/mlir-opt --test-xegpu-unrolling-patterns -split-input-file /var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir/test/Dialect/XeGPU/xegpu-unroll-patterns.mlir
# |  #0 0xffffffffe5e86ef0 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/usr/lib/llvm/21/lib/libLLVM.so.21.0gitaaaae996+0xc45ef0)
# |  #1 0xffffffffe5e8747f (/usr/lib/llvm/21/lib/libLLVM.so.21.0gitaaaae996+0xc4647f)
# |  #2 0xffffffffe5e84095 llvm::sys::RunSignalHandlers() (/usr/lib/llvm/21/lib/libLLVM.so.21.0gitaaaae996+0xc43095)
# |  #3 0xffffffffe5e8422b (/usr/lib/llvm/21/lib/libLLVM.so.21.0gitaaaae996+0xc4322b)
# |  #4 0xfffffffff7f8d5a0 (linux-gate.so.1+0x5a0)
# |  #5 0xfffffffff7f8d579 (linux-gate.so.1+0x579)
# |  #6 0xffffffffe4de7d07 (/usr/lib/libc.so.6+0x93d07)
# |  #7 0xffffffffe4d8c581 raise (/usr/lib/libc.so.6+0x38581)
# |  #8 0xffffffffe4d732d8 abort (/usr/lib/libc.so.6+0x1f2d8)
# |  #9 0xffffffffe4d731de (/usr/lib/libc.so.6+0x1f1de)
# | #10 0xffffffffe4d8450b (/usr/lib/libc.so.6+0x3050b)
# | #11 0xfffffffff58af00d mlir::computeSuffixProduct(llvm::ArrayRef<long long>) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x4c3200d)
# | #12 0xfffffffff58b176c mlir::detail::TileOffsetRangeImpl::TileOffsetRangeImpl(llvm::ArrayRef<long long>, llvm::ArrayRef<long long>, llvm::ArrayRef<long long>) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x4c3476c)
# | #13 0xfffffffff2db29fc mlir::StaticTileOffsetRange::StaticTileOffsetRange(llvm::ArrayRef<long long>, llvm::ArrayRef<long long>) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x21359fc)
# | #14 0xfffffffff5b7c52a (anonymous namespace)::UnrollPattern<mlir::xegpu::StoreNdOp>::pack(mlir::Value, mlir::TypeRange, llvm::ArrayRef<long long>, mlir::Location, mlir::PatternRewriter&) const XeGPUUnroll.cpp:0:0
# | #15 0xfffffffff5b7ef4e (anonymous namespace)::UnrollStoreNdOp::matchAndRewrite(mlir::xegpu::StoreNdOp, mlir::PatternRewriter&) const XeGPUUnroll.cpp:0:0
# | #16 0xfffffffff5b7b2ee mlir::detail::OpOrInterfaceRewritePatternBase<mlir::xegpu::StoreNdOp>::matchAndRewrite(mlir::Operation*, mlir::PatternRewriter&) const (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x4efe2ee)
# | #17 0xfffffffff5da80ba mlir::PatternApplicator::matchAndRewrite(mlir::Operation*, mlir::PatternRewriter&, llvm::function_ref<bool (mlir::Pattern const&)>, llvm::function_ref<void (mlir::Pattern const&)>, llvm::function_ref<llvm::LogicalResult (mlir::Pattern const&)>)::'lambda'()::operator()() const PatternApplicator.cpp:0:0
# | #18 0xfffffffff5da8f4e mlir::PatternApplicator::matchAndRewrite(mlir::Operation*, mlir::PatternRewriter&, llvm::function_ref<bool (mlir::Pattern const&)>, llvm::function_ref<void (mlir::Pattern const&)>, llvm::function_ref<llvm::LogicalResult (mlir::Pattern const&)>) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x512bf4e)
# | #19 0xfffffffff61ade63 (anonymous namespace)::GreedyPatternRewriteDriver::processWorklist() GreedyPatternRewriteDriver.cpp:0:0
# | #20 0xfffffffff61b029e mlir::applyPatternsGreedily(mlir::Region&, mlir::FrozenRewritePatternSet const&, mlir::GreedyRewriteConfig, bool*) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x553329e)
# | #21 0x5700059d (anonymous namespace)::TestXeGPUUnrollingPatterns::runOnOperation() TestXeGPUTransforms.cpp:0:0
# | #22 0xfffffffff5d6d371 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x50f0371)
# | #23 0xfffffffff5d6d8b4 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x50f08b4)
# | #24 0xfffffffff5d6dd8d mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::'lambda'(mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo&)::operator()(mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo&) const Pass.cpp:0:0
# | #25 0xfffffffff5d6c3a4 mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/
work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x50ef3a4)
# | #26 0xfffffffff5d6d094 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned in
t) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x50f0094)
# | #27 0xfffffffff5d6d8b4 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, 
bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) (/var/tmp/portage/llvm-core/mlir-21.
0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x50f08b4)
# | #28 0xfffffffff5d6ed93 mlir::PassManager::run(mlir::Operation*) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x50f1d93)
# | #29 0xfffffffff613a8e6 performActions(llvm::raw_ostream&, std::shared_ptr<llvm::SourceMgr> const&, mlir::MLIRContext*, mlir::MlirOptMainConfig const&) MlirOptMain.cpp:0:0
# | #30 0xfffffffff613b1d0 llvm::LogicalResult llvm::function_ref<llvm::LogicalResult (std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&)>::callback_fn<mlir::MlirOptMain(llvm::raw_ostream&, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, mlir::DialectRegistry&, mlir::MlirOptMainConfig const&)::'lambda'(std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&)>(int, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&) MlirOptMain.cpp:0:0
# | #31 0xfffffffff5df3ceb mlir::splitAndProcessBuffer(std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::function_ref<llvm::LogicalResult (std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&)>, llvm::raw_ostream&, llvm::StringRef, llvm::StringRef)::'lambda'(llvm::StringRef)::operator()(llvm::StringRef) const ToolUtilities.cpp:0:0
# | #32 0xfffffffff5df4425 mlir::splitAndProcessBuffer(std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::function_ref<llvm::LogicalResult (std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&)>, llvm::raw_ostream&, llvm::StringRef, llvm::StringRef) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x5177425)
# | #33 0xfffffffff6132e39 mlir::MlirOptMain(llvm::raw_ostream&, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, mlir::DialectRegistry&, mlir::MlirOptMainConfig const&) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x54b5e39)
# | #34 0xfffffffff613b516 mlir::MlirOptMain(int, char**, llvm::StringRef, llvm::StringRef, mlir::DialectRegistry&) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x54be516)
# | #35 0xfffffffff613ba51 mlir::MlirOptMain(int, char**, llvm::StringRef, mlir::DialectRegistry&) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x54bea51)
# | #36 0x568ff7ac main (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/mlir-opt+0x3237ac)
# | #37 0xffffffffe4d74f83 (/usr/lib/libc.so.6+0x20f83)
# | #38 0xffffffffe4d75048 __libc_start_main (/usr/lib/libc.so.6+0x21048)
# | #39 0x568fff97 _start (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/mlir-opt+0x323f97)
# `-----------------------------
# error: command failed with exit status: -6
# executed command: /usr/lib/llvm/21/bin/FileCheck /var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir/test/Dialect/XeGPU/xegpu-unroll-patterns.mlir
# .---command stderr------------
# | FileCheck error: '<stdin>' is empty.
# | FileCheck command line:  /usr/lib/llvm/21/bin/FileCheck /var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir/test/Dialect/XeGPU/xegpu-unroll-patterns.mlir
# `-----------------------------
# error: command failed with exit status: 2

--

********************

Jianhui-Li · May 17, 2025

mlir/lib/Dialect/XeGPU/Transforms/XeGPUUnroll.cpp

+      auto shape = vecTy.getShape();
+      SmallVector<Value> results;
+      for (SmallVector<int64_t> offsets :
+           StaticTileOffsetRange(shape, blockSize)) {


The PR seems not handling xegpu.order attribute at this stage. load_nd on PVC support load with transpose, which takes order as [0,1].
StaticTileOffsetRange() can take extra loopOrder parameter which can be used to support this feature.

The PR seems not handling xegpu.order attribute at this stage. load_nd on PVC support load with transpose, which takes order as [0,1]. StaticTileOffsetRange() can take extra loopOrder parameter which can be used to support this feature.

I feel this belongs to the handling of transpose op, not the handling of xegpu.order attribute.

chencha3 · May 19, 2025

I'm seeing a regression on 32-bit platforms from this change:

FAIL: MLIR :: Dialect/XeGPU/xegpu-unroll-patterns.mlir (1488 of 2894)
******************** TEST 'MLIR :: Dialect/XeGPU/xegpu-unroll-patterns.mlir' FAILED ********************
Exit Code: 2

Command Output (stdout):
--
# RUN: at line 1
/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/mlir-opt --test-xegpu-unrolling-patterns -split-input-file /var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir/test/Dialect/XeGPU/xegpu-unroll-patterns.mlir | /usr/lib/llvm/21/bin/FileCheck /var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir/test/Dialect/XeGPU/xegpu-unroll-patterns.mlir
# executed command: /var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/mlir-opt --test-xegpu-unrolling-patterns -split-input-file /var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir/test/Dialect/XeGPU/xegpu-unroll-patterns.mlir
# .---command stderr------------
# | mlir-opt: /usr/lib/llvm/21/include/llvm/ADT/SmallVector.h:291: T& llvm::SmallVectorTemplateCommon<T, <template-parameter-1-2> >::operator[](size_type) [with T = long long int; <template-parameter-1-2> = void; reference = long long int&; size_type = unsigned int]: Assertion `idx < size()' failed.
# | PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
# | Stack dump:
# | 0.  Program arguments: /var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/mlir-opt --test-xegpu-unrolling-patterns -split-input-file /var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir/test/Dialect/XeGPU/xegpu-unroll-patterns.mlir
# |  #0 0xffffffffe5e86ef0 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/usr/lib/llvm/21/lib/libLLVM.so.21.0gitaaaae996+0xc45ef0)
# |  #1 0xffffffffe5e8747f (/usr/lib/llvm/21/lib/libLLVM.so.21.0gitaaaae996+0xc4647f)
# |  #2 0xffffffffe5e84095 llvm::sys::RunSignalHandlers() (/usr/lib/llvm/21/lib/libLLVM.so.21.0gitaaaae996+0xc43095)
# |  #3 0xffffffffe5e8422b (/usr/lib/llvm/21/lib/libLLVM.so.21.0gitaaaae996+0xc4322b)
# |  #4 0xfffffffff7f8d5a0 (linux-gate.so.1+0x5a0)
# |  #5 0xfffffffff7f8d579 (linux-gate.so.1+0x579)
# |  #6 0xffffffffe4de7d07 (/usr/lib/libc.so.6+0x93d07)
# |  #7 0xffffffffe4d8c581 raise (/usr/lib/libc.so.6+0x38581)
# |  #8 0xffffffffe4d732d8 abort (/usr/lib/libc.so.6+0x1f2d8)
# |  #9 0xffffffffe4d731de (/usr/lib/libc.so.6+0x1f1de)
# | #10 0xffffffffe4d8450b (/usr/lib/libc.so.6+0x3050b)
# | #11 0xfffffffff58af00d mlir::computeSuffixProduct(llvm::ArrayRef<long long>) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x4c3200d)
# | #12 0xfffffffff58b176c mlir::detail::TileOffsetRangeImpl::TileOffsetRangeImpl(llvm::ArrayRef<long long>, llvm::ArrayRef<long long>, llvm::ArrayRef<long long>) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x4c3476c)
# | #13 0xfffffffff2db29fc mlir::StaticTileOffsetRange::StaticTileOffsetRange(llvm::ArrayRef<long long>, llvm::ArrayRef<long long>) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x21359fc)
# | #14 0xfffffffff5b7c52a (anonymous namespace)::UnrollPattern<mlir::xegpu::StoreNdOp>::pack(mlir::Value, mlir::TypeRange, llvm::ArrayRef<long long>, mlir::Location, mlir::PatternRewriter&) const XeGPUUnroll.cpp:0:0
# | #15 0xfffffffff5b7ef4e (anonymous namespace)::UnrollStoreNdOp::matchAndRewrite(mlir::xegpu::StoreNdOp, mlir::PatternRewriter&) const XeGPUUnroll.cpp:0:0
# | #16 0xfffffffff5b7b2ee mlir::detail::OpOrInterfaceRewritePatternBase<mlir::xegpu::StoreNdOp>::matchAndRewrite(mlir::Operation*, mlir::PatternRewriter&) const (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x4efe2ee)
# | #17 0xfffffffff5da80ba mlir::PatternApplicator::matchAndRewrite(mlir::Operation*, mlir::PatternRewriter&, llvm::function_ref<bool (mlir::Pattern const&)>, llvm::function_ref<void (mlir::Pattern const&)>, llvm::function_ref<llvm::LogicalResult (mlir::Pattern const&)>)::'lambda'()::operator()() const PatternApplicator.cpp:0:0
# | #18 0xfffffffff5da8f4e mlir::PatternApplicator::matchAndRewrite(mlir::Operation*, mlir::PatternRewriter&, llvm::function_ref<bool (mlir::Pattern const&)>, llvm::function_ref<void (mlir::Pattern const&)>, llvm::function_ref<llvm::LogicalResult (mlir::Pattern const&)>) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x512bf4e)
# | #19 0xfffffffff61ade63 (anonymous namespace)::GreedyPatternRewriteDriver::processWorklist() GreedyPatternRewriteDriver.cpp:0:0
# | #20 0xfffffffff61b029e mlir::applyPatternsGreedily(mlir::Region&, mlir::FrozenRewritePatternSet const&, mlir::GreedyRewriteConfig, bool*) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x553329e)
# | #21 0x5700059d (anonymous namespace)::TestXeGPUUnrollingPatterns::runOnOperation() TestXeGPUTransforms.cpp:0:0
# | #22 0xfffffffff5d6d371 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x50f0371)
# | #23 0xfffffffff5d6d8b4 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x50f08b4)
# | #24 0xfffffffff5d6dd8d mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::'lambda'(mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo&)::operator()(mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo&) const Pass.cpp:0:0
# | #25 0xfffffffff5d6c3a4 mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/
work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x50ef3a4)
# | #26 0xfffffffff5d6d094 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned in
t) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x50f0094)
# | #27 0xfffffffff5d6d8b4 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, 
bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) (/var/tmp/portage/llvm-core/mlir-21.
0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x50f08b4)
# | #28 0xfffffffff5d6ed93 mlir::PassManager::run(mlir::Operation*) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x50f1d93)
# | #29 0xfffffffff613a8e6 performActions(llvm::raw_ostream&, std::shared_ptr<llvm::SourceMgr> const&, mlir::MLIRContext*, mlir::MlirOptMainConfig const&) MlirOptMain.cpp:0:0
# | #30 0xfffffffff613b1d0 llvm::LogicalResult llvm::function_ref<llvm::LogicalResult (std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&)>::callback_fn<mlir::MlirOptMain(llvm::raw_ostream&, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, mlir::DialectRegistry&, mlir::MlirOptMainConfig const&)::'lambda'(std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&)>(int, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&) MlirOptMain.cpp:0:0
# | #31 0xfffffffff5df3ceb mlir::splitAndProcessBuffer(std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::function_ref<llvm::LogicalResult (std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&)>, llvm::raw_ostream&, llvm::StringRef, llvm::StringRef)::'lambda'(llvm::StringRef)::operator()(llvm::StringRef) const ToolUtilities.cpp:0:0
# | #32 0xfffffffff5df4425 mlir::splitAndProcessBuffer(std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::function_ref<llvm::LogicalResult (std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&)>, llvm::raw_ostream&, llvm::StringRef, llvm::StringRef) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x5177425)
# | #33 0xfffffffff6132e39 mlir::MlirOptMain(llvm::raw_ostream&, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, mlir::DialectRegistry&, mlir::MlirOptMainConfig const&) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x54b5e39)
# | #34 0xfffffffff613b516 mlir::MlirOptMain(int, char**, llvm::StringRef, llvm::StringRef, mlir::DialectRegistry&) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x54be516)
# | #35 0xfffffffff613ba51 mlir::MlirOptMain(int, char**, llvm::StringRef, mlir::DialectRegistry&) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x54bea51)
# | #36 0x568ff7ac main (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/mlir-opt+0x3237ac)
# | #37 0xffffffffe4d74f83 (/usr/lib/libc.so.6+0x20f83)
# | #38 0xffffffffe4d75048 __libc_start_main (/usr/lib/libc.so.6+0x21048)
# | #39 0x568fff97 _start (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/mlir-opt+0x323f97)
# `-----------------------------
# error: command failed with exit status: -6
# executed command: /usr/lib/llvm/21/bin/FileCheck /var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir/test/Dialect/XeGPU/xegpu-unroll-patterns.mlir
# .---command stderr------------
# | FileCheck error: '<stdin>' is empty.
# | FileCheck command line:  /usr/lib/llvm/21/bin/FileCheck /var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir/test/Dialect/XeGPU/xegpu-unroll-patterns.mlir
# `-----------------------------
# error: command failed with exit status: 2

--

********************

Hi @mgorny, could you help to share your build instructions, so I can reproduce it in my local machine.

chencha3 · May 19, 2025

I'm seeing a regression on 32-bit platforms from this change:

FAIL: MLIR :: Dialect/XeGPU/xegpu-unroll-patterns.mlir (1488 of 2894)
******************** TEST 'MLIR :: Dialect/XeGPU/xegpu-unroll-patterns.mlir' FAILED ********************
Exit Code: 2

Command Output (stdout):
--
# RUN: at line 1
/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/mlir-opt --test-xegpu-unrolling-patterns -split-input-file /var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir/test/Dialect/XeGPU/xegpu-unroll-patterns.mlir | /usr/lib/llvm/21/bin/FileCheck /var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir/test/Dialect/XeGPU/xegpu-unroll-patterns.mlir
# executed command: /var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/mlir-opt --test-xegpu-unrolling-patterns -split-input-file /var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir/test/Dialect/XeGPU/xegpu-unroll-patterns.mlir
# .---command stderr------------
# | mlir-opt: /usr/lib/llvm/21/include/llvm/ADT/SmallVector.h:291: T& llvm::SmallVectorTemplateCommon<T, <template-parameter-1-2> >::operator[](size_type) [with T = long long int; <template-parameter-1-2> = void; reference = long long int&; size_type = unsigned int]: Assertion `idx < size()' failed.
# | PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
# | Stack dump:
# | 0.  Program arguments: /var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/mlir-opt --test-xegpu-unrolling-patterns -split-input-file /var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir/test/Dialect/XeGPU/xegpu-unroll-patterns.mlir
# |  #0 0xffffffffe5e86ef0 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/usr/lib/llvm/21/lib/libLLVM.so.21.0gitaaaae996+0xc45ef0)
# |  #1 0xffffffffe5e8747f (/usr/lib/llvm/21/lib/libLLVM.so.21.0gitaaaae996+0xc4647f)
# |  #2 0xffffffffe5e84095 llvm::sys::RunSignalHandlers() (/usr/lib/llvm/21/lib/libLLVM.so.21.0gitaaaae996+0xc43095)
# |  #3 0xffffffffe5e8422b (/usr/lib/llvm/21/lib/libLLVM.so.21.0gitaaaae996+0xc4322b)
# |  #4 0xfffffffff7f8d5a0 (linux-gate.so.1+0x5a0)
# |  #5 0xfffffffff7f8d579 (linux-gate.so.1+0x579)
# |  #6 0xffffffffe4de7d07 (/usr/lib/libc.so.6+0x93d07)
# |  #7 0xffffffffe4d8c581 raise (/usr/lib/libc.so.6+0x38581)
# |  #8 0xffffffffe4d732d8 abort (/usr/lib/libc.so.6+0x1f2d8)
# |  #9 0xffffffffe4d731de (/usr/lib/libc.so.6+0x1f1de)
# | #10 0xffffffffe4d8450b (/usr/lib/libc.so.6+0x3050b)
# | #11 0xfffffffff58af00d mlir::computeSuffixProduct(llvm::ArrayRef<long long>) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x4c3200d)
# | #12 0xfffffffff58b176c mlir::detail::TileOffsetRangeImpl::TileOffsetRangeImpl(llvm::ArrayRef<long long>, llvm::ArrayRef<long long>, llvm::ArrayRef<long long>) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x4c3476c)
# | #13 0xfffffffff2db29fc mlir::StaticTileOffsetRange::StaticTileOffsetRange(llvm::ArrayRef<long long>, llvm::ArrayRef<long long>) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x21359fc)
# | #14 0xfffffffff5b7c52a (anonymous namespace)::UnrollPattern<mlir::xegpu::StoreNdOp>::pack(mlir::Value, mlir::TypeRange, llvm::ArrayRef<long long>, mlir::Location, mlir::PatternRewriter&) const XeGPUUnroll.cpp:0:0
# | #15 0xfffffffff5b7ef4e (anonymous namespace)::UnrollStoreNdOp::matchAndRewrite(mlir::xegpu::StoreNdOp, mlir::PatternRewriter&) const XeGPUUnroll.cpp:0:0
# | #16 0xfffffffff5b7b2ee mlir::detail::OpOrInterfaceRewritePatternBase<mlir::xegpu::StoreNdOp>::matchAndRewrite(mlir::Operation*, mlir::PatternRewriter&) const (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x4efe2ee)
# | #17 0xfffffffff5da80ba mlir::PatternApplicator::matchAndRewrite(mlir::Operation*, mlir::PatternRewriter&, llvm::function_ref<bool (mlir::Pattern const&)>, llvm::function_ref<void (mlir::Pattern const&)>, llvm::function_ref<llvm::LogicalResult (mlir::Pattern const&)>)::'lambda'()::operator()() const PatternApplicator.cpp:0:0
# | #18 0xfffffffff5da8f4e mlir::PatternApplicator::matchAndRewrite(mlir::Operation*, mlir::PatternRewriter&, llvm::function_ref<bool (mlir::Pattern const&)>, llvm::function_ref<void (mlir::Pattern const&)>, llvm::function_ref<llvm::LogicalResult (mlir::Pattern const&)>) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x512bf4e)
# | #19 0xfffffffff61ade63 (anonymous namespace)::GreedyPatternRewriteDriver::processWorklist() GreedyPatternRewriteDriver.cpp:0:0
# | #20 0xfffffffff61b029e mlir::applyPatternsGreedily(mlir::Region&, mlir::FrozenRewritePatternSet const&, mlir::GreedyRewriteConfig, bool*) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x553329e)
# | #21 0x5700059d (anonymous namespace)::TestXeGPUUnrollingPatterns::runOnOperation() TestXeGPUTransforms.cpp:0:0
# | #22 0xfffffffff5d6d371 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x50f0371)
# | #23 0xfffffffff5d6d8b4 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x50f08b4)
# | #24 0xfffffffff5d6dd8d mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::'lambda'(mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo&)::operator()(mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool)::OpPMInfo&) const Pass.cpp:0:0
# | #25 0xfffffffff5d6c3a4 mlir::detail::OpToOpPassAdaptor::runOnOperationAsyncImpl(bool) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/
work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x50ef3a4)
# | #26 0xfffffffff5d6d094 mlir::detail::OpToOpPassAdaptor::run(mlir::Pass*, mlir::Operation*, mlir::AnalysisManager, bool, unsigned in
t) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x50f0094)
# | #27 0xfffffffff5d6d8b4 mlir::detail::OpToOpPassAdaptor::runPipeline(mlir::OpPassManager&, mlir::Operation*, mlir::AnalysisManager, 
bool, unsigned int, mlir::PassInstrumentor*, mlir::PassInstrumentation::PipelineParentInfo const*) (/var/tmp/portage/llvm-core/mlir-21.
0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x50f08b4)
# | #28 0xfffffffff5d6ed93 mlir::PassManager::run(mlir::Operation*) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x50f1d93)
# | #29 0xfffffffff613a8e6 performActions(llvm::raw_ostream&, std::shared_ptr<llvm::SourceMgr> const&, mlir::MLIRContext*, mlir::MlirOptMainConfig const&) MlirOptMain.cpp:0:0
# | #30 0xfffffffff613b1d0 llvm::LogicalResult llvm::function_ref<llvm::LogicalResult (std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&)>::callback_fn<mlir::MlirOptMain(llvm::raw_ostream&, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, mlir::DialectRegistry&, mlir::MlirOptMainConfig const&)::'lambda'(std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&)>(int, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&) MlirOptMain.cpp:0:0
# | #31 0xfffffffff5df3ceb mlir::splitAndProcessBuffer(std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::function_ref<llvm::LogicalResult (std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&)>, llvm::raw_ostream&, llvm::StringRef, llvm::StringRef)::'lambda'(llvm::StringRef)::operator()(llvm::StringRef) const ToolUtilities.cpp:0:0
# | #32 0xfffffffff5df4425 mlir::splitAndProcessBuffer(std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::function_ref<llvm::LogicalResult (std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, llvm::raw_ostream&)>, llvm::raw_ostream&, llvm::StringRef, llvm::StringRef) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x5177425)
# | #33 0xfffffffff6132e39 mlir::MlirOptMain(llvm::raw_ostream&, std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer>>, mlir::DialectRegistry&, mlir::MlirOptMainConfig const&) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x54b5e39)
# | #34 0xfffffffff613b516 mlir::MlirOptMain(int, char**, llvm::StringRef, llvm::StringRef, mlir::DialectRegistry&) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x54be516)
# | #35 0xfffffffff613ba51 mlir::MlirOptMain(int, char**, llvm::StringRef, mlir::DialectRegistry&) (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/../lib/libMLIR.so.21.0gitaaaae996+0x54bea51)
# | #36 0x568ff7ac main (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/mlir-opt+0x3237ac)
# | #37 0xffffffffe4d74f83 (/usr/lib/libc.so.6+0x20f83)
# | #38 0xffffffffe4d75048 __libc_start_main (/usr/lib/libc.so.6+0x21048)
# | #39 0x568fff97 _start (/var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir_build-abi_x86_32.x86/bin/mlir-opt+0x323f97)
# `-----------------------------
# error: command failed with exit status: -6
# executed command: /usr/lib/llvm/21/bin/FileCheck /var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir/test/Dialect/XeGPU/xegpu-unroll-patterns.mlir
# .---command stderr------------
# | FileCheck error: '<stdin>' is empty.
# | FileCheck command line:  /usr/lib/llvm/21/bin/FileCheck /var/tmp/portage/llvm-core/mlir-21.0.0.9999/work/mlir/test/Dialect/XeGPU/xegpu-unroll-patterns.mlir
# `-----------------------------
# error: command failed with exit status: 2

--

********************

Hi @mgorny, it should be fixed here: #140567

mgorny · May 19, 2025

A relative simple reproducer would be:

CC=i686-pc-linux-gnu-gcc CXX=i686-pc-linux-gnu-g++ cmake ../llvm -G Ninja -DLLVM_CCACHE_BUILD=ON -DLLVM_ENABLE_PROJECTS='llvm;mlir' -DCMAKE_BUILD_TYPE=MinSizeRel
ninja
ninja check-mlir-dialect-xegpu

For us, i686-pc-linux-gnu-gcc is a trivial wrapper:

#!/bin/sh
exec x86_64-pc-linux-gnu-gcc -m32 -mfpmath=sse "${@}"

Though I suppose i686-pc-linux-gnu-clang symlink would work too.

chencha3 · May 19, 2025

A relative simple reproducer would be:
CC=i686-pc-linux-gnu-gcc CXX=i686-pc-linux-gnu-g++ cmake ../llvm -G Ninja -DLLVM_CCACHE_BUILD=ON -DLLVM_ENABLE_PROJECTS='llvm;mlir' -DCMAKE_BUILD_TYPE=MinSizeRel
ninja
ninja check-mlir-dialect-xegpu
For us, i686-pc-linux-gnu-gcc is a trivial wrapper:
#!/bin/sh
exec x86_64-pc-linux-gnu-gcc -m32 -mfpmath=sse "${@}"
Though I suppose i686-pc-linux-gnu-clang symlink would work too.

Thanks, it should have been fixed in #140567, please take a look.

chencha3 added 6 commits April 17, 2025 17:54

init

7d332da

Merge branch 'main' into xegpu_unroll_patterns

d4549ad

Merge branch 'main' into xegpu_unroll_patterns

cdd5059

add patterns for createNdOp and StoreNdOp

47f9b3d

refine nativeShapeFn

932747e

refine verifier for TensorDescType

f843d98

chencha3 added 5 commits April 23, 2025 18:29

add loadNd pattern

c6bdd3c

add test pass

1d4dc72

format code

545f937

add unit test

008dbc7

clean up

d077cb0

chencha3 mentioned this pull request Apr 24, 2025

[mlir][xegpu] SIMT distribution patterns for XeGPU CreateNdTdesc, LoadNd, StoreNd and Dpas Ops. #135271

Merged

chencha3 added 8 commits April 28, 2025 18:53

stage

0193a04

Merge branch 'main' into xegpu_unroll_patterns

7f8b00a

add dpas pattern and unit test

456465e

refactor

906d699

fix format

c63a496

fix format

e2ed1ac

refine

35b35f0

refine

6fef430

chencha3 marked this pull request as ready for review April 30, 2025 16:16

chencha3 changed the title ~~[MLIR][XeGPU] Add unroll pass for XeGPU~~ [MLIR][XeGPU] Add unroll patterns for XeGPU (1/N) Apr 30, 2025

chencha3 added 3 commits April 30, 2025 17:53

cleanup and add patterns for rest nd ops

9d24920

fix format

1a92661

cleanup

0126eb9

chencha3 requested review from adam-smnk, charithaintc and fschlimb May 5, 2025 16:04

clean up

b55f43b

charithaintc approved these changes May 8, 2025

View reviewed changes

adam-smnk reviewed May 8, 2025

View reviewed changes

mlir/lib/Dialect/XeGPU/IR/XeGPUOps.cpp Outdated Show resolved Hide resolved

mlir/test/lib/Dialect/XeGPU/CMakeLists.txt Outdated Show resolved Hide resolved

move getUnrolledTypes out

383bd1d

chencha3 added 5 commits May 8, 2025 18:26

addressed comments

4fc35cf

address comments

536a610

fix format

39ca440

Merge branch 'main' into xegpu_unroll_patterns

09cec0b

sync

1d3d12c

Garra1980 reviewed May 8, 2025

View reviewed changes

chencha3 added 2 commits May 8, 2025 21:36

address comments

96cb62b

Merge branch 'main' into xegpu_unroll_patterns

163204a

update cmake

1caac76

chencha3 merged commit db42345 into main May 12, 2025
11 checks passed

chencha3 deleted the users/chencha3/xegpu/xegpu_unroll_patterns branch May 12, 2025 14:16

chencha3 changed the title ~~[MLIR][XeGPU] Add unroll patterns for XeGPU (1/N)~~ [MLIR][XeGPU] Add unroll patterns and blocking pass for XeGPU (1/N) May 16, 2025

Jianhui-Li reviewed May 17, 2025

View reviewed changes

Search code, repositories, users, issues, pull requests...

[MLIR][XeGPU] Add unroll patterns and blocking pass for XeGPU (1/N) #137010

[MLIR][XeGPU] Add unroll patterns and blocking pass for XeGPU (1/N) #137010

Uh oh!

Conversation

chencha3 commented Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Garra1980 commented Apr 30, 2025

Uh oh!

chencha3 commented May 7, 2025

Uh oh!

charithaintc left a comment

Choose a reason for hiding this comment

Uh oh!

adam-smnk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

chencha3 commented May 8, 2025

Uh oh!

Garra1980 May 8, 2025

Choose a reason for hiding this comment

Uh oh!

chencha3 May 8, 2025

Choose a reason for hiding this comment

Uh oh!

Garra1980 May 8, 2025

Choose a reason for hiding this comment

Uh oh!

chencha3 May 8, 2025

Choose a reason for hiding this comment

Uh oh!

chencha3 commented May 9, 2025

Uh oh!

chencha3 commented May 9, 2025

Uh oh!

fschlimb commented May 9, 2025

Uh oh!

chencha3 commented May 12, 2025

Uh oh!

Uh oh!

mgorny commented May 17, 2025

Uh oh!

Jianhui-Li May 17, 2025

Choose a reason for hiding this comment

Uh oh!

chencha3 May 19, 2025

Choose a reason for hiding this comment

Uh oh!

chencha3 commented May 19, 2025

Uh oh!

chencha3 commented May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mgorny commented May 19, 2025

Uh oh!

chencha3 commented May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

chencha3 commented Apr 23, 2025 •

edited

Loading

github-actions bot commented Apr 23, 2025 •

edited

Loading

chencha3 commented May 19, 2025 •

edited

Loading

chencha3 commented May 19, 2025 •

edited

Loading