Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

[TTI] Provide a cost for memset_pattern which matches the libcall #139978

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
Loading
from

Conversation

preames
Copy link
Collaborator

@preames preames commented May 14, 2025

The motivation is that differences in unrolling were noticed when trying to switch from the libcall to the intrinsic. There are likely also differences not yet noticed in other cost based decisions - such as inlining, and possibly vectorization.

Neither cost is a good, well considered, cost but for the moment, let's have them be equal to simplify migration. We can come back and refine this once we have it being exercised by default.

preames added 2 commits May 14, 2025 16:08
The motivation is that differences in unrolling were noticed when
trying to swich from the libcall to the intrinsic.  Neither cost
is a good, well considered cost, but for the moment, let's have
them be equal to simplify migration.
@llvmbot
Copy link
Member

llvmbot commented May 14, 2025

@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-llvm-analysis

Author: Philip Reames (preames)

Changes

The motivation is that differences in unrolling were noticed when trying to switch from the libcall to the intrinsic. There are likely also differences not yet noticed in other cost based decisions - such as inlining, and possibly vectorization.

Neither cost is a good, well considered, cost but for the moment, let's have them be equal to simplify migration. We can come back and refine this once we have it being exercised by default.


Full diff: https://github.com/llvm/llvm-project/pull/139978.diff

2 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/BasicTTIImpl.h (+3)
  • (added) llvm/test/Analysis/CostModel/X86/memset-pattern.ll (+40)
diff --git a/llvm/include/llvm/CodeGen/BasicTTIImpl.h b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
index ff8778168686d..449e9e8f85561 100644
--- a/llvm/include/llvm/CodeGen/BasicTTIImpl.h
+++ b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
@@ -2408,6 +2408,9 @@ class BasicTTIImplBase : public TargetTransformInfoImplCRTPBase<T> {
                                           CmpInst::ICMP_ULT, CostKind);
       return Cost;
     }
+    case Intrinsic::experimental_memset_pattern:
+      // This cost is set to match the cost of the memset_pattern16 libcall
+      return TTI::TCC_Basic * 4;
     case Intrinsic::abs:
       ISD = ISD::ABS;
       break;
diff --git a/llvm/test/Analysis/CostModel/X86/memset-pattern.ll b/llvm/test/Analysis/CostModel/X86/memset-pattern.ll
new file mode 100644
index 0000000000000..aa0c6efdf34fa
--- /dev/null
+++ b/llvm/test/Analysis/CostModel/X86/memset-pattern.ll
@@ -0,0 +1,40 @@
+; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py UTC_ARGS: --version 4
+; RUN: opt < %s -mtriple=x86_64-apple-darwin10.0.0 -passes="print<cost-model>" 2>&1 -disable-output | FileCheck %s
+
+target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
+
+target triple = "x86_64-apple-darwin10.0.0"
+
+@.memset_pattern = private unnamed_addr constant [4 x i32] [i32 2, i32 2, i32 2, i32 2], align 16
+
+define void @via_libcall(ptr %p) nounwind ssp {
+; CHECK-LABEL: 'via_libcall'
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: call void @memset_pattern4(ptr %p, ptr @.memset_pattern, i64 200)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: call void @memset_pattern8(ptr %p, ptr @.memset_pattern, i64 200)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: call void @memset_pattern16(ptr %p, ptr @.memset_pattern, i64 200)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+  call void @memset_pattern4(ptr %p, ptr @.memset_pattern, i64 200)
+  call void @memset_pattern8(ptr %p, ptr @.memset_pattern, i64 200)
+  call void @memset_pattern16(ptr %p, ptr @.memset_pattern, i64 200)
+  ret void
+}
+
+declare void @memset_pattern4(ptr, ptr, i64)
+declare void @memset_pattern8(ptr, ptr, i64)
+declare void @memset_pattern16(ptr, ptr, i64)
+
+define void @via_intrinsic(ptr %p) {
+; CHECK-LABEL: 'via_intrinsic'
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.experimental.memset.pattern.p0.i16.i64(ptr align 4 %p, i16 2, i64 100, i1 false)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.experimental.memset.pattern.p0.i32.i64(ptr align 4 %p, i32 2, i64 50, i1 false)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.experimental.memset.pattern.p0.i64.i64(ptr align 4 %p, i64 2, i64 25, i1 false)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.experimental.memset.pattern.p0.i128.i64(ptr align 4 %p, i128 2, i64 12, i1 false)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+  call void @llvm.experimental.memset.pattern(ptr align 4 %p, i16 2, i64 100, i1 false)
+  call void @llvm.experimental.memset.pattern(ptr align 4 %p, i32 2, i64 50, i1 false)
+  call void @llvm.experimental.memset.pattern(ptr align 4 %p, i64 2, i64 25, i1 false)
+  call void @llvm.experimental.memset.pattern(ptr align 4 %p, i128 2, i64 12, i1 false)
+  ret void
+}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.