Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

[llvm] Introduce callee_type metadata #87573

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 31 commits into
base: users/Prabhuk/sprmain.clangcallgraphsection-add-type-id-metadata-to-indirect-call-and-targets
Choose a base branch
Loading
from

Conversation

Prabhuk
Copy link
Contributor

@Prabhuk Prabhuk commented Apr 3, 2024

Introduce callee_type metadata which will be attached to the indirect
call instructions.

The callee_type metadata will be used to generate .callgraph section
described in this RFC:
https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html

Created using spr 1.3.6-beta.1
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. llvm:SelectionDAG SelectionDAGISel as well llvm:ir labels Apr 3, 2024
@llvmbot
Copy link
Member

llvmbot commented Apr 3, 2024

@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-llvm-ir

Author: Prabhuk (Prabhuk)

Changes

Create and add generalized type identifier metadata to indirect calls,
and to functions that may be target to indirect calls.

Type identifiers will be used by the back-end to construct the call
graph section to precisely represent the possible targets for indirect calls.
The type information is deliberately pulled from the front-end to have extra
precision since some type information is lost at IR, and to ensure
consistent type identifiers between object files compiled at different
times (as C/C++ standards require language-level types to match).

Original RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-June/151044.html
Updated RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html


Patch is 29.79 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/87573.diff

16 Files Affected:

  • (modified) clang/lib/CodeGen/CGCall.cpp (+22)
  • (modified) clang/lib/CodeGen/CGClass.cpp (+8-1)
  • (modified) clang/lib/CodeGen/CGExpr.cpp (+5)
  • (modified) clang/lib/CodeGen/CGExprCXX.cpp (+29-5)
  • (modified) clang/lib/CodeGen/CGObjCMac.cpp (+8)
  • (modified) clang/lib/CodeGen/CodeGenModule.cpp (+29-5)
  • (modified) clang/lib/CodeGen/CodeGenModule.h (+4)
  • (added) clang/test/CodeGen/call-graph-section-1.cpp (+110)
  • (added) clang/test/CodeGen/call-graph-section-2.cpp (+95)
  • (added) clang/test/CodeGen/call-graph-section-3.cpp (+52)
  • (added) clang/test/CodeGen/call-graph-section.c (+85)
  • (modified) llvm/include/llvm/IR/LLVMContext.h (+1)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+6-4)
  • (modified) llvm/lib/IR/LLVMContext.cpp (+5)
  • (modified) llvm/lib/IR/Verifier.cpp (+5-2)
  • (modified) llvm/test/Verifier/operand-bundles.ll (+13)
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index f12765b826935b..51133b0a9d47cb 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -5681,6 +5681,28 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
   AllocAlignAttrEmitter AllocAlignAttrEmitter(*this, TargetDecl, CallArgs);
   Attrs = AllocAlignAttrEmitter.TryEmitAsCallSiteAttribute(Attrs);
 
+  if (CGM.getCodeGenOpts().CallGraphSection) {
+    // FIXME: create operand bundle only for indirect calls, not for all
+
+    assert((TargetDecl && TargetDecl->getFunctionType() ||
+            Callee.getAbstractInfo().getCalleeFunctionProtoType()) &&
+           "cannot find callsite type");
+
+    QualType CST;
+    if (TargetDecl && TargetDecl->getFunctionType())
+      CST = QualType(TargetDecl->getFunctionType(), 0);
+    else if (const auto *FPT =
+                 Callee.getAbstractInfo().getCalleeFunctionProtoType())
+      CST = QualType(FPT, 0);
+
+    if (!CST.isNull()) {
+      auto *TypeIdMD = CGM.CreateMetadataIdentifierGeneralized(CST);
+      auto *TypeIdMDVal =
+          llvm::MetadataAsValue::get(getLLVMContext(), TypeIdMD);
+      BundleList.emplace_back("type", TypeIdMDVal);
+    }
+  }
+
   // Emit the actual call/invoke instruction.
   llvm::CallBase *CI;
   if (!InvokeDest) {
diff --git a/clang/lib/CodeGen/CGClass.cpp b/clang/lib/CodeGen/CGClass.cpp
index 8c1c8ee455d2e6..2c347bd4359265 100644
--- a/clang/lib/CodeGen/CGClass.cpp
+++ b/clang/lib/CodeGen/CGClass.cpp
@@ -2247,7 +2247,14 @@ void CodeGenFunction::EmitCXXConstructorCall(const CXXConstructorDecl *D,
   const CGFunctionInfo &Info = CGM.getTypes().arrangeCXXConstructorCall(
       Args, D, Type, ExtraArgs.Prefix, ExtraArgs.Suffix, PassPrototypeArgs);
   CGCallee Callee = CGCallee::forDirect(CalleePtr, GlobalDecl(D, Type));
-  EmitCall(Info, Callee, ReturnValueSlot(), Args, nullptr, false, Loc);
+  llvm::CallBase *CallOrInvoke = nullptr;
+  EmitCall(Info, Callee, ReturnValueSlot(), Args, &CallOrInvoke, false, Loc);
+
+  // Set type identifier metadata of indirect calls for call graph section.
+  if (CGM.getCodeGenOpts().CallGraphSection && CallOrInvoke &&
+      CallOrInvoke->isIndirectCall())
+    CGM.CreateFunctionTypeMetadataForIcall(D->getType(), CallOrInvoke);
+
 
   // Generate vtable assumptions if we're constructing a complete object
   // with a vtable.  We don't do this for base subobjects for two reasons:
diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index 54432353e7420d..467a7cfe50e790 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -5953,6 +5953,11 @@ RValue CodeGenFunction::EmitCall(QualType CalleeType, const CGCallee &OrigCallee
     }
   }
 
+  // Set type identifier metadata of indirect calls for call graph section.
+  if (CGM.getCodeGenOpts().CallGraphSection && CallOrInvoke &&
+      CallOrInvoke->isIndirectCall())
+    CGM.CreateFunctionTypeMetadataForIcall(QualType(FnType, 0), CallOrInvoke);
+
   return Call;
 }
 
diff --git a/clang/lib/CodeGen/CGExprCXX.cpp b/clang/lib/CodeGen/CGExprCXX.cpp
index a4fb673284ceca..4cbffd7073867f 100644
--- a/clang/lib/CodeGen/CGExprCXX.cpp
+++ b/clang/lib/CodeGen/CGExprCXX.cpp
@@ -93,9 +93,17 @@ RValue CodeGenFunction::EmitCXXMemberOrOperatorCall(
       *this, MD, This, ImplicitParam, ImplicitParamTy, CE, Args, RtlArgs);
   auto &FnInfo = CGM.getTypes().arrangeCXXMethodCall(
       Args, FPT, CallInfo.ReqArgs, CallInfo.PrefixSize);
-  return EmitCall(FnInfo, Callee, ReturnValue, Args, nullptr,
+  llvm::CallBase *CallOrInvoke = nullptr;        
+  auto Call = EmitCall(FnInfo, Callee, ReturnValue, Args, &CallOrInvoke,
                   CE && CE == MustTailCall,
                   CE ? CE->getExprLoc() : SourceLocation());
+  
+  // Set type identifier metadata of indirect calls for call graph section.
+  if (CGM.getCodeGenOpts().CallGraphSection && CallOrInvoke &&
+      CallOrInvoke->isIndirectCall())
+    CGM.CreateFunctionTypeMetadataForIcall(MD->getType(), CallOrInvoke);
+
+  return Call;                    
 }
 
 RValue CodeGenFunction::EmitCXXDestructorCall(
@@ -119,9 +127,17 @@ RValue CodeGenFunction::EmitCXXDestructorCall(
   CallArgList Args;
   commonEmitCXXMemberOrOperatorCall(*this, Dtor, This, ImplicitParam,
                                     ImplicitParamTy, CE, Args, nullptr);
-  return EmitCall(CGM.getTypes().arrangeCXXStructorDeclaration(Dtor), Callee,
-                  ReturnValueSlot(), Args, nullptr, CE && CE == MustTailCall,
+  llvm::CallBase *CallOrInvoke = nullptr;
+  auto Call = EmitCall(CGM.getTypes().arrangeCXXStructorDeclaration(Dtor), Callee,
+                  ReturnValueSlot(), Args, &CallOrInvoke, CE && CE == MustTailCall,
                   CE ? CE->getExprLoc() : SourceLocation{});
+
+  // Set type identifier metadata of indirect calls for call graph section.
+  if (CGM.getCodeGenOpts().CallGraphSection && CallOrInvoke &&
+      CallOrInvoke->isIndirectCall())
+    CGM.CreateFunctionTypeMetadataForIcall(DtorDecl->getType(), CallOrInvoke);
+
+  return Call;
 }
 
 RValue CodeGenFunction::EmitCXXPseudoDestructorExpr(
@@ -482,10 +498,18 @@ CodeGenFunction::EmitCXXMemberPointerCallExpr(const CXXMemberCallExpr *E,
 
   // And the rest of the call args
   EmitCallArgs(Args, FPT, E->arguments());
-  return EmitCall(CGM.getTypes().arrangeCXXMethodCall(Args, FPT, required,
+  llvm::CallBase *CallOrInvoke = nullptr;
+  auto Call = EmitCall(CGM.getTypes().arrangeCXXMethodCall(Args, FPT, required,
                                                       /*PrefixSize=*/0),
-                  Callee, ReturnValue, Args, nullptr, E == MustTailCall,
+                  Callee, ReturnValue, Args, &CallOrInvoke, E == MustTailCall,
                   E->getExprLoc());
+
+  // Set type identifier metadata of indirect calls for call graph section.
+  if (CGM.getCodeGenOpts().CallGraphSection && CallOrInvoke &&
+      CallOrInvoke->isIndirectCall())
+    CGM.CreateFunctionTypeMetadataForIcall(QualType(FPT, 0), CallOrInvoke);
+
+  return Call;                  
 }
 
 RValue
diff --git a/clang/lib/CodeGen/CGObjCMac.cpp b/clang/lib/CodeGen/CGObjCMac.cpp
index 8a599c10e1caf1..291f5aff89ec22 100644
--- a/clang/lib/CodeGen/CGObjCMac.cpp
+++ b/clang/lib/CodeGen/CGObjCMac.cpp
@@ -2220,6 +2220,14 @@ CGObjCCommonMac::EmitMessageSend(CodeGen::CodeGenFunction &CGF,
   RValue rvalue = CGF.EmitCall(MSI.CallInfo, Callee, Return, ActualArgs,
                                &CallSite);
 
+  // Set type identifier metadata of indirect calls for call graph section.
+  if (CGM.getCodeGenOpts().CallGraphSection && Method && CallSite &&
+      CallSite->isIndirectCall()) {
+    if (const auto *FnType = Method->getFunctionType()) {
+      CGM.CreateFunctionTypeMetadataForIcall(QualType(FnType, 0), CallSite);
+    }
+  }
+
   // Mark the call as noreturn if the method is marked noreturn and the
   // receiver cannot be null.
   if (Method && Method->hasAttr<NoReturnAttr>() && !ReceiverCanBeNull) {
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index 00b3bfcaa0bc25..e19bbee996f582 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -2486,8 +2486,8 @@ void CodeGenModule::SetLLVMFunctionAttributesForDefinition(const Decl *D,
 
   // In the cross-dso CFI mode with canonical jump tables, we want !type
   // attributes on definitions only.
-  if (CodeGenOpts.SanitizeCfiCrossDso &&
-      CodeGenOpts.SanitizeCfiCanonicalJumpTables) {
+  if ((CodeGenOpts.SanitizeCfiCrossDso &&
+      CodeGenOpts.SanitizeCfiCanonicalJumpTables) || CodeGenOpts.CallGraphSection) {
     if (auto *FD = dyn_cast<FunctionDecl>(D)) {
       // Skip available_externally functions. They won't be codegen'ed in the
       // current module anyway.
@@ -2677,7 +2677,17 @@ static void setLinkageForGV(llvm::GlobalValue *GV, const NamedDecl *ND) {
 
 void CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
                                                        llvm::Function *F) {
-  // Only if we are checking indirect calls.
+  bool EmittedMDIdGeneralized = false;
+  if (CodeGenOpts.CallGraphSection &&
+      (!F->hasLocalLinkage() ||
+       F->getFunction().hasAddressTaken(nullptr, /* IgnoreCallbackUses */ true,
+                                        /* IgnoreAssumeLikeCalls */ true,
+                                        /* IgnoreLLVMUsed */ false))) {
+    F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType()));
+    EmittedMDIdGeneralized = true;
+  }
+
+  // Add additional metadata only if we are checking indirect calls with CFI.
   if (!LangOpts.Sanitize.has(SanitizerKind::CFIICall))
     return;
 
@@ -2688,7 +2698,9 @@ void CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
 
   llvm::Metadata *MD = CreateMetadataIdentifierForType(FD->getType());
   F->addTypeMetadata(0, MD);
-  F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType()));
+  // Add the generalized identifier if not added already.
+  if (!EmittedMDIdGeneralized)
+    F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType()));
 
   // Emit a hash-based bit set entry for cross-DSO calls.
   if (CodeGenOpts.SanitizeCfiCrossDso)
@@ -2696,6 +2708,17 @@ void CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
       F->addTypeMetadata(0, llvm::ConstantAsMetadata::get(CrossDsoTypeId));
 }
 
+void CodeGenModule::CreateFunctionTypeMetadataForIcall(const QualType &QT,
+                                                       llvm::CallBase *CB) {
+  // Only if needed for call graph section and only for indirect calls.
+  if (!(CodeGenOpts.CallGraphSection && CB && CB->isIndirectCall()))
+    return;
+
+  auto *MD = CreateMetadataIdentifierGeneralized(QT);
+  auto *MDN = llvm::MDNode::get(getLLVMContext(), MD);
+  CB->setMetadata(llvm::LLVMContext::MD_type, MDN);
+}
+
 void CodeGenModule::setKCFIType(const FunctionDecl *FD, llvm::Function *F) {
   llvm::LLVMContext &Ctx = F->getContext();
   llvm::MDBuilder MDB(Ctx);
@@ -2823,7 +2846,8 @@ void CodeGenModule::SetFunctionAttributes(GlobalDecl GD, llvm::Function *F,
   // are non-canonical then we need type metadata in order to produce the local
   // jump table.
   if (!CodeGenOpts.SanitizeCfiCrossDso ||
-      !CodeGenOpts.SanitizeCfiCanonicalJumpTables)
+      !CodeGenOpts.SanitizeCfiCanonicalJumpTables ||
+      CodeGenOpts.CallGraphSection)
     CreateFunctionTypeMetadataForIcall(FD, F);
 
   if (LangOpts.Sanitize.has(SanitizerKind::KCFI))
diff --git a/clang/lib/CodeGen/CodeGenModule.h b/clang/lib/CodeGen/CodeGenModule.h
index 1cc447765e2c97..12ac8d0f57eecc 100644
--- a/clang/lib/CodeGen/CodeGenModule.h
+++ b/clang/lib/CodeGen/CodeGenModule.h
@@ -1473,6 +1473,10 @@ class CodeGenModule : public CodeGenTypeCache {
   void CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
                                           llvm::Function *F);
 
+  /// Create and attach type metadata to the given call.
+  void CreateFunctionTypeMetadataForIcall(const QualType &QT,
+                                          llvm::CallBase *CB);
+
   /// Set type metadata to the given function.
   void setKCFIType(const FunctionDecl *FD, llvm::Function *F);
 
diff --git a/clang/test/CodeGen/call-graph-section-1.cpp b/clang/test/CodeGen/call-graph-section-1.cpp
new file mode 100644
index 00000000000000..25397e94422b7a
--- /dev/null
+++ b/clang/test/CodeGen/call-graph-section-1.cpp
@@ -0,0 +1,110 @@
+// Tests that we assign appropriate identifiers to indirect calls and targets
+// specifically for C++ class and instance methods.
+
+// RUN: %clang_cc1 -triple x86_64-unknown-linux -fcall-graph-section -S \
+// RUN: -emit-llvm -o %t %s
+// RUN: FileCheck --check-prefix=FT %s < %t
+// RUN: FileCheck --check-prefix=CST %s < %t
+
+////////////////////////////////////////////////////////////////////////////////
+// Class definitions (check for indirect target metadata)
+
+class Cls1 {
+public:
+  // FT-DAG: define {{.*}} ptr @_ZN4Cls18receiverEPcPf({{.*}} !type [[F_TCLS1RECEIVER:![0-9]+]]
+  static int *receiver(char *a, float *b) { return 0; }
+};
+
+class Cls2 {
+public:
+  int *(*fp)(char *, float *);
+
+  // FT-DAG: define {{.*}} i32 @_ZN4Cls22f1Ecfd({{.*}} !type [[F_TCLS2F1:![0-9]+]]
+  int f1(char a, float b, double c) { return 0; }
+
+  // FT-DAG: define {{.*}} ptr @_ZN4Cls22f2EPcPfPd({{.*}} !type [[F_TCLS2F2:![0-9]+]]
+  int *f2(char *a, float *b, double *c) { return 0; }
+
+  // FT-DAG: define {{.*}} void @_ZN4Cls22f3E4Cls1({{.*}} !type [[F_TCLS2F3F4:![0-9]+]]
+  void f3(Cls1 a) {}
+
+  // FT-DAG: define {{.*}} void @_ZN4Cls22f4E4Cls1({{.*}} !type [[F_TCLS2F3F4]]
+  void f4(const Cls1 a) {}
+
+  // FT-DAG: define {{.*}} void @_ZN4Cls22f5EP4Cls1({{.*}} !type [[F_TCLS2F5:![0-9]+]]
+  void f5(Cls1 *a) {}
+
+  // FT-DAG: define {{.*}} void @_ZN4Cls22f6EPK4Cls1({{.*}} !type [[F_TCLS2F6:![0-9]+]]
+  void f6(const Cls1 *a) {}
+
+  // FT-DAG: define {{.*}} void @_ZN4Cls22f7ER4Cls1({{.*}} !type [[F_TCLS2F7:![0-9]+]]
+  void f7(Cls1 &a) {}
+
+  // FT-DAG: define {{.*}} void @_ZN4Cls22f8ERK4Cls1({{.*}} !type [[F_TCLS2F8:![0-9]+]]
+  void f8(const Cls1 &a) {}
+
+  // FT-DAG: define {{.*}} void @_ZNK4Cls22f9Ev({{.*}} !type [[F_TCLS2F9:![0-9]+]]
+  void f9() const {}
+};
+
+// FT-DAG: [[F_TCLS1RECEIVER]] = !{i64 0, !"_ZTSFPvS_S_E.generalized"}
+// FT-DAG: [[F_TCLS2F2]]   = !{i64 0, !"_ZTSFPvS_S_S_E.generalized"}
+// FT-DAG: [[F_TCLS2F1]]   = !{i64 0, !"_ZTSFicfdE.generalized"}
+// FT-DAG: [[F_TCLS2F3F4]] = !{i64 0, !"_ZTSFv4Cls1E.generalized"}
+// FT-DAG: [[F_TCLS2F5]]   = !{i64 0, !"_ZTSFvPvE.generalized"}
+// FT-DAG: [[F_TCLS2F6]]   = !{i64 0, !"_ZTSFvPKvE.generalized"}
+// FT-DAG: [[F_TCLS2F7]]   = !{i64 0, !"_ZTSFvR4Cls1E.generalized"}
+// FT-DAG: [[F_TCLS2F8]]   = !{i64 0, !"_ZTSFvRK4Cls1E.generalized"}
+// FT-DAG: [[F_TCLS2F9]]   = !{i64 0, !"_ZTSKFvvE.generalized"}
+
+////////////////////////////////////////////////////////////////////////////////
+// Callsites (check for indirect callsite operand bundles)
+
+// CST-LABEL: define {{.*}} @_Z3foov
+void foo() {
+  Cls2 ObjCls2;
+  ObjCls2.fp = &Cls1::receiver;
+
+  // CST: call noundef ptr %{{.*}} [ "type"(metadata !"_ZTSFPvS_S_E.generalized") ]
+  ObjCls2.fp(0, 0);
+
+  auto fp_f1 = &Cls2::f1;
+  auto fp_f2 = &Cls2::f2;
+  auto fp_f3 = &Cls2::f3;
+  auto fp_f4 = &Cls2::f4;
+  auto fp_f5 = &Cls2::f5;
+  auto fp_f6 = &Cls2::f6;
+  auto fp_f7 = &Cls2::f7;
+  auto fp_f8 = &Cls2::f8;
+  auto fp_f9 = &Cls2::f9;
+
+  Cls2 *ObjCls2Ptr = &ObjCls2;
+  Cls1 Cls1Param;
+
+  // CST: call noundef i32 %{{.*}} [ "type"(metadata !"_ZTSFicfdE.generalized") ]
+  (ObjCls2Ptr->*fp_f1)(0, 0, 0);
+
+  // CST: call noundef ptr %{{.*}} [ "type"(metadata !"_ZTSFPvS_S_S_E.generalized") ]
+  (ObjCls2Ptr->*fp_f2)(0, 0, 0);
+
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFv4Cls1E.generalized") ]
+  (ObjCls2Ptr->*fp_f3)(Cls1Param);
+
+  // CST: call void  %{{.*}} [ "type"(metadata !"_ZTSFv4Cls1E.generalized") ]
+  (ObjCls2Ptr->*fp_f4)(Cls1Param);
+
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFvPvE.generalized") ]
+  (ObjCls2Ptr->*fp_f5)(&Cls1Param);
+
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFvPKvE.generalized") ]
+  (ObjCls2Ptr->*fp_f6)(&Cls1Param);
+
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFvR4Cls1E.generalized") ]
+  (ObjCls2Ptr->*fp_f7)(Cls1Param);
+
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFvRK4Cls1E.generalized") ]
+  (ObjCls2Ptr->*fp_f8)(Cls1Param);
+
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSKFvvE.generalized") ]
+  (ObjCls2Ptr->*fp_f9)();
+}
diff --git a/clang/test/CodeGen/call-graph-section-2.cpp b/clang/test/CodeGen/call-graph-section-2.cpp
new file mode 100644
index 00000000000000..4187faea495bee
--- /dev/null
+++ b/clang/test/CodeGen/call-graph-section-2.cpp
@@ -0,0 +1,95 @@
+// Tests that we assign appropriate identifiers to indirect calls and targets
+// specifically for C++ templates.
+
+// RUN: %clang_cc1 -triple x86_64-unknown-linux -fcall-graph-section -S \
+// RUN: -emit-llvm -o %t %s
+// RUN: FileCheck --check-prefix=FT    %s < %t
+// RUN: FileCheck --check-prefix=CST   %s < %t
+// RUN: FileCheck --check-prefix=CHECK %s < %t
+
+////////////////////////////////////////////////////////////////////////////////
+// Class definitions and template classes (check for indirect target metadata)
+
+class Cls1 {};
+
+// Cls2 is instantiated with T=Cls1 in foo(). Following checks are for this
+// instantiation.
+template <class T>
+class Cls2 {
+public:
+  // FT: define {{.*}} void @_ZN4Cls2I4Cls1E2f1Ev({{.*}} !type [[F_TCLS2F1:![0-9]+]]
+  void f1() {}
+
+  // FT: define {{.*}} void @_ZN4Cls2I4Cls1E2f2ES0_({{.*}} !type [[F_TCLS2F2:![0-9]+]]
+  void f2(T a) {}
+
+  // FT: define {{.*}} void @_ZN4Cls2I4Cls1E2f3EPS0_({{.*}} !type [[F_TCLS2F3:![0-9]+]]
+  void f3(T *a) {}
+
+  // FT: define {{.*}} void @_ZN4Cls2I4Cls1E2f4EPKS0_({{.*}} !type [[F_TCLS2F4:![0-9]+]]
+  void f4(const T *a) {}
+
+  // FT: define {{.*}} void @_ZN4Cls2I4Cls1E2f5ERS0_({{.*}} !type [[F_TCLS2F5:![0-9]+]]
+  void f5(T &a) {}
+
+  // FT: define {{.*}} void @_ZN4Cls2I4Cls1E2f6ERKS0_({{.*}} !type [[F_TCLS2F6:![0-9]+]]
+  void f6(const T &a) {}
+
+  // Mixed type function pointer member
+  T *(*fp)(T a, T *b, const T *c, T &d, const T &e);
+};
+
+// FT-DAG: [[F_TCLS2F1]] = !{i64 0, !"_ZTSFvvE.generalized"}
+// FT-DAG: [[F_TCLS2F2]] = !{i64 0, !"_ZTSFv4Cls1E.generalized"}
+// FT-DAG: [[F_TCLS2F3]] = !{i64 0, !"_ZTSFvPvE.generalized"}
+// FT-DAG: [[F_TCLS2F4]] = !{i64 0, !"_ZTSFvPKvE.generalized"}
+// FT-DAG: [[F_TCLS2F5]] = !{i64 0, !"_ZTSFvR4Cls1E.generalized"}
+// FT-DAG: [[F_TCLS2F6]] = !{i64 0, !"_ZTSFvRK4Cls1E.generalized"}
+
+////////////////////////////////////////////////////////////////////////////////
+// Callsites (check for indirect callsite operand bundles)
+
+template <class T>
+T *T_func(T a, T *b, const T *c, T &d, const T &e) { return b; }
+
+// CST-LABEL: define {{.*}} @_Z3foov
+void foo() {
+  // Methods for Cls2<Cls1> is checked above within the template description.
+  Cls2<Cls1> Obj;
+
+  // CHECK-DAG: define {{.*}} @_Z6T_funcI4Cls1EPT_S1_S2_PKS1_RS1_RS3_({{.*}} !type [[F_TFUNC_CLS1:![0-9]+]]
+  // CHECK-DAG: [[F_TFUNC_CLS1]] = !{i64 0, !"_ZTSFPv4Cls1S_PKvRS0_RKS0_E.generalized"}
+  Obj.fp = T_func<Cls1>;
+  Cls1 Cls1Obj;
+
+  // CST: call noundef ptr %{{.*}} [ "type"(metadata !"_ZTSFPv4Cls1S_PKvRS0_RKS0_E.generalized") ]
+  Obj.fp(Cls1Obj, &Cls1Obj, &Cls1Obj, Cls1Obj, Cls1Obj);
+
+  // Make indirect calls to Cls2's member methods
+  auto fp_f1 = &Cls2<Cls1>::f1;
+  auto fp_f2 = &Cls2<Cls1>::f2;
+  auto fp_f3 = &Cls2<Cls1>::f3;
+  auto fp_f4 = &Cls2<Cls1>::f4;
+  auto fp_f5 = &Cls2<Cls1>::f5;
+  auto fp_f6 = &Cls2<Cls1>::f6;
+
+  auto *Obj2Ptr = &Obj;
+
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFvvE.generalized") ]
+  (Obj2Ptr->*fp_f1)();
+
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFv4Cls1E.generalized") ]
+  (Obj2Ptr->*fp_f2)(Cls1Obj);
+
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFvPvE.generalized") ]
+  (Obj2Ptr->*fp_f3)(&Cls1Obj);
+
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFvPKvE.generalized") ]
+  (Obj2Ptr->*fp_f4)(&Cls1Obj);
+
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFvR4Cls1E.generalized") ]
+  (Obj2Ptr->*fp_f5)(Cls1Obj);
+
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFvRK4Cls1E.generalized") ]
+  (Obj2Ptr->*fp_f6)(Cls1Obj);
+}
diff --git a/clang/test/CodeGen/call-graph-section-3.cpp b/clang/test/CodeGen/call-graph-section-3.cpp
new file mode 100644
index 00000000000000..77bb110c664ef6
--- /dev/null
+++ b/clang/test/CodeGen/call-graph-section-3.cpp
@@ -0,0 +1,52 @@
+// Tests that we assign appropriate identifiers to indirect calls and targets
+// specifically for virtual methods.
+
+// RUN: %clang_cc1 -triple x86_64-unknown-linux -fcall-graph-section -S \
+// RUN: -emit-llvm -o %t %s
+// RUN: FileCheck --check-prefix=FT %s < %t
+// RUN: FileCheck --check-prefix=CST %s < %t
+
+////////////////////////////////////////////////////////////////////////////////
+// Class definitions (check for indirect target metadata)
+
+class Base {
+public:
+  // FT-DAG: define {{.*}} @_ZN4Base2vfEPc({{.*}} !type [[F_TVF:![0-9]+]]
+  virtual int vf(char *a) { return 0; };
+};
+
+class Derived : public Base {
+public:
+  // FT-DAG: define {{.*}} @_ZN7Derived2vfEPc({{.*}} !type [[F_TVF]]
+  int vf(char *a) override { return 1; }...
[truncated]

clang/lib/CodeGen/CodeGenModule.cpp Outdated Show resolved Hide resolved
clang/lib/CodeGen/CodeGenModule.cpp Outdated Show resolved Hide resolved
clang/lib/CodeGen/CGExprCXX.cpp Outdated Show resolved Hide resolved
clang/lib/CodeGen/CGObjCMac.cpp Outdated Show resolved Hide resolved
Prabhuk pushed a commit to Prabhuk/llvm-project that referenced this pull request Apr 19, 2024
…argets

Create and add generalized type identifier metadata to indirect calls,
and to functions that may be target to indirect calls.

Type identifiers will be used by the back-end to construct the call
graph section to precisely represent the possible targets for indirect calls.
The type information is deliberately pulled from the front-end to have extra
precision since some type information is lost at IR, and to ensure
consistent type identifiers between object files compiled at different
times (as C/C++ standards require language-level types to match).

Original RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-June/151044.html
Updated RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html

Reviewed By:
morehouse

Differential Revision: https://reviews.llvm.org/D105909?id=358327

Pull Request: llvm#87573
Prabhuk and others added 3 commits April 22, 2024 11:34
Cleaner if checks.

Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>
Update the comments as suggested.

Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>
…ion. Removed type id emitting for Objective C.

Created using spr 1.3.6-beta.1
@Prabhuk Prabhuk requested review from ilovepi and petrhosek April 24, 2024 17:25
Created using spr 1.3.6-beta.1
clang/lib/CodeGen/CGCall.cpp Outdated Show resolved Hide resolved
Created using spr 1.3.6-beta.1
Copy link
Contributor

@ilovepi ilovepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you split out the LLVM parts from the clang parts? The LLVM bits should land first, and then clang can make use of them in a later PR. This will also simplify review, since we can focus on the LLVM vs. Clang implementations in isolation.

clang/lib/CodeGen/CGCall.cpp Outdated Show resolved Hide resolved
clang/lib/CodeGen/CGCall.cpp Outdated Show resolved Hide resolved
clang/lib/CodeGen/CGExpr.cpp Outdated Show resolved Hide resolved
clang/lib/CodeGen/CGCall.cpp Outdated Show resolved Hide resolved
necipfazil and others added 4 commits November 13, 2024 17:00
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
@Prabhuk Prabhuk changed the title [clang][CallGraphSection] Add type id metadata to indirect call and targets [llvm][CallGraphSection] Handle type id metadata Nov 20, 2024
@Prabhuk
Copy link
Contributor Author

Prabhuk commented Nov 20, 2024

Can you split out the LLVM parts from the clang parts? The LLVM bits should land first, and then clang can make use of them in a later PR. This will also simplify review, since we can focus on the LLVM vs. Clang implementations in isolation.

The Clang bits are moved to #117036

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit message should be more specific, this is an operand bundle not metadata?

@Prabhuk Prabhuk changed the title [llvm][CallGraphSection] Handle type id metadata [llvm] Introduce type id operand bundle Dec 10, 2024
Created using spr 1.3.6-beta.1
@nikic
Copy link
Contributor

nikic commented Mar 19, 2025

I still don't think that using operand bundles for this is appropriate. If you take a look at all the other operand bundles (https://llvm.org/docs/LangRef.html#operand-bundles) they consistently have some kind of impact on the semantics of the call. Also keep in mind that the default implication of an operand bundle is that it can read and write arbitrary memory. This is not going to be super relevant in this context because indirect calls already have unknown effects, but would be relevant if devirtualization happens and the operand bundle is still attached.

If you properly design the metadata, I expect that preservation through optimization will be quite robust. There just aren't that many optimizations that act on indirect calls. The main one is merging of multiple call-sites via hoist/sink, and metadata can handle that.

I'll also add that no matter what design you use, you still need to have some kind of graceful fallback for the case where the callee_type bundle/metadata is missing, e.g. because you're linking in an object from a frontend that doesn't support it.

cc @efriedma-quic for a second opinion on what to do here.

Prabhuk added a commit to Prabhuk/llvm-project that referenced this pull request Apr 15, 2025
For each indirect callsite, we would like to emit an operand bundle that
holds the inteded type identifier of the callee. This information will
be used to construct callgraph section in subsequent PRs.

The type information will be pulled from the front-end to have extra
precision since some type information is lost at IR, and to ensure
consistent type identifiers between object files compiled at different
times (as C/C++ standards require language-level types to match).

Original RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-June/151044.html
Updated RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html

Pull Request: llvm#87573
Created using spr 1.3.6-beta.1
@Prabhuk Prabhuk changed the title [llvm] Introduce callee_type operand bundle [llvm] Introduce callee_type metadata Apr 19, 2025
Prabhuk added a commit to Prabhuk/llvm-project that referenced this pull request Apr 22, 2025
Introduce `callee_type` metadata which will be attached to the indirect
call instructions.

The `callee_type` metadata will be used to generate `.callgraph` section
described in this RFC:
https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html

Pull Request: llvm#87573
Prabhuk added 2 commits April 23, 2025 01:01
Created using spr 1.3.6-beta.1
llvm/test/Transforms/InstCombine/callee-type-metadata.ll Outdated Show resolved Hide resolved
llvm/test/Transforms/InstCombine/callee-type-metadata.ll Outdated Show resolved Hide resolved
llvm/test/Transforms/InstCombine/callee-type-metadata.ll Outdated Show resolved Hide resolved
Prabhuk added 2 commits April 23, 2025 23:38
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
llvm/test/Transforms/InstCombine/callee-type-metadata.ll Outdated Show resolved Hide resolved
Prabhuk added 2 commits April 28, 2025 22:48
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
llvm/lib/IR/Verifier.cpp Outdated Show resolved Hide resolved
llvm/lib/IR/Metadata.cpp Outdated Show resolved Hide resolved
llvm/lib/IR/Metadata.cpp Outdated Show resolved Hide resolved
llvm/lib/IR/Verifier.cpp Outdated Show resolved Hide resolved
Prabhuk added 2 commits May 5, 2025 23:12
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
@Prabhuk Prabhuk requested a review from nikic May 10, 2025 00:34
@Prabhuk
Copy link
Contributor Author

Prabhuk commented May 10, 2025

@nikic - Can you please check if there's anything else is needed to land this change? Thank you.

Prabhuk added a commit to Prabhuk/llvm-project that referenced this pull request May 10, 2025
Introduce `callee_type` metadata which will be attached to the indirect
call instructions.

The `callee_type` metadata will be used to generate `.callgraph` section
described in this RFC:
https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html

Pull Request: llvm#87573
llvm/docs/CalleeTypeMetadata.rst Outdated Show resolved Hide resolved
llvm/docs/CalleeTypeMetadata.rst Show resolved Hide resolved
Prabhuk added 2 commits May 14, 2025 02:06
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
llvm/test/Verifier/callee-type-metadata.ll Outdated Show resolved Hide resolved
llvm/test/Verifier/callee-type-metadata.ll Outdated Show resolved Hide resolved
llvm/test/Transforms/Inline/drop-callee-type-metadata.ll Outdated Show resolved Hide resolved
llvm/test/Transforms/Inline/drop-callee-type-metadata.ll Outdated Show resolved Hide resolved
Created using spr 1.3.6-beta.1
@@ -1252,6 +1252,12 @@ class MDNode : public Metadata {
bool isReplaceable() const { return isTemporary() || isAlwaysReplaceable(); }
bool isAlwaysReplaceable() const { return getMetadataID() == DIAssignIDKind; }

bool hasGeneralizedMDString() const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks too specific to be part of the main Metadata API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Is there a place that you'd recommend to move this to?
In my patches up the stack I have use for this in llvm/lib/CodeGen/AsmPrinter and clang/lib/CodeGen besides the use in Verifier as part of this patch.

MDNode *MDNode::getMergedCalleeTypeMetadata(LLVMContext &Ctx, MDNode *A,
MDNode *B) {
SmallVector<Metadata *, 8> AB;
SmallSet<Metadata *, 8> MergedCallees;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
SmallSet<Metadata *, 8> MergedCallees;
SmallPtrSet<Metadata *, 8> MergedCallees;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thank you.

MDNode *B) {
SmallVector<Metadata *, 8> AB;
SmallSet<Metadata *, 8> MergedCallees;
auto AddUniqueCallees = [&AB, &MergedCallees](llvm::MDNode *N) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto AddUniqueCallees = [&AB, &MergedCallees](llvm::MDNode *N) {
auto AddUniqueCallees = [&AB, &MergedCallees](MDNode *N) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thank you.

};
AddUniqueCallees(A);
AddUniqueCallees(B);
return llvm::MDNode::get(Ctx, AB);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return llvm::MDNode::get(Ctx, AB);
return MDNode::get(Ctx, AB);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thank you.

auto *TypeMD = cast<MDNode>(Op.get());
Check(TypeMD->hasGeneralizedMDString(),
"Only generalized type metadata can be part of the callee_type "
"metadata list");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CalleeTypeMetadata.rst could be clearer on this requirement. Generalizations are mentioned, but not what this means for the metadata.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a minor update to the RST file. Please take a look.

// Drop unnecessary callee_type metadata from calls that were converted
// into direct calls.
if (Call.getMetadata(LLVMContext::MD_callee_type) && !Call.isIndirectCall())
Call.setMetadata(LLVMContext::MD_callee_type, nullptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should indicate IR change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

!2 = !{i64 0, !"_ZTSFicE.generalized"}
!3 = !{i64 0, !"_ZTSFicE"}
!4 = !{!3}
!8 = !{i64 0, !"_ZTSFicE.generalized"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there's a bunch of unused metadata here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thank you.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way FileCheck works this will pass even if the metadata is not dropped. You could try whether FileCheck --match-full-lines works. Otherwise you could use explicit CHECK-NOT or {{$}}.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for catching this! Using {{$}} in the updated patch. I had CHECK-NOT in my previous version of this patch which the reviewers pointed out to be fragile. --match-full-lines did not work for me. PTAL.

case LLVMContext::MD_callee_type:
if (!AAOnly)
K->setMetadata(Kind, MDNode::getMergedCalleeTypeMetadata(
K->getContext(), KMD, JMD));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code appears to be untested. Check out existing metadata tests in SimplifyCFG.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. Added a new test file to cover the getMergedCalleeTypeMetadata cases. PTAL.

Prabhuk added 2 commits June 11, 2025 00:16
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category llvm:ir llvm:SelectionDAG SelectionDAGISel as well
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.