Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

[llvm] Introduce callee_type metadata #87573

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 28 commits into
base: users/Prabhuk/sprmain.clangcallgraphsection-add-type-id-metadata-to-indirect-call-and-targets
Choose a base branch
Loading
from

Conversation

Prabhuk
Copy link
Contributor

@Prabhuk Prabhuk commented Apr 3, 2024

Introduce callee_type metadata which will be attached to the indirect
call instructions.

The callee_type metadata will be used to generate .callgraph section
described in this RFC:
https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html

Created using spr 1.3.6-beta.1
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. llvm:SelectionDAG SelectionDAGISel as well llvm:ir labels Apr 3, 2024
@llvmbot
Copy link
Member

llvmbot commented Apr 3, 2024

@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-llvm-ir

Author: Prabhuk (Prabhuk)

Changes

Create and add generalized type identifier metadata to indirect calls,
and to functions that may be target to indirect calls.

Type identifiers will be used by the back-end to construct the call
graph section to precisely represent the possible targets for indirect calls.
The type information is deliberately pulled from the front-end to have extra
precision since some type information is lost at IR, and to ensure
consistent type identifiers between object files compiled at different
times (as C/C++ standards require language-level types to match).

Original RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-June/151044.html
Updated RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html


Patch is 29.79 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/87573.diff

16 Files Affected:

  • (modified) clang/lib/CodeGen/CGCall.cpp (+22)
  • (modified) clang/lib/CodeGen/CGClass.cpp (+8-1)
  • (modified) clang/lib/CodeGen/CGExpr.cpp (+5)
  • (modified) clang/lib/CodeGen/CGExprCXX.cpp (+29-5)
  • (modified) clang/lib/CodeGen/CGObjCMac.cpp (+8)
  • (modified) clang/lib/CodeGen/CodeGenModule.cpp (+29-5)
  • (modified) clang/lib/CodeGen/CodeGenModule.h (+4)
  • (added) clang/test/CodeGen/call-graph-section-1.cpp (+110)
  • (added) clang/test/CodeGen/call-graph-section-2.cpp (+95)
  • (added) clang/test/CodeGen/call-graph-section-3.cpp (+52)
  • (added) clang/test/CodeGen/call-graph-section.c (+85)
  • (modified) llvm/include/llvm/IR/LLVMContext.h (+1)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+6-4)
  • (modified) llvm/lib/IR/LLVMContext.cpp (+5)
  • (modified) llvm/lib/IR/Verifier.cpp (+5-2)
  • (modified) llvm/test/Verifier/operand-bundles.ll (+13)
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index f12765b826935b..51133b0a9d47cb 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -5681,6 +5681,28 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
   AllocAlignAttrEmitter AllocAlignAttrEmitter(*this, TargetDecl, CallArgs);
   Attrs = AllocAlignAttrEmitter.TryEmitAsCallSiteAttribute(Attrs);
 
+  if (CGM.getCodeGenOpts().CallGraphSection) {
+    // FIXME: create operand bundle only for indirect calls, not for all
+
+    assert((TargetDecl && TargetDecl->getFunctionType() ||
+            Callee.getAbstractInfo().getCalleeFunctionProtoType()) &&
+           "cannot find callsite type");
+
+    QualType CST;
+    if (TargetDecl && TargetDecl->getFunctionType())
+      CST = QualType(TargetDecl->getFunctionType(), 0);
+    else if (const auto *FPT =
+                 Callee.getAbstractInfo().getCalleeFunctionProtoType())
+      CST = QualType(FPT, 0);
+
+    if (!CST.isNull()) {
+      auto *TypeIdMD = CGM.CreateMetadataIdentifierGeneralized(CST);
+      auto *TypeIdMDVal =
+          llvm::MetadataAsValue::get(getLLVMContext(), TypeIdMD);
+      BundleList.emplace_back("type", TypeIdMDVal);
+    }
+  }
+
   // Emit the actual call/invoke instruction.
   llvm::CallBase *CI;
   if (!InvokeDest) {
diff --git a/clang/lib/CodeGen/CGClass.cpp b/clang/lib/CodeGen/CGClass.cpp
index 8c1c8ee455d2e6..2c347bd4359265 100644
--- a/clang/lib/CodeGen/CGClass.cpp
+++ b/clang/lib/CodeGen/CGClass.cpp
@@ -2247,7 +2247,14 @@ void CodeGenFunction::EmitCXXConstructorCall(const CXXConstructorDecl *D,
   const CGFunctionInfo &Info = CGM.getTypes().arrangeCXXConstructorCall(
       Args, D, Type, ExtraArgs.Prefix, ExtraArgs.Suffix, PassPrototypeArgs);
   CGCallee Callee = CGCallee::forDirect(CalleePtr, GlobalDecl(D, Type));
-  EmitCall(Info, Callee, ReturnValueSlot(), Args, nullptr, false, Loc);
+  llvm::CallBase *CallOrInvoke = nullptr;
+  EmitCall(Info, Callee, ReturnValueSlot(), Args, &CallOrInvoke, false, Loc);
+
+  // Set type identifier metadata of indirect calls for call graph section.
+  if (CGM.getCodeGenOpts().CallGraphSection && CallOrInvoke &&
+      CallOrInvoke->isIndirectCall())
+    CGM.CreateFunctionTypeMetadataForIcall(D->getType(), CallOrInvoke);
+
 
   // Generate vtable assumptions if we're constructing a complete object
   // with a vtable.  We don't do this for base subobjects for two reasons:
diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index 54432353e7420d..467a7cfe50e790 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -5953,6 +5953,11 @@ RValue CodeGenFunction::EmitCall(QualType CalleeType, const CGCallee &OrigCallee
     }
   }
 
+  // Set type identifier metadata of indirect calls for call graph section.
+  if (CGM.getCodeGenOpts().CallGraphSection && CallOrInvoke &&
+      CallOrInvoke->isIndirectCall())
+    CGM.CreateFunctionTypeMetadataForIcall(QualType(FnType, 0), CallOrInvoke);
+
   return Call;
 }
 
diff --git a/clang/lib/CodeGen/CGExprCXX.cpp b/clang/lib/CodeGen/CGExprCXX.cpp
index a4fb673284ceca..4cbffd7073867f 100644
--- a/clang/lib/CodeGen/CGExprCXX.cpp
+++ b/clang/lib/CodeGen/CGExprCXX.cpp
@@ -93,9 +93,17 @@ RValue CodeGenFunction::EmitCXXMemberOrOperatorCall(
       *this, MD, This, ImplicitParam, ImplicitParamTy, CE, Args, RtlArgs);
   auto &FnInfo = CGM.getTypes().arrangeCXXMethodCall(
       Args, FPT, CallInfo.ReqArgs, CallInfo.PrefixSize);
-  return EmitCall(FnInfo, Callee, ReturnValue, Args, nullptr,
+  llvm::CallBase *CallOrInvoke = nullptr;        
+  auto Call = EmitCall(FnInfo, Callee, ReturnValue, Args, &CallOrInvoke,
                   CE && CE == MustTailCall,
                   CE ? CE->getExprLoc() : SourceLocation());
+  
+  // Set type identifier metadata of indirect calls for call graph section.
+  if (CGM.getCodeGenOpts().CallGraphSection && CallOrInvoke &&
+      CallOrInvoke->isIndirectCall())
+    CGM.CreateFunctionTypeMetadataForIcall(MD->getType(), CallOrInvoke);
+
+  return Call;                    
 }
 
 RValue CodeGenFunction::EmitCXXDestructorCall(
@@ -119,9 +127,17 @@ RValue CodeGenFunction::EmitCXXDestructorCall(
   CallArgList Args;
   commonEmitCXXMemberOrOperatorCall(*this, Dtor, This, ImplicitParam,
                                     ImplicitParamTy, CE, Args, nullptr);
-  return EmitCall(CGM.getTypes().arrangeCXXStructorDeclaration(Dtor), Callee,
-                  ReturnValueSlot(), Args, nullptr, CE && CE == MustTailCall,
+  llvm::CallBase *CallOrInvoke = nullptr;
+  auto Call = EmitCall(CGM.getTypes().arrangeCXXStructorDeclaration(Dtor), Callee,
+                  ReturnValueSlot(), Args, &CallOrInvoke, CE && CE == MustTailCall,
                   CE ? CE->getExprLoc() : SourceLocation{});
+
+  // Set type identifier metadata of indirect calls for call graph section.
+  if (CGM.getCodeGenOpts().CallGraphSection && CallOrInvoke &&
+      CallOrInvoke->isIndirectCall())
+    CGM.CreateFunctionTypeMetadataForIcall(DtorDecl->getType(), CallOrInvoke);
+
+  return Call;
 }
 
 RValue CodeGenFunction::EmitCXXPseudoDestructorExpr(
@@ -482,10 +498,18 @@ CodeGenFunction::EmitCXXMemberPointerCallExpr(const CXXMemberCallExpr *E,
 
   // And the rest of the call args
   EmitCallArgs(Args, FPT, E->arguments());
-  return EmitCall(CGM.getTypes().arrangeCXXMethodCall(Args, FPT, required,
+  llvm::CallBase *CallOrInvoke = nullptr;
+  auto Call = EmitCall(CGM.getTypes().arrangeCXXMethodCall(Args, FPT, required,
                                                       /*PrefixSize=*/0),
-                  Callee, ReturnValue, Args, nullptr, E == MustTailCall,
+                  Callee, ReturnValue, Args, &CallOrInvoke, E == MustTailCall,
                   E->getExprLoc());
+
+  // Set type identifier metadata of indirect calls for call graph section.
+  if (CGM.getCodeGenOpts().CallGraphSection && CallOrInvoke &&
+      CallOrInvoke->isIndirectCall())
+    CGM.CreateFunctionTypeMetadataForIcall(QualType(FPT, 0), CallOrInvoke);
+
+  return Call;                  
 }
 
 RValue
diff --git a/clang/lib/CodeGen/CGObjCMac.cpp b/clang/lib/CodeGen/CGObjCMac.cpp
index 8a599c10e1caf1..291f5aff89ec22 100644
--- a/clang/lib/CodeGen/CGObjCMac.cpp
+++ b/clang/lib/CodeGen/CGObjCMac.cpp
@@ -2220,6 +2220,14 @@ CGObjCCommonMac::EmitMessageSend(CodeGen::CodeGenFunction &CGF,
   RValue rvalue = CGF.EmitCall(MSI.CallInfo, Callee, Return, ActualArgs,
                                &CallSite);
 
+  // Set type identifier metadata of indirect calls for call graph section.
+  if (CGM.getCodeGenOpts().CallGraphSection && Method && CallSite &&
+      CallSite->isIndirectCall()) {
+    if (const auto *FnType = Method->getFunctionType()) {
+      CGM.CreateFunctionTypeMetadataForIcall(QualType(FnType, 0), CallSite);
+    }
+  }
+
   // Mark the call as noreturn if the method is marked noreturn and the
   // receiver cannot be null.
   if (Method && Method->hasAttr<NoReturnAttr>() && !ReceiverCanBeNull) {
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index 00b3bfcaa0bc25..e19bbee996f582 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -2486,8 +2486,8 @@ void CodeGenModule::SetLLVMFunctionAttributesForDefinition(const Decl *D,
 
   // In the cross-dso CFI mode with canonical jump tables, we want !type
   // attributes on definitions only.
-  if (CodeGenOpts.SanitizeCfiCrossDso &&
-      CodeGenOpts.SanitizeCfiCanonicalJumpTables) {
+  if ((CodeGenOpts.SanitizeCfiCrossDso &&
+      CodeGenOpts.SanitizeCfiCanonicalJumpTables) || CodeGenOpts.CallGraphSection) {
     if (auto *FD = dyn_cast<FunctionDecl>(D)) {
       // Skip available_externally functions. They won't be codegen'ed in the
       // current module anyway.
@@ -2677,7 +2677,17 @@ static void setLinkageForGV(llvm::GlobalValue *GV, const NamedDecl *ND) {
 
 void CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
                                                        llvm::Function *F) {
-  // Only if we are checking indirect calls.
+  bool EmittedMDIdGeneralized = false;
+  if (CodeGenOpts.CallGraphSection &&
+      (!F->hasLocalLinkage() ||
+       F->getFunction().hasAddressTaken(nullptr, /* IgnoreCallbackUses */ true,
+                                        /* IgnoreAssumeLikeCalls */ true,
+                                        /* IgnoreLLVMUsed */ false))) {
+    F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType()));
+    EmittedMDIdGeneralized = true;
+  }
+
+  // Add additional metadata only if we are checking indirect calls with CFI.
   if (!LangOpts.Sanitize.has(SanitizerKind::CFIICall))
     return;
 
@@ -2688,7 +2698,9 @@ void CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
 
   llvm::Metadata *MD = CreateMetadataIdentifierForType(FD->getType());
   F->addTypeMetadata(0, MD);
-  F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType()));
+  // Add the generalized identifier if not added already.
+  if (!EmittedMDIdGeneralized)
+    F->addTypeMetadata(0, CreateMetadataIdentifierGeneralized(FD->getType()));
 
   // Emit a hash-based bit set entry for cross-DSO calls.
   if (CodeGenOpts.SanitizeCfiCrossDso)
@@ -2696,6 +2708,17 @@ void CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
       F->addTypeMetadata(0, llvm::ConstantAsMetadata::get(CrossDsoTypeId));
 }
 
+void CodeGenModule::CreateFunctionTypeMetadataForIcall(const QualType &QT,
+                                                       llvm::CallBase *CB) {
+  // Only if needed for call graph section and only for indirect calls.
+  if (!(CodeGenOpts.CallGraphSection && CB && CB->isIndirectCall()))
+    return;
+
+  auto *MD = CreateMetadataIdentifierGeneralized(QT);
+  auto *MDN = llvm::MDNode::get(getLLVMContext(), MD);
+  CB->setMetadata(llvm::LLVMContext::MD_type, MDN);
+}
+
 void CodeGenModule::setKCFIType(const FunctionDecl *FD, llvm::Function *F) {
   llvm::LLVMContext &Ctx = F->getContext();
   llvm::MDBuilder MDB(Ctx);
@@ -2823,7 +2846,8 @@ void CodeGenModule::SetFunctionAttributes(GlobalDecl GD, llvm::Function *F,
   // are non-canonical then we need type metadata in order to produce the local
   // jump table.
   if (!CodeGenOpts.SanitizeCfiCrossDso ||
-      !CodeGenOpts.SanitizeCfiCanonicalJumpTables)
+      !CodeGenOpts.SanitizeCfiCanonicalJumpTables ||
+      CodeGenOpts.CallGraphSection)
     CreateFunctionTypeMetadataForIcall(FD, F);
 
   if (LangOpts.Sanitize.has(SanitizerKind::KCFI))
diff --git a/clang/lib/CodeGen/CodeGenModule.h b/clang/lib/CodeGen/CodeGenModule.h
index 1cc447765e2c97..12ac8d0f57eecc 100644
--- a/clang/lib/CodeGen/CodeGenModule.h
+++ b/clang/lib/CodeGen/CodeGenModule.h
@@ -1473,6 +1473,10 @@ class CodeGenModule : public CodeGenTypeCache {
   void CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
                                           llvm::Function *F);
 
+  /// Create and attach type metadata to the given call.
+  void CreateFunctionTypeMetadataForIcall(const QualType &QT,
+                                          llvm::CallBase *CB);
+
   /// Set type metadata to the given function.
   void setKCFIType(const FunctionDecl *FD, llvm::Function *F);
 
diff --git a/clang/test/CodeGen/call-graph-section-1.cpp b/clang/test/CodeGen/call-graph-section-1.cpp
new file mode 100644
index 00000000000000..25397e94422b7a
--- /dev/null
+++ b/clang/test/CodeGen/call-graph-section-1.cpp
@@ -0,0 +1,110 @@
+// Tests that we assign appropriate identifiers to indirect calls and targets
+// specifically for C++ class and instance methods.
+
+// RUN: %clang_cc1 -triple x86_64-unknown-linux -fcall-graph-section -S \
+// RUN: -emit-llvm -o %t %s
+// RUN: FileCheck --check-prefix=FT %s < %t
+// RUN: FileCheck --check-prefix=CST %s < %t
+
+////////////////////////////////////////////////////////////////////////////////
+// Class definitions (check for indirect target metadata)
+
+class Cls1 {
+public:
+  // FT-DAG: define {{.*}} ptr @_ZN4Cls18receiverEPcPf({{.*}} !type [[F_TCLS1RECEIVER:![0-9]+]]
+  static int *receiver(char *a, float *b) { return 0; }
+};
+
+class Cls2 {
+public:
+  int *(*fp)(char *, float *);
+
+  // FT-DAG: define {{.*}} i32 @_ZN4Cls22f1Ecfd({{.*}} !type [[F_TCLS2F1:![0-9]+]]
+  int f1(char a, float b, double c) { return 0; }
+
+  // FT-DAG: define {{.*}} ptr @_ZN4Cls22f2EPcPfPd({{.*}} !type [[F_TCLS2F2:![0-9]+]]
+  int *f2(char *a, float *b, double *c) { return 0; }
+
+  // FT-DAG: define {{.*}} void @_ZN4Cls22f3E4Cls1({{.*}} !type [[F_TCLS2F3F4:![0-9]+]]
+  void f3(Cls1 a) {}
+
+  // FT-DAG: define {{.*}} void @_ZN4Cls22f4E4Cls1({{.*}} !type [[F_TCLS2F3F4]]
+  void f4(const Cls1 a) {}
+
+  // FT-DAG: define {{.*}} void @_ZN4Cls22f5EP4Cls1({{.*}} !type [[F_TCLS2F5:![0-9]+]]
+  void f5(Cls1 *a) {}
+
+  // FT-DAG: define {{.*}} void @_ZN4Cls22f6EPK4Cls1({{.*}} !type [[F_TCLS2F6:![0-9]+]]
+  void f6(const Cls1 *a) {}
+
+  // FT-DAG: define {{.*}} void @_ZN4Cls22f7ER4Cls1({{.*}} !type [[F_TCLS2F7:![0-9]+]]
+  void f7(Cls1 &a) {}
+
+  // FT-DAG: define {{.*}} void @_ZN4Cls22f8ERK4Cls1({{.*}} !type [[F_TCLS2F8:![0-9]+]]
+  void f8(const Cls1 &a) {}
+
+  // FT-DAG: define {{.*}} void @_ZNK4Cls22f9Ev({{.*}} !type [[F_TCLS2F9:![0-9]+]]
+  void f9() const {}
+};
+
+// FT-DAG: [[F_TCLS1RECEIVER]] = !{i64 0, !"_ZTSFPvS_S_E.generalized"}
+// FT-DAG: [[F_TCLS2F2]]   = !{i64 0, !"_ZTSFPvS_S_S_E.generalized"}
+// FT-DAG: [[F_TCLS2F1]]   = !{i64 0, !"_ZTSFicfdE.generalized"}
+// FT-DAG: [[F_TCLS2F3F4]] = !{i64 0, !"_ZTSFv4Cls1E.generalized"}
+// FT-DAG: [[F_TCLS2F5]]   = !{i64 0, !"_ZTSFvPvE.generalized"}
+// FT-DAG: [[F_TCLS2F6]]   = !{i64 0, !"_ZTSFvPKvE.generalized"}
+// FT-DAG: [[F_TCLS2F7]]   = !{i64 0, !"_ZTSFvR4Cls1E.generalized"}
+// FT-DAG: [[F_TCLS2F8]]   = !{i64 0, !"_ZTSFvRK4Cls1E.generalized"}
+// FT-DAG: [[F_TCLS2F9]]   = !{i64 0, !"_ZTSKFvvE.generalized"}
+
+////////////////////////////////////////////////////////////////////////////////
+// Callsites (check for indirect callsite operand bundles)
+
+// CST-LABEL: define {{.*}} @_Z3foov
+void foo() {
+  Cls2 ObjCls2;
+  ObjCls2.fp = &Cls1::receiver;
+
+  // CST: call noundef ptr %{{.*}} [ "type"(metadata !"_ZTSFPvS_S_E.generalized") ]
+  ObjCls2.fp(0, 0);
+
+  auto fp_f1 = &Cls2::f1;
+  auto fp_f2 = &Cls2::f2;
+  auto fp_f3 = &Cls2::f3;
+  auto fp_f4 = &Cls2::f4;
+  auto fp_f5 = &Cls2::f5;
+  auto fp_f6 = &Cls2::f6;
+  auto fp_f7 = &Cls2::f7;
+  auto fp_f8 = &Cls2::f8;
+  auto fp_f9 = &Cls2::f9;
+
+  Cls2 *ObjCls2Ptr = &ObjCls2;
+  Cls1 Cls1Param;
+
+  // CST: call noundef i32 %{{.*}} [ "type"(metadata !"_ZTSFicfdE.generalized") ]
+  (ObjCls2Ptr->*fp_f1)(0, 0, 0);
+
+  // CST: call noundef ptr %{{.*}} [ "type"(metadata !"_ZTSFPvS_S_S_E.generalized") ]
+  (ObjCls2Ptr->*fp_f2)(0, 0, 0);
+
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFv4Cls1E.generalized") ]
+  (ObjCls2Ptr->*fp_f3)(Cls1Param);
+
+  // CST: call void  %{{.*}} [ "type"(metadata !"_ZTSFv4Cls1E.generalized") ]
+  (ObjCls2Ptr->*fp_f4)(Cls1Param);
+
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFvPvE.generalized") ]
+  (ObjCls2Ptr->*fp_f5)(&Cls1Param);
+
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFvPKvE.generalized") ]
+  (ObjCls2Ptr->*fp_f6)(&Cls1Param);
+
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFvR4Cls1E.generalized") ]
+  (ObjCls2Ptr->*fp_f7)(Cls1Param);
+
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFvRK4Cls1E.generalized") ]
+  (ObjCls2Ptr->*fp_f8)(Cls1Param);
+
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSKFvvE.generalized") ]
+  (ObjCls2Ptr->*fp_f9)();
+}
diff --git a/clang/test/CodeGen/call-graph-section-2.cpp b/clang/test/CodeGen/call-graph-section-2.cpp
new file mode 100644
index 00000000000000..4187faea495bee
--- /dev/null
+++ b/clang/test/CodeGen/call-graph-section-2.cpp
@@ -0,0 +1,95 @@
+// Tests that we assign appropriate identifiers to indirect calls and targets
+// specifically for C++ templates.
+
+// RUN: %clang_cc1 -triple x86_64-unknown-linux -fcall-graph-section -S \
+// RUN: -emit-llvm -o %t %s
+// RUN: FileCheck --check-prefix=FT    %s < %t
+// RUN: FileCheck --check-prefix=CST   %s < %t
+// RUN: FileCheck --check-prefix=CHECK %s < %t
+
+////////////////////////////////////////////////////////////////////////////////
+// Class definitions and template classes (check for indirect target metadata)
+
+class Cls1 {};
+
+// Cls2 is instantiated with T=Cls1 in foo(). Following checks are for this
+// instantiation.
+template <class T>
+class Cls2 {
+public:
+  // FT: define {{.*}} void @_ZN4Cls2I4Cls1E2f1Ev({{.*}} !type [[F_TCLS2F1:![0-9]+]]
+  void f1() {}
+
+  // FT: define {{.*}} void @_ZN4Cls2I4Cls1E2f2ES0_({{.*}} !type [[F_TCLS2F2:![0-9]+]]
+  void f2(T a) {}
+
+  // FT: define {{.*}} void @_ZN4Cls2I4Cls1E2f3EPS0_({{.*}} !type [[F_TCLS2F3:![0-9]+]]
+  void f3(T *a) {}
+
+  // FT: define {{.*}} void @_ZN4Cls2I4Cls1E2f4EPKS0_({{.*}} !type [[F_TCLS2F4:![0-9]+]]
+  void f4(const T *a) {}
+
+  // FT: define {{.*}} void @_ZN4Cls2I4Cls1E2f5ERS0_({{.*}} !type [[F_TCLS2F5:![0-9]+]]
+  void f5(T &a) {}
+
+  // FT: define {{.*}} void @_ZN4Cls2I4Cls1E2f6ERKS0_({{.*}} !type [[F_TCLS2F6:![0-9]+]]
+  void f6(const T &a) {}
+
+  // Mixed type function pointer member
+  T *(*fp)(T a, T *b, const T *c, T &d, const T &e);
+};
+
+// FT-DAG: [[F_TCLS2F1]] = !{i64 0, !"_ZTSFvvE.generalized"}
+// FT-DAG: [[F_TCLS2F2]] = !{i64 0, !"_ZTSFv4Cls1E.generalized"}
+// FT-DAG: [[F_TCLS2F3]] = !{i64 0, !"_ZTSFvPvE.generalized"}
+// FT-DAG: [[F_TCLS2F4]] = !{i64 0, !"_ZTSFvPKvE.generalized"}
+// FT-DAG: [[F_TCLS2F5]] = !{i64 0, !"_ZTSFvR4Cls1E.generalized"}
+// FT-DAG: [[F_TCLS2F6]] = !{i64 0, !"_ZTSFvRK4Cls1E.generalized"}
+
+////////////////////////////////////////////////////////////////////////////////
+// Callsites (check for indirect callsite operand bundles)
+
+template <class T>
+T *T_func(T a, T *b, const T *c, T &d, const T &e) { return b; }
+
+// CST-LABEL: define {{.*}} @_Z3foov
+void foo() {
+  // Methods for Cls2<Cls1> is checked above within the template description.
+  Cls2<Cls1> Obj;
+
+  // CHECK-DAG: define {{.*}} @_Z6T_funcI4Cls1EPT_S1_S2_PKS1_RS1_RS3_({{.*}} !type [[F_TFUNC_CLS1:![0-9]+]]
+  // CHECK-DAG: [[F_TFUNC_CLS1]] = !{i64 0, !"_ZTSFPv4Cls1S_PKvRS0_RKS0_E.generalized"}
+  Obj.fp = T_func<Cls1>;
+  Cls1 Cls1Obj;
+
+  // CST: call noundef ptr %{{.*}} [ "type"(metadata !"_ZTSFPv4Cls1S_PKvRS0_RKS0_E.generalized") ]
+  Obj.fp(Cls1Obj, &Cls1Obj, &Cls1Obj, Cls1Obj, Cls1Obj);
+
+  // Make indirect calls to Cls2's member methods
+  auto fp_f1 = &Cls2<Cls1>::f1;
+  auto fp_f2 = &Cls2<Cls1>::f2;
+  auto fp_f3 = &Cls2<Cls1>::f3;
+  auto fp_f4 = &Cls2<Cls1>::f4;
+  auto fp_f5 = &Cls2<Cls1>::f5;
+  auto fp_f6 = &Cls2<Cls1>::f6;
+
+  auto *Obj2Ptr = &Obj;
+
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFvvE.generalized") ]
+  (Obj2Ptr->*fp_f1)();
+
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFv4Cls1E.generalized") ]
+  (Obj2Ptr->*fp_f2)(Cls1Obj);
+
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFvPvE.generalized") ]
+  (Obj2Ptr->*fp_f3)(&Cls1Obj);
+
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFvPKvE.generalized") ]
+  (Obj2Ptr->*fp_f4)(&Cls1Obj);
+
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFvR4Cls1E.generalized") ]
+  (Obj2Ptr->*fp_f5)(Cls1Obj);
+
+  // CST: call void %{{.*}} [ "type"(metadata !"_ZTSFvRK4Cls1E.generalized") ]
+  (Obj2Ptr->*fp_f6)(Cls1Obj);
+}
diff --git a/clang/test/CodeGen/call-graph-section-3.cpp b/clang/test/CodeGen/call-graph-section-3.cpp
new file mode 100644
index 00000000000000..77bb110c664ef6
--- /dev/null
+++ b/clang/test/CodeGen/call-graph-section-3.cpp
@@ -0,0 +1,52 @@
+// Tests that we assign appropriate identifiers to indirect calls and targets
+// specifically for virtual methods.
+
+// RUN: %clang_cc1 -triple x86_64-unknown-linux -fcall-graph-section -S \
+// RUN: -emit-llvm -o %t %s
+// RUN: FileCheck --check-prefix=FT %s < %t
+// RUN: FileCheck --check-prefix=CST %s < %t
+
+////////////////////////////////////////////////////////////////////////////////
+// Class definitions (check for indirect target metadata)
+
+class Base {
+public:
+  // FT-DAG: define {{.*}} @_ZN4Base2vfEPc({{.*}} !type [[F_TVF:![0-9]+]]
+  virtual int vf(char *a) { return 0; };
+};
+
+class Derived : public Base {
+public:
+  // FT-DAG: define {{.*}} @_ZN7Derived2vfEPc({{.*}} !type [[F_TVF]]
+  int vf(char *a) override { return 1; }...
[truncated]

clang/lib/CodeGen/CodeGenModule.cpp Outdated Show resolved Hide resolved
clang/lib/CodeGen/CodeGenModule.cpp Outdated Show resolved Hide resolved
Prabhuk pushed a commit to Prabhuk/llvm-project that referenced this pull request Apr 19, 2024
…argets

Create and add generalized type identifier metadata to indirect calls,
and to functions that may be target to indirect calls.

Type identifiers will be used by the back-end to construct the call
graph section to precisely represent the possible targets for indirect calls.
The type information is deliberately pulled from the front-end to have extra
precision since some type information is lost at IR, and to ensure
consistent type identifiers between object files compiled at different
times (as C/C++ standards require language-level types to match).

Original RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-June/151044.html
Updated RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html

Reviewed By:
morehouse

Differential Revision: https://reviews.llvm.org/D105909?id=358327

Pull Request: llvm#87573
Prabhuk and others added 3 commits April 22, 2024 11:34
Cleaner if checks.

Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>
Update the comments as suggested.

Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>
…ion. Removed type id emitting for Objective C.

Created using spr 1.3.6-beta.1
@Prabhuk Prabhuk requested review from ilovepi and petrhosek April 24, 2024 17:25
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
Copy link
Contributor

@ilovepi ilovepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you split out the LLVM parts from the clang parts? The LLVM bits should land first, and then clang can make use of them in a later PR. This will also simplify review, since we can focus on the LLVM vs. Clang implementations in isolation.

clang/lib/CodeGen/CGCall.cpp Outdated Show resolved Hide resolved
clang/lib/CodeGen/CGCall.cpp Outdated Show resolved Hide resolved
clang/lib/CodeGen/CGExpr.cpp Outdated Show resolved Hide resolved
clang/lib/CodeGen/CGCall.cpp Outdated Show resolved Hide resolved
necipfazil and others added 4 commits November 13, 2024 17:00
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
@Prabhuk Prabhuk changed the title [clang][CallGraphSection] Add type id metadata to indirect call and targets [llvm][CallGraphSection] Handle type id metadata Nov 20, 2024
@Prabhuk
Copy link
Contributor Author

Prabhuk commented Nov 20, 2024

Can you split out the LLVM parts from the clang parts? The LLVM bits should land first, and then clang can make use of them in a later PR. This will also simplify review, since we can focus on the LLVM vs. Clang implementations in isolation.

The Clang bits are moved to #117036

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit message should be more specific, this is an operand bundle not metadata?

@Prabhuk Prabhuk changed the title [llvm][CallGraphSection] Handle type id metadata [llvm] Introduce type id operand bundle Dec 10, 2024
@Prabhuk
Copy link
Contributor Author

Prabhuk commented Feb 11, 2025

@nikic, Is the content of this OB the same as CFI !type metadata on function definitions? — Yes. It is the same. I’ve udpated the OB name to callee_type.

I misunderstood the correctness question assuming that it was about whether there will be miscompilation if callee_type OB was dropped. Echoing @ilovepi's response, dropping the callee_type OB will result in less precise (possibly incorrect) call graph information in the callgraph section. Hopefully the loss in precision can be caught by the tests that are added in the set of follow up patches to this one.

OB vs Metadata (extending type metadata or introducing new list of types metadata) — I am not opposed to any of the three approaches but handling the callee_type information through operand bundle feels appropriate and cleaner. @nikic Can you please weigh in on this to decide the direction of this patch? Thank you.

@Prabhuk Prabhuk requested a review from nikic February 19, 2025 00:15
Prabhuk added a commit to Prabhuk/llvm-project that referenced this pull request Mar 12, 2025
For each indirect callsite, we would like to emit an operand bundle that
holds the inteded type identifier of the callee. This information will
be used to construct callgraph section in subsequent PRs.

The type information will be pulled from the front-end to have extra
precision since some type information is lost at IR, and to ensure
consistent type identifiers between object files compiled at different
times (as C/C++ standards require language-level types to match).

Original RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-June/151044.html
Updated RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html

Pull Request: llvm#87573
@Prabhuk
Copy link
Contributor Author

Prabhuk commented Mar 15, 2025

@nikic -- ping

Created using spr 1.3.6-beta.1
@nikic
Copy link
Contributor

nikic commented Mar 19, 2025

I still don't think that using operand bundles for this is appropriate. If you take a look at all the other operand bundles (https://llvm.org/docs/LangRef.html#operand-bundles) they consistently have some kind of impact on the semantics of the call. Also keep in mind that the default implication of an operand bundle is that it can read and write arbitrary memory. This is not going to be super relevant in this context because indirect calls already have unknown effects, but would be relevant if devirtualization happens and the operand bundle is still attached.

If you properly design the metadata, I expect that preservation through optimization will be quite robust. There just aren't that many optimizations that act on indirect calls. The main one is merging of multiple call-sites via hoist/sink, and metadata can handle that.

I'll also add that no matter what design you use, you still need to have some kind of graceful fallback for the case where the callee_type bundle/metadata is missing, e.g. because you're linking in an object from a frontend that doesn't support it.

cc @efriedma-quic for a second opinion on what to do here.

Prabhuk added a commit to Prabhuk/llvm-project that referenced this pull request Apr 15, 2025
For each indirect callsite, we would like to emit an operand bundle that
holds the inteded type identifier of the callee. This information will
be used to construct callgraph section in subsequent PRs.

The type information will be pulled from the front-end to have extra
precision since some type information is lost at IR, and to ensure
consistent type identifiers between object files compiled at different
times (as C/C++ standards require language-level types to match).

Original RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-June/151044.html
Updated RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html

Pull Request: llvm#87573
Created using spr 1.3.6-beta.1
@Prabhuk Prabhuk changed the title [llvm] Introduce callee_type operand bundle [llvm] Introduce callee_type metadata Apr 19, 2025
Prabhuk added a commit to Prabhuk/llvm-project that referenced this pull request Apr 22, 2025
Introduce `callee_type` metadata which will be attached to the indirect
call instructions.

The `callee_type` metadata will be used to generate `.callgraph` section
described in this RFC:
https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html

Pull Request: llvm#87573
Prabhuk added 2 commits April 23, 2025 01:01
Created using spr 1.3.6-beta.1
Prabhuk added 2 commits April 23, 2025 23:38
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
Prabhuk added 2 commits April 28, 2025 22:48
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
llvm/lib/IR/Verifier.cpp Outdated Show resolved Hide resolved
llvm/lib/IR/Metadata.cpp Outdated Show resolved Hide resolved
llvm/lib/IR/Metadata.cpp Outdated Show resolved Hide resolved
llvm/lib/IR/Verifier.cpp Outdated Show resolved Hide resolved
Prabhuk added 2 commits May 5, 2025 23:12
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
@Prabhuk Prabhuk requested a review from nikic May 10, 2025 00:34
@Prabhuk
Copy link
Contributor Author

Prabhuk commented May 10, 2025

@nikic - Can you please check if there's anything else is needed to land this change? Thank you.

Prabhuk added a commit to Prabhuk/llvm-project that referenced this pull request May 10, 2025
Introduce `callee_type` metadata which will be attached to the indirect
call instructions.

The `callee_type` metadata will be used to generate `.callgraph` section
described in this RFC:
https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html

Pull Request: llvm#87573
llvm/docs/CalleeTypeMetadata.rst Outdated Show resolved Hide resolved
llvm/docs/CalleeTypeMetadata.rst Show resolved Hide resolved
Prabhuk added 2 commits May 14, 2025 02:06
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category llvm:ir llvm:SelectionDAG SelectionDAGISel as well
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.