-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[FuncAttrs] Relax norecurse attribute inference #139943
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
or is part of a recursive call chain which does not include the caller.
@llvm/pr-subscribers-backend-amdgpu @llvm/pr-subscribers-llvm-transforms Author: Usha Gupta (usha1830) ChangesThe current code adds This is a little conservative check and prevents cases if there is self-recursive callee or if the callee is in a recursive chain which does not include current caller. Example scenarios:
main and callee1 could have Similarly, main can have the This PR relaxes the requirement that all callees must have the Patch is 22.25 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/139943.diff 4 Files Affected:
diff --git a/llvm/lib/Transforms/IPO/FunctionAttrs.cpp b/llvm/lib/Transforms/IPO/FunctionAttrs.cpp
index f918d7e059b63..293699a1639c6 100644
--- a/llvm/lib/Transforms/IPO/FunctionAttrs.cpp
+++ b/llvm/lib/Transforms/IPO/FunctionAttrs.cpp
@@ -2059,8 +2059,49 @@ static void inferAttrsFromFunctionBodies(const SCCNodeSet &SCCNodes,
AI.run(SCCNodes, Changed);
}
+/// Returns true if N or any function it (transitively) calls makes a call
+/// to an unknown external function: either an indirect call or a declaration
+/// without the NoCallback attribute.
+static bool callsUnknownExternal(LazyCallGraph::Node *N) {
+ std::deque<LazyCallGraph::Node *> Worklist;
+ DenseSet<LazyCallGraph::Node *> Visited;
+ Worklist.push_back(N);
+ Visited.insert(N);
+
+ while (!Worklist.empty()) {
+ auto *Cur = Worklist.front();
+ Worklist.pop_front();
+
+ Function &F = Cur->getFunction();
+ for (auto &BB : F) {
+ for (auto &I : BB.instructionsWithoutDebug()) {
+ if (auto *CB = dyn_cast<CallBase>(&I)) {
+ const Function *Callee = CB->getCalledFunction();
+ // Indirect call, treat as unknown external function.
+ if (!Callee)
+ return true;
+ // Extern declaration without NoCallback attribute
+ if (Callee->isDeclaration() &&
+ !Callee->hasFnAttribute(Attribute::NoCallback))
+ return true;
+ }
+ }
+ }
+
+ // Enqueue all direct call-edge successors for further scanning
+ for (auto &Edge : Cur->populate().calls()) {
+ LazyCallGraph::Node *Succ = &Edge.getNode();
+ if (Visited.insert(Succ).second)
+ Worklist.push_back(Succ);
+ }
+ }
+
+ return false;
+}
+
static void addNoRecurseAttrs(const SCCNodeSet &SCCNodes,
- SmallSet<Function *, 8> &Changed) {
+ SmallSet<Function *, 8> &Changed,
+ LazyCallGraph &CG) {
// Try and identify functions that do not recurse.
// If the SCC contains multiple nodes we know for sure there is recursion.
@@ -2071,24 +2112,35 @@ static void addNoRecurseAttrs(const SCCNodeSet &SCCNodes,
if (!F || !F->hasExactDefinition() || F->doesNotRecurse())
return;
- // If all of the calls in F are identifiable and are to norecurse functions, F
- // is norecurse. This check also detects self-recursion as F is not currently
- // marked norecurse, so any called from F to F will not be marked norecurse.
- for (auto &BB : *F)
- for (auto &I : BB.instructionsWithoutDebug())
+ // If all of the calls in F are identifiable and can be proven to not
+ // callback F, F is norecurse. This check also detects self-recursion
+ // as F is not currently marked norecurse, so any call from F to F
+ // will not be marked norecurse.
+ for (auto &BB : *F) {
+ for (auto &I : BB.instructionsWithoutDebug()) {
if (auto *CB = dyn_cast<CallBase>(&I)) {
Function *Callee = CB->getCalledFunction();
- if (!Callee || Callee == F ||
- (!Callee->doesNotRecurse() &&
- !(Callee->isDeclaration() &&
- Callee->hasFnAttribute(Attribute::NoCallback))))
- // Function calls a potentially recursive function.
+
+ if (!Callee || Callee == F)
+ return;
+
+ // External call with NoCallback attribute.
+ if (Callee->isDeclaration()) {
+ if (Callee->hasFnAttribute(Attribute::NoCallback))
+ continue;
return;
+ }
+
+ if (auto *CNode = CG.lookup(*Callee))
+ if (callsUnknownExternal(CNode))
+ return;
}
+ }
+ }
- // Every call was to a non-recursive function other than this function, and
- // we have no indirect recursion as the SCC size is one. This function cannot
- // recurse.
+ // Every call was either to an external function guaranteed to not make a
+ // call to this function or a direct call to internal function and we have no
+ // indirect recursion as the SCC size is one. This function cannot recurse.
F->setDoesNotRecurse();
++NumNoRecurse;
Changed.insert(F);
@@ -2248,7 +2300,7 @@ static SCCNodesResult createSCCNodeSet(ArrayRef<Function *> Functions) {
template <typename AARGetterT>
static SmallSet<Function *, 8>
deriveAttrsInPostOrder(ArrayRef<Function *> Functions, AARGetterT &&AARGetter,
- bool ArgAttrsOnly) {
+ bool ArgAttrsOnly, LazyCallGraph &CG) {
SCCNodesResult Nodes = createSCCNodeSet(Functions);
// Bail if the SCC only contains optnone functions.
@@ -2279,7 +2331,7 @@ deriveAttrsInPostOrder(ArrayRef<Function *> Functions, AARGetterT &&AARGetter,
addNoAliasAttrs(Nodes.SCCNodes, Changed);
addNonNullAttrs(Nodes.SCCNodes, Changed);
inferAttrsFromFunctionBodies(Nodes.SCCNodes, Changed);
- addNoRecurseAttrs(Nodes.SCCNodes, Changed);
+ addNoRecurseAttrs(Nodes.SCCNodes, Changed, CG);
}
// Finally, infer the maximal set of attributes from the ones we've inferred
@@ -2323,7 +2375,7 @@ PreservedAnalyses PostOrderFunctionAttrsPass::run(LazyCallGraph::SCC &C,
}
auto ChangedFunctions =
- deriveAttrsInPostOrder(Functions, AARGetter, ArgAttrsOnly);
+ deriveAttrsInPostOrder(Functions, AARGetter, ArgAttrsOnly, CG);
if (ChangedFunctions.empty())
return PreservedAnalyses::all();
diff --git a/llvm/test/Transforms/FunctionAttrs/norecurse_multiSCC_indirect_recursion.ll b/llvm/test/Transforms/FunctionAttrs/norecurse_multiSCC_indirect_recursion.ll
new file mode 100644
index 0000000000000..c0af16b069bde
--- /dev/null
+++ b/llvm/test/Transforms/FunctionAttrs/norecurse_multiSCC_indirect_recursion.ll
@@ -0,0 +1,159 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-attributes --version 5
+; RUN: opt < %s -passes=function-attrs -S | FileCheck %s
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128-Fn32"
+target triple = "aarch64-unknown-linux-gnu"
+
+; This test includes a call graph with multiple SCCs. The purpose of this is
+; to check that norecurse is not added when a function is part of non-singular
+; SCC.
+; There are three different SCCs in this test:
+; SCC#1: main, foo, bar, foo1, bar1
+; SCC#2: bar2, bar3, bar4
+; SCC#3: baz, fun
+; None of these functions should be marked as norecurse
+
+; Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+define dso_local void @bar1() local_unnamed_addr #0 {
+; CHECK: Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+; CHECK-LABEL: define dso_local void @bar1(
+; CHECK-SAME: ) local_unnamed_addr #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[CALL:%.*]] = tail call i32 @main()
+; CHECK-NEXT: ret void
+;
+entry:
+ %call = tail call i32 @main()
+ ret void
+}
+
+; Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+define dso_local noundef i32 @main() local_unnamed_addr #0 {
+; CHECK: Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+; CHECK-LABEL: define dso_local noundef i32 @main(
+; CHECK-SAME: ) local_unnamed_addr #[[ATTR0]] {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: tail call void @foo()
+; CHECK-NEXT: tail call void @bar2()
+; CHECK-NEXT: tail call void @baz()
+; CHECK-NEXT: ret i32 0
+;
+entry:
+ tail call void @foo()
+ tail call void @bar2()
+ tail call void @baz()
+ ret i32 0
+}
+
+; Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+define dso_local void @foo1() local_unnamed_addr #0 {
+; CHECK: Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+; CHECK-LABEL: define dso_local void @foo1(
+; CHECK-SAME: ) local_unnamed_addr #[[ATTR0]] {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: tail call void @bar1()
+; CHECK-NEXT: ret void
+;
+entry:
+ tail call void @bar1()
+ ret void
+}
+
+; Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+define dso_local void @bar() local_unnamed_addr #0 {
+; CHECK: Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+; CHECK-LABEL: define dso_local void @bar(
+; CHECK-SAME: ) local_unnamed_addr #[[ATTR0]] {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: tail call void @foo1()
+; CHECK-NEXT: ret void
+;
+entry:
+ tail call void @foo1()
+ ret void
+}
+
+; Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+define dso_local void @foo() local_unnamed_addr #0 {
+; CHECK: Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+; CHECK-LABEL: define dso_local void @foo(
+; CHECK-SAME: ) local_unnamed_addr #[[ATTR0]] {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: tail call void @bar()
+; CHECK-NEXT: ret void
+;
+entry:
+ tail call void @bar()
+ ret void
+}
+
+; Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+define dso_local void @bar4() local_unnamed_addr #0 {
+; CHECK: Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+; CHECK-LABEL: define dso_local void @bar4(
+; CHECK-SAME: ) local_unnamed_addr #[[ATTR0]] {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: tail call void @bar2()
+; CHECK-NEXT: ret void
+;
+entry:
+ tail call void @bar2()
+ ret void
+}
+
+; Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+define dso_local void @bar2() local_unnamed_addr #0 {
+; CHECK: Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+; CHECK-LABEL: define dso_local void @bar2(
+; CHECK-SAME: ) local_unnamed_addr #[[ATTR0]] {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: tail call void @bar3()
+; CHECK-NEXT: ret void
+;
+entry:
+ tail call void @bar3()
+ ret void
+}
+
+; Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+define dso_local void @bar3() local_unnamed_addr #0 {
+; CHECK: Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+; CHECK-LABEL: define dso_local void @bar3(
+; CHECK-SAME: ) local_unnamed_addr #[[ATTR0]] {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: tail call void @bar4()
+; CHECK-NEXT: ret void
+;
+entry:
+ tail call void @bar4()
+ ret void
+}
+
+; Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+define dso_local void @fun() local_unnamed_addr #0 {
+; CHECK: Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+; CHECK-LABEL: define dso_local void @fun(
+; CHECK-SAME: ) local_unnamed_addr #[[ATTR0]] {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: tail call void @baz()
+; CHECK-NEXT: ret void
+;
+entry:
+ tail call void @baz()
+ ret void
+}
+
+; Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+define dso_local void @baz() local_unnamed_addr #0 {
+; CHECK: Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+; CHECK-LABEL: define dso_local void @baz(
+; CHECK-SAME: ) local_unnamed_addr #[[ATTR0]] {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: tail call void @fun()
+; CHECK-NEXT: ret void
+;
+entry:
+ tail call void @fun()
+ ret void
+}
+
+attributes #0 = { nofree noinline nosync nounwind memory(none) uwtable "frame-pointer"="non-leaf" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+fp-armv8,+neon,+outline-atomics,+v8a,-fmv" }
diff --git a/llvm/test/Transforms/FunctionAttrs/norecurse_multiSCC_indirect_recursion1.ll b/llvm/test/Transforms/FunctionAttrs/norecurse_multiSCC_indirect_recursion1.ll
new file mode 100644
index 0000000000000..60bbaab2a7d66
--- /dev/null
+++ b/llvm/test/Transforms/FunctionAttrs/norecurse_multiSCC_indirect_recursion1.ll
@@ -0,0 +1,102 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-attributes --version 5
+; RUN: opt < %s -passes=function-attrs -S | FileCheck %s
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128-Fn32"
+target triple = "aarch64-unknown-linux-gnu"
+
+; This test includes a call graph with multiple SCCs. The purpose of this is
+; to check that norecurse is added to a function which calls a function which
+; is indirectly recursive but is not part of the recursive chain.
+; There are two SCCs in this test:
+; SCC#1: bar2, bar3, bar4
+; SCC#2: baz, fun
+; main() calls bar2 and baz, both of which are part of some indirect recursive
+; chain. but does not call back main() and hence main() can be marked as
+; norecurse.
+
+; Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+define dso_local noundef i32 @main() local_unnamed_addr #0 {
+; CHECK: Function Attrs: nofree noinline norecurse nosync nounwind memory(none) uwtable
+; CHECK-LABEL: define dso_local noundef i32 @main(
+; CHECK-SAME: ) local_unnamed_addr #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: tail call void @bar2()
+; CHECK-NEXT: tail call void @baz()
+; CHECK-NEXT: ret i32 0
+;
+entry:
+ tail call void @bar2()
+ tail call void @baz()
+ ret i32 0
+}
+
+; Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+define dso_local void @bar4() local_unnamed_addr #0 {
+; CHECK: Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+; CHECK-LABEL: define dso_local void @bar4(
+; CHECK-SAME: ) local_unnamed_addr #[[ATTR1:[0-9]+]] {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: tail call void @bar2()
+; CHECK-NEXT: ret void
+;
+entry:
+ tail call void @bar2()
+ ret void
+}
+
+; Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+define dso_local void @bar2() local_unnamed_addr #0 {
+; CHECK: Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+; CHECK-LABEL: define dso_local void @bar2(
+; CHECK-SAME: ) local_unnamed_addr #[[ATTR1]] {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: tail call void @bar3()
+; CHECK-NEXT: ret void
+;
+entry:
+ tail call void @bar3()
+ ret void
+}
+
+; Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+define dso_local void @bar3() local_unnamed_addr #0 {
+; CHECK: Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+; CHECK-LABEL: define dso_local void @bar3(
+; CHECK-SAME: ) local_unnamed_addr #[[ATTR1]] {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: tail call void @bar4()
+; CHECK-NEXT: ret void
+;
+entry:
+ tail call void @bar4()
+ ret void
+}
+
+; Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+define dso_local void @fun() local_unnamed_addr #0 {
+; CHECK: Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+; CHECK-LABEL: define dso_local void @fun(
+; CHECK-SAME: ) local_unnamed_addr #[[ATTR1]] {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: tail call void @baz()
+; CHECK-NEXT: ret void
+;
+entry:
+ tail call void @baz()
+ ret void
+}
+
+; Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+define dso_local void @baz() local_unnamed_addr #0 {
+; CHECK: Function Attrs: nofree noinline nosync nounwind memory(none) uwtable
+; CHECK-LABEL: define dso_local void @baz(
+; CHECK-SAME: ) local_unnamed_addr #[[ATTR1]] {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: tail call void @fun()
+; CHECK-NEXT: ret void
+;
+entry:
+ tail call void @fun()
+ ret void
+}
+
+attributes #0 = { nofree noinline nosync nounwind memory(none) uwtable "frame-pointer"="non-leaf" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+fp-armv8,+neon,+outline-atomics,+v8a,-fmv" }
diff --git a/llvm/test/Transforms/FunctionAttrs/norecurse_self_recursive_callee.ll b/llvm/test/Transforms/FunctionAttrs/norecurse_self_recursive_callee.ll
new file mode 100644
index 0000000000000..9b87c4e4ad76c
--- /dev/null
+++ b/llvm/test/Transforms/FunctionAttrs/norecurse_self_recursive_callee.ll
@@ -0,0 +1,173 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-attributes --version 5
+; RUN: opt < %s -passes=function-attrs -S | FileCheck %s
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128-Fn32"
+target triple = "aarch64-unknown-linux-gnu"
+
+; This test includes a call graph with a self recursive function.
+; The purpose of this is to check that norecurse is added to functions
+; which have a self-recursive function in the call-chain.
+; The call-chain in this test is as follows
+; main -> bob -> callee1 -> callee2 -> callee3 -> callee4 -> callee5
+; where callee5 is self recursive.
+
+@x = dso_local global i32 4, align 4
+@y = dso_local global i32 2, align 4
+
+; Function Attrs: nofree noinline norecurse nounwind memory(readwrite, argmem: none) uwtable
+define dso_local void @callee6() local_unnamed_addr #0 {
+; CHECK: Function Attrs: nofree noinline norecurse nounwind memory(readwrite, argmem: none) uwtable
+; CHECK-LABEL: define dso_local void @callee6(
+; CHECK-SAME: ) local_unnamed_addr #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[TMP0:%.*]] = load volatile i32, ptr @y, align 4, !tbaa [[TBAA8:![0-9]+]]
+; CHECK-NEXT: [[INC:%.*]] = add nsw i32 [[TMP0]], 1
+; CHECK-NEXT: store volatile i32 [[INC]], ptr @y, align 4, !tbaa [[TBAA8]]
+; CHECK-NEXT: ret void
+;
+entry:
+ %0 = load volatile i32, ptr @y, align 4, !tbaa !8
+ %inc = add nsw i32 %0, 1
+ store volatile i32 %inc, ptr @y, align 4, !tbaa !8
+ ret void
+}
+
+; Function Attrs: nofree noinline nounwind uwtable
+define dso_local void @callee5(i32 noundef %x) local_unnamed_addr #1 {
+; CHECK: Function Attrs: nofree noinline nounwind uwtable
+; CHECK-LABEL: define dso_local void @callee5(
+; CHECK-SAME: i32 noundef [[X:%.*]]) local_unnamed_addr #[[ATTR1:[0-9]+]] {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[CMP:%.*]] = icmp sgt i32 [[X]], 0
+; CHECK-NEXT: br i1 [[CMP]], label %[[IF_THEN:.*]], label %[[IF_END:.*]]
+; CHECK: [[IF_THEN]]:
+; CHECK-NEXT: tail call void @callee5(i32 noundef [[X]])
+; CHECK-NEXT: br label %[[IF_END]]
+; CHECK: [[IF_END]]:
+; CHECK-NEXT: tail call void @callee6()
+; CHECK-NEXT: ret void
+;
+entry:
+ %cmp = icmp sgt i32 %x, 0
+ br i1 %cmp, label %if.then, label %if.end
+
+if.then: ; preds = %entry
+ tail call void @callee5(i32 noundef %x)
+ br label %if.end
+
+if.end: ; preds = %if.then, %entry
+ tail call void @callee6()
+ ret void
+}
+
+; Function Attrs: nofree noinline nounwind uwtable
+define dso_local void @callee4() local_unnamed_addr #1 {
+; CHECK: Function Attrs: nofree noinline norecurse nounwind uwtable
+; CHECK-LABEL: define dso_local void @callee4(
+; CHECK-SAME: ) local_unnamed_addr #[[ATTR2:[0-9]+]] {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[TMP0:%.*]] = load volatile i32, ptr @x, align 4, !tbaa [[TBAA8]]
+; CHECK-NEXT: tail call void @callee5(i32 noundef [[TMP0]])
+; CHECK-NEXT: ret void
+;
+entry:
+ %0 = load volatile i32, ptr @x, align 4, !tbaa !8
+ tail call void @callee5(i32 noundef %0)
+ ret void
+}
+
+; Function Attrs: nofree noinline nounwind uwtable
+define dso_local void @callee3() local_unnamed_addr #1 {
+; CHECK: Function Attrs: nofree noinline norecurse nounwind uwtable
+; CHECK-LABEL: define dso_local void @callee3(
+; CHECK-SAME: ) local_unnamed_addr #[[ATTR2]] {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: tail call void @callee4()
+; CHECK-NEXT: ret void
+;
+entry:
+ tail call void @callee4()
+ ret void
+}
+
+; Function Attrs: nofree noinline nounwind uwtable
+define dso_local void @callee2() local_unnamed_addr #1 {
+; CHECK: Function Attrs: nofree noinline norecurse nounwind uwtable
+; CHECK-LABEL: define dso_local void @callee2(
+; CHECK-SAME: ) local_unnamed_addr #[[ATTR2]] {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: tail call void @callee3()
+; CHECK-NEXT: ret void
+;
+entry:
+ tail call void @callee3()
+ ret void
+}
+
+; Function Attrs: nofree noinline nounwind uwtable
+define dso_local void @callee1() local_unnamed_addr #1 {
+; CHECK: Function Attrs: nofree noinline norecurse nounwind uwtable
+; CHECK-LABEL: define dso_local void @callee1(
+; CHECK-SAME: ) local_unnamed_addr #[[ATTR2]] {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: tail call void @callee2()
+; CHECK-NEXT: ret void
+;
+entry:
+ tail call void @callee2()
+ ret void
+}
+
+; Function Attrs: nofree noinline nounwind uwtable
+define dso_local void @bob() local...
[truncated]
|
Worklist.pop_front(); | ||
|
||
Function &F = Cur->getFunction(); | ||
for (auto &BB : F) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there perhaps already a utility routine that can return a list of all calls made by a Function? Or could we make the same check in the loop below that looks at all the edges? I took a look at CalleeInfo
in llvm/include/llvm/IR/ModuleSummaryIndex.h and it doesn't have any information about callbacks, but perhaps we could add one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking into it.
Is there perhaps already a utility routine that can return a list of all calls made by a Function? Or could we make the same check in the loop below that looks at all the edges?
I think if I populate the edges early, I can use the calls() function instead of iterating over every instruction in every BB. I will refactor these loops.
I took a look at CalleeInfo in llvm/include/llvm/IR/ModuleSummaryIndex.h and it doesn't have any information about callbacks, but perhaps we could add one?
Yeah. I can look at that too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed this helper now and updated my fix approach to check if the address of this function is taken. If the address is not taken, we can presume that the external function deeper in call stack could not reach back the current function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think FunctionAttrs should be performing inference that inspects more than the current SCC.
Hi @nikic, Thanks for reviewing! Are you referring to the check added to detect unknown external calls in a callee’s call graph? This check is now necessary because we are no longer checking norecurse on each callee. Without this, we might incorrectly infer norecurse for a function that calls another function with unknown external call. For example: An alternative approach could be to ensure that the function F has only direct call users. If there are any non-call uses of F (e.g., being stored to a variable or passed as a function pointer), we should avoid inferring norecurse. This could further uncover more opportunities for inferring norecurse attribute. |
…ould be called back in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reasoning about uses of F is only going to work if the function has local linkage. hasAddressTaken() does not take into account potential uses in other modules.
// marked norecurse, so any called from F to F will not be marked norecurse. | ||
for (auto &BB : *F) | ||
for (auto &I : BB.instructionsWithoutDebug()) | ||
if (F->hasAddressTaken()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, thinking about this now I'm not sure it solves problems like this:
foo() -> bar() -> moo() -> foo()
or
boo() -> foo() -> bar() -> moo() -> boo()
where effectively foo() is recursive despite foo() not directly calling itself? I think the previous version of your patch solved this problem, although @nikic mentioned FunctionAttrs probably isn't the right place for deeper inspection. For a full analysis I think you still need to inspect the call graph. I'm not sure how useful this is, but you could check that all functions called by foo()
don't make any function calls themselves? I don't know if this is a quick and easy check, or whether it's just as computationally expensive as your previous version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also may not add much value because you're only going one level deeper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the only place where your current version will work is at LTO time when foo()
is actually main()
, there are no other callers or addresses taken of main()
.
The current code adds
norecurse
attribute to a function only if all of its callee functions havenorecurse
attribute.This is a little conservative check and prevents cases if there is self-recursive callee or if the callee is in a recursive chain which does not include current caller.
Example scenarios:
main -> callee1 -> callee2 -> callee2
main and callee1 could have
norecurse
in this case given the other constraints are satisfied.Similarly, main can have the
norecurse
attribute in the following case.main -> callee1 -> callee2 -> callee1
This PR relaxes the requirement that all callees must have the
norecurse
attribute. It adds a check to prevent propagatingnorecurse
to functions that may indirectly call unknown external functions lacking theNoCallback
attribute, as these may introduce callbacks.