-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Support stepping through Darwin "branch islands" #139301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support stepping through Darwin "branch islands" #139301
Conversation
This patch allows lldb to step in across "branch islands" which is the Darwin linker's way of dealing with immediate branches to targets that are too far away for the immediate slot to make the jump. I submitted this a couple days ago and it failed on the arm64 bot. I was able to match the bot OS and Tool versions (they are a bit old at this point) and ran the test there but sadly it succeeded. The x86_64 bot also failed but that was my bad, I did @skipUnlessDarwin when I should have done @skipUnlessAppleSilicon. So this resubmission is with the proper decoration for the test, and with a bunch of debug output printed in case of failure. With any luck, if this resubmission fails again I'll be able to see what's going on.
@llvm/pr-subscribers-lldb Author: None (jimingham) ChangesWhen an intra-module jump doesn't fit in the immediate branch slot, the Darwin linker inserts "branch island" symbols, and emits code to jump from branch island to branch island till it makes it to the actual function. The previous submissions failed because in that environment the linker was putting the Full diff: https://github.com/llvm/llvm-project/pull/139301.diff 9 Files Affected:
diff --git a/lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderDarwin.cpp b/lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderDarwin.cpp
index e25c4ff55e408..578ab12268ea3 100644
--- a/lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderDarwin.cpp
+++ b/lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderDarwin.cpp
@@ -26,6 +26,7 @@
#include "lldb/Target/Thread.h"
#include "lldb/Target/ThreadPlanCallFunction.h"
#include "lldb/Target/ThreadPlanRunToAddress.h"
+#include "lldb/Target/ThreadPlanStepInstruction.h"
#include "lldb/Utility/DataBuffer.h"
#include "lldb/Utility/DataBufferHeap.h"
#include "lldb/Utility/LLDBLog.h"
@@ -923,15 +924,15 @@ DynamicLoaderDarwin::GetStepThroughTrampolinePlan(Thread &thread,
if (current_symbol != nullptr) {
std::vector<Address> addresses;
+ ConstString current_name =
+ current_symbol->GetMangled().GetName(Mangled::ePreferMangled);
if (current_symbol->IsTrampoline()) {
- ConstString trampoline_name =
- current_symbol->GetMangled().GetName(Mangled::ePreferMangled);
- if (trampoline_name) {
+ if (current_name) {
const ModuleList &images = target_sp->GetImages();
SymbolContextList code_symbols;
- images.FindSymbolsWithNameAndType(trampoline_name, eSymbolTypeCode,
+ images.FindSymbolsWithNameAndType(current_name, eSymbolTypeCode,
code_symbols);
for (const SymbolContext &context : code_symbols) {
Address addr = context.GetFunctionOrSymbolAddress();
@@ -945,8 +946,8 @@ DynamicLoaderDarwin::GetStepThroughTrampolinePlan(Thread &thread,
}
SymbolContextList reexported_symbols;
- images.FindSymbolsWithNameAndType(
- trampoline_name, eSymbolTypeReExported, reexported_symbols);
+ images.FindSymbolsWithNameAndType(current_name, eSymbolTypeReExported,
+ reexported_symbols);
for (const SymbolContext &context : reexported_symbols) {
if (context.symbol) {
Symbol *actual_symbol =
@@ -968,7 +969,7 @@ DynamicLoaderDarwin::GetStepThroughTrampolinePlan(Thread &thread,
}
SymbolContextList indirect_symbols;
- images.FindSymbolsWithNameAndType(trampoline_name, eSymbolTypeResolver,
+ images.FindSymbolsWithNameAndType(current_name, eSymbolTypeResolver,
indirect_symbols);
for (const SymbolContext &context : indirect_symbols) {
@@ -1028,6 +1029,23 @@ DynamicLoaderDarwin::GetStepThroughTrampolinePlan(Thread &thread,
thread_plan_sp = std::make_shared<ThreadPlanRunToAddress>(
thread, load_addrs, stop_others);
}
+ // One more case we have to consider is "branch islands". These are regular
+ // TEXT symbols but their names end in .island plus maybe a .digit suffix.
+ // They are to allow arm64 code to branch further than the size of the
+ // address slot allows. We just need to single-instruction step in that
+ // case.
+ static const char *g_branch_island_pattern = "\\.island\\.?[0-9]*$";
+ static RegularExpression g_branch_island_regex(g_branch_island_pattern);
+
+ bool is_branch_island = g_branch_island_regex.Execute(current_name);
+ if (!thread_plan_sp && is_branch_island) {
+ thread_plan_sp = std::make_shared<ThreadPlanStepInstruction>(
+ thread,
+ /* step_over= */ false, /* stop_others */ false, eVoteNoOpinion,
+ eVoteNoOpinion);
+ LLDB_LOG(log, "Stepping one instruction over branch island: '{0}'.",
+ current_name);
+ }
} else {
LLDB_LOGF(log, "Could not find symbol for step through.");
}
diff --git a/lldb/test/API/macosx/branch-islands/Makefile b/lldb/test/API/macosx/branch-islands/Makefile
new file mode 100644
index 0000000000000..062e947f6d6ee
--- /dev/null
+++ b/lldb/test/API/macosx/branch-islands/Makefile
@@ -0,0 +1,16 @@
+C_SOURCES := main.c foo.c
+CFLAGS_EXTRAS := -std=c99
+
+include Makefile.rules
+
+a.out: main.o padding1.o padding2.o padding3.o padding4.o foo.o
+ ${CC} ${LDFLAGS} foo.o padding1.o padding2.o padding3.o padding4.o main.o -o a.out
+
+%.o: $(SRCDIR)/%.s
+ ${CC} -c $<
+
+#padding1.o: padding1.s
+# ${CC} -c $(SRCDIR)/padding1.s
+
+#padding2.o: padding2.s
+# ${CC} -c $(SRCDIR)/padding2.s
diff --git a/lldb/test/API/macosx/branch-islands/TestBranchIslands.py b/lldb/test/API/macosx/branch-islands/TestBranchIslands.py
new file mode 100644
index 0000000000000..d4885b6ead63f
--- /dev/null
+++ b/lldb/test/API/macosx/branch-islands/TestBranchIslands.py
@@ -0,0 +1,35 @@
+"""
+Make sure that we can step in across an arm64 branch island
+"""
+
+
+import lldb
+import lldbsuite.test.lldbutil as lldbutil
+from lldbsuite.test.lldbtest import *
+from lldbsuite.test.decorators import *
+
+
+class TestBranchIslandStepping(TestBase):
+ NO_DEBUG_INFO_TESTCASE = True
+
+ @skipUnlessAppleSilicon
+ def test_step_in_branch_island(self):
+ """Make sure we can step in across a branch island"""
+ self.build()
+ self.main_source_file = lldb.SBFileSpec("main.c")
+ self.do_test()
+
+ def do_test(self):
+ (target, process, thread, bkpt) = lldbutil.run_to_source_breakpoint(
+ self, "Set a breakpoint here", self.main_source_file
+ )
+
+ # Make sure that we did manage to generate a branch island for foo:
+ syms = target.FindSymbols("foo.island", lldb.eSymbolTypeCode)
+ self.assertEqual(len(syms), 1, "We did generate an island for foo")
+
+ thread.StepInto()
+ stop_frame = thread.frames[0]
+ self.assertIn("foo", stop_frame.name, "Stepped into foo")
+ var = stop_frame.FindVariable("a_variable_in_foo")
+ self.assertTrue(var.IsValid(), "Found the variable in foo")
diff --git a/lldb/test/API/macosx/branch-islands/foo.c b/lldb/test/API/macosx/branch-islands/foo.c
new file mode 100644
index 0000000000000..a5dd2e59e1d82
--- /dev/null
+++ b/lldb/test/API/macosx/branch-islands/foo.c
@@ -0,0 +1,6 @@
+#include <stdio.h>
+
+void foo() {
+ int a_variable_in_foo = 10;
+ printf("I am foo: %d.\n", a_variable_in_foo);
+}
diff --git a/lldb/test/API/macosx/branch-islands/main.c b/lldb/test/API/macosx/branch-islands/main.c
new file mode 100644
index 0000000000000..b5578bdd715df
--- /dev/null
+++ b/lldb/test/API/macosx/branch-islands/main.c
@@ -0,0 +1,6 @@
+extern void foo();
+
+int main() {
+ foo(); // Set a breakpoint here
+ return 0;
+}
diff --git a/lldb/test/API/macosx/branch-islands/padding1.s b/lldb/test/API/macosx/branch-islands/padding1.s
new file mode 100644
index 0000000000000..04abef5455c12
--- /dev/null
+++ b/lldb/test/API/macosx/branch-islands/padding1.s
@@ -0,0 +1,5 @@
+.text
+_padding1:
+.p2align 2
+.byte 0x10
+.space 120*1024*1024
diff --git a/lldb/test/API/macosx/branch-islands/padding2.s b/lldb/test/API/macosx/branch-islands/padding2.s
new file mode 100644
index 0000000000000..dc66686cc779f
--- /dev/null
+++ b/lldb/test/API/macosx/branch-islands/padding2.s
@@ -0,0 +1,5 @@
+.text
+_padding2:
+.p2align 2
+.byte 0x10
+.space 120*1024*1024
diff --git a/lldb/test/API/macosx/branch-islands/padding3.s b/lldb/test/API/macosx/branch-islands/padding3.s
new file mode 100644
index 0000000000000..bf920e2e4f643
--- /dev/null
+++ b/lldb/test/API/macosx/branch-islands/padding3.s
@@ -0,0 +1,5 @@
+.text
+_padding3:
+.p2align 2
+.byte 0x10
+.space 120*1024*1024
diff --git a/lldb/test/API/macosx/branch-islands/padding4.s b/lldb/test/API/macosx/branch-islands/padding4.s
new file mode 100644
index 0000000000000..1430fd2fd9729
--- /dev/null
+++ b/lldb/test/API/macosx/branch-islands/padding4.s
@@ -0,0 +1,5 @@
+.text
+_padding4:
+.p2align 2
+.byte 0x10
+.space 120*1024*1024
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
No, this change wasn't sufficient. There's something in the symbol table emitted on the tools on the builder that causes lldb to miscalculate the extent of the "spacing" such that its range encompasses the branch island symbols. That's why when we stop at the branch island, the stop address is attributed to the padding symbol that's at the beginning of the section and not the islands that the linker inserted at the end of this space. |
We chased this down to a bug in the Xcode 15.2 linker that causes lldb to miscalculate the symbol sizes, so that the "padding" symbol overlaps and shadows the ".island" symbol. That bug was fixed in the 15.3 linker, and I don't want to gate this PR on working around an old linker bug. So I added a bit to the test that detects this error and returns from test. It would be cleaner if we could turn this into an expected failure, but I couldn't see any way in unittest to decide mid-test method to mark that method as `expected fail". |
I used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
When an intra-module jump doesn't fit in the immediate branch slot, the Darwin linker inserts "branch island" symbols, and emits code to jump from branch island to branch island till it makes it to the actual function.
The previous submissions failed because in that environment the linker was putting the
foo.island
symbol at the same address as thepadding
symbol we we emitting to make our faked-up large binary. This submission jams a byte after the padding symbol so that the other symbols can't overlap it.