-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[MC][DebugInfo] Emit linetable entries with known offsets immediately #134677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-backend-risc-v @llvm/pr-subscribers-debuginfo Author: Jeremy Morse (jmorse) ChangesDWARF linetable entries are usually emitted as a sequence of MCDwarfLineAddrFragment fragments containing the line-number difference and an MCExpr describing the instruction-range the linetable entry covers. These then get relaxed during assembly emission. However, a large number of these instruction-range expressions are ranges within a fixed MCDataFragment, i.e. a range over fixed-size instructions that are not subject to relaxation at a later stage. Thus, we can compute the address-delta immediately, and not spend time and memory describing that computation so it can be deferred. ~ A slightly earlier+dirtier version of this patch scored a cool 5% reduction in max-rss on most compile-time tracker samples -- see [0], scroll down to stage1-releaselto-g and observe the improvement in ".link" jobs. The intermediate jobs are LTO bitcode compilations so don't codegen, thus only the .link jobs are representative. The three compiles that don't improve (tramp3d, mafft, SPASS) are due to the domination of variable-location storage for tramp3d, and LiveDebugValues hitting max-rss for the last two. I'm not intensely familiar with the MCFragment layer so I'm not sure what the best test strategy is. A stage2reldeb build of clang produced an identical binary, but as a performance improvement it's not clear if this needs a functional test. If so, I imagine using Note that the RISCV opt-out comes from precedent set in Full diff: https://github.com/llvm/llvm-project/pull/134677.diff 1 Files Affected:
diff --git a/llvm/lib/MC/MCObjectStreamer.cpp b/llvm/lib/MC/MCObjectStreamer.cpp
index e228418fea987..8d1eab03290a9 100644
--- a/llvm/lib/MC/MCObjectStreamer.cpp
+++ b/llvm/lib/MC/MCObjectStreamer.cpp
@@ -462,6 +462,24 @@ void MCObjectStreamer::emitDwarfAdvanceLineAddr(int64_t LineDelta,
Label, PointerSize);
return;
}
+
+ // If the two labels are within the same fragment and it's a plain data
+ // fragment, then the address-offset is already a fixed constant and is not
+ // relaxable. Emit the advance-line-addr data immediately to save time and
+ // memory.
+ // As per commit bbea64250f6548, RISCV always desires symbolic relocations.
+ MCFragment *LastFrag = LastLabel->getFragment();
+ if (!getAssembler().getContext().getTargetTriple().isRISCV() &&
+ LastFrag->getKind() == MCFragment::FT_Data &&
+ Label->getFragment() == LastFrag) {
+ uint64_t AddrDelta = Label->getOffset() - LastLabel->getOffset();
+ SmallString<16> Tmp;
+ MCDwarfLineAddr::encode(getContext(), Assembler->getDWARFLinetableParams(),
+ LineDelta, AddrDelta, Tmp);
+ emitBytes(Tmp);
+ return;
+ }
+
const MCExpr *AddrDelta = buildSymbolDiff(*this, Label, LastLabel, SMLoc());
insert(getContext().allocFragment<MCDwarfLineAddrFragment>(LineDelta,
*AddrDelta));
|
The code does exhibit inefficiencies. Simplifying the code would enhance maintainability and reduce potential errors, particularly for linker relaxation targets like RISC-V, LoongArch, and potentially Xtensa. (My familiarity with those targets is limited. Many non-RISC-V targets probably haven't investigated much in this area yet.) If we are going to introduce some complexity - yeah, testing BTW, have you tried moving the logic to |
llvm/lib/MC/MCObjectStreamer.cpp
Outdated
// fragment, then the address-offset is already a fixed constant and is not | ||
// relaxable. Emit the advance-line-addr data immediately to save time and | ||
// memory. | ||
// As per commit bbea64250f6548, RISCV always desires symbolic relocations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps use absoluteSymbolDiff
and remove the isRISCV
check. I've improved absoluteSymbolDiff
and you may need to rebase
e5bcdbf
to
da53ea3
Compare
✅ With the latest revision this PR passed the C/C++ code formatter. |
Thanks for the review, the simple bits first:
The most unclear thing to me right now is the RISCV behaviour, and in particular the test llvm/test/DebugInfo/RISCV/dwarf-riscv-relocs.ll . With todays trunk LLVM, it produces a .debug_line section that uses DW_LNS_fixed_advance_pc and a relocation to compute each address-delta in the linetable. With my patch applied, it instead uses a linetable special-opcode to advance the address in the linetable, because the address delta becomes a fixed value at assembly time. The only way to preserve this behaviour would be to add a target hook to let backends opt-out of the optimisation that I'm adding, such as With that in mind, I'll signal the RISCV folks to see whether this is a behaviour they want to see preserved or whether it's a spurious test difference. |
Ah -- actually I think I see what's going on with that test. It has portions in .debug_line that cross an With that in mind, I'll remove the |
This is a pre-commit for a patch (#134677) -- the test behaviour is not being changed, instead I've added "-v" to the llvm-dwarfdump commandline for --debug-line. This prints out how the linetable is encoded, not just what the linetable means, and it's important to illustrate that in the upcoming patch. I've separated out the llvm-dwarfdump line for .debug_info so that it's not affected by adding -v.
DWARF linetable entries are usually emitted as a sequence of MCDwarfLineAddrFragment fragments containing the line-number difference and an MCExpr describing the instruction-range the linetable entry covers. These then get relaxed during assembly emission. However, a large number of these instruction-range expressions are ranges within a fixed MCDataFragment, i.e. a range over fixed-size instructions that are not subject to relaxation at a later stage. Thus, we can compute the address-delta immediately, and not spend time and memory describing that computation so it can be deferred.
See the comments on bbea642, the RISCV relocation model does something differently, and thus we need to emit linetables in a fully symbolic way.
…elocs Specifically: bbea642 took away a hook for whether a target wanted symbol differences to be expressed in relocations or not, and here I've found a use case for it.
Turns out this isn't needed, each fragment records whether it's relaxable
da53ea3
to
ba944d4
Compare
I believe this latest version now gets it right (TM) for RISCV as well -- observe the changes to the encoding of dwarf-riscv-relocs.ll. The assembly for this file reads:
i.e. there's a linker-relaxable relocation at address 0x10. The line table retains a Thanks for the blog post at https://maskray.me/blog/2021-03-14-the-dark-side-of-riscv-linker-relaxation , that's educated me about what's going on here @MaskRay . |
llvm/lib/MC/MCObjectStreamer.cpp
Outdated
@@ -464,6 +464,19 @@ void MCObjectStreamer::emitDwarfAdvanceLineAddr(int64_t LineDelta, | ||
Label, PointerSize); | ||
return; | ||
} | ||
|
||
// If the two labels are within the same fragment and it's a plain data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the two labels are within the same data fragment, then the address-offset is ...
(a data fragment is a MCDataFragment. There is no special or plain data fragment)
@@ -0,0 +1,94 @@ | ||
; RUN: llc %s -o %t.o -filetype=obj | ||
; RUN: llvm-dwarfdump --debug-line %t.o | FileCheck %s --check-prefix=LINES | ||
; RUN: llc %s -o %t.o -filetype=obj -debug-only=mc-dump 2>&1 | FileCheck %s --check-prefix=FRAGMENTS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding -debug-only tests requires caution. They need REQUIRES: asserts
Reworded comment, added REQUIRES, thanks for the review! |
DWARF linetable entries are usually emitted as a sequence of MCDwarfLineAddrFragment fragments containing the line-number difference and an MCExpr describing the instruction-range the linetable entry covers. These then get relaxed during assembly emission.
However, a large number of these instruction-range expressions are ranges within a fixed MCDataFragment, i.e. a range over fixed-size instructions that are not subject to relaxation at a later stage. Thus, we can compute the address-delta immediately, and not spend time and memory describing that computation so it can be deferred.
~
A slightly earlier+dirtier version of this patch scored a cool 5% reduction in max-rss on most compile-time tracker samples -- see [0], scroll down to stage1-releaselto-g and observe the improvement in ".link" jobs. The intermediate jobs are LTO bitcode compilations so don't codegen, thus only the .link jobs are representative. The three compiles that don't improve (tramp3d, mafft, SPASS) are due to the domination of variable-location storage for tramp3d, and LiveDebugValues hitting max-rss for the last two.
I'm not intensely familiar with the MCFragment layer so I'm not sure what the best test strategy is. A stage2reldeb build of clang produced an identical binary, but as a performance improvement it's not clear if this needs a functional test. If so, I imagine using
-debug-only=mc-dump
and examining what fragments are produced would be the way forwards?Note that the RISCV opt-out comes from precedent set in
MCObjectStreamer::emitAbsoluteSymbolDiff
and friends, RISCV appears to want to opt out of these space optimisations.[0] https://llvm-compile-time-tracker.com/compare.php?from=e2c43ba981620cf71ce3ccf004db7c0db4caf8a7&to=ca512762f92108bea21b2e2517f9e17f9d680a33&stat=max-rss&details=on