-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[ELFObject] Added conditions to print removed symbols and removed sections #124692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This patch adds dbgs() statement inside removeSections() and removeSymbol() function to print the removed sections and symbols. Fixes: llvm#123041
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-llvm-binary-utilities Author: Kshitij Paranjape (kshitijvp) ChangesThis patch adds dbgs() statement inside removeSections() and removeSymbol() function to print the removed sections and symbols. Fixes: #123041 Full diff: https://github.com/llvm/llvm-project/pull/124692.diff 2 Files Affected:
diff --git a/llvm/lib/ObjCopy/ELF/ELFObject.cpp b/llvm/lib/ObjCopy/ELF/ELFObject.cpp
index 45c7ea49b5d938..94c73b62de12e1 100644
--- a/llvm/lib/ObjCopy/ELF/ELFObject.cpp
+++ b/llvm/lib/ObjCopy/ELF/ELFObject.cpp
@@ -766,8 +766,14 @@ Error SymbolTableSection::removeSymbols(
function_ref<bool(const Symbol &)> ToRemove) {
Symbols.erase(
std::remove_if(std::begin(Symbols) + 1, std::end(Symbols),
- [ToRemove](const SymPtr &Sym) { return ToRemove(*Sym); }),
- std::end(Symbols));
+ [ToRemove](const SymPtr &Sym) {
+ if (ToRemove(*Sym)) {
+ dbgs()<<"Symbols Removed:"<<Sym->Name<< "\n";
+ return true;
+ }
+ return false;
+ }));
+
auto PrevSize = Size;
Size = Symbols.size() * EntrySize;
if (Size < PrevSize)
@@ -2249,10 +2255,18 @@ Error Object::removeSections(
// Transfer removed sections into the Object RemovedSections container for use
// later.
- std::move(Iter, Sections.end(), std::back_inserter(RemovedSections));
- // Now finally get rid of them all together.
- Sections.erase(Iter, std::end(Sections));
- return Error::success();
+ for(auto &KeepSec : make_range(std::begin(Sections) , Iter))
+ {
+
+ if (Error E = KeepSec->removeSectionReferences(
+ AllowBrokenLinks, [&RemoveSections](const SectionBase *Sec) {
+ return RemoveSections.find(Sec) != RemoveSections.end();
+ }))
+ std::move(Iter, Sections.end(), std::back_inserter(RemovedSections));
+ dbgs()<<"Sections Removed:"<<KeepSec->Name<<'\n';
+ Sections.erase(Iter, std::end(Sections));
+ return Error::success();
+ }
}
Error Object::replaceSections(
diff --git a/llvm/tools/llvm-objcopy/CommonOpts.td b/llvm/tools/llvm-objcopy/CommonOpts.td
index c247c93f6e0f27..5b15191f546051 100644
--- a/llvm/tools/llvm-objcopy/CommonOpts.td
+++ b/llvm/tools/llvm-objcopy/CommonOpts.td
@@ -117,6 +117,8 @@ def regex
def version : Flag<["--"], "version">,
HelpText<"Print the version and exit.">;
+def verbose : Flag<["--"], "verbose">,
+ HelpText<"Prints the removed symbols and sections">;
def V : Flag<["-"], "V">,
Alias<version>,
HelpText<"Alias for --version">;
|
You can test this locally with the following command:git-clang-format --diff HEAD~1 HEAD --extensions h,cpp -- llvm/include/llvm/ObjCopy/CommonConfig.h llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp llvm/lib/ObjCopy/ELF/ELFObject.cpp llvm/lib/ObjCopy/ELF/ELFObject.h llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp llvm/tools/llvm-objcopy/ObjcopyOptions.cpp View the diff from clang-format here.diff --git a/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp b/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp
index 6eec21216..ad31f7ed7 100644
--- a/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp
+++ b/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp
@@ -547,7 +547,8 @@ static Error replaceAndRemoveSections(const CommonConfig &Config,
};
}
- if (Error E = Obj.removeSections(ELFConfig.AllowBrokenLinks, RemovePred, Config.Verbose))
+ if (Error E = Obj.removeSections(ELFConfig.AllowBrokenLinks, RemovePred,
+ Config.Verbose))
return E;
if (Error E = Obj.compressOrDecompressSections(Config))
@@ -792,7 +793,8 @@ static Error handleArgs(const CommonConfig &Config, const ELFConfig &ELFConfig,
if (!Config.SplitDWO.empty() && Config.ExtractDWO) {
return Obj.removeSections(
ELFConfig.AllowBrokenLinks,
- [&Obj](const SectionBase &Sec) { return onlyKeepDWOPred(Obj, Sec); }, Config.Verbose);
+ [&Obj](const SectionBase &Sec) { return onlyKeepDWOPred(Obj, Sec); },
+ Config.Verbose);
}
// Dump sections before add/remove for compatibility with GNU objcopy.
diff --git a/llvm/lib/ObjCopy/ELF/ELFObject.cpp b/llvm/lib/ObjCopy/ELF/ELFObject.cpp
index fece7a88b..cc80dc365 100644
--- a/llvm/lib/ObjCopy/ELF/ELFObject.cpp
+++ b/llvm/lib/ObjCopy/ELF/ELFObject.cpp
@@ -764,16 +764,15 @@ void SymbolTableSection::updateSymbols(function_ref<void(Symbol &)> Callable) {
Error SymbolTableSection::removeSymbols(
function_ref<bool(const Symbol &)> ToRemove) {
- Symbols.erase(
- std::remove_if(std::begin(Symbols) + 1, std::end(Symbols),
- [&](const SymPtr &Sym) {
- if (ToRemove(*Sym)) {
- if(VerboseOutput)
- outs() << "Symbols Removed:" << Sym->Name<< "\n";
- return true;
- }
- return false;
- }));
+ Symbols.erase(std::remove_if(
+ std::begin(Symbols) + 1, std::end(Symbols), [&](const SymPtr &Sym) {
+ if (ToRemove(*Sym)) {
+ if (VerboseOutput)
+ outs() << "Symbols Removed:" << Sym->Name << "\n";
+ return true;
+ }
+ return false;
+ }));
auto PrevSize = Size;
Size = Symbols.size() * EntrySize;
@@ -2200,8 +2199,9 @@ Error Object::updateSectionData(SectionBase &S, ArrayRef<uint8_t> Data) {
return updateSectionData(*It, Data);
}
-Error Object::removeSections(
- bool AllowBrokenLinks, std::function<bool(const SectionBase &)> ToRemove, bool VerboseOutput) {
+Error Object::removeSections(bool AllowBrokenLinks,
+ std::function<bool(const SectionBase &)> ToRemove,
+ bool VerboseOutput) {
auto Iter = std::stable_partition(
std::begin(Sections), std::end(Sections), [=](const SecPtr &Sec) {
@@ -2282,15 +2282,16 @@ Error Object::replaceSections(
if (Error E = removeSections(
/*AllowBrokenLinks=*/false,
- [=](const SectionBase &Sec) { return FromTo.count(&Sec) > 0; }, false))
+ [=](const SectionBase &Sec) { return FromTo.count(&Sec) > 0; },
+ false))
return E;
llvm::sort(Sections, SectionIndexLess);
return Error::success();
}
Error Object::removeSymbols(function_ref<bool(const Symbol &)> ToRemove) {
- if (SymbolTable){
- for (const SecPtr &Sec : Sections){
+ if (SymbolTable) {
+ for (const SecPtr &Sec : Sections) {
if (Error E = Sec->removeSymbols(ToRemove))
return E;
outs() << "removed symbols:" << Sec->Name;
@@ -2580,9 +2581,12 @@ static Error removeUnneededSections(Object &Obj) {
auto *StrTab = Obj.SymbolTable->getStrTab() == Obj.SectionNames
? nullptr
: Obj.SymbolTable->getStrTab();
- return Obj.removeSections(false, [&](const SectionBase &Sec) {
- return &Sec == Obj.SymbolTable || &Sec == StrTab;
- }, false);
+ return Obj.removeSections(
+ false,
+ [&](const SectionBase &Sec) {
+ return &Sec == Obj.SymbolTable || &Sec == StrTab;
+ },
+ false);
}
template <class ELFT> Error ELFWriter<ELFT>::finalize() {
@@ -2633,10 +2637,12 @@ template <class ELFT> Error ELFWriter<ELFT>::finalize() {
// references to it.
if (Obj.SectionIndexTable != nullptr) {
// We do not support sections referring to the section index table.
- if (Error E = Obj.removeSections(false /*AllowBrokenLinks*/,
- [this](const SectionBase &Sec) {
- return &Sec == Obj.SectionIndexTable;
- }, false))
+ if (Error E = Obj.removeSections(
+ false /*AllowBrokenLinks*/,
+ [this](const SectionBase &Sec) {
+ return &Sec == Obj.SectionIndexTable;
+ },
+ false))
return E;
}
}
diff --git a/llvm/lib/ObjCopy/ELF/ELFObject.h b/llvm/lib/ObjCopy/ELF/ELFObject.h
index bcda10034..42a6cab11 100644
--- a/llvm/lib/ObjCopy/ELF/ELFObject.h
+++ b/llvm/lib/ObjCopy/ELF/ELFObject.h
@@ -816,6 +816,7 @@ class SymbolTableSection : public SectionBase {
private:
bool VerboseOutput;
+
protected:
std::vector<std::unique_ptr<Symbol>> Symbols;
StringTableSection *SymbolNames = nullptr;
@@ -1228,7 +1229,8 @@ public:
ConstRange<Segment> segments() const { return make_pointee_range(Segments); }
Error removeSections(bool AllowBrokenLinks,
- std::function<bool(const SectionBase &)> ToRemove, bool VerboseOutput);
+ std::function<bool(const SectionBase &)> ToRemove,
+ bool VerboseOutput);
Error compressOrDecompressSections(const CommonConfig &Config);
Error replaceSections(const DenseMap<SectionBase *, SectionBase *> &FromTo);
Error removeSymbols(function_ref<bool(const Symbol &)> ToRemove);
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp b/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
index 3e245532d..b5a3482c4 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
@@ -2872,32 +2872,33 @@ Instruction *InstCombinerImpl::visitAnd(BinaryOperator &I) {
/*SimplifyOnly*/ false, *this))
return BinaryOperator::CreateAnd(Op0, V);
// I is the 'and' instruction
-if (auto *Add = dyn_cast<BinaryOperator>(I.getOperand(0))) {
- if (Add->getOpcode() == Instruction::Add) {
+ if (auto *Add = dyn_cast<BinaryOperator>(I.getOperand(0))) {
+ if (Add->getOpcode() == Instruction::Add) {
Value *LHS = Add->getOperand(0); // should be shl
ConstantInt *AddConst = dyn_cast<ConstantInt>(Add->getOperand(1));
ConstantInt *AndConst = dyn_cast<ConstantInt>(I.getOperand(1));
if (AddConst && AndConst) {
- // check if AddConst is 47 and AndConst is -32
- if (AddConst->equalsInt(47) && AndConst->getSExtValue() == -32) {
- // Check if LHS is shl i64 %a, 5
- if (auto *Shl = dyn_cast<BinaryOperator>(LHS)) {
- if (Shl->getOpcode() == Instruction::Shl) {
- ConstantInt *ShlConst = dyn_cast<ConstantInt>(Shl->getOperand(1));
- if (ShlConst && ShlConst->equalsInt(5)) {
- // You've matched the pattern!
- // Replace with: shl i64 (add i64 %a, 1), 5
- IRBuilder<> Builder(&I);
- Value *NewAdd = Builder.CreateAdd(Shl->getOperand(0), ConstantInt::get(I.getType(), 1));
- Value *NewShl = Builder.CreateShl(NewAdd, ShlConst);
- I.replaceAllUsesWith(NewShl);
- }
- }
+ // check if AddConst is 47 and AndConst is -32
+ if (AddConst->equalsInt(47) && AndConst->getSExtValue() == -32) {
+ // Check if LHS is shl i64 %a, 5
+ if (auto *Shl = dyn_cast<BinaryOperator>(LHS)) {
+ if (Shl->getOpcode() == Instruction::Shl) {
+ ConstantInt *ShlConst = dyn_cast<ConstantInt>(Shl->getOperand(1));
+ if (ShlConst && ShlConst->equalsInt(5)) {
+ // You've matched the pattern!
+ // Replace with: shl i64 (add i64 %a, 1), 5
+ IRBuilder<> Builder(&I);
+ Value *NewAdd = Builder.CreateAdd(
+ Shl->getOperand(0), ConstantInt::get(I.getType(), 1));
+ Value *NewShl = Builder.CreateShl(NewAdd, ShlConst);
+ I.replaceAllUsesWith(NewShl);
}
+ }
}
+ }
}
+ }
}
-}
return nullptr;
}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The symbols part of this looks reasonable to me, but I'm not sure what you've done with the Section part and there's a more obvious place for that change (see my inline comments).
Some other notes:
- Make sure to write a test.
- Don't use
dbgs()
to print things: usellvm::outs()
(orllvm::errs()
, depending on whether GNU strip prints to stdout or stderr), guarded by anif (Verbose)
style check. - Don't forget to update the documentation.
I have made the changes (except for adding tests and updating the documentation, I will do them as soon as possible) |
@jh7370 I am not sure how to generate the tests for this. Could you guide me? I am a first-time contributor. |
I'll take a look at this PR next week - I have other priorities for this week, I'm afraid.
The best bet is to take a look at how other tests check things. Tests for llvm-objcopy and the corresponding library are primarily contained in llvm/test/tools/llvm-objcopy. You'll want an input that contains sections/symbols you want output for. This input will be in the form of a YAML document that you use yaml2obj to turn into an object file that you then feed into llvm-objcopy with the appropriate options. You'll feed the stdout/stderr (I haven't looked at which you've used) to FileCheck, which is a pattern matcher that matches against CHECK directives contained in the file. These CHECK directives should be used to match against the new messages. There aren't many llvm-objcopy tests that check messages, so you might want to look at e.g. llvm-readobj tests that do similar things, for more examples. |
On this note, please could nobody merge this without my review. |
Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay in looking at this. I've taken a look now.
Please update the PR title and description to remove references to dbgs()
, since that isn't what this change is doing.
Also, please add the new option to the documentation at llvm/docs/CommandGuide/llvm-objcopy.rst
and the llvm-strip
equivalent.
If you haven't already, please review the LLVM coding standards, as there are a few places (mentioned inline) where you've not stuck to them.
Please also take a look at how llvm-objcopy and llvm-strip are tested in llvm/test/tools/llvm-objcopy. You should extend the existing tests for removing sections and symbols to include a case where --verbose
is specified, to show that the expected output is printed (and no unexpected output is printed). I can assist with some guidance, if you can't figure it out yourself from existing behaviours. You may also want to look at tests of llvm-readobj etc that print things, to see how to check output in tests.
As @MaskRay has mentioned, the GNU objcopy/strip verbose option prints which files have been modified/copied. Please take a look at the existing GNU behaviour and add this functionality. If you would prefer to do that in a separate, subsequent PR, that's okay too. However, it is important, because without printing which file we're working on, when multiple files are being operated on (e.g. in an archive file, or multiple command-line inputs to llvm-strip), it's impossible to determine which file is referenced by a specific verbose output line (e.g. if multiple of the same named symbol exist in different objects, we need to know which ones the symbol was removed from).
llvm/lib/ObjCopy/ELF/ELFObject.h
Outdated
@@ -825,6 +825,7 @@ class SymbolTableSection : public SectionBase { | ||
public: | ||
SymbolTableSection() { Type = OriginalType = ELF::SHT_SYMTAB; } | ||
|
||
bool isVerboseFlag = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LLVM coding standards say to use UpperCamelCase
for variables.
Also, the normal style is to put member variables after the methods of a class.
Also, can this be private
?
Finally, I'd rename this to VerboseOutput
or similar. The fact that a flag is used to enable it is an implementation detail unrelated to this class itself.
llvm/lib/ObjCopy/ELF/ELFObject.cpp
Outdated
if (isVerboseEnabled) { | ||
llvm::outs() << "Removed Symbols:" << Sec->Name; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is incorrect here. Error
is treated as true
if there was an error, so all this is doing is "if there was an error when trying to run removeSymbols
on a section, print "Removed Symbols: <section name>" which is clearly not correct.
What are you trying to achieve with this part of the change?
llvm/lib/ObjCopy/ELF/ELFObject.h
Outdated
@@ -1195,6 +1196,7 @@ class Object { | ||
uint32_t Flags; | ||
|
||
bool HadShdrs = true; | ||
bool isVerboseEnabled = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than using state for this, I'd prefer it if the removeSection
signature be expanded to take an VerboseOutput
parameter.
llvm/lib/ObjCopy/ELF/ELFObject.cpp
Outdated
if (isVerboseEnabled) { | ||
llvm::outs() << "Removed Section: " << (RemoveSec.get()->Name); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LLVM style doesn't use braces around single-line ifs, typically.
if (isVerboseEnabled) { | |
llvm::outs() << "Removed Section: " << (RemoveSec.get()->Name); | |
} | |
if (isVerboseEnabled) | |
llvm::outs() << "Removed Section: " << (RemoveSec.get()->Name); |
Also, the GNU output uses lower case for its verbose output when printing which files are copied, so it makes sense to print it more like "removed section".
[ToRemove](const SymPtr &Sym) { return ToRemove(*Sym); }), | ||
std::end(Symbols)); | ||
[&](const SymPtr &Sym) { | ||
if (ToRemove(*Sym)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need for the braces here (see other comment about coding standards).
llvm/lib/ObjCopy/ELF/ELFObject.cpp
Outdated
[&](const SymPtr &Sym) { | ||
if (ToRemove(*Sym)) { | ||
if(isVerboseFlag) | ||
llvm::outs() << "Symbols Removed:" << Sym->Name<< "\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need for the llvm::
qualifier before outs()
, so just drop it. Same goes throughout this patch.
Frankly I'm not sure how useful this option would be. GNU objcopy --verbose doesn't print the information. There might be some value when using --strip-* with a pattern (either --regex or --wildcard) but this is likely not useful when stripping exact pattern sections/symbols. As a user, I am used to performing a binary diff (readelf) after a strip/objcopy operation. |
I'm sorry for the delay in responding to you. I couldn’t make the changes earlier because I was tied up with my college exams. I’ll add the remaining changes soon. |
Hi @kshitijvp, please avoid force pushing to active PRs. It's generally not needed - use a merge commit from I've made some comments on the original issue, having given the topic some more though. You might want to pause further work on this PR until we've come to a conclusion there. You could implement the |
I sincerely apologize for that. I will ensure it doesn’t happen again moving forward. |
As discussed on #123041 , I don't think printing removed symbols/sections belongs to the proposed --verbose. |
This patch adds conditions to print the removed sections and symbols when verbose output is given on the terminal.
Fixes: #123041