Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

[ELFObject] Added conditions to print removed symbols and removed sections #124692

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
Loading
from

Conversation

kshitijvp
Copy link
Contributor

@kshitijvp kshitijvp commented Jan 28, 2025

This patch adds conditions to print the removed sections and symbols when verbose output is given on the terminal.
Fixes: #123041

This patch adds dbgs() statement inside removeSections()
and removeSymbol() function to print the removed sections
and symbols.

Fixes: llvm#123041
@llvmbot
Copy link
Member

llvmbot commented Jan 28, 2025

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-llvm-binary-utilities

Author: Kshitij Paranjape (kshitijvp)

Changes

This patch adds dbgs() statement inside removeSections() and removeSymbol() function to print the removed sections and symbols.

Fixes: #123041


Full diff: https://github.com/llvm/llvm-project/pull/124692.diff

2 Files Affected:

  • (modified) llvm/lib/ObjCopy/ELF/ELFObject.cpp (+20-6)
  • (modified) llvm/tools/llvm-objcopy/CommonOpts.td (+2)
diff --git a/llvm/lib/ObjCopy/ELF/ELFObject.cpp b/llvm/lib/ObjCopy/ELF/ELFObject.cpp
index 45c7ea49b5d938..94c73b62de12e1 100644
--- a/llvm/lib/ObjCopy/ELF/ELFObject.cpp
+++ b/llvm/lib/ObjCopy/ELF/ELFObject.cpp
@@ -766,8 +766,14 @@ Error SymbolTableSection::removeSymbols(
     function_ref<bool(const Symbol &)> ToRemove) {
   Symbols.erase(
       std::remove_if(std::begin(Symbols) + 1, std::end(Symbols),
-                     [ToRemove](const SymPtr &Sym) { return ToRemove(*Sym); }),
-      std::end(Symbols));
+               [ToRemove](const SymPtr &Sym) {
+                   if (ToRemove(*Sym)) {
+                       dbgs()<<"Symbols Removed:"<<Sym->Name<< "\n";
+                       return true;
+                   }
+                   return false;
+               }));
+
   auto PrevSize = Size;
   Size = Symbols.size() * EntrySize;
   if (Size < PrevSize)
@@ -2249,10 +2255,18 @@ Error Object::removeSections(
 
   // Transfer removed sections into the Object RemovedSections container for use
   // later.
-  std::move(Iter, Sections.end(), std::back_inserter(RemovedSections));
-  // Now finally get rid of them all together.
-  Sections.erase(Iter, std::end(Sections));
-  return Error::success();
+  for(auto &KeepSec : make_range(std::begin(Sections) , Iter))
+  {
+    
+    if (Error E = KeepSec->removeSectionReferences(
+            AllowBrokenLinks, [&RemoveSections](const SectionBase *Sec) {
+              return RemoveSections.find(Sec) != RemoveSections.end();
+            }))
+      std::move(Iter, Sections.end(), std::back_inserter(RemovedSections));
+      dbgs()<<"Sections Removed:"<<KeepSec->Name<<'\n';
+      Sections.erase(Iter, std::end(Sections));
+      return Error::success();
+  }
 }
 
 Error Object::replaceSections(
diff --git a/llvm/tools/llvm-objcopy/CommonOpts.td b/llvm/tools/llvm-objcopy/CommonOpts.td
index c247c93f6e0f27..5b15191f546051 100644
--- a/llvm/tools/llvm-objcopy/CommonOpts.td
+++ b/llvm/tools/llvm-objcopy/CommonOpts.td
@@ -117,6 +117,8 @@ def regex
 
 def version : Flag<["--"], "version">,
               HelpText<"Print the version and exit.">;
+def verbose : Flag<["--"], "verbose">,
+              HelpText<"Prints the removed symbols and sections">;
 def V : Flag<["-"], "V">,
         Alias<version>,
         HelpText<"Alias for --version">;

Copy link

github-actions bot commented Jan 28, 2025

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff HEAD~1 HEAD --extensions h,cpp -- llvm/include/llvm/ObjCopy/CommonConfig.h llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp llvm/lib/ObjCopy/ELF/ELFObject.cpp llvm/lib/ObjCopy/ELF/ELFObject.h llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp llvm/tools/llvm-objcopy/ObjcopyOptions.cpp
View the diff from clang-format here.
diff --git a/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp b/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp
index 6eec21216..ad31f7ed7 100644
--- a/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp
+++ b/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp
@@ -547,7 +547,8 @@ static Error replaceAndRemoveSections(const CommonConfig &Config,
     };
   }
 
-  if (Error E = Obj.removeSections(ELFConfig.AllowBrokenLinks, RemovePred, Config.Verbose))
+  if (Error E = Obj.removeSections(ELFConfig.AllowBrokenLinks, RemovePred,
+                                   Config.Verbose))
     return E;
 
   if (Error E = Obj.compressOrDecompressSections(Config))
@@ -792,7 +793,8 @@ static Error handleArgs(const CommonConfig &Config, const ELFConfig &ELFConfig,
   if (!Config.SplitDWO.empty() && Config.ExtractDWO) {
     return Obj.removeSections(
         ELFConfig.AllowBrokenLinks,
-        [&Obj](const SectionBase &Sec) { return onlyKeepDWOPred(Obj, Sec); }, Config.Verbose);
+        [&Obj](const SectionBase &Sec) { return onlyKeepDWOPred(Obj, Sec); },
+        Config.Verbose);
   }
 
   // Dump sections before add/remove for compatibility with GNU objcopy.
diff --git a/llvm/lib/ObjCopy/ELF/ELFObject.cpp b/llvm/lib/ObjCopy/ELF/ELFObject.cpp
index fece7a88b..cc80dc365 100644
--- a/llvm/lib/ObjCopy/ELF/ELFObject.cpp
+++ b/llvm/lib/ObjCopy/ELF/ELFObject.cpp
@@ -764,16 +764,15 @@ void SymbolTableSection::updateSymbols(function_ref<void(Symbol &)> Callable) {
 
 Error SymbolTableSection::removeSymbols(
     function_ref<bool(const Symbol &)> ToRemove) {
-  Symbols.erase(
-      std::remove_if(std::begin(Symbols) + 1, std::end(Symbols),
-               [&](const SymPtr &Sym) {
-                   if (ToRemove(*Sym)) {
-                    if(VerboseOutput)
-                      outs() << "Symbols Removed:" << Sym->Name<< "\n";
-                    return true;
-                   }
-                   return false;
-               }));
+  Symbols.erase(std::remove_if(
+      std::begin(Symbols) + 1, std::end(Symbols), [&](const SymPtr &Sym) {
+        if (ToRemove(*Sym)) {
+          if (VerboseOutput)
+            outs() << "Symbols Removed:" << Sym->Name << "\n";
+          return true;
+        }
+        return false;
+      }));
 
   auto PrevSize = Size;
   Size = Symbols.size() * EntrySize;
@@ -2200,8 +2199,9 @@ Error Object::updateSectionData(SectionBase &S, ArrayRef<uint8_t> Data) {
   return updateSectionData(*It, Data);
 }
 
-Error Object::removeSections(
-    bool AllowBrokenLinks, std::function<bool(const SectionBase &)> ToRemove, bool VerboseOutput) {
+Error Object::removeSections(bool AllowBrokenLinks,
+                             std::function<bool(const SectionBase &)> ToRemove,
+                             bool VerboseOutput) {
 
   auto Iter = std::stable_partition(
       std::begin(Sections), std::end(Sections), [=](const SecPtr &Sec) {
@@ -2282,15 +2282,16 @@ Error Object::replaceSections(
 
   if (Error E = removeSections(
           /*AllowBrokenLinks=*/false,
-          [=](const SectionBase &Sec) { return FromTo.count(&Sec) > 0; }, false))
+          [=](const SectionBase &Sec) { return FromTo.count(&Sec) > 0; },
+          false))
     return E;
   llvm::sort(Sections, SectionIndexLess);
   return Error::success();
 }
 
 Error Object::removeSymbols(function_ref<bool(const Symbol &)> ToRemove) {
-  if (SymbolTable){
-    for (const SecPtr &Sec : Sections){
+  if (SymbolTable) {
+    for (const SecPtr &Sec : Sections) {
       if (Error E = Sec->removeSymbols(ToRemove))
         return E;
       outs() << "removed symbols:" << Sec->Name;
@@ -2580,9 +2581,12 @@ static Error removeUnneededSections(Object &Obj) {
   auto *StrTab = Obj.SymbolTable->getStrTab() == Obj.SectionNames
                      ? nullptr
                      : Obj.SymbolTable->getStrTab();
-  return Obj.removeSections(false, [&](const SectionBase &Sec) {
-    return &Sec == Obj.SymbolTable || &Sec == StrTab;
-  }, false);
+  return Obj.removeSections(
+      false,
+      [&](const SectionBase &Sec) {
+        return &Sec == Obj.SymbolTable || &Sec == StrTab;
+      },
+      false);
 }
 
 template <class ELFT> Error ELFWriter<ELFT>::finalize() {
@@ -2633,10 +2637,12 @@ template <class ELFT> Error ELFWriter<ELFT>::finalize() {
     // references to it.
     if (Obj.SectionIndexTable != nullptr) {
       // We do not support sections referring to the section index table.
-      if (Error E = Obj.removeSections(false /*AllowBrokenLinks*/,
-                                       [this](const SectionBase &Sec) {
-                                         return &Sec == Obj.SectionIndexTable;
-                                       }, false))
+      if (Error E = Obj.removeSections(
+              false /*AllowBrokenLinks*/,
+              [this](const SectionBase &Sec) {
+                return &Sec == Obj.SectionIndexTable;
+              },
+              false))
         return E;
     }
   }
diff --git a/llvm/lib/ObjCopy/ELF/ELFObject.h b/llvm/lib/ObjCopy/ELF/ELFObject.h
index bcda10034..42a6cab11 100644
--- a/llvm/lib/ObjCopy/ELF/ELFObject.h
+++ b/llvm/lib/ObjCopy/ELF/ELFObject.h
@@ -816,6 +816,7 @@ class SymbolTableSection : public SectionBase {
 
 private:
   bool VerboseOutput;
+
 protected:
   std::vector<std::unique_ptr<Symbol>> Symbols;
   StringTableSection *SymbolNames = nullptr;
@@ -1228,7 +1229,8 @@ public:
   ConstRange<Segment> segments() const { return make_pointee_range(Segments); }
 
   Error removeSections(bool AllowBrokenLinks,
-                       std::function<bool(const SectionBase &)> ToRemove, bool VerboseOutput);
+                       std::function<bool(const SectionBase &)> ToRemove,
+                       bool VerboseOutput);
   Error compressOrDecompressSections(const CommonConfig &Config);
   Error replaceSections(const DenseMap<SectionBase *, SectionBase *> &FromTo);
   Error removeSymbols(function_ref<bool(const Symbol &)> ToRemove);
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp b/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
index 3e245532d..b5a3482c4 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
@@ -2872,32 +2872,33 @@ Instruction *InstCombinerImpl::visitAnd(BinaryOperator &I) {
                                       /*SimplifyOnly*/ false, *this))
     return BinaryOperator::CreateAnd(Op0, V);
   // I is the 'and' instruction
-if (auto *Add = dyn_cast<BinaryOperator>(I.getOperand(0))) {
-  if (Add->getOpcode() == Instruction::Add) {
+  if (auto *Add = dyn_cast<BinaryOperator>(I.getOperand(0))) {
+    if (Add->getOpcode() == Instruction::Add) {
       Value *LHS = Add->getOperand(0); // should be shl
       ConstantInt *AddConst = dyn_cast<ConstantInt>(Add->getOperand(1));
       ConstantInt *AndConst = dyn_cast<ConstantInt>(I.getOperand(1));
       if (AddConst && AndConst) {
-          // check if AddConst is 47 and AndConst is -32
-          if (AddConst->equalsInt(47) && AndConst->getSExtValue() == -32) {
-              // Check if LHS is shl i64 %a, 5
-              if (auto *Shl = dyn_cast<BinaryOperator>(LHS)) {
-                  if (Shl->getOpcode() == Instruction::Shl) {
-                      ConstantInt *ShlConst = dyn_cast<ConstantInt>(Shl->getOperand(1));
-                      if (ShlConst && ShlConst->equalsInt(5)) {
-                          // You've matched the pattern!
-                          // Replace with: shl i64 (add i64 %a, 1), 5
-                          IRBuilder<> Builder(&I);
-                          Value *NewAdd = Builder.CreateAdd(Shl->getOperand(0), ConstantInt::get(I.getType(), 1));
-                          Value *NewShl = Builder.CreateShl(NewAdd, ShlConst);
-                          I.replaceAllUsesWith(NewShl);
-                      }
-                  }
+        // check if AddConst is 47 and AndConst is -32
+        if (AddConst->equalsInt(47) && AndConst->getSExtValue() == -32) {
+          // Check if LHS is shl i64 %a, 5
+          if (auto *Shl = dyn_cast<BinaryOperator>(LHS)) {
+            if (Shl->getOpcode() == Instruction::Shl) {
+              ConstantInt *ShlConst = dyn_cast<ConstantInt>(Shl->getOperand(1));
+              if (ShlConst && ShlConst->equalsInt(5)) {
+                // You've matched the pattern!
+                // Replace with: shl i64 (add i64 %a, 1), 5
+                IRBuilder<> Builder(&I);
+                Value *NewAdd = Builder.CreateAdd(
+                    Shl->getOperand(0), ConstantInt::get(I.getType(), 1));
+                Value *NewShl = Builder.CreateShl(NewAdd, ShlConst);
+                I.replaceAllUsesWith(NewShl);
               }
+            }
           }
+        }
       }
+    }
   }
-}
   return nullptr;
 }
 

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The symbols part of this looks reasonable to me, but I'm not sure what you've done with the Section part and there's a more obvious place for that change (see my inline comments).

Some other notes:

  1. Make sure to write a test.
  2. Don't use dbgs() to print things: use llvm::outs() (or llvm::errs(), depending on whether GNU strip prints to stdout or stderr), guarded by an if (Verbose) style check.
  3. Don't forget to update the documentation.

llvm/lib/ObjCopy/ELF/ELFObject.cpp Outdated Show resolved Hide resolved
llvm/lib/ObjCopy/ELF/ELFObject.cpp Outdated Show resolved Hide resolved
@kshitijvp
Copy link
Contributor Author

I have made the changes (except for adding tests and updating the documentation, I will do them as soon as possible)

llvm/include/llvm/ObjCopy/CommonConfig.h Outdated Show resolved Hide resolved
llvm/lib/ObjCopy/ELF/ELFObject.cpp Outdated Show resolved Hide resolved
llvm/lib/ObjCopy/ELF/ELFObject.cpp Outdated Show resolved Hide resolved
llvm/lib/ObjCopy/ELF/ELFObject.h Outdated Show resolved Hide resolved
@kshitijvp
Copy link
Contributor Author

@jh7370 I am not sure how to generate the tests for this. Could you guide me? I am a first-time contributor.

@jh7370
Copy link
Collaborator

jh7370 commented Feb 10, 2025

I'll take a look at this PR next week - I have other priorities for this week, I'm afraid.

@jh7370 I am not sure how to generate the tests for this. Could you guide me? I am a first-time contributor.

The best bet is to take a look at how other tests check things. Tests for llvm-objcopy and the corresponding library are primarily contained in llvm/test/tools/llvm-objcopy. You'll want an input that contains sections/symbols you want output for. This input will be in the form of a YAML document that you use yaml2obj to turn into an object file that you then feed into llvm-objcopy with the appropriate options. You'll feed the stdout/stderr (I haven't looked at which you've used) to FileCheck, which is a pattern matcher that matches against CHECK directives contained in the file. These CHECK directives should be used to match against the new messages. There aren't many llvm-objcopy tests that check messages, so you might want to look at e.g. llvm-readobj tests that do similar things, for more examples.

@jh7370
Copy link
Collaborator

jh7370 commented Feb 10, 2025

I'll take a look at this PR next week - I have other priorities for this week, I'm afraid.

On this note, please could nobody merge this without my review.

@kshitijvp
Copy link
Contributor Author

Thanks!

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay in looking at this. I've taken a look now.

Please update the PR title and description to remove references to dbgs(), since that isn't what this change is doing.

Also, please add the new option to the documentation at llvm/docs/CommandGuide/llvm-objcopy.rst and the llvm-strip equivalent.

If you haven't already, please review the LLVM coding standards, as there are a few places (mentioned inline) where you've not stuck to them.

Please also take a look at how llvm-objcopy and llvm-strip are tested in llvm/test/tools/llvm-objcopy. You should extend the existing tests for removing sections and symbols to include a case where --verbose is specified, to show that the expected output is printed (and no unexpected output is printed). I can assist with some guidance, if you can't figure it out yourself from existing behaviours. You may also want to look at tests of llvm-readobj etc that print things, to see how to check output in tests.

As @MaskRay has mentioned, the GNU objcopy/strip verbose option prints which files have been modified/copied. Please take a look at the existing GNU behaviour and add this functionality. If you would prefer to do that in a separate, subsequent PR, that's okay too. However, it is important, because without printing which file we're working on, when multiple files are being operated on (e.g. in an archive file, or multiple command-line inputs to llvm-strip), it's impossible to determine which file is referenced by a specific verbose output line (e.g. if multiple of the same named symbol exist in different objects, we need to know which ones the symbol was removed from).

@@ -825,6 +825,7 @@ class SymbolTableSection : public SectionBase {
public:
SymbolTableSection() { Type = OriginalType = ELF::SHT_SYMTAB; }

bool isVerboseFlag = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLVM coding standards say to use UpperCamelCase for variables.

Also, the normal style is to put member variables after the methods of a class.

Also, can this be private?

Finally, I'd rename this to VerboseOutput or similar. The fact that a flag is used to enable it is an implementation detail unrelated to this class itself.

Comment on lines 2296 to 2298
if (isVerboseEnabled) {
llvm::outs() << "Removed Symbols:" << Sec->Name;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect here. Error is treated as true if there was an error, so all this is doing is "if there was an error when trying to run removeSymbols on a section, print "Removed Symbols: <section name>" which is clearly not correct.

What are you trying to achieve with this part of the change?

@@ -1195,6 +1196,7 @@ class Object {
uint32_t Flags;

bool HadShdrs = true;
bool isVerboseEnabled = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than using state for this, I'd prefer it if the removeSection signature be expanded to take an VerboseOutput parameter.

Comment on lines 2240 to 2241
if (isVerboseEnabled) {
llvm::outs() << "Removed Section: " << (RemoveSec.get()->Name);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLVM style doesn't use braces around single-line ifs, typically.

Suggested change
if (isVerboseEnabled) {
llvm::outs() << "Removed Section: " << (RemoveSec.get()->Name);
}
if (isVerboseEnabled)
llvm::outs() << "Removed Section: " << (RemoveSec.get()->Name);

Also, the GNU output uses lower case for its verbose output when printing which files are copied, so it makes sense to print it more like "removed section".

[ToRemove](const SymPtr &Sym) { return ToRemove(*Sym); }),
std::end(Symbols));
[&](const SymPtr &Sym) {
if (ToRemove(*Sym)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need for the braces here (see other comment about coding standards).

[&](const SymPtr &Sym) {
if (ToRemove(*Sym)) {
if(isVerboseFlag)
llvm::outs() << "Symbols Removed:" << Sym->Name<< "\n";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need for the llvm:: qualifier before outs(), so just drop it. Same goes throughout this patch.

@MaskRay
Copy link
Member

MaskRay commented Feb 21, 2025

Frankly I'm not sure how useful this option would be. GNU objcopy --verbose doesn't print the information.
llvm-strip's default behavior with --strip-all removes all symbols and sections, which can result in a very large output file for anything beyond the simplest object.

There might be some value when using --strip-* with a pattern (either --regex or --wildcard) but this is likely not useful when stripping exact pattern sections/symbols.
We also do not mention update modifications llvm-objcopy might perform, which could lead to potential confusion.

As a user, I am used to performing a binary diff (readelf) after a strip/objcopy operation.

@kshitijvp kshitijvp changed the title [ELFObject] Added dbgs() statement in removeSections() [ELFObject] Added conditions to print removed symbols and removed sections Mar 11, 2025
@kshitijvp
Copy link
Contributor Author

I'm sorry for the delay in responding to you. I couldn’t make the changes earlier because I was tied up with my college exams. I’ll add the remaining changes soon.
Thank you!

@jh7370
Copy link
Collaborator

jh7370 commented Mar 11, 2025

Hi @kshitijvp, please avoid force pushing to active PRs. It's generally not needed - use a merge commit from main if you need to rebase, or do fixup commits (these will disappear when the PR is squashed and merged) and makes reviewing harder, because it trashes the "view changes since last review" option. The LLVM guidelines advise against it too. Also, please don't resolve conversations I've initiated. I use these to check through changes I've requested to ensure you've made the change I asked you to make and marking them as resolved makes it hard to see which ones I've confirmed and which ones I haven't yet (there's a whole discussion on the LLVM forums about the topic).

I've made some comments on the original issue, having given the topic some more though. You might want to pause further work on this PR until we've come to a conclusion there. You could implement the --verbose option first as just printing the input files, per the current GNU behaviour. This is definitely going to be wanted and is uncontroversial.

@kshitijvp
Copy link
Contributor Author

I sincerely apologize for that. I will ensure it doesn’t happen again moving forward.

@MaskRay
Copy link
Member

MaskRay commented May 16, 2025

As discussed on #123041 , I don't think printing removed symbols/sections belongs to the proposed --verbose.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a --verbose option for llvm-strip
5 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.