Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

LLVM symbolizer gsym support - attempt 2 #139686

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 13, 2025

Conversation

sfc-gh-mkwiczala
Copy link
Contributor

@sfc-gh-mkwiczala sfc-gh-mkwiczala commented May 13, 2025

Add support for gsym files to llvm-symbolizer.

co-author @sfc-gh-sgiesecke

Notes:
There was a PR that was
approved and merged: #134847
and reverted: #139660
Due to buildbot failures:
https://lab.llvm.org/buildbot/#/builders/66/builds/13851 - it looks like related
https://lab.llvm.org/buildbot/#/builders/51/builds/16018 - it looks like related
https://lab.llvm.org/buildbot/#/builders/146/builds/2905 - it looks like it's not related to changes

Fix:
To fix missing GSYM symbols

+ diff -u expected.new undefined.new
+_ZN4llvm4gsym10GsymReader8openFileENS_9StringRefE U
+_ZN4llvm4gsym10GsymReaderC1EOS1_ U
+_ZN4llvm4gsym10GsymReaderD1Ev U
+_ZN4llvm4gsym13GsymDIContextC1ENSt20__InternalSymbolizer10unique_ptrINS0_10GsymReaderENS2_14default_deleteIS4_EEEE U
+ echo 'Failed: unexpected symbols'

for script compiler-rt/lib/sanitizer_common/symbolizer/scripts/build_symbolizer.sh
LLVMDebugInfoGSYM was added.
Please check the commit:
ba55425
That's the only change compare to #134847

@llvmbot
Copy link
Member

llvmbot commented May 13, 2025

@llvm/pr-subscribers-llvm-binary-utilities

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Mariusz Kwiczala (sfc-gh-mkwiczala)

Changes

Add support for gsym files to llvm-symbolizer.

co-author @sfc-gh-sgiesecke

Notes:
There was a PR that was
approved and merged: #134847
and reverted: #139660
Due to buildbot failures:
https://lab.llvm.org/buildbot/#/builders/66/builds/13851 - it looks like related
https://lab.llvm.org/buildbot/#/builders/51/builds/16018 - it looks like related
https://lab.llvm.org/buildbot/#/builders/146/builds/2905 - it looks like it's not related to changes

This PR


Patch is 25.07 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/139686.diff

14 Files Affected:

  • (modified) compiler-rt/lib/sanitizer_common/symbolizer/scripts/build_symbolizer.sh (+2-1)
  • (modified) llvm/include/llvm/DebugInfo/DIContext.h (+1-1)
  • (added) llvm/include/llvm/DebugInfo/GSYM/GsymDIContext.h (+66)
  • (modified) llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h (+3)
  • (modified) llvm/lib/DebugInfo/GSYM/CMakeLists.txt (+1)
  • (added) llvm/lib/DebugInfo/GSYM/GsymDIContext.cpp (+166)
  • (modified) llvm/lib/DebugInfo/Symbolize/CMakeLists.txt (+1)
  • (modified) llvm/lib/DebugInfo/Symbolize/Symbolize.cpp (+71-23)
  • (added) llvm/test/tools/llvm-symbolizer/Inputs/addr-gsymonly.exe ()
  • (added) llvm/test/tools/llvm-symbolizer/Inputs/addr-gsymonly.exe.gsym ()
  • (added) llvm/test/tools/llvm-symbolizer/sym-gsymonly.test (+93)
  • (modified) llvm/tools/llvm-symbolizer/Opts.td (+5)
  • (modified) llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp (+2)
  • (modified) llvm/utils/gn/secondary/llvm/lib/DebugInfo/GSYM/BUILD.gn (+1)
diff --git a/compiler-rt/lib/sanitizer_common/symbolizer/scripts/build_symbolizer.sh b/compiler-rt/lib/sanitizer_common/symbolizer/scripts/build_symbolizer.sh
index a7b78f885eea4..fe49d944d3a2f 100755
--- a/compiler-rt/lib/sanitizer_common/symbolizer/scripts/build_symbolizer.sh
+++ b/compiler-rt/lib/sanitizer_common/symbolizer/scripts/build_symbolizer.sh
@@ -145,7 +145,7 @@ if [[ ! -f ${LLVM_BUILD}/build.ninja ]]; then
   $LLVM_SRC
 fi
 cd ${LLVM_BUILD}
-ninja LLVMSymbolize LLVMObject LLVMBinaryFormat LLVMDebugInfoDWARF LLVMSupport LLVMDebugInfoPDB LLVMDebuginfod LLVMMC LLVMDemangle LLVMTextAPI LLVMTargetParser LLVMCore
+ninja LLVMSymbolize LLVMObject LLVMBinaryFormat LLVMDebugInfoDWARF LLVMDebugInfoGSYM LLVMSupport LLVMDebugInfoPDB LLVMDebuginfod LLVMMC LLVMDemangle LLVMTextAPI LLVMTargetParser LLVMCore
 
 cd ${BUILD_DIR}
 rm -rf ${SYMBOLIZER_BUILD}
@@ -174,6 +174,7 @@ $LINK $LIBCXX_ARCHIVE_DIR/libc++.a \
       $LLVM_BUILD/lib/libLLVMObject.a \
       $LLVM_BUILD/lib/libLLVMBinaryFormat.a \
       $LLVM_BUILD/lib/libLLVMDebugInfoDWARF.a \
+      $LLVM_BUILD/lib/libLLVMDebugInfoGSYM.a \
       $LLVM_BUILD/lib/libLLVMSupport.a \
       $LLVM_BUILD/lib/libLLVMDebugInfoPDB.a \
       $LLVM_BUILD/lib/libLLVMDebugInfoMSF.a \
diff --git a/llvm/include/llvm/DebugInfo/DIContext.h b/llvm/include/llvm/DebugInfo/DIContext.h
index c90b99987f1db..0347f90c236d1 100644
--- a/llvm/include/llvm/DebugInfo/DIContext.h
+++ b/llvm/include/llvm/DebugInfo/DIContext.h
@@ -238,7 +238,7 @@ struct DIDumpOptions {
 
 class DIContext {
 public:
-  enum DIContextKind { CK_DWARF, CK_PDB, CK_BTF };
+  enum DIContextKind { CK_DWARF, CK_PDB, CK_BTF, CK_GSYM };
 
   DIContext(DIContextKind K) : Kind(K) {}
   virtual ~DIContext() = default;
diff --git a/llvm/include/llvm/DebugInfo/GSYM/GsymDIContext.h b/llvm/include/llvm/DebugInfo/GSYM/GsymDIContext.h
new file mode 100644
index 0000000000000..396c08c608d25
--- /dev/null
+++ b/llvm/include/llvm/DebugInfo/GSYM/GsymDIContext.h
@@ -0,0 +1,66 @@
+//===-- GsymDIContext.h --------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===/
+
+#ifndef LLVM_DEBUGINFO_GSYM_GSYMDICONTEXT_H
+#define LLVM_DEBUGINFO_GSYM_GSYMDICONTEXT_H
+
+#include "llvm/DebugInfo/DIContext.h"
+#include <cstdint>
+#include <memory>
+#include <string>
+
+namespace llvm {
+
+namespace gsym {
+
+class GsymReader;
+
+/// GSYM DI Context
+/// This data structure is the top level entity that deals with GSYM
+/// symbolication.
+/// This data structure exists only when there is a need for a transparent
+/// interface to different symbolication formats (e.g. GSYM, PDB and DWARF).
+/// More control and power over the debug information access can be had by using
+/// the GSYM interfaces directly.
+class GsymDIContext : public DIContext {
+public:
+  GsymDIContext(std::unique_ptr<GsymReader> Reader);
+
+  GsymDIContext(GsymDIContext &) = delete;
+  GsymDIContext &operator=(GsymDIContext &) = delete;
+
+  static bool classof(const DIContext *DICtx) {
+    return DICtx->getKind() == CK_GSYM;
+  }
+
+  void dump(raw_ostream &OS, DIDumpOptions DIDumpOpts) override;
+
+  std::optional<DILineInfo> getLineInfoForAddress(
+      object::SectionedAddress Address,
+      DILineInfoSpecifier Specifier = DILineInfoSpecifier()) override;
+  std::optional<DILineInfo>
+  getLineInfoForDataAddress(object::SectionedAddress Address) override;
+  DILineInfoTable getLineInfoForAddressRange(
+      object::SectionedAddress Address, uint64_t Size,
+      DILineInfoSpecifier Specifier = DILineInfoSpecifier()) override;
+  DIInliningInfo getInliningInfoForAddress(
+      object::SectionedAddress Address,
+      DILineInfoSpecifier Specifier = DILineInfoSpecifier()) override;
+
+  std::vector<DILocal>
+  getLocalsForAddress(object::SectionedAddress Address) override;
+
+private:
+  const std::unique_ptr<GsymReader> Reader;
+};
+
+} // end namespace gsym
+
+} // end namespace llvm
+
+#endif // LLVM_DEBUGINFO_PDB_PDBCONTEXT_H
diff --git a/llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h b/llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h
index 5747ad99d0f13..7c6beaa2189b7 100644
--- a/llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h
+++ b/llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h
@@ -58,11 +58,13 @@ class LLVMSymbolizer {
     bool RelativeAddresses = false;
     bool UntagAddresses = false;
     bool UseDIA = false;
+    bool DisableGsym = false;
     std::string DefaultArch;
     std::vector<std::string> DsymHints;
     std::string FallbackDebugPath;
     std::string DWPName;
     std::vector<std::string> DebugFileDirectory;
+    std::vector<std::string> GsymFileDirectory;
     size_t MaxCacheSize =
         sizeof(size_t) == 4
             ? 512 * 1024 * 1024 /* 512 MiB */
@@ -177,6 +179,7 @@ class LLVMSymbolizer {
   ObjectFile *lookUpBuildIDObject(const std::string &Path,
                                   const ELFObjectFileBase *Obj,
                                   const std::string &ArchName);
+  std::string lookUpGsymFile(const std::string &Path);
 
   bool findDebugBinary(const std::string &OrigPath,
                        const std::string &DebuglinkName, uint32_t CRCHash,
diff --git a/llvm/lib/DebugInfo/GSYM/CMakeLists.txt b/llvm/lib/DebugInfo/GSYM/CMakeLists.txt
index c27d648db62f6..724b5b213d643 100644
--- a/llvm/lib/DebugInfo/GSYM/CMakeLists.txt
+++ b/llvm/lib/DebugInfo/GSYM/CMakeLists.txt
@@ -4,6 +4,7 @@ add_llvm_component_library(LLVMDebugInfoGSYM
   FileWriter.cpp
   FunctionInfo.cpp
   GsymCreator.cpp
+  GsymDIContext.cpp
   GsymReader.cpp
   InlineInfo.cpp
   LineTable.cpp
diff --git a/llvm/lib/DebugInfo/GSYM/GsymDIContext.cpp b/llvm/lib/DebugInfo/GSYM/GsymDIContext.cpp
new file mode 100644
index 0000000000000..68024a9c9e782
--- /dev/null
+++ b/llvm/lib/DebugInfo/GSYM/GsymDIContext.cpp
@@ -0,0 +1,166 @@
+//===-- GsymDIContext.cpp ------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===/
+
+#include "llvm/DebugInfo/GSYM/GsymDIContext.h"
+
+#include "llvm/DebugInfo/GSYM/GsymReader.h"
+#include "llvm/Support/Path.h"
+
+using namespace llvm;
+using namespace llvm::gsym;
+
+GsymDIContext::GsymDIContext(std::unique_ptr<GsymReader> Reader)
+    : DIContext(CK_GSYM), Reader(std::move(Reader)) {}
+
+void GsymDIContext::dump(raw_ostream &OS, DIDumpOptions DumpOpts) {}
+
+static bool fillLineInfoFromLocation(const SourceLocation &Location,
+                                     DILineInfoSpecifier Specifier,
+                                     DILineInfo &LineInfo) {
+  // FIXME Demangle in case of DINameKind::ShortName
+  if (Specifier.FNKind != DINameKind::None) {
+    LineInfo.FunctionName = Location.Name.str();
+  }
+
+  switch (Specifier.FLIKind) {
+  case DILineInfoSpecifier::FileLineInfoKind::RelativeFilePath:
+    // We have no information to determine the relative path, so we fall back to
+    // returning the absolute path.
+  case DILineInfoSpecifier::FileLineInfoKind::RawValue:
+  case DILineInfoSpecifier::FileLineInfoKind::AbsoluteFilePath:
+    if (Location.Dir.empty()) {
+      if (Location.Base.empty())
+        LineInfo.FileName = DILineInfo::BadString;
+      else
+        LineInfo.FileName = Location.Base.str();
+    } else {
+      SmallString<128> Path(Location.Dir);
+      sys::path::append(Path, Location.Base);
+      LineInfo.FileName = static_cast<std::string>(Path);
+    }
+    break;
+
+  case DILineInfoSpecifier::FileLineInfoKind::BaseNameOnly:
+    LineInfo.FileName = Location.Base.str();
+    break;
+
+  default:
+    return false;
+  }
+  LineInfo.Line = Location.Line;
+
+  // We don't have information in GSYM to fill any of the Source, Column,
+  // StartFileName or StartLine attributes.
+
+  return true;
+}
+
+std::optional<DILineInfo>
+GsymDIContext::getLineInfoForAddress(object::SectionedAddress Address,
+                                     DILineInfoSpecifier Specifier) {
+  if (Address.SectionIndex != object::SectionedAddress::UndefSection)
+    return {};
+
+  auto ResultOrErr = Reader->lookup(Address.Address);
+
+  if (!ResultOrErr) {
+    consumeError(ResultOrErr.takeError());
+    return {};
+  }
+
+  const auto &Result = *ResultOrErr;
+
+  DILineInfo LineInfo;
+
+  if (Result.Locations.empty()) {
+    // No debug info for this, we just had a symbol from the symbol table.
+
+    // FIXME Demangle in case of DINameKind::ShortName
+    if (Specifier.FNKind != DINameKind::None)
+      LineInfo.FunctionName = Result.FuncName.str();
+  } else if (!fillLineInfoFromLocation(Result.Locations.front(), Specifier,
+                                       LineInfo))
+    return {};
+
+  LineInfo.StartAddress = Result.FuncRange.start();
+
+  return LineInfo;
+}
+
+std::optional<DILineInfo>
+GsymDIContext::getLineInfoForDataAddress(object::SectionedAddress Address) {
+  // We can't implement this, there's no such information in the GSYM file.
+
+  return {};
+}
+
+DILineInfoTable
+GsymDIContext::getLineInfoForAddressRange(object::SectionedAddress Address,
+                                          uint64_t Size,
+                                          DILineInfoSpecifier Specifier) {
+  if (Size == 0)
+    return DILineInfoTable();
+
+  if (Address.SectionIndex != llvm::object::SectionedAddress::UndefSection)
+    return DILineInfoTable();
+
+  if (auto FuncInfoOrErr = Reader->getFunctionInfo(Address.Address)) {
+    DILineInfoTable Table;
+    if (FuncInfoOrErr->OptLineTable) {
+      const gsym::LineTable &LT = *FuncInfoOrErr->OptLineTable;
+      const uint64_t StartAddr = Address.Address;
+      const uint64_t EndAddr = Address.Address + Size;
+      for (const auto &LineEntry : LT) {
+        if (StartAddr <= LineEntry.Addr && LineEntry.Addr < EndAddr) {
+          // Use LineEntry.Addr, LineEntry.File (which is a file index into the
+          // files tables from the GsymReader), and LineEntry.Line (source line
+          // number) to add stuff to the DILineInfoTable
+        }
+      }
+    }
+    return Table;
+  } else {
+    consumeError(FuncInfoOrErr.takeError());
+    return DILineInfoTable();
+  }
+}
+
+DIInliningInfo
+GsymDIContext::getInliningInfoForAddress(object::SectionedAddress Address,
+                                         DILineInfoSpecifier Specifier) {
+  auto ResultOrErr = Reader->lookup(Address.Address);
+
+  if (!ResultOrErr)
+    return {};
+
+  const auto &Result = *ResultOrErr;
+
+  DIInliningInfo InlineInfo;
+
+  for (const auto &Location : Result.Locations) {
+    DILineInfo LineInfo;
+
+    if (!fillLineInfoFromLocation(Location, Specifier, LineInfo))
+      return {};
+
+    // Hm, that's probably something that should only be filled in the first or
+    // last frame?
+    LineInfo.StartAddress = Result.FuncRange.start();
+
+    InlineInfo.addFrame(LineInfo);
+  }
+
+  return InlineInfo;
+}
+
+std::vector<DILocal>
+GsymDIContext::getLocalsForAddress(object::SectionedAddress Address) {
+  // We can't implement this, there's no such information in the GSYM file.
+
+  return {};
+}
diff --git a/llvm/lib/DebugInfo/Symbolize/CMakeLists.txt b/llvm/lib/DebugInfo/Symbolize/CMakeLists.txt
index 29f62bf6156fc..7aef3b0d79a3a 100644
--- a/llvm/lib/DebugInfo/Symbolize/CMakeLists.txt
+++ b/llvm/lib/DebugInfo/Symbolize/CMakeLists.txt
@@ -10,6 +10,7 @@ add_llvm_component_library(LLVMSymbolize
 
   LINK_COMPONENTS
   DebugInfoDWARF
+  DebugInfoGSYM
   DebugInfoPDB
   DebugInfoBTF
   Object
diff --git a/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp b/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
index 1d8217ad587ec..78a1421005de2 100644
--- a/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
+++ b/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
@@ -15,6 +15,8 @@
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/DebugInfo/BTF/BTFContext.h"
 #include "llvm/DebugInfo/DWARF/DWARFContext.h"
+#include "llvm/DebugInfo/GSYM/GsymDIContext.h"
+#include "llvm/DebugInfo/GSYM/GsymReader.h"
 #include "llvm/DebugInfo/PDB/PDB.h"
 #include "llvm/DebugInfo/PDB/PDBContext.h"
 #include "llvm/DebugInfo/Symbolize/SymbolizableObjectFile.h"
@@ -498,6 +500,34 @@ bool LLVMSymbolizer::getOrFindDebugBinary(const ArrayRef<uint8_t> BuildID,
   return false;
 }
 
+std::string LLVMSymbolizer::lookUpGsymFile(const std::string &Path) {
+  if (Opts.DisableGsym)
+    return {};
+
+  auto CheckGsymFile = [](const llvm::StringRef &GsymPath) {
+    sys::fs::file_status Status;
+    std::error_code EC = llvm::sys::fs::status(GsymPath, Status);
+    return !EC && !llvm::sys::fs::is_directory(Status);
+  };
+
+  // First, look beside the binary file
+  if (const auto GsymPath = Path + ".gsym"; CheckGsymFile(GsymPath))
+    return GsymPath;
+
+  // Then, look in the directories specified by GsymFileDirectory
+
+  for (const auto &Directory : Opts.GsymFileDirectory) {
+    SmallString<16> GsymPath = llvm::StringRef{Directory};
+    llvm::sys::path::append(GsymPath,
+                            llvm::sys::path::filename(Path) + ".gsym");
+
+    if (CheckGsymFile(GsymPath))
+      return static_cast<std::string>(GsymPath);
+  }
+
+  return {};
+}
+
 Expected<LLVMSymbolizer::ObjectPair>
 LLVMSymbolizer::getOrCreateObjectPair(const std::string &Path,
                                       const std::string &ArchName) {
@@ -634,30 +664,48 @@ LLVMSymbolizer::getOrCreateModuleInfo(StringRef ModuleName) {
   std::unique_ptr<DIContext> Context;
   // If this is a COFF object containing PDB info and not containing DWARF
   // section, use a PDBContext to symbolize. Otherwise, use DWARF.
-  if (auto CoffObject = dyn_cast<COFFObjectFile>(Objects.first)) {
-    const codeview::DebugInfo *DebugInfo;
-    StringRef PDBFileName;
-    auto EC = CoffObject->getDebugPDBInfo(DebugInfo, PDBFileName);
-    // Use DWARF if there're DWARF sections.
-    bool HasDwarf =
-        llvm::any_of(Objects.first->sections(), [](SectionRef Section) -> bool {
-          if (Expected<StringRef> SectionName = Section.getName())
-            return SectionName.get() == ".debug_info";
-          return false;
-        });
-    if (!EC && !HasDwarf && DebugInfo != nullptr && !PDBFileName.empty()) {
-      using namespace pdb;
-      std::unique_ptr<IPDBSession> Session;
-
-      PDB_ReaderType ReaderType =
-          Opts.UseDIA ? PDB_ReaderType::DIA : PDB_ReaderType::Native;
-      if (auto Err = loadDataForEXE(ReaderType, Objects.first->getFileName(),
-                                    Session)) {
-        Modules.emplace(ModuleName, std::unique_ptr<SymbolizableModule>());
-        // Return along the PDB filename to provide more context
-        return createFileError(PDBFileName, std::move(Err));
+  // Create a DIContext to symbolize as follows:
+  // - If there is a GSYM file, create a GsymDIContext.
+  // - Otherwise, if this is a COFF object containing PDB info, create a
+  // PDBContext.
+  // - Otherwise, create a DWARFContext.
+  const auto GsymFile = lookUpGsymFile(BinaryName.str());
+  if (!GsymFile.empty()) {
+    auto ReaderOrErr = gsym::GsymReader::openFile(GsymFile);
+
+    if (ReaderOrErr) {
+      std::unique_ptr<gsym::GsymReader> Reader =
+          std::make_unique<gsym::GsymReader>(std::move(*ReaderOrErr));
+
+      Context = std::make_unique<gsym::GsymDIContext>(std::move(Reader));
+    }
+  }
+  if (!Context) {
+    if (auto CoffObject = dyn_cast<COFFObjectFile>(Objects.first)) {
+      const codeview::DebugInfo *DebugInfo;
+      StringRef PDBFileName;
+      auto EC = CoffObject->getDebugPDBInfo(DebugInfo, PDBFileName);
+      // Use DWARF if there're DWARF sections.
+      bool HasDwarf = llvm::any_of(
+          Objects.first->sections(), [](SectionRef Section) -> bool {
+            if (Expected<StringRef> SectionName = Section.getName())
+              return SectionName.get() == ".debug_info";
+            return false;
+          });
+      if (!EC && !HasDwarf && DebugInfo != nullptr && !PDBFileName.empty()) {
+        using namespace pdb;
+        std::unique_ptr<IPDBSession> Session;
+
+        PDB_ReaderType ReaderType =
+            Opts.UseDIA ? PDB_ReaderType::DIA : PDB_ReaderType::Native;
+        if (auto Err = loadDataForEXE(ReaderType, Objects.first->getFileName(),
+                                      Session)) {
+          Modules.emplace(ModuleName, std::unique_ptr<SymbolizableModule>());
+          // Return along the PDB filename to provide more context
+          return createFileError(PDBFileName, std::move(Err));
+        }
+        Context.reset(new PDBContext(*CoffObject, std::move(Session)));
       }
-      Context.reset(new PDBContext(*CoffObject, std::move(Session)));
     }
   }
   if (!Context)
diff --git a/llvm/test/tools/llvm-symbolizer/Inputs/addr-gsymonly.exe b/llvm/test/tools/llvm-symbolizer/Inputs/addr-gsymonly.exe
new file mode 100755
index 0000000000000..f6f013b245822
Binary files /dev/null and b/llvm/test/tools/llvm-symbolizer/Inputs/addr-gsymonly.exe differ
diff --git a/llvm/test/tools/llvm-symbolizer/Inputs/addr-gsymonly.exe.gsym b/llvm/test/tools/llvm-symbolizer/Inputs/addr-gsymonly.exe.gsym
new file mode 100644
index 0000000000000..a46f78b9d880c
Binary files /dev/null and b/llvm/test/tools/llvm-symbolizer/Inputs/addr-gsymonly.exe.gsym differ
diff --git a/llvm/test/tools/llvm-symbolizer/sym-gsymonly.test b/llvm/test/tools/llvm-symbolizer/sym-gsymonly.test
new file mode 100644
index 0000000000000..0d00c002a2bdb
--- /dev/null
+++ b/llvm/test/tools/llvm-symbolizer/sym-gsymonly.test
@@ -0,0 +1,93 @@
+# This test is a variant of sym.test. It uses a binary without DWARF debug
+# info, but a corresponding .gsym file. The expectations are the same, except
+# for the fact that GSYM doesn't provide us with column numbers.
+#
+# Source:
+# #include <stdio.h>
+# static inline int inctwo (int *a) {
+#   printf ("%d\n",(*a)++);
+#   return (*a)++;
+# }
+# static inline int inc (int *a) {
+#   printf ("%d\n",inctwo(a));
+#   return (*a)++;
+# }
+#
+#
+# int main () {
+#   int x = 1;
+#   return inc(&x);
+# }
+#
+# Build as : clang -g -O2 addr.c
+extrat gsym file as : llvm-gsymutil --convert=%p/Inputs/addr.exe --out-file=%p/Inputs/addr-gsymonly.exe.gsym
+strip debug as : llvm-objcopy --strip-debug %p/Inputs/addr.exe %p/Inputs/addr-gsymonly.exe
+
+
+RUN: llvm-symbolizer --print-address --obj=%p/Inputs/addr-gsymonly.exe < %p/Inputs/addr.inp | FileCheck %s
+RUN: llvm-symbolizer --addresses --obj=%p/Inputs/addr-gsymonly.exe < %p/Inputs/addr.inp | FileCheck %s
+RUN: llvm-symbolizer -a --obj=%p/Inputs/addr-gsymonly.exe < %p/Inputs/addr.inp | FileCheck %s
+
+CHECK: ??:0:0
+CHECK-EMPTY:
+CHECK-NEXT: 0x40054d
+CHECK-NEXT: inctwo
+CHECK-NEXT: {{[/\]+}}tmp{{[/\]+}}x.c:3:0
+CHECK-NEXT: inc
+CHECK-NEXT: {{[/\]+}}tmp{{[/\]+}}x.c:7:0
+CHECK-NEXT: main
+CHECK-NEXT: {{[/\]+}}tmp{{[/\]+}}x.c:14:0
+CHECK-EMPTY:
+CHECK-NEXT: ??
+CHECK-NEXT: ??:0:0
+
+RUN: llvm-symbolizer --inlining --print-address --pretty-print --obj=%p/Inputs/addr-gsymonly.exe < %p/Inputs/addr.inp | FileCheck -check-prefix="PRETTY" %s 
+RUN: llvm-symbolizer --inlining --print-address -p --obj=%p/Inputs/addr-gsymonly.exe < %p/Inputs/addr.inp | FileCheck -check-prefix="PRETTY" %s
+RUN: llvm-symbolizer --inlines --print-address --pretty-print --obj=%p/Inputs/addr-gsymonly.exe < %p/Inputs/addr.inp | FileCheck -check-prefix="PRETTY" %s
+RUN: llvm-symbolizer --inlines --print-address -p --obj=%p/Inputs/addr-gsymonly.exe < %p/Inputs/addr.inp | FileCheck -check-prefix="PRETTY" %s
+RUN: llvm-symbolizer -i --print-address --pretty-print --obj=%p/Inputs/addr-gsymonly.exe < %p/Inputs/addr.inp | FileCheck -check-prefix="PRETTY" %s
+RUN: llvm-symbolizer -i --print-address -p --obj=%p/Inputs/addr-gsymonly.exe < %p/Inputs/addr.inp | FileCheck -check-prefix="PRETTY" %s
+
+# Before 2020-08-04, asan_symbolize.py passed --inlining=true.
+# Support this compatibility alias for a while.
+RUN: llvm-symbolizer --inlining=true --print-address -p --obj=%p/Inputs/addr-gsymonly.exe < %p/Inputs/addr.inp | FileCheck -check-prefix="PRETTY" %s
+
+PRETTY: ??:0:0
+PRETTY: {{[0x]+}}40054d: inctwo at {{[/\]+}}tmp{{[/\]+}}x.c:3:0
+PRE...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented May 13, 2025

@llvm/pr-subscribers-debuginfo

Author: Mariusz Kwiczala (sfc-gh-mkwiczala)

Changes

Add support for gsym files to llvm-symbolizer.

co-author @sfc-gh-sgiesecke

Notes:
There was a PR that was
approved and merged: #134847
and reverted: #139660
Due to buildbot failures:
https://lab.llvm.org/buildbot/#/builders/66/builds/13851 - it looks like related
https://lab.llvm.org/buildbot/#/builders/51/builds/16018 - it looks like related
https://lab.llvm.org/buildbot/#/builders/146/builds/2905 - it looks like it's not related to changes

This PR


Patch is 25.07 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/139686.diff

14 Files Affected:

  • (modified) compiler-rt/lib/sanitizer_common/symbolizer/scripts/build_symbolizer.sh (+2-1)
  • (modified) llvm/include/llvm/DebugInfo/DIContext.h (+1-1)
  • (added) llvm/include/llvm/DebugInfo/GSYM/GsymDIContext.h (+66)
  • (modified) llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h (+3)
  • (modified) llvm/lib/DebugInfo/GSYM/CMakeLists.txt (+1)
  • (added) llvm/lib/DebugInfo/GSYM/GsymDIContext.cpp (+166)
  • (modified) llvm/lib/DebugInfo/Symbolize/CMakeLists.txt (+1)
  • (modified) llvm/lib/DebugInfo/Symbolize/Symbolize.cpp (+71-23)
  • (added) llvm/test/tools/llvm-symbolizer/Inputs/addr-gsymonly.exe ()
  • (added) llvm/test/tools/llvm-symbolizer/Inputs/addr-gsymonly.exe.gsym ()
  • (added) llvm/test/tools/llvm-symbolizer/sym-gsymonly.test (+93)
  • (modified) llvm/tools/llvm-symbolizer/Opts.td (+5)
  • (modified) llvm/tools/llvm-symbolizer/llvm-symbolizer.cpp (+2)
  • (modified) llvm/utils/gn/secondary/llvm/lib/DebugInfo/GSYM/BUILD.gn (+1)
diff --git a/compiler-rt/lib/sanitizer_common/symbolizer/scripts/build_symbolizer.sh b/compiler-rt/lib/sanitizer_common/symbolizer/scripts/build_symbolizer.sh
index a7b78f885eea4..fe49d944d3a2f 100755
--- a/compiler-rt/lib/sanitizer_common/symbolizer/scripts/build_symbolizer.sh
+++ b/compiler-rt/lib/sanitizer_common/symbolizer/scripts/build_symbolizer.sh
@@ -145,7 +145,7 @@ if [[ ! -f ${LLVM_BUILD}/build.ninja ]]; then
   $LLVM_SRC
 fi
 cd ${LLVM_BUILD}
-ninja LLVMSymbolize LLVMObject LLVMBinaryFormat LLVMDebugInfoDWARF LLVMSupport LLVMDebugInfoPDB LLVMDebuginfod LLVMMC LLVMDemangle LLVMTextAPI LLVMTargetParser LLVMCore
+ninja LLVMSymbolize LLVMObject LLVMBinaryFormat LLVMDebugInfoDWARF LLVMDebugInfoGSYM LLVMSupport LLVMDebugInfoPDB LLVMDebuginfod LLVMMC LLVMDemangle LLVMTextAPI LLVMTargetParser LLVMCore
 
 cd ${BUILD_DIR}
 rm -rf ${SYMBOLIZER_BUILD}
@@ -174,6 +174,7 @@ $LINK $LIBCXX_ARCHIVE_DIR/libc++.a \
       $LLVM_BUILD/lib/libLLVMObject.a \
       $LLVM_BUILD/lib/libLLVMBinaryFormat.a \
       $LLVM_BUILD/lib/libLLVMDebugInfoDWARF.a \
+      $LLVM_BUILD/lib/libLLVMDebugInfoGSYM.a \
       $LLVM_BUILD/lib/libLLVMSupport.a \
       $LLVM_BUILD/lib/libLLVMDebugInfoPDB.a \
       $LLVM_BUILD/lib/libLLVMDebugInfoMSF.a \
diff --git a/llvm/include/llvm/DebugInfo/DIContext.h b/llvm/include/llvm/DebugInfo/DIContext.h
index c90b99987f1db..0347f90c236d1 100644
--- a/llvm/include/llvm/DebugInfo/DIContext.h
+++ b/llvm/include/llvm/DebugInfo/DIContext.h
@@ -238,7 +238,7 @@ struct DIDumpOptions {
 
 class DIContext {
 public:
-  enum DIContextKind { CK_DWARF, CK_PDB, CK_BTF };
+  enum DIContextKind { CK_DWARF, CK_PDB, CK_BTF, CK_GSYM };
 
   DIContext(DIContextKind K) : Kind(K) {}
   virtual ~DIContext() = default;
diff --git a/llvm/include/llvm/DebugInfo/GSYM/GsymDIContext.h b/llvm/include/llvm/DebugInfo/GSYM/GsymDIContext.h
new file mode 100644
index 0000000000000..396c08c608d25
--- /dev/null
+++ b/llvm/include/llvm/DebugInfo/GSYM/GsymDIContext.h
@@ -0,0 +1,66 @@
+//===-- GsymDIContext.h --------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===/
+
+#ifndef LLVM_DEBUGINFO_GSYM_GSYMDICONTEXT_H
+#define LLVM_DEBUGINFO_GSYM_GSYMDICONTEXT_H
+
+#include "llvm/DebugInfo/DIContext.h"
+#include <cstdint>
+#include <memory>
+#include <string>
+
+namespace llvm {
+
+namespace gsym {
+
+class GsymReader;
+
+/// GSYM DI Context
+/// This data structure is the top level entity that deals with GSYM
+/// symbolication.
+/// This data structure exists only when there is a need for a transparent
+/// interface to different symbolication formats (e.g. GSYM, PDB and DWARF).
+/// More control and power over the debug information access can be had by using
+/// the GSYM interfaces directly.
+class GsymDIContext : public DIContext {
+public:
+  GsymDIContext(std::unique_ptr<GsymReader> Reader);
+
+  GsymDIContext(GsymDIContext &) = delete;
+  GsymDIContext &operator=(GsymDIContext &) = delete;
+
+  static bool classof(const DIContext *DICtx) {
+    return DICtx->getKind() == CK_GSYM;
+  }
+
+  void dump(raw_ostream &OS, DIDumpOptions DIDumpOpts) override;
+
+  std::optional<DILineInfo> getLineInfoForAddress(
+      object::SectionedAddress Address,
+      DILineInfoSpecifier Specifier = DILineInfoSpecifier()) override;
+  std::optional<DILineInfo>
+  getLineInfoForDataAddress(object::SectionedAddress Address) override;
+  DILineInfoTable getLineInfoForAddressRange(
+      object::SectionedAddress Address, uint64_t Size,
+      DILineInfoSpecifier Specifier = DILineInfoSpecifier()) override;
+  DIInliningInfo getInliningInfoForAddress(
+      object::SectionedAddress Address,
+      DILineInfoSpecifier Specifier = DILineInfoSpecifier()) override;
+
+  std::vector<DILocal>
+  getLocalsForAddress(object::SectionedAddress Address) override;
+
+private:
+  const std::unique_ptr<GsymReader> Reader;
+};
+
+} // end namespace gsym
+
+} // end namespace llvm
+
+#endif // LLVM_DEBUGINFO_PDB_PDBCONTEXT_H
diff --git a/llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h b/llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h
index 5747ad99d0f13..7c6beaa2189b7 100644
--- a/llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h
+++ b/llvm/include/llvm/DebugInfo/Symbolize/Symbolize.h
@@ -58,11 +58,13 @@ class LLVMSymbolizer {
     bool RelativeAddresses = false;
     bool UntagAddresses = false;
     bool UseDIA = false;
+    bool DisableGsym = false;
     std::string DefaultArch;
     std::vector<std::string> DsymHints;
     std::string FallbackDebugPath;
     std::string DWPName;
     std::vector<std::string> DebugFileDirectory;
+    std::vector<std::string> GsymFileDirectory;
     size_t MaxCacheSize =
         sizeof(size_t) == 4
             ? 512 * 1024 * 1024 /* 512 MiB */
@@ -177,6 +179,7 @@ class LLVMSymbolizer {
   ObjectFile *lookUpBuildIDObject(const std::string &Path,
                                   const ELFObjectFileBase *Obj,
                                   const std::string &ArchName);
+  std::string lookUpGsymFile(const std::string &Path);
 
   bool findDebugBinary(const std::string &OrigPath,
                        const std::string &DebuglinkName, uint32_t CRCHash,
diff --git a/llvm/lib/DebugInfo/GSYM/CMakeLists.txt b/llvm/lib/DebugInfo/GSYM/CMakeLists.txt
index c27d648db62f6..724b5b213d643 100644
--- a/llvm/lib/DebugInfo/GSYM/CMakeLists.txt
+++ b/llvm/lib/DebugInfo/GSYM/CMakeLists.txt
@@ -4,6 +4,7 @@ add_llvm_component_library(LLVMDebugInfoGSYM
   FileWriter.cpp
   FunctionInfo.cpp
   GsymCreator.cpp
+  GsymDIContext.cpp
   GsymReader.cpp
   InlineInfo.cpp
   LineTable.cpp
diff --git a/llvm/lib/DebugInfo/GSYM/GsymDIContext.cpp b/llvm/lib/DebugInfo/GSYM/GsymDIContext.cpp
new file mode 100644
index 0000000000000..68024a9c9e782
--- /dev/null
+++ b/llvm/lib/DebugInfo/GSYM/GsymDIContext.cpp
@@ -0,0 +1,166 @@
+//===-- GsymDIContext.cpp ------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===/
+
+#include "llvm/DebugInfo/GSYM/GsymDIContext.h"
+
+#include "llvm/DebugInfo/GSYM/GsymReader.h"
+#include "llvm/Support/Path.h"
+
+using namespace llvm;
+using namespace llvm::gsym;
+
+GsymDIContext::GsymDIContext(std::unique_ptr<GsymReader> Reader)
+    : DIContext(CK_GSYM), Reader(std::move(Reader)) {}
+
+void GsymDIContext::dump(raw_ostream &OS, DIDumpOptions DumpOpts) {}
+
+static bool fillLineInfoFromLocation(const SourceLocation &Location,
+                                     DILineInfoSpecifier Specifier,
+                                     DILineInfo &LineInfo) {
+  // FIXME Demangle in case of DINameKind::ShortName
+  if (Specifier.FNKind != DINameKind::None) {
+    LineInfo.FunctionName = Location.Name.str();
+  }
+
+  switch (Specifier.FLIKind) {
+  case DILineInfoSpecifier::FileLineInfoKind::RelativeFilePath:
+    // We have no information to determine the relative path, so we fall back to
+    // returning the absolute path.
+  case DILineInfoSpecifier::FileLineInfoKind::RawValue:
+  case DILineInfoSpecifier::FileLineInfoKind::AbsoluteFilePath:
+    if (Location.Dir.empty()) {
+      if (Location.Base.empty())
+        LineInfo.FileName = DILineInfo::BadString;
+      else
+        LineInfo.FileName = Location.Base.str();
+    } else {
+      SmallString<128> Path(Location.Dir);
+      sys::path::append(Path, Location.Base);
+      LineInfo.FileName = static_cast<std::string>(Path);
+    }
+    break;
+
+  case DILineInfoSpecifier::FileLineInfoKind::BaseNameOnly:
+    LineInfo.FileName = Location.Base.str();
+    break;
+
+  default:
+    return false;
+  }
+  LineInfo.Line = Location.Line;
+
+  // We don't have information in GSYM to fill any of the Source, Column,
+  // StartFileName or StartLine attributes.
+
+  return true;
+}
+
+std::optional<DILineInfo>
+GsymDIContext::getLineInfoForAddress(object::SectionedAddress Address,
+                                     DILineInfoSpecifier Specifier) {
+  if (Address.SectionIndex != object::SectionedAddress::UndefSection)
+    return {};
+
+  auto ResultOrErr = Reader->lookup(Address.Address);
+
+  if (!ResultOrErr) {
+    consumeError(ResultOrErr.takeError());
+    return {};
+  }
+
+  const auto &Result = *ResultOrErr;
+
+  DILineInfo LineInfo;
+
+  if (Result.Locations.empty()) {
+    // No debug info for this, we just had a symbol from the symbol table.
+
+    // FIXME Demangle in case of DINameKind::ShortName
+    if (Specifier.FNKind != DINameKind::None)
+      LineInfo.FunctionName = Result.FuncName.str();
+  } else if (!fillLineInfoFromLocation(Result.Locations.front(), Specifier,
+                                       LineInfo))
+    return {};
+
+  LineInfo.StartAddress = Result.FuncRange.start();
+
+  return LineInfo;
+}
+
+std::optional<DILineInfo>
+GsymDIContext::getLineInfoForDataAddress(object::SectionedAddress Address) {
+  // We can't implement this, there's no such information in the GSYM file.
+
+  return {};
+}
+
+DILineInfoTable
+GsymDIContext::getLineInfoForAddressRange(object::SectionedAddress Address,
+                                          uint64_t Size,
+                                          DILineInfoSpecifier Specifier) {
+  if (Size == 0)
+    return DILineInfoTable();
+
+  if (Address.SectionIndex != llvm::object::SectionedAddress::UndefSection)
+    return DILineInfoTable();
+
+  if (auto FuncInfoOrErr = Reader->getFunctionInfo(Address.Address)) {
+    DILineInfoTable Table;
+    if (FuncInfoOrErr->OptLineTable) {
+      const gsym::LineTable &LT = *FuncInfoOrErr->OptLineTable;
+      const uint64_t StartAddr = Address.Address;
+      const uint64_t EndAddr = Address.Address + Size;
+      for (const auto &LineEntry : LT) {
+        if (StartAddr <= LineEntry.Addr && LineEntry.Addr < EndAddr) {
+          // Use LineEntry.Addr, LineEntry.File (which is a file index into the
+          // files tables from the GsymReader), and LineEntry.Line (source line
+          // number) to add stuff to the DILineInfoTable
+        }
+      }
+    }
+    return Table;
+  } else {
+    consumeError(FuncInfoOrErr.takeError());
+    return DILineInfoTable();
+  }
+}
+
+DIInliningInfo
+GsymDIContext::getInliningInfoForAddress(object::SectionedAddress Address,
+                                         DILineInfoSpecifier Specifier) {
+  auto ResultOrErr = Reader->lookup(Address.Address);
+
+  if (!ResultOrErr)
+    return {};
+
+  const auto &Result = *ResultOrErr;
+
+  DIInliningInfo InlineInfo;
+
+  for (const auto &Location : Result.Locations) {
+    DILineInfo LineInfo;
+
+    if (!fillLineInfoFromLocation(Location, Specifier, LineInfo))
+      return {};
+
+    // Hm, that's probably something that should only be filled in the first or
+    // last frame?
+    LineInfo.StartAddress = Result.FuncRange.start();
+
+    InlineInfo.addFrame(LineInfo);
+  }
+
+  return InlineInfo;
+}
+
+std::vector<DILocal>
+GsymDIContext::getLocalsForAddress(object::SectionedAddress Address) {
+  // We can't implement this, there's no such information in the GSYM file.
+
+  return {};
+}
diff --git a/llvm/lib/DebugInfo/Symbolize/CMakeLists.txt b/llvm/lib/DebugInfo/Symbolize/CMakeLists.txt
index 29f62bf6156fc..7aef3b0d79a3a 100644
--- a/llvm/lib/DebugInfo/Symbolize/CMakeLists.txt
+++ b/llvm/lib/DebugInfo/Symbolize/CMakeLists.txt
@@ -10,6 +10,7 @@ add_llvm_component_library(LLVMSymbolize
 
   LINK_COMPONENTS
   DebugInfoDWARF
+  DebugInfoGSYM
   DebugInfoPDB
   DebugInfoBTF
   Object
diff --git a/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp b/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
index 1d8217ad587ec..78a1421005de2 100644
--- a/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
+++ b/llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
@@ -15,6 +15,8 @@
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/DebugInfo/BTF/BTFContext.h"
 #include "llvm/DebugInfo/DWARF/DWARFContext.h"
+#include "llvm/DebugInfo/GSYM/GsymDIContext.h"
+#include "llvm/DebugInfo/GSYM/GsymReader.h"
 #include "llvm/DebugInfo/PDB/PDB.h"
 #include "llvm/DebugInfo/PDB/PDBContext.h"
 #include "llvm/DebugInfo/Symbolize/SymbolizableObjectFile.h"
@@ -498,6 +500,34 @@ bool LLVMSymbolizer::getOrFindDebugBinary(const ArrayRef<uint8_t> BuildID,
   return false;
 }
 
+std::string LLVMSymbolizer::lookUpGsymFile(const std::string &Path) {
+  if (Opts.DisableGsym)
+    return {};
+
+  auto CheckGsymFile = [](const llvm::StringRef &GsymPath) {
+    sys::fs::file_status Status;
+    std::error_code EC = llvm::sys::fs::status(GsymPath, Status);
+    return !EC && !llvm::sys::fs::is_directory(Status);
+  };
+
+  // First, look beside the binary file
+  if (const auto GsymPath = Path + ".gsym"; CheckGsymFile(GsymPath))
+    return GsymPath;
+
+  // Then, look in the directories specified by GsymFileDirectory
+
+  for (const auto &Directory : Opts.GsymFileDirectory) {
+    SmallString<16> GsymPath = llvm::StringRef{Directory};
+    llvm::sys::path::append(GsymPath,
+                            llvm::sys::path::filename(Path) + ".gsym");
+
+    if (CheckGsymFile(GsymPath))
+      return static_cast<std::string>(GsymPath);
+  }
+
+  return {};
+}
+
 Expected<LLVMSymbolizer::ObjectPair>
 LLVMSymbolizer::getOrCreateObjectPair(const std::string &Path,
                                       const std::string &ArchName) {
@@ -634,30 +664,48 @@ LLVMSymbolizer::getOrCreateModuleInfo(StringRef ModuleName) {
   std::unique_ptr<DIContext> Context;
   // If this is a COFF object containing PDB info and not containing DWARF
   // section, use a PDBContext to symbolize. Otherwise, use DWARF.
-  if (auto CoffObject = dyn_cast<COFFObjectFile>(Objects.first)) {
-    const codeview::DebugInfo *DebugInfo;
-    StringRef PDBFileName;
-    auto EC = CoffObject->getDebugPDBInfo(DebugInfo, PDBFileName);
-    // Use DWARF if there're DWARF sections.
-    bool HasDwarf =
-        llvm::any_of(Objects.first->sections(), [](SectionRef Section) -> bool {
-          if (Expected<StringRef> SectionName = Section.getName())
-            return SectionName.get() == ".debug_info";
-          return false;
-        });
-    if (!EC && !HasDwarf && DebugInfo != nullptr && !PDBFileName.empty()) {
-      using namespace pdb;
-      std::unique_ptr<IPDBSession> Session;
-
-      PDB_ReaderType ReaderType =
-          Opts.UseDIA ? PDB_ReaderType::DIA : PDB_ReaderType::Native;
-      if (auto Err = loadDataForEXE(ReaderType, Objects.first->getFileName(),
-                                    Session)) {
-        Modules.emplace(ModuleName, std::unique_ptr<SymbolizableModule>());
-        // Return along the PDB filename to provide more context
-        return createFileError(PDBFileName, std::move(Err));
+  // Create a DIContext to symbolize as follows:
+  // - If there is a GSYM file, create a GsymDIContext.
+  // - Otherwise, if this is a COFF object containing PDB info, create a
+  // PDBContext.
+  // - Otherwise, create a DWARFContext.
+  const auto GsymFile = lookUpGsymFile(BinaryName.str());
+  if (!GsymFile.empty()) {
+    auto ReaderOrErr = gsym::GsymReader::openFile(GsymFile);
+
+    if (ReaderOrErr) {
+      std::unique_ptr<gsym::GsymReader> Reader =
+          std::make_unique<gsym::GsymReader>(std::move(*ReaderOrErr));
+
+      Context = std::make_unique<gsym::GsymDIContext>(std::move(Reader));
+    }
+  }
+  if (!Context) {
+    if (auto CoffObject = dyn_cast<COFFObjectFile>(Objects.first)) {
+      const codeview::DebugInfo *DebugInfo;
+      StringRef PDBFileName;
+      auto EC = CoffObject->getDebugPDBInfo(DebugInfo, PDBFileName);
+      // Use DWARF if there're DWARF sections.
+      bool HasDwarf = llvm::any_of(
+          Objects.first->sections(), [](SectionRef Section) -> bool {
+            if (Expected<StringRef> SectionName = Section.getName())
+              return SectionName.get() == ".debug_info";
+            return false;
+          });
+      if (!EC && !HasDwarf && DebugInfo != nullptr && !PDBFileName.empty()) {
+        using namespace pdb;
+        std::unique_ptr<IPDBSession> Session;
+
+        PDB_ReaderType ReaderType =
+            Opts.UseDIA ? PDB_ReaderType::DIA : PDB_ReaderType::Native;
+        if (auto Err = loadDataForEXE(ReaderType, Objects.first->getFileName(),
+                                      Session)) {
+          Modules.emplace(ModuleName, std::unique_ptr<SymbolizableModule>());
+          // Return along the PDB filename to provide more context
+          return createFileError(PDBFileName, std::move(Err));
+        }
+        Context.reset(new PDBContext(*CoffObject, std::move(Session)));
       }
-      Context.reset(new PDBContext(*CoffObject, std::move(Session)));
     }
   }
   if (!Context)
diff --git a/llvm/test/tools/llvm-symbolizer/Inputs/addr-gsymonly.exe b/llvm/test/tools/llvm-symbolizer/Inputs/addr-gsymonly.exe
new file mode 100755
index 0000000000000..f6f013b245822
Binary files /dev/null and b/llvm/test/tools/llvm-symbolizer/Inputs/addr-gsymonly.exe differ
diff --git a/llvm/test/tools/llvm-symbolizer/Inputs/addr-gsymonly.exe.gsym b/llvm/test/tools/llvm-symbolizer/Inputs/addr-gsymonly.exe.gsym
new file mode 100644
index 0000000000000..a46f78b9d880c
Binary files /dev/null and b/llvm/test/tools/llvm-symbolizer/Inputs/addr-gsymonly.exe.gsym differ
diff --git a/llvm/test/tools/llvm-symbolizer/sym-gsymonly.test b/llvm/test/tools/llvm-symbolizer/sym-gsymonly.test
new file mode 100644
index 0000000000000..0d00c002a2bdb
--- /dev/null
+++ b/llvm/test/tools/llvm-symbolizer/sym-gsymonly.test
@@ -0,0 +1,93 @@
+# This test is a variant of sym.test. It uses a binary without DWARF debug
+# info, but a corresponding .gsym file. The expectations are the same, except
+# for the fact that GSYM doesn't provide us with column numbers.
+#
+# Source:
+# #include <stdio.h>
+# static inline int inctwo (int *a) {
+#   printf ("%d\n",(*a)++);
+#   return (*a)++;
+# }
+# static inline int inc (int *a) {
+#   printf ("%d\n",inctwo(a));
+#   return (*a)++;
+# }
+#
+#
+# int main () {
+#   int x = 1;
+#   return inc(&x);
+# }
+#
+# Build as : clang -g -O2 addr.c
+extrat gsym file as : llvm-gsymutil --convert=%p/Inputs/addr.exe --out-file=%p/Inputs/addr-gsymonly.exe.gsym
+strip debug as : llvm-objcopy --strip-debug %p/Inputs/addr.exe %p/Inputs/addr-gsymonly.exe
+
+
+RUN: llvm-symbolizer --print-address --obj=%p/Inputs/addr-gsymonly.exe < %p/Inputs/addr.inp | FileCheck %s
+RUN: llvm-symbolizer --addresses --obj=%p/Inputs/addr-gsymonly.exe < %p/Inputs/addr.inp | FileCheck %s
+RUN: llvm-symbolizer -a --obj=%p/Inputs/addr-gsymonly.exe < %p/Inputs/addr.inp | FileCheck %s
+
+CHECK: ??:0:0
+CHECK-EMPTY:
+CHECK-NEXT: 0x40054d
+CHECK-NEXT: inctwo
+CHECK-NEXT: {{[/\]+}}tmp{{[/\]+}}x.c:3:0
+CHECK-NEXT: inc
+CHECK-NEXT: {{[/\]+}}tmp{{[/\]+}}x.c:7:0
+CHECK-NEXT: main
+CHECK-NEXT: {{[/\]+}}tmp{{[/\]+}}x.c:14:0
+CHECK-EMPTY:
+CHECK-NEXT: ??
+CHECK-NEXT: ??:0:0
+
+RUN: llvm-symbolizer --inlining --print-address --pretty-print --obj=%p/Inputs/addr-gsymonly.exe < %p/Inputs/addr.inp | FileCheck -check-prefix="PRETTY" %s 
+RUN: llvm-symbolizer --inlining --print-address -p --obj=%p/Inputs/addr-gsymonly.exe < %p/Inputs/addr.inp | FileCheck -check-prefix="PRETTY" %s
+RUN: llvm-symbolizer --inlines --print-address --pretty-print --obj=%p/Inputs/addr-gsymonly.exe < %p/Inputs/addr.inp | FileCheck -check-prefix="PRETTY" %s
+RUN: llvm-symbolizer --inlines --print-address -p --obj=%p/Inputs/addr-gsymonly.exe < %p/Inputs/addr.inp | FileCheck -check-prefix="PRETTY" %s
+RUN: llvm-symbolizer -i --print-address --pretty-print --obj=%p/Inputs/addr-gsymonly.exe < %p/Inputs/addr.inp | FileCheck -check-prefix="PRETTY" %s
+RUN: llvm-symbolizer -i --print-address -p --obj=%p/Inputs/addr-gsymonly.exe < %p/Inputs/addr.inp | FileCheck -check-prefix="PRETTY" %s
+
+# Before 2020-08-04, asan_symbolize.py passed --inlining=true.
+# Support this compatibility alias for a while.
+RUN: llvm-symbolizer --inlining=true --print-address -p --obj=%p/Inputs/addr-gsymonly.exe < %p/Inputs/addr.inp | FileCheck -check-prefix="PRETTY" %s
+
+PRETTY: ??:0:0
+PRETTY: {{[0x]+}}40054d: inctwo at {{[/\]+}}tmp{{[/\]+}}x.c:3:0
+PRE...
[truncated]

@sfc-gh-mkwiczala
Copy link
Contributor Author

@dwblaikie can we try to approve/merge it one more time?

@dwblaikie
Copy link
Collaborator

Could you include more details directly in the PR description about the failures and what has changed in this patch to address them?

@sfc-gh-mkwiczala
Copy link
Contributor Author

Could you include more details directly in the PR description about the failures and what has changed in this patch to address them?

@dwblaikie added section "Fix:"
ba55425
That's the only change compare to #134847

@dwblaikie dwblaikie merged commit f4b80b9 into llvm:main May 13, 2025
12 of 15 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented May 13, 2025

LLVM Buildbot has detected a new failure on builder openmp-offload-amdgpu-runtime-2 running on rocm-worker-hw-02 while building compiler-rt,llvm at step 6 "test-openmp".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/10/builds/5274

Here is the relevant piece of the build log for the reference
Step 6 (test-openmp) failure: test (failure)
******************** TEST 'libarcher :: races/parallel-simple.c' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 13
/home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/./bin/clang -fopenmp  -gdwarf-4 -O1 -fsanitize=thread  -I /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests -I /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src   /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/parallel-simple.c -o /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/parallel-simple.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/deflake.bash /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/parallel-simple.c.tmp 2>&1 | tee /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/parallel-simple.c.tmp.log | /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/./bin/FileCheck /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/parallel-simple.c
# executed command: /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/./bin/clang -fopenmp -gdwarf-4 -O1 -fsanitize=thread -I /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests -I /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/parallel-simple.c -o /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/parallel-simple.c.tmp -latomic
# note: command had no output on stdout or stderr
# executed command: env TSAN_OPTIONS=ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1 /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/deflake.bash /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/parallel-simple.c.tmp
# note: command had no output on stdout or stderr
# executed command: tee /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/parallel-simple.c.tmp.log
# note: command had no output on stdout or stderr
# executed command: /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/./bin/FileCheck /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/parallel-simple.c
# note: command had no output on stdout or stderr
# RUN: at line 14
/home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/./bin/clang -fopenmp  -gdwarf-4 -O1 -fsanitize=thread  -I /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests -I /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src   /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/parallel-simple.c -o /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/parallel-simple.c.tmp -latomic && env ARCHER_OPTIONS="ignore_serial=1 report_data_leak=1" env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/deflake.bash /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/parallel-simple.c.tmp 2>&1 | tee /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/parallel-simple.c.tmp.log | /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/./bin/FileCheck /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/parallel-simple.c
# executed command: /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/./bin/clang -fopenmp -gdwarf-4 -O1 -fsanitize=thread -I /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests -I /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/parallel-simple.c -o /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/parallel-simple.c.tmp -latomic
# note: command had no output on stdout or stderr
# executed command: env 'ARCHER_OPTIONS=ignore_serial=1 report_data_leak=1' env TSAN_OPTIONS=ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1 /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/deflake.bash /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/parallel-simple.c.tmp
# note: command had no output on stdout or stderr
# executed command: tee /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/parallel-simple.c.tmp.log
# note: command had no output on stdout or stderr
# executed command: /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/./bin/FileCheck /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/parallel-simple.c
# .---command stderr------------
# | /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/parallel-simple.c:36:11: error: CHECK: expected string not found in input
# | // CHECK: ThreadSanitizer: reported {{[1-7]}} warnings
# |           ^
# | <stdin>:26:5: note: scanning from here
# | DONE
# |     ^
# | <stdin>:27:1: note: possible intended match here
# | ThreadSanitizer: thread T4 finished with ignores enabled, created at:
# | ^
# | 
# | Input file: <stdin>
# | Check file: /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/parallel-simple.c
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |             .
# |             .
# |             .
# |            21:  #0 pthread_create /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1045:3 (parallel-simple.c.tmp+0xa2ada) 
# |            22:  #1 __kmp_create_worker z_Linux_util.cpp (libomp.so+0xcac82) 
# |            23:  
# |            24: SUMMARY: ThreadSanitizer: data race /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/parallel-simple.c:23:8 in main.omp_outlined_debug__ 
# |            25: ================== 
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented May 13, 2025

LLVM Buildbot has detected a new failure on builder premerge-monolithic-windows running on premerge-windows-1 while building compiler-rt,llvm at step 5 "clean-build-dir".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/35/builds/10050

Here is the relevant piece of the build log for the reference
Step 5 (clean-build-dir) failure: Delete failed. (failure)
Step 8 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'lit :: timeout-hang.py' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 13
not env -u FILECHECK_OPTS "C:\Python39\python.exe" C:\ws\buildbot\premerge-monolithic-windows\llvm-project\llvm\utils\lit\lit.py -j1 --order=lexical Inputs/timeout-hang/run-nonexistent.txt  --timeout=1 --param external=0 | "C:\Python39\python.exe" C:\ws\buildbot\premerge-monolithic-windows\build\utils\lit\tests\timeout-hang.py 1
# executed command: not env -u FILECHECK_OPTS 'C:\Python39\python.exe' 'C:\ws\buildbot\premerge-monolithic-windows\llvm-project\llvm\utils\lit\lit.py' -j1 --order=lexical Inputs/timeout-hang/run-nonexistent.txt --timeout=1 --param external=0
# .---command stderr------------
# | lit.py: C:\ws\buildbot\premerge-monolithic-windows\llvm-project\llvm\utils\lit\lit\main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 1 seconds was requested on the command line. Forcing timeout to be 1 seconds.
# `-----------------------------
# executed command: 'C:\Python39\python.exe' 'C:\ws\buildbot\premerge-monolithic-windows\build\utils\lit\tests\timeout-hang.py' 1
# .---command stdout------------
# | Testing took as long or longer than timeout
# `-----------------------------
# error: command failed with exit status: 1

--

********************


/// interface to different symbolication formats (e.g. GSYM, PDB and DWARF).
/// More control and power over the debug information access can be had by using
/// the GSYM interfaces directly.
class GsymDIContext : public DIContext {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a nit, but I think GsymContext as a name would more consistent with other formats since we already have BTFContext, DWARFContext and PDBContext.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@petrhosek Thank you.
PR created here: #140227
@petrhosek @dwblaikie could you review it?

dwblaikie pushed a commit that referenced this pull request May 16, 2025
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request May 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.