Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

[clang] Catch missing format attributes #105479

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
Loading
from

Conversation

budimirarandjelovichtec
Copy link
Contributor

Enable flag -Wmissing-format-attribute to catch missing attributes.

Fixes #60718

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Aug 21, 2024
@llvmbot
Copy link
Member

llvmbot commented Aug 21, 2024

@llvm/pr-subscribers-clang

Author: Budimir Aranđelović (budimirarandjelovichtec)

Changes

Enable flag -Wmissing-format-attribute to catch missing attributes.

Fixes #60718


Patch is 40.79 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/105479.diff

9 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+3)
  • (modified) clang/include/clang/Basic/DiagnosticGroups.td (-1)
  • (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (+3)
  • (modified) clang/include/clang/Sema/Attr.h (+7)
  • (modified) clang/include/clang/Sema/Sema.h (+4)
  • (modified) clang/lib/Sema/SemaDecl.cpp (+2)
  • (modified) clang/lib/Sema/SemaDeclAttr.cpp (+214-2)
  • (added) clang/test/Sema/attr-format-missing.c (+393)
  • (added) clang/test/Sema/attr-format-missing.cpp (+174)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 6adf57da42e656..f9fd4a20ea6bf6 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -707,6 +707,9 @@ Improvements to Clang's diagnostics
 
 - Clang now diagnoses integer constant expressions that are folded to a constant value as an extension in more circumstances. Fixes #GH59863
 
+- Clang now diagnoses missing format attributes for non-template functions and
+  class/struct/union members. Fixes #GH60718
+
 Improvements to Clang's time-trace
 ----------------------------------
 
diff --git a/clang/include/clang/Basic/DiagnosticGroups.td b/clang/include/clang/Basic/DiagnosticGroups.td
index 2241f8481484e2..da6a3b2fe3571b 100644
--- a/clang/include/clang/Basic/DiagnosticGroups.td
+++ b/clang/include/clang/Basic/DiagnosticGroups.td
@@ -525,7 +525,6 @@ def MainReturnType : DiagGroup<"main-return-type">;
 def MaxUnsignedZero : DiagGroup<"max-unsigned-zero">;
 def MissingBraces : DiagGroup<"missing-braces">;
 def MissingDeclarations: DiagGroup<"missing-declarations">;
-def : DiagGroup<"missing-format-attribute">;
 def MissingIncludeDirs : DiagGroup<"missing-include-dirs">;
 def MissingNoreturn : DiagGroup<"missing-noreturn">;
 def MultiChar : DiagGroup<"multichar">;
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 940f9ac226ca87..75e51c5aa166d2 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -1031,6 +1031,9 @@ def err_opencl_invalid_param : Error<
 def err_opencl_invalid_return : Error<
   "declaring function return value of type %0 is not allowed %select{; did you forget * ?|}1">;
 def warn_enum_value_overflow : Warning<"overflow in enumeration value">;
+def warn_missing_format_attribute : Warning<
+  "diagnostic behavior may be improved by adding the %0 format attribute to the declaration of %1">,
+  InGroup<DiagGroup<"missing-format-attribute">>, DefaultIgnore;
 def warn_pragma_options_align_reset_failed : Warning<
   "#pragma options align=reset failed: %0">,
   InGroup<IgnoredPragmas>;
diff --git a/clang/include/clang/Sema/Attr.h b/clang/include/clang/Sema/Attr.h
index 3f0b10212789a4..37c124ca7b454a 100644
--- a/clang/include/clang/Sema/Attr.h
+++ b/clang/include/clang/Sema/Attr.h
@@ -123,6 +123,13 @@ inline bool isInstanceMethod(const Decl *D) {
   return false;
 }
 
+inline bool checkIfMethodHasImplicitObjectParameter(const Decl *D) {
+  if (const auto *MethodDecl = dyn_cast<CXXMethodDecl>(D))
+    return MethodDecl->isInstance() &&
+           !MethodDecl->hasCXXExplicitFunctionObjectParameter();
+  return false;
+}
+
 /// Diagnose mutually exclusive attributes when present on a given
 /// declaration. Returns true if diagnosed.
 template <typename AttrTy>
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index 57994f4033922c..9022ac86edf817 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -4602,6 +4602,10 @@ class Sema final : public SemaBase {
 
   enum class RetainOwnershipKind { NS, CF, OS };
 
+  void DiagnoseMissingFormatAttributes(Stmt *Body, const FunctionDecl *FDecl);
+  std::vector<FormatAttr *>
+  GetMissingFormatAttributes(Stmt *Body, const FunctionDecl *FDecl);
+
   UuidAttr *mergeUuidAttr(Decl *D, const AttributeCommonInfo &CI,
                           StringRef UuidAsWritten, MSGuidDecl *GuidDecl);
 
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index 66eeaa8e6f7777..29c28d4b567513 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -15925,6 +15925,8 @@ Decl *Sema::ActOnFinishFunctionBody(Decl *dcl, Stmt *Body,
         }
       }
 
+      DiagnoseMissingFormatAttributes(Body, FD);
+
       // We might not have found a prototype because we didn't wish to warn on
       // the lack of a missing prototype. Try again without the checks for
       // whether we want to warn on the missing prototype.
diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp
index f2cd46d1e7c932..4634dd8b9cce80 100644
--- a/clang/lib/Sema/SemaDeclAttr.cpp
+++ b/clang/lib/Sema/SemaDeclAttr.cpp
@@ -3508,7 +3508,7 @@ static void handleFormatAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
 
   // In C++ the implicit 'this' function parameter also counts, and they are
   // counted from one.
-  bool HasImplicitThisParam = isInstanceMethod(D);
+  bool HasImplicitThisParam = checkIfMethodHasImplicitObjectParameter(D);
   unsigned NumArgs = getFunctionOrMethodNumParams(D) + HasImplicitThisParam;
 
   IdentifierInfo *II = AL.getArgAsIdent(0)->Ident;
@@ -3621,7 +3621,7 @@ static void handleCallbackAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
     return;
   }
 
-  bool HasImplicitThisParam = isInstanceMethod(D);
+  bool HasImplicitThisParam = checkIfMethodHasImplicitObjectParameter(D);
   int32_t NumArgs = getFunctionOrMethodNumParams(D);
 
   FunctionDecl *FD = D->getAsFunction();
@@ -5320,6 +5320,218 @@ static void handlePreferredTypeAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
   D->addAttr(::new (S.Context) PreferredTypeAttr(S.Context, AL, ParmTSI));
 }
 
+// This function is called only if function call is not inside template body.
+// TODO: Add call for function calls inside template body.
+// Emit warnings if parent function misses format attributes.
+void Sema::DiagnoseMissingFormatAttributes(Stmt *Body,
+                                           const FunctionDecl *FDecl) {
+  assert(FDecl);
+
+  // If there are no function body, exit.
+  if (!Body)
+    return;
+
+  // Get missing format attributes
+  std::vector<FormatAttr *> MissingFormatAttributes =
+      GetMissingFormatAttributes(Body, FDecl);
+  if (MissingFormatAttributes.empty())
+    return;
+
+  // Check if there are more than one format type found. In that case do not
+  // emit diagnostic.
+  const clang::IdentifierInfo *AttrType = MissingFormatAttributes[0]->getType();
+  if (llvm::any_of(MissingFormatAttributes, [&](const FormatAttr *Attr) {
+        return AttrType != Attr->getType();
+      }))
+    return;
+
+  for (const FormatAttr *FA : MissingFormatAttributes) {
+    // If format index and first-to-check argument index are negative, it means
+    // that this attribute is only saved for multiple format types checking.
+    if (FA->getFormatIdx() < 0 || FA->getFirstArg() < 0)
+      continue;
+
+    // Emit diagnostic
+    SourceLocation Loc = FDecl->getLocation();
+    Diag(Loc, diag::warn_missing_format_attribute)
+        << FA->getType() << FDecl
+        << FixItHint::CreateInsertion(Loc,
+                                      (llvm::Twine("__attribute__((format(") +
+                                       FA->getType()->getName() + ", " +
+                                       llvm::Twine(FA->getFormatIdx()) + ", " +
+                                       llvm::Twine(FA->getFirstArg()) + ")))")
+                                          .str());
+  }
+}
+
+// Returns vector of format attributes. There are no two attributes with same
+// arguments in returning vector. There can be attributes that effectivelly only
+// store information about format type.
+std::vector<FormatAttr *>
+Sema::GetMissingFormatAttributes(Stmt *Body, const FunctionDecl *FDecl) {
+  unsigned int FunctionFormatArgumentIndexOffset =
+      checkIfMethodHasImplicitObjectParameter(FDecl) ? 2 : 1;
+
+  std::vector<FormatAttr *> MissingAttributes;
+
+  // Iterate over body statements.
+  for (auto *Child : Body->children()) {
+    // If child statement is compound statement, recursively get missing
+    // attributes.
+    if (dyn_cast_or_null<CompoundStmt>(Child)) {
+      std::vector<FormatAttr *> CompoundStmtMissingAttributes =
+          GetMissingFormatAttributes(Child, FDecl);
+
+      // If there are already missing attributes with same arguments, do not add
+      // duplicates.
+      for (FormatAttr *FA : CompoundStmtMissingAttributes) {
+        if (!llvm::any_of(MissingAttributes, [&](const FormatAttr *Attr) {
+              return FA->getType() == Attr->getType() &&
+                     FA->getFormatIdx() == Attr->getFormatIdx() &&
+                     FA->getFirstArg() == Attr->getFirstArg();
+            }))
+          MissingAttributes.push_back(FA);
+      }
+
+      continue;
+    }
+
+    ValueStmt *VS = dyn_cast_or_null<ValueStmt>(Child);
+    if (!VS)
+      continue;
+    Expr *TheExpr = VS->getExprStmt();
+    if (!TheExpr)
+      continue;
+    CallExpr *TheCall = dyn_cast_or_null<CallExpr>(TheExpr);
+    if (!TheCall)
+      continue;
+    const FunctionDecl *ChildFunction =
+        dyn_cast_or_null<FunctionDecl>(TheCall->getCalleeDecl());
+    if (!ChildFunction)
+      continue;
+
+    Expr **Args = TheCall->getArgs();
+    unsigned int NumArgs = TheCall->getNumArgs();
+
+    // If child expression is function, check if it is format function.
+    // If it is, check if parent function misses format attributes.
+
+    // If child function is format function and format arguments are not
+    // relevant to emit diagnostic, save only information about format type
+    // (format index and first-to-check argument index are set to -1).
+    // Information about format type is later used to determine if there are
+    // more than one format type found.
+
+    unsigned int ChildFunctionFormatArgumentIndexOffset =
+        checkIfMethodHasImplicitObjectParameter(ChildFunction) ? 2 : 1;
+
+    // Check if function has format attribute with forwarded format string.
+    IdentifierInfo *AttrType;
+    const ParmVarDecl *FormatArg;
+    if (llvm::any_of(ChildFunction->specific_attrs<FormatAttr>(),
+                     [&](const FormatAttr *Attr) {
+                       AttrType = Attr->getType();
+
+                       int OffsetFormatIndex =
+                           Attr->getFormatIdx() -
+                           ChildFunctionFormatArgumentIndexOffset;
+                       if (OffsetFormatIndex < 0 ||
+                           (unsigned)OffsetFormatIndex >= NumArgs)
+                         return true;
+
+                       const auto *FormatArgExpr = dyn_cast<DeclRefExpr>(
+                           Args[OffsetFormatIndex]->IgnoreParenCasts());
+                       if (!FormatArgExpr)
+                         return true;
+
+                       FormatArg = dyn_cast_or_null<ParmVarDecl>(
+                           FormatArgExpr->getReferencedDeclOfCallee());
+                       if (!FormatArg)
+                         return true;
+
+                       return false;
+                     })) {
+      MissingAttributes.push_back(
+          FormatAttr::CreateImplicit(getASTContext(), AttrType, -1, -1));
+      continue;
+    }
+
+    // Do not add in a vector format attributes whose type is different than
+    // parent function attribute type.
+    if (llvm::any_of(FDecl->specific_attrs<FormatAttr>(),
+                     [&](const FormatAttr *FunctionAttr) {
+                       return AttrType != FunctionAttr->getType();
+                     }))
+      continue;
+
+    // Check if format string argument is parent function parameter.
+    int StringIndex = 0;
+    if (!llvm::any_of(FDecl->parameters(), [&](const ParmVarDecl *Param) {
+          if (Param != FormatArg)
+            return false;
+
+          StringIndex = Param->getFunctionScopeIndex() +
+                        FunctionFormatArgumentIndexOffset;
+
+          return true;
+        })) {
+      MissingAttributes.push_back(
+          FormatAttr::CreateImplicit(getASTContext(), AttrType, -1, -1));
+      continue;
+    }
+
+    unsigned NumOfParentFunctionParams = FDecl->getNumParams();
+
+    // Compare parent and calling function format attribute arguments (archetype
+    // and format string).
+    if (llvm::any_of(
+            FDecl->specific_attrs<FormatAttr>(), [&](const FormatAttr *Attr) {
+              if (Attr->getType() != AttrType)
+                return false;
+              int OffsetFormatIndex =
+                  Attr->getFormatIdx() - FunctionFormatArgumentIndexOffset;
+
+              if (OffsetFormatIndex < 0 ||
+                  (unsigned)OffsetFormatIndex >= NumOfParentFunctionParams)
+                return false;
+
+              if (FDecl->parameters()[OffsetFormatIndex] != FormatArg)
+                return false;
+
+              return true;
+            })) {
+      MissingAttributes.push_back(
+          FormatAttr::CreateImplicit(getASTContext(), AttrType, -1, -1));
+      continue;
+    }
+
+    // Get first argument index
+    int FirstToCheck = [&]() -> unsigned {
+      if (!FDecl->isVariadic())
+        return 0;
+      const auto *FirstToCheckArg = Args[NumArgs - 1]->IgnoreParenCasts();
+      if (!FirstToCheckArg ||
+          FirstToCheckArg->getType().getCanonicalType() !=
+              Context.getBuiltinVaListType().getCanonicalType())
+        return 0;
+      return NumOfParentFunctionParams + FunctionFormatArgumentIndexOffset;
+    }();
+
+    // If there are already attributes which arguments matches arguments
+    // detected in this iteration, do not add new attribute as it would be
+    // duplicate.
+    if (!llvm::any_of(MissingAttributes, [&](const FormatAttr *Attr) {
+          return Attr->getType() == AttrType &&
+                 Attr->getFormatIdx() == StringIndex &&
+                 Attr->getFirstArg() == FirstToCheck;
+        }))
+      MissingAttributes.push_back(FormatAttr::CreateImplicit(
+          getASTContext(), AttrType, StringIndex, FirstToCheck));
+  }
+
+  return MissingAttributes;
+}
+
 //===----------------------------------------------------------------------===//
 // Microsoft specific attribute handlers.
 //===----------------------------------------------------------------------===//
diff --git a/clang/test/Sema/attr-format-missing.c b/clang/test/Sema/attr-format-missing.c
new file mode 100644
index 00000000000000..e887e11d767040
--- /dev/null
+++ b/clang/test/Sema/attr-format-missing.c
@@ -0,0 +1,393 @@
+// RUN: %clang_cc1 -fsyntax-only -verify=expected,c_diagnostics -Wmissing-format-attribute %s
+// RUN: %clang_cc1 -fsyntax-only -Wmissing-format-attribute -fdiagnostics-parseable-fixits %s 2>&1 | FileCheck %s --check-prefixes=CHECK,C-CHECK
+// RUN: %clang_cc1 -fsyntax-only -x c++ -verify=expected,cpp_diagnostics -Wmissing-format-attribute %s
+// RUN: %clang_cc1 -fsyntax-only -x c++ -verify=expected,cpp_diagnostics -std=c++2b -Wmissing-format-attribute %s
+// RUN: %clang_cc1 -fsyntax-only -x c++ -verify=expected,cpp_diagnostics -std=c++23 -Wmissing-format-attribute %s
+// RUN: not %clang_cc1 -fsyntax-only -x c++ -Wmissing-format-attribute -fdiagnostics-parseable-fixits %s 2>&1 | FileCheck %s
+
+#ifndef __cplusplus
+typedef __CHAR16_TYPE__ char16_t;
+typedef __CHAR32_TYPE__ char32_t;
+typedef __WCHAR_TYPE__ wchar_t;
+#endif
+
+typedef __SIZE_TYPE__ size_t;
+typedef __builtin_va_list va_list;
+
+__attribute__((__format__(__printf__, 1, 2)))
+int printf(const char *, ...); // #printf
+
+__attribute__((__format__(__scanf__, 1, 2)))
+int scanf(const char *, ...); // #scanf
+
+__attribute__((__format__(__printf__, 1, 0)))
+int vprintf(const char *, va_list); // #vprintf
+
+__attribute__((__format__(__scanf__, 1, 0)))
+int vscanf(const char *, va_list); // #vscanf
+
+__attribute__((__format__(__printf__, 2, 0)))
+int vsprintf(char *, const char *, va_list); // #vsprintf
+
+__attribute__((__format__(__printf__, 3, 0)))
+int vsnprintf(char *ch, size_t, const char *, va_list); // #vsnprintf
+
+__attribute__((__format__(__scanf__, 1, 4)))
+void f1(char *out, const size_t len, const char *format, ... /* args */) // #f1
+{
+    va_list args;
+    vsnprintf(out, len, format, args); // expected-no-warning@#f1
+}
+
+__attribute__((__format__(__printf__, 1, 4)))
+void f2(char *out, const size_t len, const char *format, ... /* args */) // #f2
+{
+    va_list args;
+    vsnprintf(out, len, format, args); // expected-warning@#f2 {{diagnostic behavior may be improved by adding the 'printf' format attribute to the declaration of 'f2'}}
+                                       // CHECK: fix-it:"{{.*}}":{[[@LINE-4]]:6-[[@LINE-4]]:6}:"__attribute__((format(printf, 3, 4)))"
+}
+
+void f3(char *out, va_list args) // #f3
+{
+    vprintf(out, args); // expected-warning@#f3 {{diagnostic behavior may be improved by adding the 'printf' format attribute to the declaration of 'f3'}}
+                        // CHECK: fix-it:"{{.*}}":{[[@LINE-3]]:6-[[@LINE-3]]:6}:"__attribute__((format(printf, 1, 0)))"
+}
+
+void f4(char* out, ... /* args */) // #f4
+{
+    va_list args;
+    vprintf("test", args); // expected-no-warning@#f4
+
+    const char *ch;
+    vprintf(ch, args); // expected-no-warning@#f4
+}
+
+void f5(va_list args) // #f5
+{
+    char *ch;
+    vscanf(ch, args); // expected-no-warning@#f5
+}
+
+void f6(char *out, va_list args) // #f6
+{
+    char *ch;
+    vprintf(ch, args); // expected-no-warning@#f6
+    vprintf("test", args); // expected-no-warning@#f6
+    vprintf(out, args); // expected-warning@#f6 {{diagnostic behavior may be improved by adding the 'printf' format attribute to the declaration of 'f6'}}
+                        // CHECK: fix-it:"{{.*}}":{[[@LINE-6]]:6-[[@LINE-6]]:6}:"__attribute__((format(printf, 1, 0)))"
+}
+
+void f7(const char *out, ... /* args */) // #f7
+{
+    va_list args;
+
+    vscanf(out, args); // expected-warning@#f7 {{diagnostic behavior may be improved by adding the 'scanf' format attribute to the declaration of 'f7'}}
+                       // CHECK: fix-it:"{{.*}}":{[[@LINE-5]]:6-[[@LINE-5]]:6}:"__attribute__((format(scanf, 1, 2)))"
+}
+
+void f8(const char *out, ... /* args */) // #f8
+{
+    va_list args;
+
+    vscanf(out, args); // expected-no-warning@#f8
+    vprintf(out, args); // expected-no-warning@#f8
+}
+
+void f9(const char out[], ... /* args */) // #f9
+{
+    va_list args;
+    char *ch;
+    vprintf(ch, args); // expected-no-warning
+    vsprintf(ch, out, args); // expected-warning@#f9 {{diagnostic behavior may be improved by adding the 'printf' format attribute to the declaration of 'f9'}}
+                             // CHECK: fix-it:"{{.*}}":{[[@LINE-6]]:6-[[@LINE-6]]:6}:"__attribute__((format(printf, 1, 2)))"
+}
+
+void f10(const wchar_t *out, ... /* args */) // #f10
+{
+    va_list args;
+    vscanf(out, args);
+#if (defined(__aarch64__) && !defined(_WIN64)) || (defined(__arm__) && !defined(_WIN32))
+                        // c_diagnostics-warning@-2 {{incompatible pointer types passing 'const wchar_t *' (aka 'const unsigned int *') to parameter of type 'const char *'}}
+#elif __SIZEOF_WCHAR_T__ == 4
+                        // c_diagnostics-warning@-4 {{incompatible pointer types passing 'const wchar_t *' (aka 'const int *') to parameter of type 'const char *'}}
+#else
+                        // c_diagnostics-warning@-6 {{incompatible pointer types passing 'const wchar_t *' (aka 'const unsigned short *') to parameter of type 'const char *'}}
+#endif
+                        // c_diagnostics-note@#vscanf {{passing argument to parameter here}}
+                        // c_diagnostics-warning@#f10 {{diagnostic behavior may be improved by adding the 'scanf' format attribute to the declaration of 'f10'}}
+                        // C-CHECK: fix-it:"{{.*}}":{[[@LINE-13]]:6-[[@LINE-13]]:6}:"__attribute__((format(scanf, 1, 2)))"
+                        // cpp_diagnostics-error@-11 {{no matching function for call to 'vscanf'}}
+                        // cpp_diagnostics-note@#vscanf {{candidate function not viable: no known conversion from 'const wchar_t *' to 'const char *' for 1st argument}}
+}
+
+void f11(const wchar_t *out, ... /* args */) // #f11
+{
+    va_list args;
+    vscanf((const char *) out, args); // expected-warning@#f11 {{diagnostic behavior may be improved by adding the 'scanf' format attribute to the declaration of 'f11'}}
+                                      // CHECK: fix-it:"{{.*}}":{[[@LINE-4]]:6-[[@LINE-4]]:6}:"__attribute__((format(scanf, 1, 2)))"
+}
+
+void f12(const wchar_t *out, ... /* args */) // #f12
+{
+    va_list args;
+    vscanf((char *) out, args); // expected-warning@#f12 {{diagnostic behavior may be improved by adding the 'scanf' format attribute to the declaration of 'f12'}}
+                                // CHECK: fix-it:"{{.*}}":{[[@LINE-4]]:6-[[@LINE-4]]:6}:"__attribute__((format(sc...
[truncated]

@budimirarandjelovichtec
Copy link
Contributor Author

@AaronBallman
Copy link
Collaborator

Can you explain what changes were made to address the various issues which caused the earlier revert? I've tried diffing the PRs to see what the changes are, but my git-fu is insufficient to the task. :-D

@vitalybuka
Copy link
Collaborator

Can you explain what changes were made to address the various issues which caused the earlier revert? I've tried diffing the PRs to see what the changes are, but my git-fu is insufficient to the task. :-D

Just for convenience I extracted update diff into #106649

@budimirarandjelovichtec
Copy link
Contributor Author

Can you explain what changes were made to address the various issues which caused the earlier revert? I've tried diffing the PRs to see what the changes are, but my git-fu is insufficient to the task. :-D

I made changes in several places. Here are major changes:

  1. determining first argument index. Here I removed dynamic cast to DeclRefAttr of last function argument. In C++ mode casting to DeclRefAttr resulted in unrecognizing argument type as va_list for several architectures and OS. So, some architectures gave 0 as output and others gave non-zero value. Unrecognizing argument type as va_list was a result of defining canonical va_list type differently depending on architecture.
  2. added check if calling function has format attribute before iterating through its format attributes. This check was added because one pointer (AttrType) was initialized in iteration through format attributes and later used to get name if missing attribute was caught. In previous merge this was resulting in throwing error that there were use of uninitialized variable for some architectures.
  3. edited some parts of C code where passed arguments were not valid. Examples are passing &args[0] or char * as va_list argument (functions f8() and f41()).

Relevant parts of code are reviewed in #106649 .

clang/lib/Sema/SemaDeclAttr.cpp Outdated Show resolved Hide resolved
clang/lib/Sema/SemaDeclAttr.cpp Outdated Show resolved Hide resolved
clang/lib/Sema/SemaDeclAttr.cpp Outdated Show resolved Hide resolved
clang/test/Sema/attr-format-missing.c Outdated Show resolved Hide resolved
clang/test/Sema/attr-format-missing.c Outdated Show resolved Hide resolved
clang/test/Sema/attr-format-missing.c Outdated Show resolved Hide resolved
clang/test/Sema/attr-format-missing.c Outdated Show resolved Hide resolved
clang/test/Sema/attr-format-missing.c Outdated Show resolved Hide resolved
clang/test/Sema/attr-format-missing.cpp Outdated Show resolved Hide resolved
clang/test/Sema/attr-format-missing.cpp Outdated Show resolved Hide resolved
@budimirarandjelovichtec budimirarandjelovichtec force-pushed the missing-format-attribute branch 3 times, most recently from e57e2fb to 11aef7c Compare August 30, 2024 14:53
Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@budimirarandjelovichtec
Copy link
Contributor Author

Let's wait few days if there are any opinions. Then, someone from my company will merge it.

clang/lib/Sema/SemaDeclAttr.cpp Outdated Show resolved Hide resolved
clang/lib/Sema/SemaDeclAttr.cpp Outdated Show resolved Hide resolved
for (auto *Child : Body->children()) {
// If child statement is compound statement, recursively get missing
// attributes.
if (dyn_cast_or_null<CompoundStmt>(Child)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I get this right that we're only looking into CompoundStmts, nothing else? Like a very basic CFG visitor? So if we have a branch around the call, we won't emit the warning? Sorry if this was discussed before, but I thought we would check on encountering a CallExpr to a format-annotated function. (Which would also imply that we don't pay for the analysis when there are no relevant function calls.)

The reason why I'm asking is that I thought of the GCC warning as a way to "propagate" the attribute through wrappers to call sites with literal arguments, where then the regular check should apply. So in a sense, we should either be able to check the format string against the arguments, or warn that the arguments come from callers and that the function should be annotated.

But maybe we want to exclude calls behind branches, because the function could just be a very generic wrapper? I just think it's more common that for example a trace function checks the trace output level first and only prints when that check succeeds, and it seems we'd miss those cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of this part of code is to enable diagnosing missing format attributes if there are branches around call, ie. call is not direct child of caller function. This part of code is aimed to handle examples like this:

void f(char *out, va_list args)
{
    {
        vprintf(out, args);
    }
}

Without this check, missing format attribute in example above would not be diagnosed. This part of code enables recursively checking children statements.

If statement is not compound, next check is if it is a function call.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you write about "branches around call", do you mean braces? However, I meant branches:

void f(char *out, va_list args)
{
    if (tracing_enabled) {
        vprintf(out, args);
    }
}

This is not a compound statement, so you'd not be looking into that.

My proposal would be to check on encountering an annotated CallExpr instead of traversing the function body. This is also better for performance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I meant braces instead of branches. This snippet does not produce warning, so I'm working to fix it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed logic to detect missing format attributes on an annotated function calls. Here are differences: link.

clang/lib/Sema/SemaDeclAttr.cpp Outdated Show resolved Hide resolved
clang/lib/Sema/SemaDeclAttr.cpp Outdated Show resolved Hide resolved
clang/lib/Sema/SemaDeclAttr.cpp Outdated Show resolved Hide resolved
clang/lib/Sema/SemaDeclAttr.cpp Outdated Show resolved Hide resolved
Comment on lines 5513 to 5567
if (FDecl->parameters()[OffsetFormatIndex] != FormatArg)
return false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't we already found FormatArg in FDecl->parameters() in the previous any_of? I presume that parameters have to be unique.

Couldn't we just compare Attr->getFormatIdx() with StringIndex? The attribute that we'll be suggesting below contains exactly that index.

In fact I wonder if we couldn't also rely on the check for duplicates here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In previous, any_of it was found callee format argument. In this any_of, caller format argument is checked with callee one. If they mismatch, it means that caller misses format attribute.

clang/lib/Sema/SemaDeclAttr.cpp Outdated Show resolved Hide resolved
clang/test/Sema/attr-format-missing.cpp Outdated Show resolved Hide resolved
@budimirarandjelovichtec budimirarandjelovichtec force-pushed the missing-format-attribute branch 2 times, most recently from a8a958f to de95cbd Compare October 25, 2024 14:38
@budimirarandjelovichtec budimirarandjelovichtec force-pushed the missing-format-attribute branch 3 times, most recently from ccb3dca to 97b7071 Compare November 6, 2024 16:29
@budimirarandjelovichtec budimirarandjelovichtec force-pushed the missing-format-attribute branch 4 times, most recently from e4faa01 to 3d7798a Compare November 14, 2024 07:42
@budimirarandjelovichtec budimirarandjelovichtec force-pushed the missing-format-attribute branch 2 times, most recently from 38d9734 to 3ffb989 Compare November 20, 2024 15:35
@budimirarandjelovichtec budimirarandjelovichtec force-pushed the missing-format-attribute branch 3 times, most recently from 848ebca to 1a096f6 Compare December 24, 2024 11:34
@Endilll Endilll removed their request for review January 15, 2025 18:48
@budimirarandjelovichtec
Copy link
Contributor Author

Before looking into details, may I ask you to:

1. Re-base

2. address or reply to existing comments

3. click re-request review.

Thank you!

  1. Rebased to the last commit as of February 14, 2025.

  2. Replied on existing comments.

  3. Re-requested review for all reviewers.

@vitalybuka vitalybuka removed their request for review February 27, 2025 05:30
@budimirarandjelovichtec
Copy link
Contributor Author

Ping @vitalybuka @aaronpuchert

@cor3ntin cor3ntin requested a review from apple-fcloutier May 16, 2025 09:43
Copy link
Contributor

@apple-fcloutier apple-fcloutier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand your change, it can only trip when -Wformat-nonliteral also trips. I think that you should implement this as a note/fixit on -Wformat-nonliteral. You should be able to leverage checkFormatString's value traversal to have the same precision as -Wformat checking. I don't think that your change correctly identifies this case, for instance:

void foo(const char *a, ...) {
	va_list ap;
	const char *const b = a;
	va_start(ap, a);
	vprintf(b, ap);
	va_end(ap);
}

whereas -Wformat will dig that deep.


// va_list is not intended to be passed to variadic function.
if (Callee->isVariadic())
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that there's just one code base in the world that does it, but Clang supports multiple format attributes on the same function, where one of them has a non-zero format index and any others pass a va_list. See #129954. In that context, bailing out because the function is variadic is too eager.

if (const auto *FormatArgExpr = dyn_cast<DeclRefExpr>(
Args[OffsetFormatIndex]->IgnoreParenCasts()))
if (FormatStringArg = dyn_cast_or_null<ParmVarDecl>(
FormatArgExpr->getReferencedDeclOfCallee()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct for what you're trying to do, but doesn't clang emit a warning for un-parenthesized assignments in if conditions?

// vector. Vector could contains attributes that only store information about
// format type (format string and first to check argument are set to -1).
namespace {
std::vector<FormatAttr *> MissingAttributes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clang does not use global mutable state. You need to find some other way to keep that around.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Clang flag -Wmissing-format-attribute doesn't catch missing attributes
7 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.