Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

[Flang] Add parser support for prefetch directive #139702

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
Loading
from

Conversation

Thirumalai-Shaktivel
Copy link
Member

Implementation details:

  • Recognize prefetch directive in the parser as !dir$ prefetch ...
  • Unparse the prefetch directive
  • Add required tests

Details on the prefetch directive:
!dir$ prefetch designator[, designator]..., where the designator list
can be a variable or an array reference. This directive is used to
insert a hint to the code generator to prefetch instructions for
memory references.

* Recognize prefetch directive in the parser as `!dir$ prefetch ...`
* Unparse the prefetch directive
* Add required tests

Details on the prefetch directive:
`!dir$ prefetch designator[, designator]...`, where the designator list
can be a variable or an array reference. This directive is used to
insert a hint to the code generator to prefetch instructions for
memory references.
@llvmbot llvmbot added flang Flang issues not falling into any other category flang:parser labels May 13, 2025
@llvmbot
Copy link
Member

llvmbot commented May 13, 2025

@llvm/pr-subscribers-flang-parser

Author: Thirumalai Shaktivel (Thirumalai-Shaktivel)

Changes

Implementation details:

  • Recognize prefetch directive in the parser as !dir$ prefetch ...
  • Unparse the prefetch directive
  • Add required tests

Details on the prefetch directive:
!dir$ prefetch designator[, designator]..., where the designator list
can be a variable or an array reference. This directive is used to
insert a hint to the code generator to prefetch instructions for
memory references.


Full diff: https://github.com/llvm/llvm-project/pull/139702.diff

6 Files Affected:

  • (modified) flang/docs/Directives.md (+3)
  • (modified) flang/include/flang/Parser/dump-parse-tree.h (+1)
  • (modified) flang/include/flang/Parser/parse-tree.h (+7-2)
  • (modified) flang/lib/Parser/Fortran-parsers.cpp (+4)
  • (modified) flang/lib/Parser/unparse.cpp (+4)
  • (added) flang/test/Parser/prefetch.f90 (+80)
diff --git a/flang/docs/Directives.md b/flang/docs/Directives.md
index 91c27cb510ea0..9216516494523 100644
--- a/flang/docs/Directives.md
+++ b/flang/docs/Directives.md
@@ -50,6 +50,9 @@ A list of non-standard directives supported by Flang
   integer that specifying the unrolling factor. When `N` is `0` or `1`, the loop 
   should not be unrolled at all. If `N` is omitted the optimizer will
   selects the number of times to unroll the loop.
+* `!dir$ prefetch designator[, designator]...`, where the designator list can be
+  a variable or an array reference. This directive is used to insert a hint to
+  the code generator to prefetch instructions for memory references.
 * `!dir$ novector` disabling vectorization on the following loop.
 * `!dir$ nounroll` disabling unrolling on the following loop.
 * `!dir$ nounroll_and_jam` disabling unrolling and jamming on the following loop.
diff --git a/flang/include/flang/Parser/dump-parse-tree.h b/flang/include/flang/Parser/dump-parse-tree.h
index df9278697346f..c62d9b695108d 100644
--- a/flang/include/flang/Parser/dump-parse-tree.h
+++ b/flang/include/flang/Parser/dump-parse-tree.h
@@ -214,6 +214,7 @@ class ParseTreeDumper {
   NODE(CompilerDirective, NoVector)
   NODE(CompilerDirective, NoUnroll)
   NODE(CompilerDirective, NoUnrollAndJam)
+  NODE(CompilerDirective, Prefetch)
   NODE(parser, ComplexLiteralConstant)
   NODE(parser, ComplexPart)
   NODE(parser, ComponentArraySpec)
diff --git a/flang/include/flang/Parser/parse-tree.h b/flang/include/flang/Parser/parse-tree.h
index 254236b510544..cba7653be83d3 100644
--- a/flang/include/flang/Parser/parse-tree.h
+++ b/flang/include/flang/Parser/parse-tree.h
@@ -3354,6 +3354,7 @@ struct StmtFunctionStmt {
 // !DIR$ NOVECTOR
 // !DIR$ NOUNROLL
 // !DIR$ NOUNROLL_AND_JAM
+// !DIR$ PREFETCH designator[, designator]...
 // !DIR$ <anything else>
 struct CompilerDirective {
   UNION_CLASS_BOILERPLATE(CompilerDirective);
@@ -3379,14 +3380,18 @@ struct CompilerDirective {
   struct UnrollAndJam {
     WRAPPER_CLASS_BOILERPLATE(UnrollAndJam, std::optional<std::uint64_t>);
   };
+  struct Prefetch {
+    WRAPPER_CLASS_BOILERPLATE(
+        Prefetch, std::list<common::Indirection<Designator>>);
+  };
   EMPTY_CLASS(NoVector);
   EMPTY_CLASS(NoUnroll);
   EMPTY_CLASS(NoUnrollAndJam);
   EMPTY_CLASS(Unrecognized);
   CharBlock source;
   std::variant<std::list<IgnoreTKR>, LoopCount, std::list<AssumeAligned>,
-      VectorAlways, std::list<NameValue>, Unroll, UnrollAndJam, Unrecognized,
-      NoVector, NoUnroll, NoUnrollAndJam>
+      VectorAlways, std::list<NameValue>, Unroll, UnrollAndJam, Prefetch,
+      Unrecognized, NoVector, NoUnroll, NoUnrollAndJam>
       u;
 };
 
diff --git a/flang/lib/Parser/Fortran-parsers.cpp b/flang/lib/Parser/Fortran-parsers.cpp
index fbe629ab52935..782dff8a967b6 100644
--- a/flang/lib/Parser/Fortran-parsers.cpp
+++ b/flang/lib/Parser/Fortran-parsers.cpp
@@ -1294,6 +1294,7 @@ TYPE_PARSER(construct<StatOrErrmsg>("STAT =" >> statVariable) ||
 // !DIR$ LOOP COUNT (n1[, n2]...)
 // !DIR$ name[=value] [, name[=value]]...
 // !DIR$ UNROLL [n]
+// !DIR$ PREFETCH designator[, designator]...
 // !DIR$ <anything else>
 constexpr auto ignore_tkr{
     "IGNORE_TKR" >> optionalList(construct<CompilerDirective::IgnoreTKR>(
@@ -1308,6 +1309,8 @@ constexpr auto vectorAlways{
     "VECTOR ALWAYS" >> construct<CompilerDirective::VectorAlways>()};
 constexpr auto unroll{
     "UNROLL" >> construct<CompilerDirective::Unroll>(maybe(digitString64))};
+constexpr auto prefetch{"PREFETCH" >>
+    construct<CompilerDirective::Prefetch>(nonemptyList(indirect(designator)))};
 constexpr auto unrollAndJam{"UNROLL_AND_JAM" >>
     construct<CompilerDirective::UnrollAndJam>(maybe(digitString64))};
 constexpr auto novector{"NOVECTOR" >> construct<CompilerDirective::NoVector>()};
@@ -1321,6 +1324,7 @@ TYPE_PARSER(beginDirective >> "DIR$ "_tok >>
                 construct<CompilerDirective>(vectorAlways) ||
                 construct<CompilerDirective>(unrollAndJam) ||
                 construct<CompilerDirective>(unroll) ||
+                construct<CompilerDirective>(prefetch) ||
                 construct<CompilerDirective>(novector) ||
                 construct<CompilerDirective>(nounrollAndJam) ||
                 construct<CompilerDirective>(nounroll) ||
diff --git a/flang/lib/Parser/unparse.cpp b/flang/lib/Parser/unparse.cpp
index a626888b7dfe5..e4dbb16a6346c 100644
--- a/flang/lib/Parser/unparse.cpp
+++ b/flang/lib/Parser/unparse.cpp
@@ -1854,6 +1854,10 @@ class UnparseVisitor {
               Word("!DIR$ UNROLL");
               Walk(" ", unroll.v);
             },
+            [&](const CompilerDirective::Prefetch &prefetch) {
+              Word("!DIR$ PREFETCH");
+              Walk(" ", prefetch.v);
+            },
             [&](const CompilerDirective::UnrollAndJam &unrollAndJam) {
               Word("!DIR$ UNROLL_AND_JAM");
               Walk(" ", unrollAndJam.v);
diff --git a/flang/test/Parser/prefetch.f90 b/flang/test/Parser/prefetch.f90
new file mode 100644
index 0000000000000..1013a09c92117
--- /dev/null
+++ b/flang/test/Parser/prefetch.f90
@@ -0,0 +1,80 @@
+!RUN: %flang_fc1 -fdebug-unparse-no-sema %s 2>&1 | FileCheck %s -check-prefix=UNPARSE
+!RUN: %flang_fc1 -fdebug-dump-parse-tree-no-sema %s 2>&1 | FileCheck %s -check-prefix=TREE
+
+subroutine test_prefetch_01(a, b)
+    integer, intent(in) :: a
+    integer, intent(inout) :: b(5)
+    integer :: i = 2
+    integer :: res
+
+!TREE: | | DeclarationConstruct -> SpecificationConstruct -> CompilerDirective -> Prefetch -> Designator -> DataRef -> Name = 'a'
+
+!UNPARSE:    !DIR$ PREFETCH a
+    !dir$ prefetch a
+    b(1) = a
+
+!TREE: | | ExecutionPartConstruct -> ExecutableConstruct -> CompilerDirective -> Prefetch -> Designator -> DataRef -> Name = 'b'
+
+!UNPARSE:    !DIR$ PREFETCH b
+    !dir$ prefetch b
+    res = sum(b)
+
+!TREE: | | ExecutionPartConstruct -> ExecutableConstruct -> CompilerDirective -> Prefetch -> Designator -> DataRef -> Name = 'a'
+!TREE: | | Designator -> DataRef -> ArrayElement
+!TREE: | | | DataRef -> Name = 'b'
+!TREE: | | | SectionSubscript -> SubscriptTriplet
+!TREE: | | | | Scalar -> Integer -> Expr -> LiteralConstant -> IntLiteralConstant = '3'
+!TREE: | | | | Scalar -> Integer -> Expr -> LiteralConstant -> IntLiteralConstant = '5'
+
+!UNPARSE:    !DIR$ PREFETCH a, b(3:5)
+    !dir$ prefetch a, b(3:5)
+    res = a + b(4)
+
+!TREE: | | ExecutionPartConstruct -> ExecutableConstruct -> CompilerDirective -> Prefetch -> Designator -> DataRef -> Name = 'res'
+!TREE: | | Designator -> DataRef -> ArrayElement
+!TREE: | | | DataRef -> Name = 'b'
+!TREE: | | | SectionSubscript -> Integer -> Expr -> Add
+!TREE: | | | | Expr -> Designator -> DataRef -> Name = 'i'
+!TREE: | | | | Expr -> LiteralConstant -> IntLiteralConstant = '2'
+
+!UNPARSE:    !DIR$ PREFETCH res, b(i+2)
+    !dir$ prefetch res, b(i+2)
+    res = res + b(i+2)
+end subroutine
+
+subroutine test_prefetch_02(n, a)
+    integer, intent(in) :: n
+    integer, intent(in) :: a(n)
+    type :: t
+        real, allocatable :: x(:, :)
+    end type t
+    type(t) :: p
+
+    do i = 1, n
+!TREE: | | | | ExecutionPartConstruct -> ExecutableConstruct -> CompilerDirective -> Prefetch -> Designator -> DataRef -> ArrayElement
+!TREE: | | | | | DataRef -> StructureComponent
+!TREE: | | | | | | DataRef -> Name = 'p'
+!TREE: | | | | | | Name = 'x'
+!TREE: | | | | | SectionSubscript -> Integer -> Expr -> Designator -> DataRef -> Name = 'i'
+!TREE: | | | | | SectionSubscript -> SubscriptTriplet
+!TREE: | | | | Designator -> DataRef -> Name = 'a'
+
+!UNPARSE:  !DIR$ PREFETCH p%x(i,:), a
+        !dir$ prefetch p%x(i, :), a
+        do j = 1, n
+!TREE: | | | | | | ExecutionPartConstruct -> ExecutableConstruct -> CompilerDirective -> Prefetch -> Designator -> DataRef -> ArrayElement
+!TREE: | | | | | | | DataRef -> StructureComponent
+!TREE: | | | | | | | | DataRef -> Name = 'p'
+!TREE: | | | | | | | | Name = 'x'
+!TREE: | | | | | | | SectionSubscript -> Integer -> Expr -> Designator -> DataRef -> Name = 'i'
+!TREE: | | | | | | | SectionSubscript -> Integer -> Expr -> Designator -> DataRef -> Name = 'j'
+!TREE: | | | | | | Designator -> DataRef -> ArrayElement
+!TREE: | | | | | | | DataRef -> Name = 'a'
+!TREE: | | | | | | | SectionSubscript -> Integer -> Expr -> Designator -> DataRef -> Name = 'i'
+
+!UNPARSE:   !DIR$ PREFETCH p%x(i,j), a(i)
+            !dir$ prefetch p%x(i, j), a(i)
+            p%x(i, j) = p%x(i, j) ** a(j)
+        end do
+    end do
+end subroutine

Copy link
Contributor

@tblah tblah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Please could you add a TODO(loc, "!$dir prefetch") in lowering so that this does not get silently ignored until the codegen lands.

Copy link
Contributor

@NimishMishra NimishMishra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Please add the TODO as Tom mentioned.

@Thirumalai-Shaktivel
Copy link
Member Author

Thanks for the reviews!

I will add the required changes soon.

@kiranchandramohan
Copy link
Contributor

Just a pass-through comment.
IBM and HPE have prefetch directives that have more options. Might be good to check with @kkwli @DanielCChen @tmjbios to see whether they are OK with the syntax in this PR.
https://www.intel.com/content/www/us/en/docs/fortran-compiler/developer-guide-reference/2024-2/prefetch-and-noprefetch-general-directives.html
https://support.hpe.com/hpesc/public/docDisplay?docId=a00115296en_us&page=PREFETCH.html&docLocale=en_US

@tmjbios
Copy link

tmjbios commented May 14, 2025

Thanks for the poke, Kiran.

Cray Compiler Environment documentation link: https://cpe.ext.hpe.com/docs/latest/cce/index.html

Our syntax is slightly different

!DIR$ PREFETCH [([lines(num)][, level(num)] [, write][, nt])] var[, var]...

With this provided example showing it in practice:

real*8 a(m,n), b(n,p), c(m,p), arow(n)
...
do j = 1, p
!dir$ prefetch (lines(3), nt) arow(1),b(1,j)
    do k = 1, n, 4
!dir$ prefetch (nt) arow(k+24),b(k+24,j)
        c(i,j) = c(i,j) + arow(k) * b(k,j)
        c(i,j) = c(i,j) + arow(k+1) * b(k+1,j)
        c(i,j) = c(i,j) + arow(k+2) * b(k+2,j)
        c(i,j) = c(i,j) + arow(k+3) * b(k+3,j)
    enddo
enddo

@kiranchandramohan
Copy link
Contributor

Thanks @tmjbios for the quick reply. The question is whether the syntax proposed in this PR !dir$ prefetch designator[, designator]... is OK with you. If what is proposed here is a subset of the functionality you have in CCE then I think it is OK and if you require, you can extend it later.

@kkwli
Copy link
Collaborator

kkwli commented May 14, 2025

The IBM Open XL Fortran compiler has slightly different syntax for the prefetch_* directives, e.g. !ibm* prefetech_by_load (var, ...). I think the proposed syntax is consistent with other supported directives (without the parentheses). It looks fine to me. Thanks.

https://www.ibm.com/docs/en/openxl-fortran-aix/17.1.3?topic=prefetch-by-load
https://www.ibm.com/docs/en/openxl-fortran-aix/17.1.3?topic=prefetch-by-stream
https://www.ibm.com/docs/en/openxl-fortran-aix/17.1.3?topic=prefetch-load
https://www.ibm.com/docs/en/openxl-fortran-aix/17.1.3?topic=prefetch-store

@tmjbios
Copy link

tmjbios commented May 14, 2025

Both Cray CCE ftn and Intel's ifx throw multiple errors with the Flang example (test) code in this pull request.

Intel's ifx will warn, but not error, with the Cray example code.

A current master branch flang will warn, but not error, with the Cray CCE example.

So I would object to this new implementation's syntax being incompatible with the existing Fortran compilers.

Is there a compelling reason for Flang to be different from the major vendors? This is especially concerning when there are existing codes in the wild which will break or see significantly degraded performance if they use flang.

@kkwli
Copy link
Collaborator

kkwli commented May 14, 2025

One more thought. If we want to specialize the prefetch operation (e.g. for store or load) without introducing a new directive, the current syntax may be very limited (i.e. no way to distinguish a keyword and a variable name).

@kiranchandramohan
Copy link
Contributor

Is there a compelling reason for Flang to be different from the major vendors? This is especially concerning when there are existing codes in the wild which will break or see significantly degraded performance if they use flang.

The syntax proposed here is similar to the ones that are/were supported in pgfortran and classic-flang based compilers (AOCC, Huawei compilers, Arm compilers). They all had the syntax !$mem prefetch <var1>[,<var2>[,...]]. This was modified for use in Flang !$dir prefetch <var1>[,<var2>[,...]] to match other directives.

https://docs.nvidia.com/hpc-sdk/pgi-compilers/19.1/x86/pgi-ref-guide/index.htm#prefetch
https://developer.arm.com/documentation/101380/2404/Optimize/Directives/prefetch
https://www.amd.com/content/dam/amd/en/documents/pdfs/developer/aocc/aocc-v4.0-ga-user-guide.pdf (Section 4.1.5)

@kiranchandramohan
Copy link
Contributor

Both Cray CCE ftn and Intel's ifx throw multiple errors with the Flang example (test) code in this pull request.

Is that because the prefetch directive is only applicable in limited contexts like loops? From the syntax in the links that you posted, it looks like the syntax accepted in this patch is a subset.

@tmjbios
Copy link

tmjbios commented May 14, 2025

Yes, sorta - this seems more of a superset of what we support in that we require more specificity from the user.

CCE will tend to disallow prefetching a whole array in this manner. Instead we allow the user to specify a scalar or an array element along with a number of cache lines, whether it is for read or write, whether the data is temporal or non-temporal, and which level of cache to work with.

I'm not suggesting anyone block or disapprove this PR - this is certainly a step in the right direction. I'm just reminded of xkcd 927.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:parser flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.