[MachineScheduler] Optional scheduling of single-MI regions #129704

lucas-rami · Mar 4, 2025

Following 15e295d the machine scheduler no longer filters-out single-MI regions when emitting regions to schedule. While this has no functional impact at the moment, it generally has a negative compile-time impact (see #128739).

Since all targets but AMDGPU do not care for this behavior, this introduces an off-by-default flag to ScheduleDAGInstrs to control whether such regions are going to be scheduled, effectively reverting 15e295d for all targets but AMDGPU (currently the only target enabling this flag).

llvmbot · Mar 4, 2025

@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-backend-arm

@llvm/pr-subscribers-backend-x86

Author: Lucas Ramirez (lucas-rami)

Changes

Following 15e295d the machine scheduler no longer filters-out single-MI regions when emitting regions to schedule. While this has no functional impact at the moment, it generally has a negative compile-time impact (see #128739).

Since all targets but AMDGPU do not care for this behavior, this introduces an off-by-default flag to ScheduleDAGInstrs to control whether such regions are going to be scheduled, effectively reverting 15e295d for all targets but AMDGPU (currently the only target enabling this flag).

Full diff: https://github.com/llvm/llvm-project/pull/129704.diff

5 Files Affected:

(modified) llvm/include/llvm/CodeGen/ScheduleDAGInstrs.h (+8)
(modified) llvm/lib/CodeGen/MachineScheduler.cpp (+3-5)
(modified) llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp (+4)
(modified) llvm/test/CodeGen/ARM/misched-branch-targets.mir (+5-6)
(modified) llvm/test/CodeGen/X86/fake-use-scheduler.mir (-6)

diff --git a/llvm/include/llvm/CodeGen/ScheduleDAGInstrs.h b/llvm/include/llvm/CodeGen/ScheduleDAGInstrs.h
index 7c534805b8333..ab7f0a6c42866 100644
--- a/llvm/include/llvm/CodeGen/ScheduleDAGInstrs.h
+++ b/llvm/include/llvm/CodeGen/ScheduleDAGInstrs.h
@@ -124,6 +124,9 @@ namespace llvm {
     /// rescheduling).
     bool RemoveKillFlags;
 
+    /// True if regions with a single MI should be scheduled.
+    bool ScheduleSingleMIRegions = false;
+
     /// The standard DAG builder does not normally include terminators as DAG
     /// nodes because it does not create the necessary dependencies to prevent
     /// reordering. A specialized scheduler can override
@@ -288,6 +291,11 @@ namespace llvm {
       return Topo.IsReachable(SU, TargetSU);
     }
 
+    /// Whether regions with a single MI should be scheduled.
+    bool shouldScheduleSingleMIRegions() const {
+      return ScheduleSingleMIRegions;
+    }
+
     /// Returns an iterator to the top of the current scheduling region.
     MachineBasicBlock::iterator begin() const { return RegionBegin; }
 
diff --git a/llvm/lib/CodeGen/MachineScheduler.cpp b/llvm/lib/CodeGen/MachineScheduler.cpp
index 0c0c616aac660..988ab87426a4e 100644
--- a/llvm/lib/CodeGen/MachineScheduler.cpp
+++ b/llvm/lib/CodeGen/MachineScheduler.cpp
@@ -769,6 +769,7 @@ void MachineSchedulerBase::scheduleRegions(ScheduleDAGInstrs &Scheduler,
 
     MBBRegionsVector MBBRegions;
     getSchedRegions(&*MBB, MBBRegions, Scheduler.doMBBSchedRegionsTopDown());
+    bool ScheduleSingleMI = Scheduler.shouldScheduleSingleMIRegions();
     for (const SchedRegion &R : MBBRegions) {
       MachineBasicBlock::iterator I = R.RegionBegin;
       MachineBasicBlock::iterator RegionEnd = R.RegionEnd;
@@ -778,11 +779,8 @@ void MachineSchedulerBase::scheduleRegions(ScheduleDAGInstrs &Scheduler,
       // it. Perhaps it still needs to be bundled.
       Scheduler.enterRegion(&*MBB, I, RegionEnd, NumRegionInstrs);
 
-      // Skip empty scheduling regions but include single-MI regions; we want
-      // those to be scheduled so that backends which move MIs across regions
-      // during scheduling can reason about and schedule those regions
-      // correctly.
-      if (I == RegionEnd) {
+      // Skip empty scheduling regions and, conditionally, region with a single MI.
+      if (I == RegionEnd || (!ScheduleSingleMI && I == std::prev(RegionEnd))) {
         // Close the current region. Bundle the terminator if needed.
         // This invalidates 'RegionEnd' and 'I'.
         Scheduler.exitRegion();
diff --git a/llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp b/llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
index 176586e3fbbb6..c277223de13ac 100644
--- a/llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+++ b/llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
@@ -760,6 +760,10 @@ GCNScheduleDAGMILive::GCNScheduleDAGMILive(
       StartingOccupancy(MFI.getOccupancy()), MinOccupancy(StartingOccupancy),
       RegionLiveOuts(this, /*IsLiveOut=*/true) {
 
+  // We want regions with a single MI to be scheduled so that we can reason
+  // about them correctly during scheduling stages that move MIs between regions
+  // (e.g., rematerialization).
+  ScheduleSingleMIRegions = true;
   LLVM_DEBUG(dbgs() << "Starting occupancy is " << StartingOccupancy << ".\n");
   if (RelaxedOcc) {
     MinOccupancy = std::min(MFI.getMinAllowedOccupancy(), StartingOccupancy);
diff --git a/llvm/test/CodeGen/ARM/misched-branch-targets.mir b/llvm/test/CodeGen/ARM/misched-branch-targets.mir
index 4b27ed93119c9..2e30ac893b4bc 100644
--- a/llvm/test/CodeGen/ARM/misched-branch-targets.mir
+++ b/llvm/test/CodeGen/ARM/misched-branch-targets.mir
@@ -1,7 +1,7 @@
-# RUN: llc -o - -run-pass=machine-scheduler -misched=shuffle %s | FileCheck -check-prefixes=CHECK,CHECK-MISCHED %s
-# RUN: llc -o - -passes=machine-scheduler -misched=shuffle %s | FileCheck -check-prefixes=CHECK,CHECK-MISCHED %s
-# RUN: llc -o - -run-pass=postmisched %s | FileCheck -check-prefixes=CHECK,CHECK-POSTMISCHED %s
-# RUN: llc -o - -passes=postmisched %s | FileCheck -check-prefixes=CHECK,CHECK-POSTMISCHED %s
+# RUN: llc -o - -run-pass=machine-scheduler -misched=shuffle %s | FileCheck %s
+# RUN: llc -o - -passes=machine-scheduler -misched=shuffle %s | FileCheck %s
+# RUN: llc -o - -run-pass=postmisched %s | FileCheck %s
+# RUN: llc -o - -passes=postmisched %s | FileCheck %s
 
 # REQUIRES: asserts
 # -misched=shuffle is only available with assertions enabled
@@ -147,8 +147,7 @@ body:             |
 
 # CHECK-LABEL: name:            foo_setjmp
 # CHECK:       body:
-# CHECK-MISCHED:         tBL 14 /* CC::al */, $noreg, @setjmp, csr_aapcs, implicit-def dead $lr, implicit $sp, implicit $r0, implicit-def $sp, implicit-def $r0
-# CHECK-POSTMISCHED:     tBL 14 /* CC::al */, $noreg, @setjmp, csr_aapcs, implicit-def dead $lr, implicit $sp, implicit killed $r0, implicit-def $sp, implicit-def $r0
+# CHECK: tBL 14 /* CC::al */, $noreg, @setjmp, csr_aapcs, implicit-def dead $lr, implicit $sp, implicit killed $r0, implicit-def $sp, implicit-def $r0
 # CHECK-NEXT:    t2BTI
 
 ---
diff --git a/llvm/test/CodeGen/X86/fake-use-scheduler.mir b/llvm/test/CodeGen/X86/fake-use-scheduler.mir
index 20c9ffec61383..8b82c4ed2485d 100644
--- a/llvm/test/CodeGen/X86/fake-use-scheduler.mir
+++ b/llvm/test/CodeGen/X86/fake-use-scheduler.mir
@@ -9,12 +9,6 @@
 #
 # CHECK: ********** MI Scheduling **********
 # CHECK-NEXT: foo:%bb.0 entry
-# CHECK-NEXT:   From: $rax = COPY %5:gr64
-# CHECK-NEXT:     To: RET 0, killed $rax
-# CHECK-NEXT:  RegionInstrs: 1
-#
-# CHECK: ********** MI Scheduling **********
-# CHECK-NEXT: foo:%bb.0 entry
 # CHECK-NEXT:   From: %0:gr64 = COPY $rdi
 # CHECK-NEXT:     To: FAKE_USE %5:gr64
 # CHECK-NEXT:  RegionInstrs: 7

) Following 15e295d the machine scheduler no longer filters-out single-MI regions when emitting regions to schedule. While this has no functional impact at the moment, it generally has a negative compile-time impact (see llvm#128739). Since all targets but AMDGPU do not care for this behavior, this introduces an off-by-default flag to `ScheduleDAGInstrs` to control whether such regions are going to be scheduled, effectively reverting 15e295d for all targets but AMDGPU (currently the only target enabling this flag).

) Following 15e295d the machine scheduler no longer filters-out single-MI regions when emitting regions to schedule. While this has no functional impact at the moment, it generally has a negative compile-time impact (see llvm#128739). Since all targets but AMDGPU do not care for this behavior, this introduces an off-by-default flag to `ScheduleDAGInstrs` to control whether such regions are going to be scheduled, effectively reverting 15e295d for all targets but AMDGPU (currently the only target enabling this flag). Change-Id: Ib38f9b7e8d2bb1073cb43ed7e58eaf251ffdce48

Made single-MI region scheduling conditional

5448e89

llvmbot added backend:ARM backend:AMDGPU backend:X86 labels Mar 4, 2025

lucas-rami requested review from jayfoad, arsenm, nikic and shiltian March 4, 2025 13:31

Fix format/spelling

db0bfa7

nikic approved these changes Mar 4, 2025

View reviewed changes

lucas-rami merged commit 03677f6 into llvm:main Mar 4, 2025
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[MachineScheduler] Optional scheduling of single-MI regions #129704

[MachineScheduler] Optional scheduling of single-MI regions #129704

Uh oh!

lucas-rami commented Mar 4, 2025

Uh oh!

llvmbot commented Mar 4, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Search code, repositories, users, issues, pull requests...

[MachineScheduler] Optional scheduling of single-MI regions #129704

[MachineScheduler] Optional scheduling of single-MI regions #129704

Uh oh!

Conversation

lucas-rami commented Mar 4, 2025

Uh oh!

llvmbot commented Mar 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

llvmbot commented Mar 4, 2025 •

edited

Loading