Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit 17ea305

Browse filesBrowse files
committed
[AMDGPU] Run LowerLDS at the end of the fullLTO pipeline
This change allows us to use `--lto-partitions` in some cases (not guaranteed it works perfectly), as LDS is lowered before the module is split for parallel codegen.
1 parent d4569d4 commit 17ea305
Copy full SHA for 17ea305

File tree

1 file changed

+9
-0
lines changed
Filter options

1 file changed

+9
-0
lines changed

‎llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp

Copy file name to clipboardExpand all lines: llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+9Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -793,6 +793,15 @@ void AMDGPUTargetMachine::registerPassBuilderCallbacks(
793793

794794
PM.addPass(createCGSCCToFunctionPassAdaptor(std::move(FPM)));
795795
});
796+
797+
PB.registerFullLinkTimeOptimizationLastEPCallback(
798+
[this](ModulePassManager &PM, OptimizationLevel Level) {
799+
// We want to support the -lto-partitions=N option as "best effort".
800+
// For that, we need to lower LDS earlier in the pipeline before the
801+
// module is partitioned for codegen.
802+
if (EnableLowerModuleLDS)
803+
PM.addPass(AMDGPULowerModuleLDSPass(*this));
804+
});
796805
}
797806

798807
int64_t AMDGPUTargetMachine::getNullPointerValue(unsigned AddrSpace) {

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.