Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

AMDGPU: Start considering new atomicrmw metadata on integer operations #122138

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: users/arsenm/amdgpu-expand-system-atomics
Choose a base branch
Loading
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 53 additions & 12 deletions 65 llvm/lib/Target/AMDGPU/SIISelLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -16939,19 +16939,60 @@ SITargetLowering::shouldExpandAtomicRMWInIR(AtomicRMWInst *RMW) const {
case AtomicRMWInst::UDecWrap: {
if (AMDGPU::isFlatGlobalAddrSpace(AS) ||
AS == AMDGPUAS::BUFFER_FAT_POINTER) {
// Always expand system scope atomics.
if (HasSystemScope) {
if (Op == AtomicRMWInst::Sub || Op == AtomicRMWInst::Or ||
Op == AtomicRMWInst::Xor) {
// Atomic sub/or/xor do not work over PCI express, but atomic add
// does. InstCombine transforms these with 0 to or, so undo that.
if (Constant *ConstVal = dyn_cast<Constant>(RMW->getValOperand());
ConstVal && ConstVal->isNullValue())
return AtomicExpansionKind::Expand;
}

return AtomicExpansionKind::CmpXChg;
// On most subtargets, for atomicrmw operations other than add/xchg,
// whether or not the instructions will behave correctly depends on where
// the address physically resides and what interconnect is used in the
// system configuration. On some some targets the instruction will nop,
// and in others synchronization will only occur at degraded device scope.
//
// If the allocation is known local to the device, the instructions should
// work correctly.
if (RMW->hasMetadata("amdgpu.no.remote.memory"))
return atomicSupportedIfLegalIntType(RMW);

// If fine-grained remote memory works at device scope, we don't need to
// do anything.
if (!HasSystemScope &&
Subtarget->supportsAgentScopeFineGrainedRemoteMemoryAtomics())
return atomicSupportedIfLegalIntType(RMW);

// If we are targeting a remote allocated address, it depends what kind of
// allocation the address belongs to.
//
// If the allocation is fine-grained (in host memory, or in PCIe peer
// device memory), the operation will fail depending on the target.
//
// Note fine-grained host memory access does work on APUs or if XGMI is
// used, but we do not know if we are targeting an APU or the system
// configuration from the ISA version/target-cpu.
if (RMW->hasMetadata("amdgpu.no.fine.grained.memory"))
return atomicSupportedIfLegalIntType(RMW);

if (Op == AtomicRMWInst::Sub || Op == AtomicRMWInst::Or ||
Op == AtomicRMWInst::Xor) {
// Atomic sub/or/xor do not work over PCI express, but atomic add
// does. InstCombine transforms these with 0 to or, so undo that.
if (Constant *ConstVal = dyn_cast<Constant>(RMW->getValOperand());
ConstVal && ConstVal->isNullValue())
return AtomicExpansionKind::Expand;
}

// If the allocation could be in remote, fine-grained memory, the rmw
// instructions may fail. cmpxchg should work, so emit that. On some
// system configurations, PCIe atomics aren't supported so cmpxchg won't
// even work, so you're out of luck anyway.

// In summary:
//
// Cases that may fail:
// - fine-grained pinned host memory
// - fine-grained migratable host memory
// - fine-grained PCIe peer device
//
// Cases that should work, but may be treated overly conservatively.
// - fine-grained host memory on an APU
// - fine-grained XGMI peer device
return AtomicExpansionKind::CmpXChg;
}

return atomicSupportedIfLegalIntType(RMW);
Expand Down
Loading
Loading
Morty Proxy This is a proxified and sanitized view of the page, visit original site.