Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

[AMDGPU] Improve s_delay_alu insertion for instructions with multiple defs #163589

Copy link
Copy link
@jayfoad

Description

@jayfoad
Issue body actions

See https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/AMDGPU/fcopysign.bf16.ll#L1233

The VOPD pair v_dual_mov_b32 v0, s2 :: v_dual_mov_b32 v1, s3 is treated like a single instruction that writes to both v0 and v1.

s_delay_alu instid0(VALU_DEP_1) | instskip(NEXT) | instid1(VALU_DEP_2) says to wait first for the VOPD pair to complete before the use of v0, and then again for the VOPD pair to complete before the use of v1. The second part of this is redundant and potentially decreases code density.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Morty Proxy This is a proxified and sanitized view of the page, visit original site.