-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[LAA] Hoist setting condition for RT-checks #128045
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-llvm-analysis Author: Ramkumar Ramachandra (artagnon) ChangesStrip ShouldRetyWithRuntimeCheck from the DepedenceDistanceStrideAndSizeInfo struct, and free isDependent from the responsibility of setting the condition for when runtime-checks are needed, transferring this responsibility to getDependenceDistanceStrideAndSize. The change is simple enough, but proving that this was the original intent of the code, making the change non-functional is non-trivial. The proof is as follows. The condition for runtime-checks is only respected when we return a Dependence::DepType of Dependence::Unknown, and the only possible place that could introduce a functional change with this patch is when Dist is possibly zero, but HasSameSize is false. Previously, we weren't checking that the Dist is not a SCEVConstant, and setting the runtime-check condition. Now, with the runtime-check condition hoisted, we do. We note that for a functional change to occur, the distance must be both non-constant, not provably non-positive and non-negative, and this is an impossibility. Full diff: https://github.com/llvm/llvm-project/pull/128045.diff 2 Files Affected:
diff --git a/llvm/include/llvm/Analysis/LoopAccessAnalysis.h b/llvm/include/llvm/Analysis/LoopAccessAnalysis.h
index cb6f47e3a76be..b724078346903 100644
--- a/llvm/include/llvm/Analysis/LoopAccessAnalysis.h
+++ b/llvm/include/llvm/Analysis/LoopAccessAnalysis.h
@@ -372,8 +372,6 @@ class MemoryDepChecker {
uint64_t MaxStride;
std::optional<uint64_t> CommonStride;
- bool ShouldRetryWithRuntimeCheck;
-
/// TypeByteSize is either the common store size of both accesses, or 0 when
/// store sizes mismatch.
uint64_t TypeByteSize;
@@ -383,11 +381,9 @@ class MemoryDepChecker {
DepDistanceStrideAndSizeInfo(const SCEV *Dist, uint64_t MaxStride,
std::optional<uint64_t> CommonStride,
- bool ShouldRetryWithRuntimeCheck,
uint64_t TypeByteSize, bool AIsWrite,
bool BIsWrite)
: Dist(Dist), MaxStride(MaxStride), CommonStride(CommonStride),
- ShouldRetryWithRuntimeCheck(ShouldRetryWithRuntimeCheck),
TypeByteSize(TypeByteSize), AIsWrite(AIsWrite), BIsWrite(BIsWrite) {}
};
diff --git a/llvm/lib/Analysis/LoopAccessAnalysis.cpp b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
index cab70c5c01a45..5e0302760aeac 100644
--- a/llvm/lib/Analysis/LoopAccessAnalysis.cpp
+++ b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
@@ -2006,14 +2006,14 @@ MemoryDepChecker::getDependenceDistanceStrideAndSize(
if (StrideAScaled == StrideBScaled)
CommonStride = StrideAScaled;
- // TODO: Historically, we don't retry with runtime checks unless the
- // (unscaled) strides are the same. Fix this once the condition for runtime
- // checks in isDependent is fixed.
- bool ShouldRetryWithRuntimeCheck = StrideAPtrInt == StrideBPtrInt;
+ // TODO: FoundNonConstantDistanceDependence is used as a necessary condition
+ // to consider retrying with runtime checks. Historically, we did not set it
+ // when (unscaled) strides were different but there is no inherent reason to.
+ if (!isa<SCEVConstant>(Dist))
+ FoundNonConstantDistanceDependence |= StrideAPtrInt == StrideBPtrInt;
return DepDistanceStrideAndSizeInfo(Dist, MaxStride, CommonStride,
- ShouldRetryWithRuntimeCheck, TypeByteSize,
- AIsWrite, BIsWrite);
+ TypeByteSize, AIsWrite, BIsWrite);
}
MemoryDepChecker::Dependence::DepType
@@ -2028,15 +2028,11 @@ MemoryDepChecker::isDependent(const MemAccessInfo &A, unsigned AIdx,
if (std::holds_alternative<Dependence::DepType>(Res))
return std::get<Dependence::DepType>(Res);
- auto &[Dist, MaxStride, CommonStride, ShouldRetryWithRuntimeCheck,
- TypeByteSize, AIsWrite, BIsWrite] =
+ auto &[Dist, MaxStride, CommonStride, TypeByteSize, AIsWrite, BIsWrite] =
std::get<DepDistanceStrideAndSizeInfo>(Res);
bool HasSameSize = TypeByteSize > 0;
if (isa<SCEVCouldNotCompute>(Dist)) {
- // TODO: Relax requirement that there is a common unscaled stride to retry
- // with non-constant distance dependencies.
- FoundNonConstantDistanceDependence |= ShouldRetryWithRuntimeCheck;
LLVM_DEBUG(dbgs() << "LAA: Dependence because of uncomputable distance.\n");
return Dependence::Unknown;
}
@@ -2096,14 +2092,8 @@ MemoryDepChecker::isDependent(const MemAccessInfo &A, unsigned AIdx,
// forward dependency will allow vectorization using any width.
if (IsTrueDataDependence && EnableForwardingConflictDetection) {
- if (!ConstDist) {
- // TODO: FoundNonConstantDistanceDependence is used as a necessary
- // condition to consider retrying with runtime checks. Historically, we
- // did not set it when strides were different but there is no inherent
- // reason to.
- FoundNonConstantDistanceDependence |= ShouldRetryWithRuntimeCheck;
+ if (!ConstDist)
return Dependence::Unknown;
- }
if (!HasSameSize ||
couldPreventStoreLoadForward(
ConstDist->getAPInt().abs().getZExtValue(), TypeByteSize)) {
@@ -2119,22 +2109,8 @@ MemoryDepChecker::isDependent(const MemAccessInfo &A, unsigned AIdx,
int64_t MinDistance = SE.getSignedRangeMin(Dist).getSExtValue();
// Below we only handle strictly positive distances.
- if (MinDistance <= 0) {
- FoundNonConstantDistanceDependence |= ShouldRetryWithRuntimeCheck;
+ if (MinDistance <= 0)
return Dependence::Unknown;
- }
-
- if (!ConstDist) {
- // Previously this case would be treated as Unknown, possibly setting
- // FoundNonConstantDistanceDependence to force re-trying with runtime
- // checks. Until the TODO below is addressed, set it here to preserve
- // original behavior w.r.t. re-trying with runtime checks.
- // TODO: FoundNonConstantDistanceDependence is used as a necessary
- // condition to consider retrying with runtime checks. Historically, we
- // did not set it when strides were different but there is no inherent
- // reason to.
- FoundNonConstantDistanceDependence |= ShouldRetryWithRuntimeCheck;
- }
if (!HasSameSize) {
LLVM_DEBUG(dbgs() << "LAA: ReadWrite-Write positive dependency with "
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought when this was added originally there were cases where this needed to be checked on demand before exiting. It might not be needed after recent changes, although we might miss some test coverage. Will check to see if there's missing coverage.
Gentle ping. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, should hopefully be able to do this by Monday, if not could you ping me on Monday?
Gentle ping. |
1 similar comment
Gentle ping. |
#122318 is blocked on this patch. I'm quite happy with the natural-language proof I've provided in the summary for asserting that this patch is an NFC. |
Adds extra test coverage showing change by #128045.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added one case where this changes behavior in 8bdcd0a
Not looked closely at it, but I think previously the code should only retry with runtime checks if there is an unknown dependence with non-constant distance.
Now we will always retry with runtime checks, even if we had a non-constant distance that isn't unknown?
Interesting, I seem to have misunderstood something.
My misunderstanding is the following: wouldn't ShouldRetryWithRuntimeChecks only be respected when the dependence is unknown? Why would we retry with runtime checks if the dependence is Forward, Backward etc? |
…checks. Adds extra test coverage showing change by llvm/llvm-project#128045.
537f7ef
to
35c2a0f
Compare
Okay, it doesn't look like I misunderstood anything, and this is an edge-case I didn't account for: it returns unknown straight away for two access-pairs, not hitting any code changed by the patch, and returns a valid DependenceDistanceStrideAndSize (later determined as NoDep) for the third access-pair while setting the condition for RT-checks. While I think the logic is far from precise, I think the patch and the test change are correct, within the limitations of the existing code. |
Gentle ping. |
So I thought about this patch, and I think we are already quite unprincipled in our criterion for retrying with runtime checks. The unintended side-effect of this patch is that we retry with runtime checks in one additional corner case, but this is not a problem for correctness: I hence think we should go ahead with this patch, if only for the reason of making progress in terms of having a more maintainable LAA and to unblock #122318. |
I looked at the From what I can tell, we can have multiple Is this the intended behaviour? If so, is there a rationale for treating Also, as noted above, the patch does not appear to be NFC in practice, but that might be acceptable. Making it truly NFC would require hoisting quite a bit of logic from In any case, I’d appreciate some insight into the retry logic. |
Yes, this is correct.
Using a single
As far as I can tell, there is no rationale for this, although I'm unable to dissect the original intent of the code from the history. It should be noted that // TODO: Relax requirement that there is a common unscaled stride to retry
// with non-constant distance dependencies. It is actually an ad-hoc condition for retrying with runtime checks, which is set when either there is a non-constant dependence distance, or if there is no common unscaled stride (what is the semantic meaning of the latter?). Thanks for your write-up! I couldn't have written it more clearly myself, and I will adapt your write-up for the commit message. |
Strip ShouldRetyWithRuntimeCheck from the DepedenceDistanceStrideAndSizeInfo struct, and free isDependent from the responsibility of setting the condition for when runtime-checks are needed, transferring this responsibility to getDependenceDistanceStrideAndSize. The change is simple enough, but proving that this was the original intent of the code, making the change non-functional is non-trivial. The proof is as follows. The condition for runtime-checks is only respected when we return a Dependence::DepType of Dependence::Unknown, and the only possible place that could introduce a functional change with this patch is when Dist is possibly zero, but HasSameSize is false. Previously, we weren't checking that the Dist is not a SCEVConstant, and setting the runtime-check condition. Now, with the runtime-check condition hoisted, we do. We note that for a functional change to occur, the distance must be both non-constant, not provably non-positive and non-negative, and this is an impossibility.
35c2a0f
to
9c5c2fe
Compare
Gentle ping. Is there some kind of conclusion we've reached about the patch? |
Leaving it to @fhahn to review this patch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the latest update, will check again how frequently this changes behavior to make sure there are no unexpected surprises.
Gentle ping. |
Strip ShouldRetyWithRuntimeCheck from the DepedenceDistanceStrideAndSizeInfo struct, and free isDependent from the responsibility of setting the condition for when runtime-checks are needed, transferring this responsibility to getDependenceDistanceStrideAndSize.
We can have multiple DepType::Unknown dependences that, by themselves, do not trigger the retrying with runtime memory checks, and therefore block vectorization. But once a single FoundNonConstantDistanceDependence is found, the analysis seems to switch to the "LAA: Retrying with memory checks" path and allows all these dependences to be handled via runtime checks. There is hence no rationale for predicating FoundNonConstantDependenceDistance on DepType::Unknown, and removing this predication is one of the side-effects of this patch.