-
-
Notifications
You must be signed in to change notification settings - Fork 226
Use new Interlocked.Exchange/CompareExchange overloads that support bool values #4585
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use new Interlocked.Exchange/CompareExchange overloads that support bool values #4585
Conversation
e79b72c
to
34bec13
Compare
If you accept this, please consider adding the |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #4585 +/- ##
==========================================
+ Coverage 73.49% 73.51% +0.01%
==========================================
Files 482 483 +1
Lines 17678 17688 +10
Branches 3493 3492 -1
==========================================
+ Hits 12993 13003 +10
+ Misses 3797 3796 -1
- Partials 888 889 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…d of `bool` because of the lack of support in Interlocked.Exchange/CompareExchange to use the now-supported type, but with backwards-compatible support for .NET pre-9.0.
e409bbc
to
f0d95a3
Compare
Rebased to resolve merge conflict in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @logiclrd!
I think it looks pretty good as is. About the only thing I'm wondering if we can improve upon is that there's quite a bit of repetition of this pattern:
#if NET9_0_OR_GREATER
private bool _someVariable;
const bool TRUE = true;
const bool FALSE = false;
#else
private int _someVariable;
const int TRUE = 1;
const int FALSE = 0;
#endif
Perhaps we could centralise that logic in an InterlockedHelpers
or InterlockedShim
file somewhere:
#if NET9_0_OR_GREATER
using LockedBool = System.Boolean;
const bool TRUE = true;
const bool FALSE = false;
#else
using LockedBool = System.Int32;
const int TRUE = 1;
const int FALSE = 0;
#endif
And then we could use this in other source files like:
private LockedBool _someVariable;
I think that way we'd only need the conditional directives in the helper/shim file and the rest of the code could remain relatively tidy.
@Flash0ver what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @logiclrd for the PR!
Is there a way we can consolidate / centralize the pattern without deteriorating the Editor-experience?
What about wrapping the
This one file would still be subject to Rider objecting to having two names for |
I have proofed this out, I think it fits the bill :-) No repetition, concise usage sites, centralization of the version-specific logic.
And:
|
@Flash0ver I installed Rider to see this suggestion ("Use type alias") for myself and see if I could find a way to disable it locally, but on a fresh installation, it doesn't seem to be giving that suggestion. Is there something I need to toggle to enable it, do you know?? |
Never mind, I figured it out :-) Looks like that suggestion only applies to |
This reverts commit f0d95a3. Per discussion, this is not a user-facing change.
Per the discussion earlier, I've reverted the change to And, I've committed a change that switches this over to the proposal I most recently outlined. Let me know if you'd like me to squash the commits down before merging (to reformulate the commit message), or if you'll just do it in the PR merge or whatnot. 🙂 It also occurs to me that this is now a new testable unit. |
…ting. Added InterlockedBooleanTests.cs to Sentry.Tests. Added a missing using statement to SentryDatabaseLogging.cs.
…n test theory InterlockedBoolean_CompareExchange_ReturnsExpected.
This is a weird build error and I'm not sure how to go about debugging it. In an earlier iteration, I had attributed the members of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great implementation. And great test coverage!
I got a couple of nitpicks about style and naming,
and a candidate for a follow-up PR around performance and maintainability.
I'll take a look at CS8785
... this seems to be unrelated to this PR ... and on Windows only ...
i.e. the usage of MethodImplAttribute
should have no effect on this and you should be able to bring it back.
|
||
public InterlockedBoolean(bool value) { _value = value ? True : False; } | ||
|
||
public static implicit operator bool(InterlockedBoolean? _this) => (_this != null) && (_this.Value._value != False); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: do not pass Nullable<InterlockedBoolean>
From what I can see, we don't have any usages converting a Nullable<InterlockedBoolean>
to a System.Boolean
.
In order to avoid some implicit conversions there, please change this to be non-nullable, for best performance.
public bool Exchange(bool newValue) | ||
{ | ||
TBool localNewValue = newValue ? True : False; | ||
|
||
TBool localReturnValue = Interlocked.Exchange(ref _value, localNewValue); | ||
|
||
return (localReturnValue != False); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: potentially more efficient implementation ... candidate for a follow-up PR
I am wondering if we can make the implementation for NET9_0_OR_GREATER
more efficient by removing TBool localNewValue = newValue ? True : False;
which is only required for older TFMs.
Something like:
// ...
#if NET9_0_OR_GREATER
internal volatile bool _value;
#else
internal volatile int _value;
#endif
// ...
public bool Exchange(bool newValue)
{
#if NET9_0_OR_GREATER
return Interlocked.Exchange(ref _value, newValue);
#else
int localNewValue = newValue ? 1 : 0;
int localReturnValue = Interlocked.Exchange(ref _value, localNewValue);
return (localReturnValue != 0);
#endif
}
// ...
This way we wouldn't need any TBool
alias anymore, which in my opinion becomes a bit more readable,
and the NET9_0_OR_GREATER
implementation should be every so slightly more efficient (unless the JIT sees through it and optimized it).
However, I am quite happy to having this as a follow-up PR, if you'd like ... as the current implementation is functional and this change is about style and performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not able to predict exactly what it'll do, but I actually wouldn't be surprised if the JIT's code optimization just treated the same storage as bool
and int
contextually. I'm pretty sure in the ABI, a parameter on the call stack is always machine word-aligned, so on x86 the bool
is internally an Int32
and on x64 the bool
is an Int64
. The optimizer might notice this and simply pass the memory address of the parameter on the stack in as the ref int
for Interlocked.Exchange
. I wouldn't depend on this, of course, but just to say that there might not really be a hugely noticeable speed difference.
Now I'm curious 🙂 I might do some poking just to see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I took a peek at what the JIT is doing.
In Debug
mode, it's doing exactly what's written, no special efficiency.
Source code:
TBool localNewValue = newValue ? True : False;
Compiled code:
// Load the parameter value as a 32-bit value, but then explicitly zero-extend the least significant byte
00007FFE80D61C39 mov eax,dword ptr [rbp+68h]
00007FFE80D61C3C movzx eax,al
// EAX is now either 0 or 1 based on the bool value of newValue. Branch based on whether it is zero or not.
00007FFE80D61C3F test eax,eax
00007FFE80D61C41 jnz Sentry.Internal.InterlockedBoolean.Exchange(Boolean)+03Ah (07FFE80D61C4Ah)
// - It is zero, write the value of False, which is also bool false, into temporary storage
00007FFE80D61C43 xor eax,eax
00007FFE80D61C45 mov dword ptr [rbp+30h],eax
00007FFE80D61C48 jmp Sentry.Internal.InterlockedBoolean.Exchange(Boolean)+041h (07FFE80D61C51h)
// - It is non-zero, write the value of True, which is bool true, into temporary storage
00007FFE80D61C4A mov dword ptr [rbp+30h],1
// Now, just as when we read the parameter value, read the temporary storage as a 32-bit value, but then explicitly zero-extend the least significant byte (totally unnecessary 🙂)
00007FFE80D61C51 mov eax,dword ptr [rbp+30h]
00007FFE80D61C54 movzx eax,al
// Write the result to localNewValue
00007FFE80D61C57 mov dword ptr [rbp+3Ch],eax
So that's kind of disappointing but also not a huge surprise. 🙂
Interestingly, the return value isn't quite so horrifically inefficient:
Source code, in which localReturnValue is a bool because I'm running in .NET 9:
return (localReturnValue != False);
Compiled code:
// Load the value of localReturnValue
00007FFE80D61C79 mov eax,dword ptr [rbp+38h]
// Write it to temporary storage
00007FFE80D61C7C mov dword ptr [rbp+34h],eax
00007FFE80D61C7F nop
// Load the temporary storage into EAX in preparation for RET (totally unnecessary 🙂 it's already there)
00007FFE80D61C80 mov eax,dword ptr [rbp+34h]
So, in the return code, it recognized that boolValue != false
is just boolValue
, but in the lead-in, it didn't recognize that boolValue ? true : false
is just boolValue
.
Anyway, that's the Debug
profile, no optimization. Now for Release
.
Source code:
TBool localNewValue = newValue ? True : False;
Compiled code:
// With the fastcall calling convention, the this argument is in ECX and the parameter newValue is in EDX. Stash in temporary storage.
00007FFE6B3C7B53 mov dword ptr [rbp+18h],edx
// Similar to the Debug profile, load the value as a 32-bit word and then zero-extend.
00007FFE6B3C7B56 mov eax,dword ptr [rbp+18h]
00007FFE6B3C7B59 movzx eax,al
// Test if it is zero or not.
00007FFE6B3C7B5C test eax,eax
// No branching, ma! Set AL to 1 or 0 based on whether the bool value was true or false.
00007FFE6B3C7B5E setnz al
// Zero-extend, just in case (this clears the top bytes of EAX), then store it in localNewValue
00007FFE6B3C7B61 movzx eax,al
00007FFE6B3C7B64 mov dword ptr [rbp-4],eax
As for the return, well, that gets interesting :-) The Interlocked.Exchange
actually got treated as an intrinsic, and the result, which was computed into ECX, was then copied directly to EAX. The metadata didn't even attribute a byte range to the return
line at all.
Source code:
TBool localReturnValue = Interlocked.Exchange(ref _value, localNewValue);
return (localReturnValue != False);
Compiled code:
// Load the memory address of _value
00007FFE6B3C7B67 mov rax,qword ptr [rbp+10h]
// This opcode is weird, possibly a bug in the JIT? It compares the byte at this memory address _with the least significant byte of the memory address_
00007FFE6B3C7B6B cmp byte ptr [rax],al
// Load the memory address of _value again (unnecessary)
00007FFE6B3C7B6D mov rax,qword ptr [rbp+10h]
// Load the value of localNewValue into ECX
00007FFE6B3C7B71 mov ecx,dword ptr [rbp-4]
// Interlocked.Exchange, a single instruction :-)
00007FFE6B3C7B74 xchg cl,byte ptr [rax]
// XCHG has swapped the byte in _value with the CL register. Now we zero-extend, turning it into a 32-bit representation of the bool value.
00007FFE6B3C7B76 movzx ecx,cl
// Store in EAX for return. All done.
00007FFE6B3C7B79 mov eax,ecx
So how does that compare against just a direct call to Interlocked.Exchange
, if the target framework supports it?
Source code:
return Interlocked.Exchange(ref _value, newValue);
Compiled code:
// Move the fastcall parameter from EDX into temporary storage
00007FFF5D766C78 mov dword ptr [rbp+18h],edx
// Load the memory address of _value
00007FFF5D766C7B mov rax,qword ptr [rbp+10h]
// This opcode is weird, possibly a bug in the JIT? It compares the byte at this memory address _with the least significant byte of the memory address_
00007FFF5D766C7F cmp byte ptr [rax],al
// Load the memory address of _value again (unnecessary)
00007FFF5D766C81 mov rax,qword ptr [rbp+10h]
// Load the value of localNewValue into ECX
00007FFF5D766C85 mov ecx,dword ptr [rbp+18h]
00007FFF5D766C88 movzx ecx,cl
// Interlocked.Exchange, a single instruction :-)
00007FFF5D766C8B xchg cl,byte ptr [rax]
// XCHG has swapped the byte in _value with the CL register. Now we zero-extend, turning it into a 32-bit representation of the bool value.
00007FFF5D766C8D movzx ecx,cl
// Store in EAX for return. All done.
00007FFF5D766C90 mov eax,ecx
So basically identical to the tail end of the generic implementation.
It's not perfectly-optimized, but it's pretty good.
What does this mean for performance?
It means the difference between being able to do myInterlockedBoolean.Exchange(value)
254 million times per second and only being able to do 251 million calls per second.
It appears that that extra lead-up is almost entirely inconsequential. I made a program to benchmark it, and very consistently, the "slow" path is only 1.2% slower than the "fast" path.
I'll polish up the benchmark code and throw it up into a repo somewhere, because maybe I made a silly mistake somewhere. 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even in Debug
mode, the "slow" version is only about 11% slower than the "fast" version, ~158 million calls per second compared to ~140 million calls per second.
Here's the benchmark:
https://github.com/logiclrd/InterlockedExchangePerformanceTest
Warning (as error) We're using an older version there, for .NET Standard and .NET Framework, triggered in It should work again, when you change the test #if NET9_0_OR_GREATER
-using TBool = bool;
+using TBool = System.Boolean;
#else
-using TBool = int;
+using TBool = System.Int32;
#endif So you should be able to bring the |
done ✅ (as topic) |
- Moved using statements outside of namespaces in SentryDatabaseLogging.cs and SentryCountdownLock.cs. Specified preference for this in .editorconfig. - Renamed _init back to Init in SentryDatabaseLogging.cs. - Moved type aliases in InterlockedBoolean.cs and InterlockedBooleanTests.cs outside of the namespace and changed them from built-in type names to explicit references to the underlying types. - Made _value in InterlockedBoolean private and added an internal test-specific property instead. - Added MethodImpl hints to the methods it InterlockedBoolean.cs requesting aggressive inlining. - Changed parameters to the implicit conversion operatons in InterlockedBoolean from _this to @this. - Standardized .Should.Be(..) assertions in InterlockedBooleanTests.cs so that the expectation is always named expected.
Okay, I think that got all of the points of feedback. I checked three times, but if I missed any, let me know. |
…rTests property setter in SampleProfilerSession.cs. One straggler from the PR review feedback that I somehow missed: explicitly discard an unused return value in one of the tests.
…erlockedBoolean in SampleProfilerSection.cs in case the assignment isn't atomic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well done!
I only got a final nitpick ... and then it's ready to merge.
With a potential follow-up about perf: https://github.com/getsentry/sentry-dotnet/pull/4585/files/71fc795107b11363cb6cb581482146e8ff439096#r2413655262
…rom the file local to Sentry.Azure.Functions.Worker to the root file, and changed it to csharp_using_directive_placement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@logiclrd huge thank you for the PR and for working through all our feedback!
❤️ 🚀
This PR updates places that use
Interlocked.Exchange
and/orInterlocked.CompareExchange
withint
values that are really booleans because, historically, the typebool
wasn't supported to use the new overloads in .NET 9.0+ that do supportbool
. As the code has targets older than .NET 9.0, these are all used conditionally, with fallback toint
on older platforms. Constants namedTRUE
andFALSE
are used, tracking the actual type of the backing field, so that the actual code doesn't need separate implementations.Closes: #3686
#skip-changelog