Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit a0c6c20

Browse filesBrowse files
committed
BUG, SIMD: Fix floating-point errors with positive infinity input in sqrt on armhf
Guards against passing positive infinity to vrsqrteq_f32 in sqrt operation, which would raise invalid floating-point errors on ARMv7 architectures.
1 parent 09c515d commit a0c6c20
Copy full SHA for a0c6c20

File tree

1 file changed

+6
-6
lines changed
Filter options
  • numpy/_core/src/common/simd/neon

1 file changed

+6
-6
lines changed

‎numpy/_core/src/common/simd/neon/math.h

Copy file name to clipboardExpand all lines: numpy/_core/src/common/simd/neon/math.h
+6-6Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -28,11 +28,13 @@ NPY_FINLINE npyv_f32 npyv_square_f32(npyv_f32 a)
2828
// Based on ARM doc, see https://developer.arm.com/documentation/dui0204/j/CIHDIACI
2929
NPY_FINLINE npyv_f32 npyv_sqrt_f32(npyv_f32 a)
3030
{
31+
const npyv_f32 one = vdupq_n_f32(1.0f);
3132
const npyv_f32 zero = vdupq_n_f32(0.0f);
3233
const npyv_u32 pinf = vdupq_n_u32(0x7f800000);
3334
npyv_u32 is_zero = vceqq_f32(a, zero), is_inf = vceqq_u32(vreinterpretq_u32_f32(a), pinf);
34-
// guard against floating-point division-by-zero error
35-
npyv_f32 guard_byz = vbslq_f32(is_zero, vreinterpretq_f32_u32(pinf), a);
35+
npyv_u32 is_special = vorrq_u32(is_zero, is_inf);
36+
// guard against division-by-zero and infinity input to vrsqrte to avoid invalid fp error
37+
npyv_f32 guard_byz = vbslq_f32(is_special, one, a);
3638
// estimate to (1/√a)
3739
npyv_f32 rsqrte = vrsqrteq_f32(guard_byz);
3840
/**
@@ -47,10 +49,8 @@ NPY_FINLINE npyv_f32 npyv_square_f32(npyv_f32 a)
4749
rsqrte = vmulq_f32(vrsqrtsq_f32(vmulq_f32(a, rsqrte), rsqrte), rsqrte);
4850
// a * (1/√a)
4951
npyv_f32 sqrt = vmulq_f32(a, rsqrte);
50-
// return zero if the a is zero
51-
// - return zero if a is zero.
52-
// - return positive infinity if a is positive infinity
53-
return vbslq_f32(vorrq_u32(is_zero, is_inf), a, sqrt);
52+
// Handle special cases: return a for zeros and positive infinities
53+
return vbslq_f32(is_special, a, sqrt);
5454
}
5555
#endif // NPY_SIMD_F64
5656

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.