Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit e30697f

Browse filesBrowse files
Merge pull request opencv#27002 from GenshinImpactStarts:magnitude
[HAL RVV] impl magnitude | add perf test opencv#27002 Implement through the existing `cv_hal_magnitude32f` and `cv_hal_magnitude64f` interfaces. **UPDATE**: UI is enabled. The only difference between UI and HAL now is HAL use a approximate `sqrt`. Perf test done on MUSE-PI. ```sh $ opencv_test_core --gtest_filter="*Magnitude*" $ opencv_perf_core --gtest_filter="*Magnitude*" --perf_min_samples=300 --perf_force_samples=300 ``` Test result between enabled UI and HAL: ``` Name of Test ui rvv rvv vs ui (x-factor) Magnitude::MagnitudeFixture::(127x61, 32FC1) 0.029 0.016 1.75 Magnitude::MagnitudeFixture::(127x61, 64FC1) 0.057 0.036 1.57 Magnitude::MagnitudeFixture::(640x480, 32FC1) 1.063 0.648 1.64 Magnitude::MagnitudeFixture::(640x480, 64FC1) 2.261 1.530 1.48 Magnitude::MagnitudeFixture::(1280x720, 32FC1) 3.261 2.118 1.54 Magnitude::MagnitudeFixture::(1280x720, 64FC1) 6.802 4.682 1.45 Magnitude::MagnitudeFixture::(1920x1080, 32FC1) 7.287 4.738 1.54 Magnitude::MagnitudeFixture::(1920x1080, 64FC1) 15.226 10.334 1.47 ``` Test result before and after enabling UI: ``` Name of Test orig pr pr vs orig (x-factor) Magnitude::MagnitudeFixture::(127x61, 32FC1) 0.032 0.029 1.11 Magnitude::MagnitudeFixture::(127x61, 64FC1) 0.067 0.057 1.17 Magnitude::MagnitudeFixture::(640x480, 32FC1) 1.228 1.063 1.16 Magnitude::MagnitudeFixture::(640x480, 64FC1) 2.786 2.261 1.23 Magnitude::MagnitudeFixture::(1280x720, 32FC1) 3.762 3.261 1.15 Magnitude::MagnitudeFixture::(1280x720, 64FC1) 8.549 6.802 1.26 Magnitude::MagnitudeFixture::(1920x1080, 32FC1) 8.408 7.287 1.15 Magnitude::MagnitudeFixture::(1920x1080, 64FC1) 18.884 15.226 1.24 ``` ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [ ] The PR is proposed to the proper branch - [ ] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake
1 parent 71fe903 commit e30697f
Copy full SHA for e30697f

File tree

Expand file treeCollapse file tree

5 files changed

+73
-7
lines changed
Filter options
Expand file treeCollapse file tree

5 files changed

+73
-7
lines changed

‎3rdparty/hal_rvv/hal_rvv.hpp

Copy file name to clipboardExpand all lines: 3rdparty/hal_rvv/hal_rvv.hpp
+1Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@
3030
#include "hal_rvv_1p0/minmax.hpp" // core
3131
#include "hal_rvv_1p0/atan.hpp" // core
3232
#include "hal_rvv_1p0/split.hpp" // core
33+
#include "hal_rvv_1p0/magnitude.hpp" // core
3334
#include "hal_rvv_1p0/flip.hpp" // core
3435
#include "hal_rvv_1p0/lut.hpp" // core
3536
#include "hal_rvv_1p0/exp.hpp" // core
+42Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
// This file is part of OpenCV project.
2+
// It is subject to the license terms in the LICENSE file found in the top-level directory
3+
// of this distribution and at http://opencv.org/license.html.
4+
5+
// Copyright (C) 2025, Institute of Software, Chinese Academy of Sciences.
6+
7+
#ifndef OPENCV_HAL_RVV_MAGNITUDE_HPP_INCLUDED
8+
#define OPENCV_HAL_RVV_MAGNITUDE_HPP_INCLUDED
9+
10+
#include <riscv_vector.h>
11+
12+
#include "hal_rvv_1p0/sqrt.hpp"
13+
#include "hal_rvv_1p0/types.hpp"
14+
15+
namespace cv { namespace cv_hal_rvv {
16+
17+
#undef cv_hal_magnitude32f
18+
#define cv_hal_magnitude32f cv::cv_hal_rvv::magnitude<cv::cv_hal_rvv::Sqrt32f<cv::cv_hal_rvv::RVV_F32M8>>
19+
#undef cv_hal_magnitude64f
20+
#define cv_hal_magnitude64f cv::cv_hal_rvv::magnitude<cv::cv_hal_rvv::Sqrt64f<cv::cv_hal_rvv::RVV_F64M8>>
21+
22+
template <typename SQRT_T, typename T = typename SQRT_T::T::ElemType>
23+
inline int magnitude(const T* x, const T* y, T* dst, int len)
24+
{
25+
size_t vl;
26+
for (; len > 0; len -= (int)vl, x += vl, y += vl, dst += vl)
27+
{
28+
vl = SQRT_T::T::setvl(len);
29+
30+
auto vx = SQRT_T::T::vload(x, vl);
31+
auto vy = SQRT_T::T::vload(y, vl);
32+
33+
auto vmag = detail::sqrt<SQRT_T::iter_times>(__riscv_vfmadd(vx, vx, __riscv_vfmul(vy, vy, vl), vl), vl);
34+
SQRT_T::T::vstore(dst, vmag, vl);
35+
}
36+
37+
return CV_HAL_ERROR_OK;
38+
}
39+
40+
}} // namespace cv::cv_hal_rvv
41+
42+
#endif // OPENCV_HAL_RVV_MAGNITUDE_HPP_INCLUDED

‎3rdparty/hal_rvv/hal_rvv_1p0/sqrt.hpp

Copy file name to clipboardExpand all lines: 3rdparty/hal_rvv/hal_rvv_1p0/sqrt.hpp
+7-5Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -45,11 +45,12 @@ inline VEC_T sqrt(VEC_T x, size_t vl)
4545
t = __riscv_vfrsub(t, 1.5, vl);
4646
y = __riscv_vfmul(t, y, vl);
4747
}
48-
// just to prevent the compiler from calculating mask before the invSqrt, which will run out
48+
// just to prevent the compiler from calculating mask before the iteration, which will run out
4949
// of registers and cause memory access.
5050
asm volatile("" ::: "memory");
51-
auto mask = __riscv_vmfne(x, 0.0, vl);
52-
mask = __riscv_vmfne_mu(mask, mask, x, INFINITY, vl);
51+
auto classified = __riscv_vfclass(x, vl);
52+
// block -0, +0, positive subnormal number, +inf
53+
auto mask = __riscv_vmseq(__riscv_vand(classified, 0b10111000, vl), 0, vl);
5354
return __riscv_vfmul_mu(mask, x, x, y, vl);
5455
}
5556

@@ -58,8 +59,9 @@ inline VEC_T sqrt(VEC_T x, size_t vl)
5859
template <size_t iter_times, typename VEC_T>
5960
inline VEC_T invSqrt(VEC_T x, size_t vl)
6061
{
61-
auto mask = __riscv_vmfne(x, 0.0, vl);
62-
mask = __riscv_vmfne_mu(mask, mask, x, INFINITY, vl);
62+
auto classified = __riscv_vfclass(x, vl);
63+
// block -0, +0, positive subnormal number, +inf
64+
auto mask = __riscv_vmseq(__riscv_vand(classified, 0b10111000, vl), 0, vl);
6365
auto x2 = __riscv_vfmul(x, 0.5, vl);
6466
auto y = __riscv_vfrsqrt7(x, vl);
6567
#pragma unroll

‎modules/core/perf/perf_math.cpp

Copy file name to clipboardExpand all lines: modules/core/perf/perf_math.cpp
+21Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,27 @@ PERF_TEST_P(VectorLength, phase64f, testing::Values(128, 1000, 128*1024, 512*102
3636
SANITY_CHECK(angle, 5e-5);
3737
}
3838

39+
///////////// Magnitude /////////////
40+
41+
typedef Size_MatType MagnitudeFixture;
42+
43+
PERF_TEST_P(MagnitudeFixture, Magnitude,
44+
testing::Combine(testing::Values(TYPICAL_MAT_SIZES), testing::Values(CV_32F, CV_64F)))
45+
{
46+
cv::Size size = std::get<0>(GetParam());
47+
int type = std::get<1>(GetParam());
48+
49+
cv::Mat x(size, type);
50+
cv::Mat y(size, type);
51+
cv::Mat magnitude(size, type);
52+
53+
declare.in(x, y, WARMUP_RNG).out(magnitude);
54+
55+
TEST_CYCLE() cv::magnitude(x, y, magnitude);
56+
57+
SANITY_CHECK_NOTHING();
58+
}
59+
3960
// generates random vectors, performs Gram-Schmidt orthogonalization on them
4061
Mat randomOrtho(int rows, int ftype, RNG& rng)
4162
{

‎modules/core/src/mathfuncs_core.simd.hpp

Copy file name to clipboardExpand all lines: modules/core/src/mathfuncs_core.simd.hpp
+2-2Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -273,7 +273,7 @@ void magnitude32f(const float* x, const float* y, float* mag, int len)
273273

274274
int i = 0;
275275

276-
#if CV_SIMD
276+
#if (CV_SIMD || CV_SIMD_SCALABLE)
277277
const int VECSZ = VTraits<v_float32>::vlanes();
278278
for( ; i < len; i += VECSZ*2 )
279279
{
@@ -306,7 +306,7 @@ void magnitude64f(const double* x, const double* y, double* mag, int len)
306306

307307
int i = 0;
308308

309-
#if CV_SIMD_64F
309+
#if (CV_SIMD_64F || CV_SIMD_SCALABLE_64F)
310310
const int VECSZ = VTraits<v_float64>::vlanes();
311311
for( ; i < len; i += VECSZ*2 )
312312
{

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.