Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

ENH: Improve polyval performance by using inplace operations #26885

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
Loading
from

Conversation

eendebakpt
Copy link
Contributor

@eendebakpt eendebakpt commented Jul 6, 2024

Improve performance of polynomial evaluation by using inplace operations.

Benchmark results on main:

· Running 7 total benchmarks (1 commits * 1 environments * 7 benchmarks)
[ 0.00%] ·· Benchmarking existing-pyC__projects_misc_numpy_.venv_Scripts_python.exe
[ 7.14%] ··· Running (bench_polynomial.Polynomial.time_polynomial_addition--).......
[57.14%] ··· bench_polynomial.Polynomial.time_polynomial_addition                                           14.8±0.9μs
[64.29%] ··· bench_polynomial.Polynomial.time_polynomial_evaluation_array_1000                              16.1±0.9μs
[71.43%] ··· bench_polynomial.Polynomial.time_polynomial_evaluation_array_1_000_000                           17.1±2ms
[78.57%] ··· bench_polynomial.Polynomial.time_polynomial_evaluation_array_3                                 5.30±0.1μs
[85.71%] ··· bench_polynomial.Polynomial.time_polynomial_evaluation_python_float                           1.64±0.09μs
[92.86%] ··· bench_polynomial.Polynomial.time_polynomial_evaluation_scalar                                 1.60±0.04μs
[100.00%] ··· bench_polynomial.Polynomial.time_polyval                                                         13.0±2ms

PR

· Running 7 total benchmarks (1 commits * 1 environments * 7 benchmarks)
[ 0.00%] ·· Benchmarking existing-pyC__projects_misc_numpy_.venv_Scripts_python.exe
[ 7.14%] ··· Running (bench_polynomial.Polynomial.time_polynomial_addition--).......
[57.14%] ··· bench_polynomial.Polynomial.time_polynomial_addition                                           14.0±0.5μs
[64.29%] ··· bench_polynomial.Polynomial.time_polynomial_evaluation_array_1000                              15.1±0.5μs
[71.43%] ··· bench_polynomial.Polynomial.time_polynomial_evaluation_array_1_000_000                         12.0±0.4ms
[78.57%] ··· bench_polynomial.Polynomial.time_polynomial_evaluation_array_3                                 5.70±0.4μs
[85.71%] ··· bench_polynomial.Polynomial.time_polynomial_evaluation_python_float                           1.58±0.02μs
[92.86%] ··· bench_polynomial.Polynomial.time_polynomial_evaluation_scalar                                 1.60±0.03μs
[100.00%] ··· bench_polynomial.Polynomial.time_polyval                                                       5.95±0.7ms

For large arrays polyval performance is twice as fast. Tricky here is that inplace operations cannot promote the dtype. For the implementation in polyval all type promotions have already been done in the line c0 = c[-1] + x*0 before the for loop in combination with c=c + 0.0. An exception are the dtype=object arrays, but there the type promotions do not apply to the entire array but to the individual elements.

@eendebakpt eendebakpt changed the title Draft: ENH: Improve polyval performance by using inplace operations ENH: Improve polyval performance by using inplace operations Jul 7, 2024
@jorenham
Copy link
Member

jorenham commented Jul 8, 2024

Since there's only linear operations involved, I believe it should be possible to get rid of that loop altogether, and replace it with e.g. a single matrix multiplication or an einsum.
It might even be faster 🤷🏻 .

@eendebakpt
Copy link
Contributor Author

Since there's only linear operations involved, I believe it should be possible to get rid of that loop altogether, and replace it with e.g. a single matrix multiplication or an einsum. It might even be faster 🤷🏻 .

From the argument x one can indeed create a vector (x**n, x**(n-1), ..., x**2, x, 1) and then use matrix multiplication c @ x for the computation. However, for large input arrays x the intermediate array will require much more memory (e.g. degree polynomial * x,size).

@jorenham
Copy link
Member

jorenham commented Jul 8, 2024

Ah I see. I was hoping that there was something like numpy.lib.stride_ticks.sliding_window_view could be used for this, or some neat np.einsum trick or something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.