Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

gh-130167: Improve speed of _pydecimal._all_zeros and _pydecimal._exact_half by replacing re #132065

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 10 commits into from

Conversation

Marius-Juston
Copy link
Contributor

@Marius-Juston Marius-Juston commented Apr 4, 2025

Benchmark for all zeros

Benchmark regex_decimal non_regex_ends_fast_decimal
exact 230 ns 83.5 ns: 2.76x faster
No zeros 1 199 ns 83.9 ns: 2.38x faster
No zeros 10 207 ns 84.1 ns: 2.46x faster
No zeros 100 206 ns 86.4 ns: 2.38x faster
No zeros 1000 201 ns 94.1 ns: 2.13x faster
Mixed 1 204 ns 169 ns: 1.21x faster
Mixed 10 200 ns 170 ns: 1.18x faster
Mixed 100 208 ns 175 ns: 1.19x faster
Mixed 1000 198 ns 231 ns: 1.17x slower
Zero 1 231 ns 150 ns: 1.54x faster
Zero 10 237 ns 180 ns: 1.32x faster
Zero 100 287 ns 185 ns: 1.55x faster
Zero 1000 783 ns 245 ns: 3.20x faster
Geometric mean (ref) 1.72x faster

Benchmark for exact half

Benchmark half_regex half_non_regex_correct
exact 176 ns 97.1 ns: 1.81x faster
No half 1 178 ns 124 ns: 1.44x faster
No half 10 177 ns 125 ns: 1.42x faster
No half 100 177 ns 125 ns: 1.42x faster
No half 1000 180 ns 126 ns: 1.42x faster
Mixed 1 177 ns 126 ns: 1.41x faster
Mixed 10 177 ns 124 ns: 1.42x faster
Mixed 100 178 ns 125 ns: 1.42x faster
Mixed 1000 177 ns 127 ns: 1.39x faster
Half 1 232 ns 230 ns: 1.01x faster
Half 10 235 ns 269 ns: 1.14x slower
Half 100 288 ns 268 ns: 1.08x faster
Half 1000 780 ns 338 ns: 2.31x faster
Geometric mean (ref) 1.38x faster

The tests benchmarks

import pyperf

def setup_tests_zero():
    names = [
        ('exact', Decimal('3.5')._int)
    ]

    for i in [1, 10, 100, 1000]:
        names.append((f'No zeros {i}', Decimal('3.1' + '12' * i)._int))

    for i in [1, 10, 100, 1000]:
        names.append((f'Mixed {i}', Decimal('3.12' + '0' * i)._int))
        
    for i in [1, 10, 100, 1000]:
        names.append((f'Zero {i}', Decimal('3.1' + '0' * i)._int))
        
    return names

def setup_tests_half():
    names = [
        ('exact', Decimal('3.5')._int)
    ]

    for i in [1, 10, 100, 1000]:
        names.append((f'No half {i}', Decimal('3.1' + '12' * i)._int))

    for i in [1, 10, 100, 1000]:
        names.append((f'Mixed {i}', Decimal('3.5' + '0' * i)._int))
        
    for i in [1, 10, 100, 1000]:
        names.append((f'Half {i}', Decimal('3.15' + '0' * i)._int))
        
    return names

method = ''
benchmark_method = 'half'

benchmark = setup_tests_half if benchmark_method == 'half' else set 

func = _round_down if method == 'regex' else _is_all_zeros

runner = pyperf.Runner()

for n, v in benchmark():
    # print(func(v))
    runner.bench_func(n, func, v)

Lib/_pydecimal.py Outdated Show resolved Hide resolved
Lib/_pydecimal.py Outdated Show resolved Hide resolved
@eendebakpt
Copy link
Contributor

eendebakpt commented Apr 4, 2025

I have some doubts this is worthwhile. The _all_zeros is indeed faster, but is that a bottleneck in actual computations? I would like to see at least a benchmark with a public method from the decimal module and a benchmark with huge numbers.

_all_zeros = re.compile('0*$').match
_exact_half = re.compile('50*$').match
# Checks for regex 0*$
def _all_zeros(d_int,prec=0):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the prec argument?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes we need it, since when it gets called in:

def _round_down(self, prec):
    """Also known as round-towards-0, truncate."""
    if _all_zeros(self._int, prec):
        return 0
    else:
        return -1

and

def _round_half_up(self, prec):
    """Rounds 5 up (away from 0)"""
    if self._int[prec] in '56789':
        return 1
    elif _all_zeros(self._int, prec):
        return 0
    else:
        return -1

they both need the argument

Marius-Juston and others added 3 commits April 4, 2025 07:03
Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
@Marius-Juston
Copy link
Contributor Author

Benchmark regex_decimal non_regex_ends_fast_decimal non_regex_while
exact 230 ns 83.5 ns: 2.76x faster 119 ns: 1.93x faster
No zeros 1 199 ns 83.9 ns: 2.38x faster 119 ns: 1.67x faster
No zeros 10 207 ns 84.1 ns: 2.46x faster 119 ns: 1.74x faster
No zeros 100 206 ns 86.4 ns: 2.38x faster 119 ns: 1.74x faster
No zeros 1000 201 ns 94.1 ns: 2.13x faster 127 ns: 1.59x faster
Mixed 1 204 ns 169 ns: 1.21x faster 119 ns: 1.72x faster
Mixed 10 200 ns 170 ns: 1.18x faster 119 ns: 1.68x faster
Mixed 100 208 ns 175 ns: 1.19x faster 120 ns: 1.74x faster
Mixed 1000 198 ns 231 ns: 1.17x slower 126 ns: 1.57x faster
Zero 1 231 ns 150 ns: 1.54x faster 120 ns: 1.92x faster
Zero 10 237 ns 180 ns: 1.32x faster 120 ns: 1.98x faster
Zero 100 287 ns 185 ns: 1.55x faster 120 ns: 2.40x faster
Zero 1000 783 ns 245 ns: 3.20x faster 126 ns: 6.20x faster
Geometric mean (ref) 1.72x faster 1.97x faster

with the new while loop ( should have done this method first lol )

@Marius-Juston
Copy link
Contributor Author

Benchmark half_regex half_non_regex_correct half_non_regex_while
exact 176 ns 97.1 ns: 1.81x faster 125 ns: 1.41x faster
No half 1 178 ns 124 ns: 1.44x faster 127 ns: 1.40x faster
No half 10 177 ns 125 ns: 1.42x faster 126 ns: 1.40x faster
No half 100 177 ns 125 ns: 1.42x faster 126 ns: 1.41x faster
No half 1000 180 ns 126 ns: 1.42x faster 128 ns: 1.40x faster
Mixed 1 232 ns 230 ns: 1.01x faster 126 ns: 1.84x faster
Mixed 10 235 ns 269 ns: 1.14x slower 126 ns: 1.87x faster
Mixed 100 288 ns 268 ns: 1.08x faster 126 ns: 2.28x faster
Mixed 1000 780 ns 338 ns: 2.31x faster 128 ns: 6.10x faster
Half 1 177 ns 126 ns: 1.41x faster 126 ns: 1.41x faster
Half 10 177 ns 124 ns: 1.42x faster 126 ns: 1.40x faster
Half 100 178 ns 125 ns: 1.42x faster 126 ns: 1.41x faster
Half 1000 177 ns 127 ns: 1.39x faster 128 ns: 1.38x faster
Geometric mean (ref) 1.38x faster 1.70x faster

With the new while loop method for the _exact_half as well

@Marius-Juston
Copy link
Contributor Author

I have some doubts this is worthwhile. The _all_zeros is indeed faster, but is that a bottleneck in actual computations? I would like to see at least a benchmark with a public method from the decimal module and a benchmark with huge numbers.

This is probably not a bottleneck in actual computation ( it's a nice and easy optimization for sure ) and i was planning to use public methods later for benchmarking ( separating them this way though made it simpler to optimize individually ). And yeah I was planning to do larger numbers as well to test this out with repeating patterns as well to see. ( though since the prec sets the initial location of the match or the while loop it does not actually really matter how large the number is but really what the precision is )

Though these addition benchmarks will take me some time to do since I will be very busy, so if you wanted to test it yourself feel free!

@picnixz
Copy link
Member

picnixz commented Apr 4, 2025

I don't think this change needs to be done. _pydecimal is only used as a fallback when the decimal C extension is not available. In addition, this is not really better from a readability PoV, so I would prefer keeping this as is.

@skirpichev
Copy link
Member

This is probably not a bottleneck in actual computation ( it's a nice and easy optimization for sure )

If so, hardly it's a "nice optimization". E.g. the re dependency is not removed.

BTW, what's you use case for the pure-Python decimal module? In most cases, people will just use C-coded extension.

@Marius-Juston
Copy link
Contributor Author

Fair enough, I don't actually have a use case for the _pydecimal I was just looking through files and functions to see if there was anything that interested me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.