Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

base64 module: Link against SIMD library for 10x performance. #124951

Copy link
Copy link
Open
@gpshead

Description

@gpshead
Issue body actions

Performance enhancement

Proposal:

https://pypi.org/project/pybase64/ aka https://github.com/mayeut/pybase64 (BSD licensed) exists. On top of some of its own SIMD code for base64 module extra features (character translation)^, it links against https://github.com/aklomp/base64, a BSD licensed C99 library with SIMD acceleration giving 5-20x performance on base64 encoding and decoding operations vs our existing generic byte based base64 C code.

We could adopt a bunch of the pybase64 code to make the default base64 module experience better - it is relatively straight forward extension module code (as one would expect). On the other hand, I expect pybase64 to still be where new development and further improvements in this space continue to happen as people who care strongly about performance need the latest and greatest from PyPI regardless of their current CPython version. (looping in @mayeut for thoughts on that)

Practicalities: Library availability? we'd vendor a libbase64 build for use on our binary distributions. I don't think it is currently widely available (? I only did a quick search on Ubuntu) as a package on Linux distributions though so we'd currently need to vendor our own copy in tree to be fair and match the good performance there (yuck, but ideally only temporary until distros pick it up as a package of its own, consider it similar to a Modules/_decimal/libmpdec/ situation - our configure.ac finds an installed one & distros link against that)

Risks: It is a new C library dependency. Security concerns within it thus become our own. As base64 is frequently used to process untrusted input. But its surface of possible problems is limited (very simple data format). We should ensure the library gets proper oss-fuzz test coverage before adoption (@aklomp for visibility).


^ bytes.translate, bytearray.translate, or str.translate might benefit from similar SIMD treatment - which would be better from a CPython perspective than only doing that within this module? If so, lets file a new issue just for that bit.


❯ python -m pybase64 benchmark `which python`
pybase64 1.4.0 (C extension active - NEON)  # running on my Apple M3
bench: altchars=None, validate=False
pybase64._pybase64.encodebytes:   4776.815 MB/s (5,936,128 bytes -> 8,018,983 bytes)
pybase64._pybase64.b64encode:    11989.872 MB/s (5,936,128 bytes -> 7,914,840 bytes)
pybase64._pybase64.b64decode:     3039.329 MB/s (7,914,840 bytes -> 5,936,128 bytes)
base64.encodebytes:                292.876 MB/s (5,936,128 bytes -> 8,018,983 bytes)
base64.b64encode:                  601.307 MB/s (5,936,128 bytes -> 7,914,840 bytes)
base64.b64decode:                  492.088 MB/s (7,914,840 bytes -> 5,936,128 bytes)
bench: altchars=None, validate=True
pybase64._pybase64.b64encode:    12327.286 MB/s (5,936,128 bytes -> 7,914,840 bytes)
pybase64._pybase64.b64decode:     8611.733 MB/s (7,914,840 bytes -> 5,936,128 bytes)
base64.b64encode:                  597.389 MB/s (5,936,128 bytes -> 7,914,840 bytes)
base64.b64decode:                  472.430 MB/s (7,914,840 bytes -> 5,936,128 bytes)
bench: altchars=b'-_', validate=False
pybase64._pybase64.b64encode:     1287.615 MB/s (5,936,128 bytes -> 7,914,840 bytes)
pybase64._pybase64.b64decode:     2524.966 MB/s (7,914,840 bytes -> 5,936,128 bytes)
base64.b64encode:                  473.320 MB/s (5,936,128 bytes -> 7,914,840 bytes)
base64.b64decode:                  406.411 MB/s (7,914,840 bytes -> 5,936,128 bytes)
bench: altchars=b'-_', validate=True
pybase64._pybase64.b64encode:     1283.111 MB/s (5,936,128 bytes -> 7,914,840 bytes)
pybase64._pybase64.b64decode:     6745.809 MB/s (7,914,840 bytes -> 5,936,128 bytes)
base64.b64encode:                  464.526 MB/s (5,936,128 bytes -> 7,914,840 bytes)
base64.b64decode:                  391.959 MB/s (7,914,840 bytes -> 5,936,128 bytes)

Has this already been discussed elsewhere?

No response given

Links to previous discussion of this feature:

If we spawn Discuss threads around this, lets edit and drop links here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    performancePerformance or resource usagePerformance or resource usagestdlibPython modules in the Lib dirPython modules in the Lib dir

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Morty Proxy This is a proxified and sanitized view of the page, visit original site.