Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

⚡️ Speed up function readable_bytes_string by 5%#52

Open
codeflash-ai[bot] wants to merge 1 commit into
maincodeflash-ai/python-aiplatform:mainfrom
codeflash/optimize-readable_bytes_string-mglqgggzcodeflash-ai/python-aiplatform:codeflash/optimize-readable_bytes_string-mglqgggzCopy head branch name to clipboard
Open

⚡️ Speed up function readable_bytes_string by 5%#52
codeflash-ai[bot] wants to merge 1 commit into
maincodeflash-ai/python-aiplatform:mainfrom
codeflash/optimize-readable_bytes_string-mglqgggzcodeflash-ai/python-aiplatform:codeflash/optimize-readable_bytes_string-mglqgggzCopy head branch name to clipboard

Conversation

@codeflash-ai

@codeflash-ai codeflash-ai Bot commented Oct 11, 2025

Copy link
Copy Markdown

📄 5% (0.05x) speedup for readable_bytes_string in google/cloud/aiplatform/tensorboard/upload_tracker.py

⏱️ Runtime : 962 microseconds 915 microseconds (best of 340 runs)

📝 Explanation and details

The optimized code applies two key micro-optimizations that together achieve a 5% speedup:

1. Pre-computed constants instead of power operations

  • Replaced 2**20 with 1048576 and 2**10 with 1024
  • Eliminates repeated exponentiation calculations on every function call
  • The line profiler shows reduced time in the comparison operations (194.6ns vs 205.9ns per hit for the first condition)

2. Removed unnecessary float() casts

  • Changed float(bytes) / 2**20 to bytes / 1048576
  • In Python 3, division automatically returns float, making the explicit cast redundant
  • Saves function call overhead, particularly visible in the formatting lines where time per hit improved significantly (446.9ns vs 533.4ns for MB formatting)

Performance characteristics:
The optimization is most effective for larger byte values (MB range), where test cases show 6-16% improvements. This aligns with the line profiler data showing the biggest per-hit time reduction in the MB formatting path. The optimization provides consistent small gains across all ranges, with some individual test cases showing up to 51% improvement for extremely large numbers, likely due to reduced computational overhead when dealing with large integer operations.

The changes are purely computational optimizations with no behavioral modifications - all formatting and logic remain identical.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 3401 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
import pytest  # used for our unit tests
from aiplatform.tensorboard.upload_tracker import readable_bytes_string

# unit tests

# --- Basic Test Cases ---

def test_bytes_under_1_kb():
    # 0 bytes
    codeflash_output = readable_bytes_string(0) # 780ns -> 809ns (3.58% slower)
    # 1 byte
    codeflash_output = readable_bytes_string(1) # 348ns -> 377ns (7.69% slower)
    # 512 bytes
    codeflash_output = readable_bytes_string(512) # 250ns -> 261ns (4.21% slower)
    # 1023 bytes (just under 1 kB)
    codeflash_output = readable_bytes_string(1023) # 245ns -> 249ns (1.61% slower)

def test_exactly_1_kb():
    # 1024 bytes == 1.0 kB
    codeflash_output = readable_bytes_string(1024) # 1.44μs -> 1.43μs (1.05% faster)

def test_bytes_in_kb_range():
    # 2048 bytes == 2.0 kB
    codeflash_output = readable_bytes_string(2048) # 1.13μs -> 1.15μs (0.961% slower)
    # 1536 bytes == 1.5 kB
    codeflash_output = readable_bytes_string(1536) # 599ns -> 552ns (8.51% faster)
    # 4096 bytes == 4.0 kB
    codeflash_output = readable_bytes_string(4096) # 348ns -> 367ns (5.18% slower)
    # 9999 bytes
    expected = "%.1f kB" % (9999 / 1024)
    codeflash_output = readable_bytes_string(9999) # 315ns -> 306ns (2.94% faster)

def test_exactly_1_mb():
    # 1048576 bytes == 1.0 MB
    codeflash_output = readable_bytes_string(1048576) # 1.02μs -> 891ns (14.1% faster)

def test_bytes_in_mb_range():
    # 2 MB
    codeflash_output = readable_bytes_string(2 * 1048576) # 985ns -> 954ns (3.25% faster)
    # 1.5 MB
    codeflash_output = readable_bytes_string(int(1.5 * 1048576)) # 486ns -> 481ns (1.04% faster)
    # 10 MB
    codeflash_output = readable_bytes_string(10 * 1048576) # 515ns -> 522ns (1.34% slower)
    # 1234567 bytes
    expected = "%.1f MB" % (1234567 / 1048576)
    codeflash_output = readable_bytes_string(1234567) # 309ns -> 296ns (4.39% faster)

# --- Edge Test Cases ---

def test_bytes_just_below_and_above_thresholds():
    # Just below 1 kB
    codeflash_output = readable_bytes_string(1023) # 657ns -> 677ns (2.95% slower)
    # Exactly 1 kB
    codeflash_output = readable_bytes_string(1024) # 842ns -> 797ns (5.65% faster)
    # Just above 1 kB
    expected = "%.1f kB" % (1025 / 1024)
    codeflash_output = readable_bytes_string(1025) # 420ns -> 392ns (7.14% faster)
    # Just below 1 MB
    codeflash_output = readable_bytes_string(1048575) # 541ns -> 504ns (7.34% faster)
    # Exactly 1 MB
    codeflash_output = readable_bytes_string(1048576) # 361ns -> 333ns (8.41% faster)
    # Just above 1 MB
    expected = "%.1f MB" % (1048577 / 1048576)
    codeflash_output = readable_bytes_string(1048577) # 334ns -> 317ns (5.36% faster)

def test_negative_bytes():
    # Negative bytes should be formatted as B
    codeflash_output = readable_bytes_string(-1) # 658ns -> 670ns (1.79% slower)
    codeflash_output = readable_bytes_string(-1024) # 340ns -> 368ns (7.61% slower)
    codeflash_output = readable_bytes_string(-1048576) # 293ns -> 308ns (4.87% slower)

def test_non_integer_input():
    # Float input less than 1 kB
    codeflash_output = readable_bytes_string(123.45) # 917ns -> 863ns (6.26% faster)
    # Float input in kB range
    codeflash_output = readable_bytes_string(1500.5) # 1.04μs -> 1.06μs (1.79% slower)
    # Float input in MB range
    codeflash_output = readable_bytes_string(2_500_000.75) # 405ns -> 431ns (6.03% slower)

def test_large_integer_just_below_100mb():
    # 100MB in bytes: 104857600
    codeflash_output = readable_bytes_string(104857599) # 1.28μs -> 1.21μs (6.29% faster)
    # Exactly 100MB
    codeflash_output = readable_bytes_string(104857600) # 435ns -> 477ns (8.81% slower)

def test_large_integer_above_100mb():
    # 150MB
    codeflash_output = readable_bytes_string(157286400) # 1.04μs -> 1.00μs (3.59% faster)
    # 999MB
    codeflash_output = readable_bytes_string(1048576 * 999) # 563ns -> 541ns (4.07% faster)

def test_type_error_on_invalid_input():
    # String input should raise TypeError
    with pytest.raises(TypeError):
        readable_bytes_string("1000") # 1.27μs -> 1.29μs (1.62% slower)
    # None input should raise TypeError
    with pytest.raises(TypeError):
        readable_bytes_string(None) # 822ns -> 817ns (0.612% faster)
    # List input should raise TypeError
    with pytest.raises(TypeError):
        readable_bytes_string([1024]) # 571ns -> 573ns (0.349% slower)

def test_zero_bytes():
    # Zero bytes should be formatted as "0 B"
    codeflash_output = readable_bytes_string(0) # 735ns -> 684ns (7.46% faster)

def test_extremely_large_number():
    # Largest 64-bit signed integer
    max_int = 9223372036854775807
    expected = "%.1f MB" % (max_int / 1048576)
    codeflash_output = readable_bytes_string(max_int) # 995ns -> 657ns (51.4% faster)

# --- Large Scale Test Cases ---

def test_many_sequential_sizes():
    # Test a sequence of sizes from 0 to 999
    for i in range(0, 1000):
        if i < 1024:
            codeflash_output = readable_bytes_string(i)
        else:
            expected = "%.1f kB" % (i / 1024)
            codeflash_output = readable_bytes_string(i)

def test_many_kb_sizes():
    # Test a sequence of kB sizes from 1024 to 1048575 (just below 1MB)
    for i in range(1024, 1048576, 1024*10):  # step by 10kB
        expected = "%.1f kB" % (i / 1024)
        codeflash_output = readable_bytes_string(i) # 34.0μs -> 30.9μs (9.91% faster)

def test_many_mb_sizes():
    # Test a sequence of MB sizes from 1MB to 100MB
    for i in range(1048576, 104857600, 1048576):  # step by 1MB
        expected = "%.1f MB" % (i / 1048576)
        codeflash_output = readable_bytes_string(i) # 31.3μs -> 28.6μs (9.71% faster)

def test_performance_large_inputs():
    # Check that the function runs efficiently for large input
    # (not a strict timing test, but ensures no exceptions for large input)
    for i in range(10**7, 10**8, 10**7):  # 10MB to 100MB, step by 10MB
        codeflash_output = readable_bytes_string(i); result = codeflash_output # 3.85μs -> 3.62μs (6.27% faster)

def test_float_inputs_large_scale():
    # Test float values in kB and MB ranges
    for i in range(1, 1000, 100):
        bytes_val = float(i) * 1024  # kB range
        expected = "%.1f kB" % (bytes_val / 1024)
        codeflash_output = readable_bytes_string(bytes_val) # 3.77μs -> 3.70μs (1.81% faster)
    for i in range(1, 100, 10):
        bytes_val = float(i) * 1048576  # MB range
        expected = "%.1f MB" % (bytes_val / 1048576)
        codeflash_output = readable_bytes_string(bytes_val) # 3.04μs -> 3.01μs (0.864% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import pytest  # used for our unit tests
from aiplatform.tensorboard.upload_tracker import readable_bytes_string

# unit tests

# -----------------------
# 1. Basic Test Cases
# -----------------------

def test_bytes_under_1kb():
    # Test for bytes less than 1024 (1 kB)
    codeflash_output = readable_bytes_string(0) # 705ns -> 673ns (4.75% faster)
    codeflash_output = readable_bytes_string(1) # 377ns -> 405ns (6.91% slower)
    codeflash_output = readable_bytes_string(512) # 259ns -> 267ns (3.00% slower)
    codeflash_output = readable_bytes_string(1023) # 236ns -> 239ns (1.26% slower)

def test_exactly_1kb_and_above():
    # Test for bytes at and above 1 kB but below 1 MB
    codeflash_output = readable_bytes_string(1024) # 1.22μs -> 1.21μs (1.07% faster)
    codeflash_output = readable_bytes_string(1536) # 615ns -> 528ns (16.5% faster)
    codeflash_output = readable_bytes_string(2047) # 407ns -> 418ns (2.63% slower)
    codeflash_output = readable_bytes_string(4096) # 333ns -> 311ns (7.07% faster)

def test_exactly_1mb_and_above():
    # Test for bytes at and above 1 MB
    codeflash_output = readable_bytes_string(2**20) # 959ns -> 904ns (6.08% faster)
    codeflash_output = readable_bytes_string(2**20 + 512*1024) # 567ns -> 505ns (12.3% faster)
    codeflash_output = readable_bytes_string(2**20 * 2) # 319ns -> 318ns (0.314% faster)
    codeflash_output = readable_bytes_string(5 * 2**20) # 305ns -> 293ns (4.10% faster)

# -----------------------
# 2. Edge Test Cases
# -----------------------

def test_negative_bytes():
    # Negative values: should be handled as-is, returning "<n> B"
    codeflash_output = readable_bytes_string(-1) # 638ns -> 619ns (3.07% faster)
    codeflash_output = readable_bytes_string(-1024) # 338ns -> 333ns (1.50% faster)
    codeflash_output = readable_bytes_string(-1048576) # 283ns -> 300ns (5.67% slower)

def test_boundary_values():
    # Values exactly at boundaries
    codeflash_output = readable_bytes_string(1023) # 564ns -> 559ns (0.894% faster)
    codeflash_output = readable_bytes_string(1024) # 1.06μs -> 1.02μs (4.82% faster)
    codeflash_output = readable_bytes_string(1048575) # 652ns -> 638ns (2.19% faster)
    codeflash_output = readable_bytes_string(1048576) # 369ns -> 361ns (2.22% faster)

def test_float_input():
    # Should handle float input gracefully (cast to float for calculation)
    codeflash_output = readable_bytes_string(1024.0) # 1.03μs -> 1.11μs (7.73% slower)
    codeflash_output = readable_bytes_string(1048576.0) # 458ns -> 435ns (5.29% faster)
    codeflash_output = readable_bytes_string(1536.5) # 457ns -> 493ns (7.30% slower)

def test_large_non_mb_values():
    # Large values that are not a round MB/kB
    codeflash_output = readable_bytes_string(123456) # 1.18μs -> 1.14μs (2.89% faster)
    codeflash_output = readable_bytes_string(6543210) # 567ns -> 512ns (10.7% faster)

def test_unusual_types():
    # Accepts ints and floats, but not strings or other types
    with pytest.raises(TypeError):
        readable_bytes_string("1024") # 1.35μs -> 1.37μs (1.17% slower)
    with pytest.raises(TypeError):
        readable_bytes_string(None) # 823ns -> 756ns (8.86% faster)
    with pytest.raises(TypeError):
        readable_bytes_string([1024]) # 539ns -> 567ns (4.94% slower)

def test_rounding_behavior():
    # Check rounding to one decimal place for kB and MB
    codeflash_output = readable_bytes_string(1100) # 1.27μs -> 1.25μs (1.20% faster)
    codeflash_output = readable_bytes_string(1150) # 552ns -> 496ns (11.3% faster)
    codeflash_output = readable_bytes_string(1177) # 338ns -> 334ns (1.20% faster)
    codeflash_output = readable_bytes_string(1200) # 354ns -> 349ns (1.43% faster)

# -----------------------
# 3. Large Scale Test Cases
# -----------------------

def test_very_large_values():
    # Test for large values up to 100 MB
    codeflash_output = readable_bytes_string(50 * 2**20) # 1.11μs -> 1.05μs (5.71% faster)
    codeflash_output = readable_bytes_string(99 * 2**20) # 569ns -> 553ns (2.89% faster)
    codeflash_output = readable_bytes_string(100 * 2**20) # 345ns -> 332ns (3.92% faster)

def test_large_range_of_values():
    # Test a range of values from 0 to 1MB in steps to check consistency
    for b in range(0, 1024000, 1024):  # Every kB up to 1MB
        if b < 1024:
            expected = f"{b} B"
        elif b < 2**20:
            expected = "%.1f kB" % (float(b) / 2**10)
        else:
            expected = "%.1f MB" % (float(b) / 2**20)
        codeflash_output = readable_bytes_string(b) # 312μs -> 290μs (7.35% faster)

def test_maximum_allowed_size():
    # Test the largest allowed value (100MB)
    max_bytes = 100 * 2**20  # 100 MB
    codeflash_output = readable_bytes_string(max_bytes) # 1.56μs -> 1.40μs (11.4% faster)

def test_performance_on_large_list():
    # Test performance and correctness for a list of values up to 1000 elements
    test_values = [i * 1024 for i in range(1000)]  # 0 kB to 999 kB
    for val in test_values:
        # For each value, check that output is correct
        if val < 1024:
            expected = f"{val} B"
        else:
            expected = "%.1f kB" % (float(val) / 2**10)
        # MB not reached in this range
        codeflash_output = readable_bytes_string(val) # 311μs -> 289μs (7.71% faster)

def test_incremental_mb_values():
    # Check that MB formatting is correct for increments up to 100MB
    for i in range(1, 101):  # 1 MB to 100 MB
        val = i * 2**20
        expected = "%.1f MB" % (float(val) / 2**20)
        codeflash_output = readable_bytes_string(val) # 31.4μs -> 29.2μs (7.80% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-readable_bytes_string-mglqgggz and push.

Codeflash

The optimized code applies two key micro-optimizations that together achieve a 5% speedup:

**1. Pre-computed constants instead of power operations**
- Replaced `2**20` with `1048576` and `2**10` with `1024`
- Eliminates repeated exponentiation calculations on every function call
- The line profiler shows reduced time in the comparison operations (194.6ns vs 205.9ns per hit for the first condition)

**2. Removed unnecessary `float()` casts**
- Changed `float(bytes) / 2**20` to `bytes / 1048576`
- In Python 3, division automatically returns float, making the explicit cast redundant
- Saves function call overhead, particularly visible in the formatting lines where time per hit improved significantly (446.9ns vs 533.4ns for MB formatting)

**Performance characteristics:**
The optimization is most effective for larger byte values (MB range), where test cases show 6-16% improvements. This aligns with the line profiler data showing the biggest per-hit time reduction in the MB formatting path. The optimization provides consistent small gains across all ranges, with some individual test cases showing up to 51% improvement for extremely large numbers, likely due to reduced computational overhead when dealing with large integer operations.

The changes are purely computational optimizations with no behavioral modifications - all formatting and logic remain identical.
@codeflash-ai codeflash-ai Bot requested a review from mashraf-222 October 11, 2025 03:46
@codeflash-ai codeflash-ai Bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Oct 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants

Morty Proxy This is a proxified and sanitized view of the page, visit original site.