Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

⚡️ Speed up function _get_closest_match_prebuilt_container_uri by 139%#47

Open
codeflash-ai[bot] wants to merge 1 commit into
maincodeflash-ai/python-aiplatform:mainfrom
codeflash/optimize-_get_closest_match_prebuilt_container_uri-mglkps8icodeflash-ai/python-aiplatform:codeflash/optimize-_get_closest_match_prebuilt_container_uri-mglkps8iCopy head branch name to clipboard
Open

⚡️ Speed up function _get_closest_match_prebuilt_container_uri by 139%#47
codeflash-ai[bot] wants to merge 1 commit into
maincodeflash-ai/python-aiplatform:mainfrom
codeflash/optimize-_get_closest_match_prebuilt_container_uri-mglkps8icodeflash-ai/python-aiplatform:codeflash/optimize-_get_closest_match_prebuilt_container_uri-mglkps8iCopy head branch name to clipboard

Conversation

@codeflash-ai

@codeflash-ai codeflash-ai Bot commented Oct 11, 2025

Copy link
Copy Markdown

📄 139% (1.39x) speedup for _get_closest_match_prebuilt_container_uri in google/cloud/aiplatform/helpers/container_uri_builders.py

⏱️ Runtime : 1.45 milliseconds 608 microseconds (best of 108 runs)

📝 Explanation and details

The optimized code achieves a 138% speedup through three key optimizations:

1. Early Return for Exact Matches (Fast Path)
Added a direct string lookup before any version parsing:

if framework_version in accelerator_map:
    return accelerator_map[framework_version]

This eliminates expensive version.Version() object creation for exact matches, which are common in practice. Test results show this optimization provides 2000-3000% speedup for exact match cases.

2. Reduced Dictionary Lookups
Cached intermediate dictionary lookups in variables:

region_map = URI_MAP.get(region)
framework_map = region_map.get(framework)  
accelerator_map = framework_map.get(accelerator)

This eliminates redundant nested dictionary traversals throughout the function.

3. Optimized Version Comparison Loop
Replaced the expensive list comprehension that created all version.Version objects upfront:

# OLD: Creates ALL version objects first
available_version_list = [version.Version(v) for v in keys]
closest_version = min([v for v in available_version_list if condition])

# NEW: Creates version objects only as needed
for ver_str in accelerator_map.keys():
    ver_obj = version.Version(ver_str)
    if condition:
        candidates.append(ver_obj)

The line profiler shows the original code spent 68.1% of time creating the full version list, while the optimized version only spends 48.1% creating version objects on-demand and exits early when possible.

These optimizations are most effective for exact version matches (the most common use case) and scenarios with smaller version lists, providing substantial performance gains while maintaining identical functionality.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 24 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 95.2%
🌀 Generated Regression Tests and Runtime
import sys
import warnings
# Re-import function to use patched modules
from types import ModuleType

# imports
import pytest  # used for our unit tests
from aiplatform.helpers.container_uri_builders import \
    _get_closest_match_prebuilt_container_uri
from packaging import version


# Mocks for dependencies
class DummyGlobalConfig:
    location = "us-central1"

class DummyInitializer:
    global_config = DummyGlobalConfig()

# Simulated container URI map for testing
# Structure: region -> framework -> accelerator -> version -> uri
DUMMY_URI_MAP = {
    "us": {
        "tensorflow": {
            "cpu": {
                "2.3": "us/tensorflow/cpu/2.3",
                "2.4": "us/tensorflow/cpu/2.4",
                "2.5": "us/tensorflow/cpu/2.5",
            },
            "gpu": {
                "2.3": "us/tensorflow/gpu/2.3",
                "2.4": "us/tensorflow/gpu/2.4",
            },
        },
        "xgboost": {
            "cpu": {
                "1.2": "us/xgboost/cpu/1.2",
                "1.3": "us/xgboost/cpu/1.3",
            }
        },
        "sklearn": {
            "cpu": {
                "0.23": "us/sklearn/cpu/0.23",
                "0.24": "us/sklearn/cpu/0.24",
            }
        },
    },
    "europe": {
        "tensorflow": {
            "cpu": {
                "2.4": "europe/tensorflow/cpu/2.4",
                "2.5": "europe/tensorflow/cpu/2.5",
            }
        }
    }
}

DUMMY_DOCS_URL = "https://dummy.docs/containers"
from aiplatform.helpers.container_uri_builders import \
    _get_closest_match_prebuilt_container_uri

# -------------------- UNIT TESTS --------------------

# BASIC TEST CASES

def test_exact_version_match_cpu():
    # Test exact match for tensorflow 2.4 cpu in us region
    codeflash_output = _get_closest_match_prebuilt_container_uri("tensorflow", "2.4", "us", "cpu") # 76.9μs -> 2.79μs (2661% faster)

def test_exact_version_match_gpu():
    # Test exact match for tensorflow 2.3 gpu in us region
    codeflash_output = _get_closest_match_prebuilt_container_uri("tensorflow", "2.3", "us", "gpu") # 67.8μs -> 1.97μs (3347% faster)

def test_case_insensitive_framework():
    # Framework should be case-insensitive
    codeflash_output = _get_closest_match_prebuilt_container_uri("TensorFlow", "2.4", "us", "cpu") # 67.0μs -> 1.78μs (3664% faster)

def test_default_region_from_initializer():
    # Should use DummyInitializer.global_config.location ("us-central1") -> "us"
    codeflash_output = _get_closest_match_prebuilt_container_uri("xgboost", "1.2") # 56.1μs -> 6.48μs (766% faster)

def test_default_accelerator():
    # Should use default accelerator "cpu"
    codeflash_output = _get_closest_match_prebuilt_container_uri("sklearn", "0.23", "us") # 45.1μs -> 1.86μs (2322% faster)

def test_exact_match_europe_region():
    # Should work for europe region
    codeflash_output = _get_closest_match_prebuilt_container_uri("tensorflow", "2.4", "europe", "cpu") # 66.2μs -> 1.82μs (3535% faster)

# EDGE TEST CASES

def test_version_not_found_raises():
    # Version not present and no higher version in same major
    with pytest.raises(ValueError) as excinfo:
        _get_closest_match_prebuilt_container_uri("sklearn", "0.25", "us", "cpu") # 47.5μs -> 79.5μs (40.2% slower)

def test_accelerator_not_supported_raises():
    # Accelerator not present for framework
    with pytest.raises(ValueError) as excinfo:
        _get_closest_match_prebuilt_container_uri("sklearn", "0.23", "us", "gpu") # 3.08μs -> 3.04μs (1.22% faster)



def test_closest_higher_version_selected_and_warned():
    # Should select closest higher version in same major and warn
    with warnings.catch_warnings(record=True) as w:
        warnings.simplefilter("always")
        codeflash_output = _get_closest_match_prebuilt_container_uri("tensorflow", "2.3.1", "us", "cpu"); uri = codeflash_output # 85.4μs -> 84.6μs (0.959% faster)


def test_version_lower_than_all_available():
    # Should pick the lowest available version in same major if input is lower than all
    codeflash_output = _get_closest_match_prebuilt_container_uri("xgboost", "1.1", "us", "cpu") # 62.1μs -> 2.76μs (2152% faster)


def test_non_numeric_version_string():
    # Should work with non-numeric patch version
    codeflash_output = _get_closest_match_prebuilt_container_uri("tensorflow", "2.3.0", "us", "cpu") # 76.5μs -> 76.7μs (0.141% slower)

def test_region_with_hyphen():
    # Should split region correctly
    codeflash_output = _get_closest_match_prebuilt_container_uri("tensorflow", "2.4", "us-central1", "cpu") # 66.9μs -> 2.17μs (2985% faster)

# LARGE SCALE TEST CASES




#------------------------------------------------
import warnings
from typing import Optional

# imports
import pytest  # used for our unit tests
from aiplatform.helpers.container_uri_builders import \
    _get_closest_match_prebuilt_container_uri
# --- FUNCTION UNDER TEST ---
from packaging import version


# --- MOCK DEPENDENCIES ---
# Mock for google.cloud.aiplatform.initializer
class MockGlobalConfig:
    location = "us-central1"

class MockInitializer:
    global_config = MockGlobalConfig()

# Mock for google.cloud.aiplatform.constants.prediction
class MockPrediction:
    _SERVING_CONTAINER_DOCUMENTATION_URL = "https://container-docs.example.com"
    # Example container URI map for testing
    _SERVING_CONTAINER_URI_MAP = {
        "us": {
            "tensorflow": {
                "cpu": {
                    "2.3": "us-tf-cpu-2.3-uri",
                    "2.4": "us-tf-cpu-2.4-uri",
                    "2.5": "us-tf-cpu-2.5-uri",
                    "3.0": "us-tf-cpu-3.0-uri",
                },
                "gpu": {
                    "2.3": "us-tf-gpu-2.3-uri",
                    "2.5": "us-tf-gpu-2.5-uri",
                },
            },
            "sklearn": {
                "cpu": {
                    "0.22": "us-sk-cpu-0.22-uri",
                    "0.23": "us-sk-cpu-0.23-uri",
                }
            }
        },
        "europe": {
            "xgboost": {
                "cpu": {
                    "1.0": "eu-xgb-cpu-1.0-uri",
                    "1.2": "eu-xgb-cpu-1.2-uri",
                }
            }
        }
    }

# Patch the imports in the function to use our mocks
initializer = MockInitializer
prediction = MockPrediction
from aiplatform.helpers.container_uri_builders import \
    _get_closest_match_prebuilt_container_uri

# --- UNIT TESTS ---

# ----------- BASIC TEST CASES -----------

def test_exact_match_basic_cpu():
    # Test exact version match for TensorFlow CPU in US region
    codeflash_output = _get_closest_match_prebuilt_container_uri("TensorFlow", "2.4", "us", "cpu"); uri = codeflash_output # 75.6μs -> 2.88μs (2524% faster)

def test_exact_match_basic_gpu():
    # Test exact version match for TensorFlow GPU in US region
    codeflash_output = _get_closest_match_prebuilt_container_uri("TensorFlow", "2.5", "us", "gpu"); uri = codeflash_output # 67.5μs -> 1.99μs (3292% faster)

def test_case_insensitive_framework():
    # Framework name should be case-insensitive
    codeflash_output = _get_closest_match_prebuilt_container_uri("SKLEARN", "0.23", "us", "cpu"); uri = codeflash_output # 45.9μs -> 1.88μs (2344% faster)

def test_default_region_from_initializer():
    # Should use initializer.global_config.location if region not provided
    codeflash_output = _get_closest_match_prebuilt_container_uri("TensorFlow", "2.5", None, "cpu"); uri = codeflash_output # 69.4μs -> 6.53μs (963% faster)

def test_default_accelerator_cpu():
    # Should default to CPU if accelerator not provided
    codeflash_output = _get_closest_match_prebuilt_container_uri("TensorFlow", "2.3", "us"); uri = codeflash_output # 66.7μs -> 1.79μs (3631% faster)

# ----------- EDGE TEST CASES -----------

def test_version_round_up_to_next_patch():
    # Should round up to next available patch version in same major
    with warnings.catch_warnings(record=True) as w:
        codeflash_output = _get_closest_match_prebuilt_container_uri("TensorFlow", "2.2", "us", "cpu"); uri = codeflash_output # 66.1μs -> 1.66μs (3884% faster)

def test_version_round_up_to_next_minor():
    # Should round up to next available minor version in same major
    with warnings.catch_warnings(record=True) as w:
        codeflash_output = _get_closest_match_prebuilt_container_uri("TensorFlow", "2.4.1", "us", "cpu"); uri = codeflash_output # 77.2μs -> 87.8μs (12.1% slower)


def test_version_too_high():
    # Should raise if requested version is higher than any available
    with pytest.raises(ValueError) as excinfo:
        _get_closest_match_prebuilt_container_uri("TensorFlow", "2.99", "us", "cpu") # 77.4μs -> 119μs (35.4% slower)


def test_accelerator_not_supported():
    # Should raise if accelerator is not supported for framework
    with pytest.raises(ValueError) as excinfo:
        _get_closest_match_prebuilt_container_uri("sklearn", "0.22", "us", "gpu") # 3.89μs -> 3.89μs (0.000% faster)


def test_version_string_with_leading_zeros():
    # Should handle version strings with leading zeros
    codeflash_output = _get_closest_match_prebuilt_container_uri("sklearn", "0.22.0", "us", "cpu"); uri = codeflash_output # 56.4μs -> 56.0μs (0.625% faster)

def test_version_string_with_extra_patch():
    # Should round up if patch version is not available
    with warnings.catch_warnings(record=True) as w:
        codeflash_output = _get_closest_match_prebuilt_container_uri("sklearn", "0.22.1", "us", "cpu"); uri = codeflash_output # 57.4μs -> 56.1μs (2.24% faster)

def test_region_with_dash():
    # Should correctly parse region with dash (e.g., "us-central1")
    codeflash_output = _get_closest_match_prebuilt_container_uri("TensorFlow", "2.3", "us-central1", "cpu"); uri = codeflash_output # 67.7μs -> 2.09μs (3134% faster)

To edit these changes git checkout codeflash/optimize-_get_closest_match_prebuilt_container_uri-mglkps8i and push.

Codeflash

The optimized code achieves a **138% speedup** through three key optimizations:

**1. Early Return for Exact Matches (Fast Path)**
Added a direct string lookup before any version parsing:
```python
if framework_version in accelerator_map:
    return accelerator_map[framework_version]
```
This eliminates expensive `version.Version()` object creation for exact matches, which are common in practice. Test results show this optimization provides **2000-3000% speedup** for exact match cases.

**2. Reduced Dictionary Lookups**
Cached intermediate dictionary lookups in variables:
```python
region_map = URI_MAP.get(region)
framework_map = region_map.get(framework)  
accelerator_map = framework_map.get(accelerator)
```
This eliminates redundant nested dictionary traversals throughout the function.

**3. Optimized Version Comparison Loop**
Replaced the expensive list comprehension that created all `version.Version` objects upfront:
```python
# OLD: Creates ALL version objects first
available_version_list = [version.Version(v) for v in keys]
closest_version = min([v for v in available_version_list if condition])

# NEW: Creates version objects only as needed
for ver_str in accelerator_map.keys():
    ver_obj = version.Version(ver_str)
    if condition:
        candidates.append(ver_obj)
```

The line profiler shows the original code spent **68.1% of time** creating the full version list, while the optimized version only spends **48.1%** creating version objects on-demand and exits early when possible.

These optimizations are most effective for **exact version matches** (the most common use case) and scenarios with **smaller version lists**, providing substantial performance gains while maintaining identical functionality.
@codeflash-ai codeflash-ai Bot requested a review from mashraf-222 October 11, 2025 01:05
@codeflash-ai codeflash-ai Bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Oct 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants

Morty Proxy This is a proxified and sanitized view of the page, visit original site.