FIX `LogisticRegressionCV.score` and `_BaseScorer` metadata routing #30859

adrinjalali · Feb 19, 2025

Fixes #30817

Two issues fixed in this PR:

LogisticRegressionCV had a sample_weight arg in its score, which makes it a consumer of it, while being a router. This PR removes sample_weight as a consumer arg from that method
_BaseScorer wasn't implementing a get_metadata_routing and as a result the default implementation wasn't correctly detecting sample_weight in the __call__ signature of those scorers

Needs tests and refining the error message regarding scorer.__call__

cc @Dalesrox, @antoinebaker

github-actions · Feb 19, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 76c7e31. Link to the linter CI: here}

StefanieSenger · Feb 19, 2025

How did you discover these bugs?

Edit: he it was related to the issue, sorry didn't see in time.

adrinjalali · Mar 10, 2025

sklearn/metrics/tests/test_score_objects.py

+        err_msg = re.escape(
+            "[sample_weight] are passed but are not explicitly set as requested or not"
+            " requested for _Scorer.score, which is used within test.score. Call"
+            " `_Scorer.set_score_request({metadata}=True/False)` for each"


we can replace _Scorer with __repr__ once #30946 is merged.

adrinjalali · Mar 10, 2025

sklearn/linear_model/_logistic.py

@@ -1762,6 +1764,8 @@ class LogisticRegressionCV(LogisticRegression, LinearClassifierMixin, BaseEstima
    0.98...
    """

+    # TODO(1.9): remove this when sample_weight is removed from the `score` signature
+    __metadata_request__score = {"sample_weight": metadata_routing.UNUSED}


this is to avoid set_score_request be present on this class

adrinjalali · Mar 10, 2025

sklearn/linear_model/tests/test_logistic.py

@@ -2262,18 +2262,18 @@ def test_lr_cv_scores_differ_when_sample_weight_is_requested():
    sample_weight[: len(y) // 2] = 2
    kwargs = {"sample_weight": sample_weight}

-    scorer1 = get_scorer("accuracy")
+    scorer1 = get_scorer("accuracy").set_score_request(sample_weight=False)


now that scorers properly request their routing, this is required since we're passing sample weight bellow.

adrinjalali · Mar 10, 2025

sklearn/metrics/_scorer.py

+                score_method=self._score_func,
+                ignore_params={"y_true", "y_pred"},


we explicitly pass the _score_func so that the right metadata can be deducted from its signature.

And we need to ignore y_true and y_pred in that process.

Do we need to ignore y_prob, y_proba, y_score, labels_true, labels_pred, pred_decision as well ? (Some of the various names the first two args of a score function can have). Maybe an easier way to skip them all would be to ignore the first two positional arguments of the score function ?

adrinjalali · Mar 10, 2025

@OmarManzoor @antoinebaker this is ready for review now.

OmarManzoor

Thank you for the PR @adrinjalali

OmarManzoor · Mar 11, 2025

sklearn/linear_model/_logistic.py

+    # TODO(1.9): remove this when sample_weight is removed from the `score` signature
+    # and remove `sample_weight` from the `score` signature


Just to make it clear

Suggested change

# TODO(1.9): remove this when sample_weight is removed from the `score` signature

# and remove `sample_weight` from the `score` signature

# TODO(1.9): remove this decorator along with `sample_weight` from the `score`

# signature

OmarManzoor · Mar 11, 2025

sklearn/linear_model/_logistic.py

@@ -2231,13 +2242,14 @@ def score(self, X, y, sample_weight=None, **score_params):
            Score of self.predict(X) w.r.t. y.
        """
        _raise_for_params(score_params, self, "score")
+        if sample_weight is not None:
+            score_params["sample_weight"] = sample_weight


Since we intend on removing sample_weight then shouldn't we also update the condition where routing is not enabled to be:
if "sample_weight " in score_params: instead of if sample_weight is not None:

antoinebaker

A first round of reviews, but from my limited understanding of the metadata routing API, I probably don't get all the logic right 😉

antoinebaker · Mar 18, 2025

sklearn/metrics/_scorer.py

+                score_method=self._score_func,
+                ignore_params={"y_true", "y_pred"},


Do we need to ignore y_prob, y_proba, y_score, labels_true, labels_pred, pred_decision as well ? (Some of the various names the first two args of a score function can have). Maybe an easier way to skip them all would be to ignore the first two positional arguments of the score function ?

antoinebaker · Mar 18, 2025

sklearn/linear_model/tests/test_logistic.py

+    # sample_weight cannot be passed to lr_cv1.score since it's unrequested.
+    score_1 = lr_cv1.score(X_t, y_t)


Maybe we could test here that lr_cv1.score(X_t, y_t, **kwargs) raises the appropriate error.

antoinebaker · Mar 18, 2025

sklearn/utils/_metadata_requests.py

+                cls._build_request_for_signature(
+                    method_name=method,
+                    method_obj=score_method if score_method != "score" else None,
+                    ignore_params=ignore_params,
+                ),


Out of curiosity, why the generic _get_default_requests and _build_request_for_signature need to be modified specifically for scorers ? Is it because in general we rely on the method signature, but here for scorers we instead rely on the scorer._scorer_func signature ?

If so, I am wondering if we should rather redefine _get_default_requests and _build_request_for_signature in _BaseScorer (instead of changing the _MetadataRequestermixin).

antoinebaker · Mar 18, 2025

sklearn/metrics/_scorer.py

+    # TODO (1.9): remove in 1.9
+    @_deprecate_positional_args(version="1.9")
+    def __call__(self, estimator, X, y_true, *, sample_weight=None, **kwargs):


Out of curiosity, why do we need to deprecate sample_weight as a positional arg ? Is it directly related to this PR or is it a general guideline ?

FIX LogisticRegressionCV.score and _BaseScorer metadata routing

e17acbb

github-actions bot added module:linear_model module:metrics module:utils labels Feb 19, 2025

adrinjalali mentioned this pull request Mar 4, 2025

Pipeline score asks to explicitly request sample_weight #30937

Open

adrinjalali added 4 commits March 10, 2025 18:53

...

38e18d6

...

8c3bf31

Merge remote-tracking branch 'upstream/main' into logregcv_score

21e4907

changelog

76c7e31

adrinjalali commented Mar 10, 2025

View reviewed changes

adrinjalali marked this pull request as ready for review March 10, 2025 20:48

OmarManzoor reviewed Mar 11, 2025

View reviewed changes

antoinebaker reviewed Mar 18, 2025

View reviewed changes

		score_method=self._score_func,
		ignore_params={"y_true", "y_pred"},

		# TODO(1.9): remove this when sample_weight is removed from the `score` signature
		# and remove `sample_weight` from the `score` signature

		# sample_weight cannot be passed to lr_cv1.score since it's unrequested.
		score_1 = lr_cv1.score(X_t, y_t)

Search code, repositories, users, issues, pull requests...

Uh oh!

FIX LogisticRegressionCV.score and _BaseScorer metadata routing #30859

Are you sure you want to change the base?

FIX LogisticRegressionCV.score and _BaseScorer metadata routing #30859

Uh oh!

Conversation

adrinjalali commented Feb 19, 2025

Uh oh!

github-actions bot commented Feb 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

StefanieSenger commented Feb 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

antoinebaker Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adrinjalali commented Mar 10, 2025

Uh oh!

OmarManzoor left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

antoinebaker left a comment

Choose a reason for hiding this comment

Uh oh!

antoinebaker Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

antoinebaker Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

FIX `LogisticRegressionCV.score` and `_BaseScorer` metadata routing #30859

FIX `LogisticRegressionCV.score` and `_BaseScorer` metadata routing #30859

github-actions bot commented Feb 19, 2025 •

edited

Loading

StefanieSenger commented Feb 19, 2025 •

edited

Loading

antoinebaker Mar 18, 2025 •

edited

Loading

antoinebaker Mar 18, 2025 •

edited

Loading

antoinebaker Mar 18, 2025 •

edited

Loading