Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

FIX LogisticRegressionCV.score and _BaseScorer metadata routing #30859

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
Loading
from

Conversation

adrinjalali
Copy link
Member

Fixes #30817

Two issues fixed in this PR:

  • LogisticRegressionCV had a sample_weight arg in its score, which makes it a consumer of it, while being a router. This PR removes sample_weight as a consumer arg from that method
  • _BaseScorer wasn't implementing a get_metadata_routing and as a result the default implementation wasn't correctly detecting sample_weight in the __call__ signature of those scorers

Needs tests and refining the error message regarding scorer.__call__

cc @Dalesrox, @antoinebaker

Copy link

github-actions bot commented Feb 19, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 76c7e31. Link to the linter CI: here

@StefanieSenger
Copy link
Contributor

StefanieSenger commented Feb 19, 2025

How did you discover these bugs?

Edit: he it was related to the issue, sorry didn't see in time.

err_msg = re.escape(
"[sample_weight] are passed but are not explicitly set as requested or not"
" requested for _Scorer.score, which is used within test.score. Call"
" `_Scorer.set_score_request({metadata}=True/False)` for each"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can replace _Scorer with __repr__ once #30946 is merged.

@@ -1762,6 +1764,8 @@ class LogisticRegressionCV(LogisticRegression, LinearClassifierMixin, BaseEstima
0.98...
"""

# TODO(1.9): remove this when sample_weight is removed from the `score` signature
__metadata_request__score = {"sample_weight": metadata_routing.UNUSED}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is to avoid set_score_request be present on this class

@@ -2262,18 +2262,18 @@ def test_lr_cv_scores_differ_when_sample_weight_is_requested():
sample_weight[: len(y) // 2] = 2
kwargs = {"sample_weight": sample_weight}

scorer1 = get_scorer("accuracy")
scorer1 = get_scorer("accuracy").set_score_request(sample_weight=False)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now that scorers properly request their routing, this is required since we're passing sample weight bellow.

Comment on lines +375 to +376
score_method=self._score_func,
ignore_params={"y_true", "y_pred"},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we explicitly pass the _score_func so that the right metadata can be deducted from its signature.

And we need to ignore y_true and y_pred in that process.

Copy link
Contributor

@antoinebaker antoinebaker Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to ignore y_prob, y_proba, y_score, labels_true, labels_pred, pred_decision as well ? (Some of the various names the first two args of a score function can have). Maybe an easier way to skip them all would be to ignore the first two positional arguments of the score function ?

@adrinjalali adrinjalali marked this pull request as ready for review March 10, 2025 20:48
@adrinjalali
Copy link
Member Author

@OmarManzoor @antoinebaker this is ready for review now.

Copy link
Contributor

@OmarManzoor OmarManzoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR @adrinjalali

Comment on lines +2213 to +2214
# TODO(1.9): remove this when sample_weight is removed from the `score` signature
# and remove `sample_weight` from the `score` signature
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make it clear

Suggested change
# TODO(1.9): remove this when sample_weight is removed from the `score` signature
# and remove `sample_weight` from the `score` signature
# TODO(1.9): remove this decorator along with `sample_weight` from the `score`
# signature

@@ -2231,13 +2242,14 @@ def score(self, X, y, sample_weight=None, **score_params):
Score of self.predict(X) w.r.t. y.
"""
_raise_for_params(score_params, self, "score")
if sample_weight is not None:
score_params["sample_weight"] = sample_weight
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we intend on removing sample_weight then shouldn't we also update the condition where routing is not enabled to be:
if "sample_weight " in score_params: instead of if sample_weight is not None:

Copy link
Contributor

@antoinebaker antoinebaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A first round of reviews, but from my limited understanding of the metadata routing API, I probably don't get all the logic right 😉

Comment on lines +375 to +376
score_method=self._score_func,
ignore_params={"y_true", "y_pred"},
Copy link
Contributor

@antoinebaker antoinebaker Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to ignore y_prob, y_proba, y_score, labels_true, labels_pred, pred_decision as well ? (Some of the various names the first two args of a score function can have). Maybe an easier way to skip them all would be to ignore the first two positional arguments of the score function ?

Comment on lines +2275 to +2276
# sample_weight cannot be passed to lr_cv1.score since it's unrequested.
score_1 = lr_cv1.score(X_t, y_t)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could test here that lr_cv1.score(X_t, y_t, **kwargs) raises the appropriate error.

Comment on lines +1473 to +1477
cls._build_request_for_signature(
method_name=method,
method_obj=score_method if score_method != "score" else None,
ignore_params=ignore_params,
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, why the generic _get_default_requests and _build_request_for_signature need to be modified specifically for scorers ? Is it because in general we rely on the method signature, but here for scorers we instead rely on the scorer._scorer_func signature ?

Copy link
Contributor

@antoinebaker antoinebaker Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, I am wondering if we should rather redefine _get_default_requests and _build_request_for_signature in _BaseScorer (instead of changing the _MetadataRequestermixin).

Comment on lines +264 to +266
# TODO (1.9): remove in 1.9
@_deprecate_positional_args(version="1.9")
def __call__(self, estimator, X, y_true, *, sample_weight=None, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, why do we need to deprecate sample_weight as a positional arg ? Is it directly related to this PR or is it a general guideline ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sample_weight is silently ignored in LogisticRegressionCV.score when metadata routing is enabled
4 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.