TST improve messages raised in check_classifier_multioutput #30235

adrinjalali · Nov 7, 2024

This PR improves error messages raised in check_classifier_multioutput

github-actions · Nov 7, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 75caa5b. Link to the linter CI: here}

adrinjalali · Nov 7, 2024

sklearn/utils/estimator_checks.py

+        "The prediction for multioutput data is expected to be of the same type "
+        f"as the input y. Got dtype={y_pred.dtype}, dtype.kind={y_pred.dtype.kind} "
+        f"instead, while input was dtype={y.dtype}, dtype.kind={y.dtype.kind}."


I suppose this is what we're testing. But not entirely sure.

adrinjalali · Nov 7, 2024

sklearn/utils/estimator_checks.py


    if hasattr(estimator, "decision_function"):
        decision = estimator.decision_function(X)
-        assert isinstance(decision, np.ndarray)
+        assert isinstance(decision, np.ndarray), (


So here we're checking the actual type of the output of decision_function. But this seems wrong(?) when I think of the array API work. WDYT @betatim @ogrisel @OmarManzoor

Maybe we could check for xp.array?

adrinjalali · Nov 7, 2024

sklearn/utils/estimator_checks.py

-        dec_pred = (decision > 0).astype(int)
-        dec_exp = estimator.classes_[dec_pred]
-        assert_array_equal(dec_exp, y_pred)


This parts is just odd to me. estimator.classes_=[0, 1, 2], and we do some fancy indexing using the results from the decision function. I can't imagine a case where this could fail, or to be meaningful, especially since the value 2 from classes_ would never be chosen here.

adrinjalali · Nov 7, 2024

sklearn/utils/estimator_checks.py

-                    np.argmax(y_prob[i], axis=1).astype(int), y_pred[:, i]
-                )
-        elif not tags.classifier_tags.poor_score:
+                if not tags.classifier_tags.poor_score:


decoupling the shape check from the value check, and skipping only the value check if poor_score

adrinjalali · Nov 8, 2024

@glemaitre I think you've worked on this quite a bit in the past.

TST improve messages raised in check_classifier_multioutput

75caa5b

adrinjalali added Developer API Third party developer API related No Changelog Needed labels Nov 7, 2024

github-actions bot added the module:utils label Nov 7, 2024

adrinjalali commented Nov 7, 2024

View reviewed changes

dokterbob mentioned this pull request Mar 21, 2025

[Mathijs] Task 13 (Augment): TST improve error messages in _check_transformer Agent-Benchmarking/scikit-learn#10

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

TST improve messages raised in check_classifier_multioutput #30235

TST improve messages raised in check_classifier_multioutput #30235

Uh oh!

adrinjalali commented Nov 7, 2024

Uh oh!

github-actions bot commented Nov 7, 2024

Uh oh!

adrinjalali Nov 7, 2024

Uh oh!

adrinjalali Nov 7, 2024

Uh oh!

OmarManzoor Nov 13, 2024

Uh oh!

adrinjalali Nov 7, 2024

Uh oh!

adrinjalali Nov 7, 2024

Uh oh!

adrinjalali commented Nov 8, 2024

Uh oh!

Uh oh!

Search code, repositories, users, issues, pull requests...

Uh oh!

TST improve messages raised in check_classifier_multioutput #30235

Are you sure you want to change the base?

TST improve messages raised in check_classifier_multioutput #30235

Uh oh!

Conversation

adrinjalali commented Nov 7, 2024

Uh oh!

github-actions bot commented Nov 7, 2024

✔️ Linting Passed

Uh oh!

adrinjalali Nov 7, 2024

Choose a reason for hiding this comment

Uh oh!

adrinjalali Nov 7, 2024

Choose a reason for hiding this comment

Uh oh!

OmarManzoor Nov 13, 2024

Choose a reason for hiding this comment

Uh oh!

adrinjalali Nov 7, 2024

Choose a reason for hiding this comment

Uh oh!

adrinjalali Nov 7, 2024

Choose a reason for hiding this comment

Uh oh!

adrinjalali commented Nov 8, 2024

Uh oh!

Uh oh!