ENH: Make brier_score_loss Array API compatible #31191

lithomas1 · Apr 13, 2025

Reference Issues/PRs

xref #26024
Depends on #30878

What does this implement/fix? Explain your changes.

Makes brier_score_loss Array API compatible.

Any other comments?

…r-array-api

github-actions · Apr 13, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 4020016. Link to the linter CI: here}

OmarManzoor

Thanks for the PR @lithomas1

OmarManzoor · Apr 14, 2025

sklearn/metrics/_classification.py

+        y_proba,
+        ensure_2d=False,
+        dtype=tuple(
+            xp.__array_namespace_info__().dtypes(kind="real floating").values()


Do we really need this change aside from just replacing np with xp in the floatdata types? Is there some other float dtype that we want to support?

I did this since some libraries like PyTorch MPS don't support xp.float32.
(Also float16 is not in the array API standard. Should we make a special exception for np.float16?)

Maybe it would be good to put this in a helper in the array API utils module?

So far, we used _find_matching_floating_dtype for this use case. We could update that utility to leverage __array_namespace_info__ as you did here.

We could also improve check_array to accept dtype="floating" and do device/namespace specific conversion when provided with integer inputs.

OmarManzoor · Apr 14, 2025

sklearn/metrics/_classification.py

+        transformed_labels = xp.asarray(transformed_labels, device=device)
+        y_proba = xp.asarray(y_proba, device=device)


Shouldn't these be on the device already. I think y_proba might be shifted to cpu because of the check_array function but assuming that y_true and y_prob are on the expected device transformed_labels should be on the device as well. Or is this just handling for the array-api-strict?

y_proba might be shifted to cpu because of the check_array

In which case could y_proba be shifted to cpu?

If it is possible that check_array alters the device, we probably need to do get_namespace_and_device before any check_arrays - i.e., in _validate_binary_probabilistic_prediction we do a column_or_1d first, which does check_array inside. I wonder if it would be good practice to just always do get_namespace_and_device first?

Or is this just handling for the array-api-strict?

Question, why would it be needed for array-api-strict?

OmarManzoor · Apr 14, 2025

sklearn/metrics/_classification.py

+
+    # If transformed_labels is integer array, cast it to the floating dtype of
+    # y_proba
+    transformed_labels = xp.astype(transformed_labels, y_proba.dtype, device=device)


Here we are again moving it on the device?

OmarManzoor · Apr 14, 2025

sklearn/utils/validation.py

-            or np.array_equal(classes, [1])
+        xp, _, device = get_namespace_and_device(y_true)
+        classes = xp.unique_values(y_true)
+        if (_is_numpy_namespace(xp) and classes.dtype.kind in "OUS") or not (


I seem to recall seeing a similar kind of change in another PR?

This is the part from #30878.

lucyleeow

Looks good but had some comments/questions. Thanks!

lucyleeow · May 21, 2025

doc/whats_new/upcoming_changes/array-api/31191.feature.rst

@@ -0,0 +1,2 @@
+- :func:`sklearn.metrics.brier_score_loss` now support Array API compatible inputs for the binary class case.


Suggested change

- :func:`sklearn.metrics.brier_score_loss` now support Array API compatible inputs for the binary class case.

- :func:`sklearn.metrics.brier_score_loss` now supports Array API compatible inputs for the binary class case.

nit

lucyleeow · May 21, 2025

sklearn/metrics/_classification.py

-    if y_prob.max() > 1:
+    if xp.max(y_prob) > 1:
        raise ValueError(f"y_prob contains values greater than 1: {y_prob.max()}")
-    if y_prob.min() < 0:
+    if xp.min(y_prob) < 0:


Not necessarily for this PR but this code seems to be repeated x4 in this module, maybe we could refactor it out?

lucyleeow · May 21, 2025

sklearn/metrics/_classification.py

+        transformed_labels = xp.asarray(transformed_labels, device=device)
+        y_proba = xp.asarray(y_proba, device=device)


y_proba might be shifted to cpu because of the check_array

In which case could y_proba be shifted to cpu?

If it is possible that check_array alters the device, we probably need to do get_namespace_and_device before any check_arrays - i.e., in _validate_binary_probabilistic_prediction we do a column_or_1d first, which does check_array inside. I wonder if it would be good practice to just always do get_namespace_and_device first?

Or is this just handling for the array-api-strict?

Question, why would it be needed for array-api-strict?

lucyleeow · May 21, 2025

sklearn/utils/validation.py

-            - If `sample_weight.dtype` is one of `{np.float64, np.float32}`,
+            - If `sample_weight.dtype` is one of `{xp.float64, xp.float32}`,


If we are not changing types in public functions, I wonder if we should keep private ones as is too, for consistency?

lucyleeow · May 21, 2025

sklearn/utils/validation.py

@@ -2169,17 +2177,18 @@ def _check_sample_weight(
        Validated sample weight. It is guaranteed to be "C" contiguous.


Just checking, is this the same changes as made in #30878? Since that one as the additional tests, maybe it should be merged first?

lithomas1 added 2 commits April 13, 2025 16:53

ENH: Make brier score Array API compatible

bdc5f40

Merge branch 'main' of github.com:scikit-learn/scikit-learn into brie…

28367b5

…r-array-api

github-actions bot added module:metrics module:utils labels Apr 13, 2025

lithomas1 changed the title ~~Brier array api~~ ENH: Make brier_score_loss Array API compatible Apr 13, 2025

whatsnew

4020016

lithomas1 marked this pull request as ready for review April 13, 2025 21:57

lucyleeow mentioned this pull request Apr 14, 2025

ENH Add Array API compatibility to Binarizer #31190

Merged

OmarManzoor reviewed Apr 14, 2025

View reviewed changes

virchan added the Array API label Apr 24, 2025

lesteve mentioned this pull request May 19, 2025

Contains code not allowed for commercial use #31390

Closed

lucyleeow reviewed May 21, 2025

View reviewed changes

lucyleeow mentioned this pull request May 22, 2025

Make more of the "tools" of scikit-learn Array API compatible #26024

Open

		transformed_labels = xp.asarray(transformed_labels, device=device)
		y_proba = xp.asarray(y_proba, device=device)

		@@ -0,0 +1,2 @@
		- :func:`sklearn.metrics.brier_score_loss` now support Array API compatible inputs for the binary class case.

		- If `sample_weight.dtype` is one of `{np.float64, np.float32}`,
		- If `sample_weight.dtype` is one of `{xp.float64, xp.float32}`,

		@@ -2169,17 +2177,18 @@ def _check_sample_weight(
		Validated sample weight. It is guaranteed to be "C" contiguous.

Search code, repositories, users, issues, pull requests...

Uh oh!

ENH: Make brier_score_loss Array API compatible #31191

Are you sure you want to change the base?

ENH: Make brier_score_loss Array API compatible #31191

Conversation

lithomas1 commented Apr 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

github-actions bot commented Apr 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

OmarManzoor left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lucyleeow left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lithomas1 commented Apr 13, 2025 •

edited

Loading

github-actions bot commented Apr 13, 2025 •

edited

Loading