Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

fix: mps device support in entropy #29321

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Jul 8, 2024

Conversation

EdAbati
Copy link
Contributor

@EdAbati EdAbati commented Jun 20, 2024

Reference Issues/PRs

As mentioned in #29300, we have some tests regarding the Array API that are failing in main.

FAILED sklearn/metrics/tests/test_common.py::test_array_api_compliance[accuracy_score-check_array_api_multilabel_classification_metric-cupy-None-None] - ValueError: unrecognized csr_matrix constructor usage
FAILED sklearn/metrics/tests/test_common.py::test_array_api_compliance[accuracy_score-check_array_api_multilabel_classification_metric-cupy.array_api-None-None] - TypeError: bool is only allowed on arrays with 0 dimensions
FAILED sklearn/metrics/tests/test_common.py::test_array_api_compliance[accuracy_score-check_array_api_multilabel_classification_metric-torch-cuda-float64] - ValueError: unrecognized csr_matrix constructor usage
FAILED sklearn/metrics/tests/test_common.py::test_array_api_compliance[accuracy_score-check_array_api_multilabel_classification_metric-torch-cuda-float32] - ValueError: unrecognized csr_matrix constructor usage
FAILED sklearn/metrics/tests/test_common.py::test_array_api_compliance[zero_one_loss-check_array_api_multilabel_classification_metric-cupy-None-None] - ValueError: unrecognized csr_matrix constructor usage
FAILED sklearn/metrics/tests/test_common.py::test_array_api_compliance[zero_one_loss-check_array_api_multilabel_classification_metric-cupy.array_api-None-None] - TypeError: bool is only allowed on arrays with 0 dimensions
FAILED sklearn/metrics/tests/test_common.py::test_array_api_compliance[zero_one_loss-check_array_api_multilabel_classification_metric-torch-cuda-float64] - ValueError: unrecognized csr_matrix constructor usage
FAILED sklearn/metrics/tests/test_common.py::test_array_api_compliance[zero_one_loss-check_array_api_multilabel_classification_metric-torch-cuda-float32] - ValueError: unrecognized csr_matrix constructor usage

and

FAILED sklearn/metrics/cluster/tests/test_supervised.py::test_entropy_array_api[torch-mps-float32] - TypeError: Cannot convert a MPS Tensor to float64 dtype as the MPS framework doesn't support float64. Please use float32 instead.
FAILED sklearn/metrics/tests/test_common.py::test_array_api_compliance[accuracy_score-check_array_api_multilabel_classification_metric-torch-mps-float32] - ValueError: unrecognized csr_matrix constructor usage
FAILED sklearn/metrics/tests/test_common.py::test_array_api_compliance[zero_one_loss-check_array_api_multilabel_classification_metric-torch-mps-float32] - ValueError: unrecognized csr_matrix constructor usage

What does this implement/fix? Explain your changes.

The first commit fixes the issue with mps.

I believe that the others were caused by #29269.
If I remember correctly, the Array API does not support sparse matrices, and therefore should not work with those metrics in the multilabel case. It seems that the code introduced in that PR only works for numpy and torch in cpu, or am I missing something?

Should we revert this change?

cc @ogrisel

Copy link

github-actions bot commented Jun 20, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 31b0477. Link to the linter CI: here

@EdAbati EdAbati changed the title fix cast to float64 for mps fix: array api metrics tests Jun 20, 2024
@Tialo
Copy link
Contributor

Tialo commented Jun 20, 2024

Thanks for the fix. I see 3 possible solutions for problem with sparse matrices.

  1. Don't allow Array API for multilabel metrics.
  2. If we use array_api_dispatch, convert sparse matrix to array here https://github.com/scikit-learn/scikit-learn/blob/main/sklearn/metrics/_classification.py#L217 (If I am not mistaken, exception raises because _average gets a sparse matrix which is create here)
  3. If use array_api_dispatch do not convert array to sparse matrix here https://github.com/scikit-learn/scikit-learn/blob/main/sklearn/metrics/_classification.py#L132

Also, I will try to check each PR in colab more carefully.

@EdAbati
Copy link
Contributor Author

EdAbati commented Jun 21, 2024

Yes, you are right! I think it could be nice to support multilabel.
And given that the sparse matrices are created internally and not by the user, I'm more for option 3. But we decide for option 1, I think we should have some meaningful error message.

@EdAbati EdAbati marked this pull request as draft June 21, 2024 06:42
@EdAbati EdAbati changed the title fix: array api metrics tests fix: multilabel support for accuracy and mps support for entropy Jun 21, 2024
@EdAbati
Copy link
Contributor Author

EdAbati commented Jun 21, 2024

I've implemented something in the direction of option 3.

I will work on mypy later today.

Not sure if I should make 2 PRs, since the fixes for entropy and for accuracy are unrelated. 🤔

@ogrisel
Copy link
Member

ogrisel commented Jun 21, 2024

Not sure if I should make 2 PRs, since the fixes for entropy and for accuracy are unrelated. 🤔

That would be great thanks. That would ease the review in case one of the changes is controversial.

sklearn/utils/_array_api.py Show resolved Hide resolved
Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your analysis @Tialo and for the fix @EdAbati.

I don't understand the mypy failure yet but overall I think I agree with then plan to follow option 3 as well.

Let's add a comment though.

sklearn/metrics/_classification.py Outdated Show resolved Hide resolved
@EdAbati EdAbati changed the title fix: multilabel support for accuracy and mps support for entropy fix: mps device support in entropy Jun 22, 2024
@EdAbati EdAbati marked this pull request as ready for review June 22, 2024 13:12
@EdAbati
Copy link
Contributor Author

EdAbati commented Jun 22, 2024

Created the 2nd PR #29336 for accuracy and zero_loss :)

sklearn/utils/_array_api.py Outdated Show resolved Hide resolved
sklearn/metrics/cluster/_supervised.py Show resolved Hide resolved
Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sklearn/utils/_array_api.py Outdated Show resolved Hide resolved
Copy link
Contributor

@OmarManzoor OmarManzoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @EdAbati

@OmarManzoor OmarManzoor merged commit a408a59 into scikit-learn:main Jul 8, 2024
30 checks passed
@EdAbati EdAbati deleted the fix-array-api-tests branch July 8, 2024 11:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.