Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

FIX Error for sparse matrix in OrdinalEncoder.inverse_transform #19879

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 13, 2021

Conversation

glemaitre
Copy link
Member

closes #19878

OrdinalEncoder.inverse_transform should not support sparse matrix. It was already failing but with an obscure message.
This PR adds a non-regression test to check for the error message.

@jeremiedbb
Copy link
Member

shouldn't the inverse transform support sparse matrices since the transform method can return a sparse matrix ?

@glemaitre
Copy link
Member Author

transform method can return a sparse matrix

How to trigger this behaviour? I was not able to.

When fitting on dense X and trying to use inverse_transform on sparse, I got the following error:

encoder.inverse_transform(sparse.csr_matrix(X_tr))
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
/tmp/xxx.py in 
----> 32 encoder.inverse_transform(sparse.csr_matrix(X_tr))

~/Documents/packages/scikit-learn/sklearn/preprocessing/_encoders.py in inverse_transform(self, X)
    882                 found_unknown[i] = unknown_labels
    883             else:
--> 884                 X_tr[:, i] = self.categories_[i][labels]
    885 
    886         # insert None values for unknown values

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

but it might be because I am not able to trigger a sparse X_trans when calling transform.

@glemaitre
Copy link
Member Author

And I assume that OrdinalEncoder in general should not really benefit from sparse matrix. You would need to always have the category 0 but creating a sparse matrix would be rather weird regarding the semantic of the category).

@jeremiedbb
Copy link
Member

jeremiedbb commented Apr 13, 2021

transform can't return a sparse matrix indeed. The docstring is wrong.
Edit: actually it's not wrong anymore. I was looking at an old version of the code :)

Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Thanks @glemaitre !

@glemaitre
Copy link
Member Author

Edit: actually it's not wrong anymore. I was looking at an old version of the code :)

Yes, I might solve it in the previous PR :)

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thomasjpfan thomasjpfan changed the title FIX do not support sparse matrix in OrdinalEncoder.inverse_transform FIX Error for sparse matrix in OrdinalEncoder.inverse_transform Apr 13, 2021
@thomasjpfan thomasjpfan merged commit 8a3939a into scikit-learn:main Apr 13, 2021
thomasjpfan pushed a commit to thomasjpfan/scikit-learn that referenced this pull request Apr 19, 2021
@glemaitre glemaitre mentioned this pull request Apr 22, 2021
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OrdinalEncoder accept and failed with sparse matrix in inverse_transform
3 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.