Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

[MRG] Make drop_idx_ a masked array in OneHotEncoder #16554

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

cmarmo
Copy link
Contributor

@cmarmo cmarmo commented Feb 26, 2020

Reference Issues/PRs

Fixes #16552.

What does this implement/fix? Explain your changes.

Make drop_idx_ a numpy masked array in order to manage column selection.

@cmarmo
Copy link
Contributor Author

cmarmo commented Feb 26, 2020

Sorry... I forgot to finish the tests... :(

@cmarmo cmarmo changed the title [MRG] Make drop_idx_ a masked array in OneHotEncoder [WIP] Make drop_idx_ a masked array in OneHotEncoder Feb 26, 2020
@cmarmo cmarmo changed the title [WIP] Make drop_idx_ a masked array in OneHotEncoder [MRG] Make drop_idx_ a masked array in OneHotEncoder Feb 26, 2020
@glemaitre
Copy link
Member

I am not sure that we should use a mask array. If it was only some internal issues without exposing the resulting array, I think that it will be OK because we already use this mask array in the SimpleImputer. The issue here is that we exposed this attribute publicly and I am scared that some of our users do not know about NumPy masked array.

I would think this is more friendly to have a list or a NumPy array with object dtype where we model with None or the index to be dropped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OneHotEncoder drop 'if_binary' drop one column from all categorical variables
2 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.