Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

FIX improve error message when no samples are available in mutual information #25192

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
Loading
from

Conversation

makoeppel
Copy link
Contributor

@makoeppel makoeppel commented Dec 15, 2022

Reference Issues/PRs

closes #25179

check if instances are left after masking of unique labels

check if instances are left after masking of unique labels
Copy link
Contributor

@Micky774 Micky774 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! This looks like a good clarification on an otherwise annoying error. When you get the chance, please include a test that verifies that the appropriate ValueError is raised.

Let me know if you have any questions or concerns :)

doc/whats_new/v1.3.rst Outdated Show resolved Hide resolved
sklearn/feature_selection/_mutual_info.py Outdated Show resolved Hide resolved
@makoeppel
Copy link
Contributor Author

Hi @Micky774 thank you for your feedback! I updated the code and added a test.

Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @makoeppel. Here are a few suggestions

sklearn/feature_selection/tests/test_mutual_info.py Outdated Show resolved Hide resolved
sklearn/feature_selection/_mutual_info.py Outdated Show resolved Hide resolved
doc/whats_new/v1.3.rst Outdated Show resolved Hide resolved
@makoeppel
Copy link
Contributor Author

Hi @jeremiedbb, thanks for your feedback. I implemented them all!

@Micky774
Copy link
Contributor

Micky774 commented Jan 4, 2023

I haven't checked out the wider context here yet -- is this a risk/concern with either the continuous/continuous or discrete/discrete case? I'll take a look later if you aren't readily sure.

@makoeppel
Copy link
Contributor Author

It's more about improving the usability of that function. In case people use it wrongly (like I did) you left with a strange error message which does not hint you to the problem. So I had to understand the code first to find out what the problem is. That's why I thought having a clear error here could improve the user experience.

@Micky774
Copy link
Contributor

Micky774 commented Jan 4, 2023

It's more about improving the usability of that function. In case people use it wrongly (like I did) you left with a strange error message which does not hint you to the problem. So I had to understand the code first to find out what the problem is. That's why I thought having a clear error here could improve the user experience.

Sure, the error is objectively a helpful addition! I just meant to ask if this error also present in any of the other cases or is it unique to the mixed continuous/discrete function where the check currently is.

@makoeppel
Copy link
Contributor Author

It's more about improving the usability of that function. In case people use it wrongly (like I did) you left with a strange error message which does not hint you to the problem. So I had to understand the code first to find out what the problem is. That's why I thought having a clear error here could improve the user experience.

Sure, the error is objectively a helpful addition! I just meant to ask if this error also present in any of the other cases or is it unique to the mixed continuous/discrete function where the check currently is.

True, did not check this so fare. I only quickly looked over it and code wise it should work.

However, thinking about this I am wondering if the mutual information or more specific the entropy of a single value of a random variable is actually defined. Which would be the reason for masking unique values in the first place in this function. Which is than opening the question what to do with unique values for all the oder functions even if it would work code wise.

@glemaitre glemaitre changed the title improve error handling for _compute_mi_cd FIX improve error message when no samples are available in mutual information Jan 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve error handling in _mutual_info.py
4 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.