-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Made _weighted_percentile
raise value error when weights are all zero.
#31041
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably needs a changelog too.
cc @lucyleeow
sklearn/utils/tests/test_stats.py
Outdated
@@ -69,8 +69,8 @@ def test_weighted_percentile_zero_weight(): | ||
y.fill(1.0) | ||
sw = np.ones(102, dtype=np.float64) | ||
sw.fill(0.0) | ||
value = _weighted_percentile(y, sw, 50) | ||
assert approx(value) == 1.0 | ||
with pytest.raises(ValueError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's obvious what this is testing but let's add a message match anyway.
I have made the changes you two have recommended. Could you please look over this again when you have time? |
@@ -0,0 +1,2 @@ | ||
- In :mod:`utils.stats` updated _weighted_percentile raise an value error when all weights are zero. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- In :mod:`utils.stats` updated _weighted_percentile raise an value error when all weights are zero. | |
- Updated `utils.stats._weighted_percentile` to raise an error when all weights are zero. |
Could you please also list all the downstream public classes/functions that this would affect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather not name private API in the changelog. Instead, identify how this code path can be reached from the public API and document the user-facing behavior change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this code is not reachable from public API (but only from unit tests calling directly into the private API), then we don't need to document it in the changelog.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this code is not reachable from public API
It is reachable by public functions that use _weighted_percentile
, and results in an incorrect result being given. e.g.:
In [8]: y_true = [3, -0.5, 2, 7]
...:
...: y_pred = [2.5, 0.0, 2, 8]
...:
...: median_absolute_error(y_true, y_pred, sample_weight=[0,0,0,0])
Out[8]: 1.0
Just for completeness I checked numpys quantile function:
np.quantile([1,2,3,4], 0.5, weights=[0,0,0,0], method='inverted_cdf')
gives the error:
RuntimeWarning: invalid value encountered in divide
cdf /= cdf[-1, ...] # normalization to 1
Error message could be improved to something specific for this case, but at least an error is given.
Edit: just realise that is a warning and it gives output: array(1)
- will raise an issue over in numpy
@@ -0,0 +1,2 @@ | ||
- In :mod:`utils.stats` updated _weighted_percentile raise an value error when all weights are zero. | ||
Also changed test_weighted_percentile_zero_weight to check for that error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is not needed.
_weighted_percentile
raise value error when weights are all zero.
Sorry for taking so long; I have gone and looked at all things that use it and have added them to the docs. |
- In :mod:`metrics._regession` updated set_huber_delta raise an value error when all weights are zero. | ||
- In :mod:`metrics._regession` updated median_absolute_error raise an value error when all weights are zero. | ||
- In :mod:`metrics._regession` updated d2_pinball_score raise an value error when all weights are zero. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these can be a single entry that lists all 3 functions.
Please use the syntax:
:func:`~metrics._regession.updated d2_pinball_score`
etc
@@ -0,0 +1,2 @@ | ||
- In :mod:`dummy` made it so that if you use the median strategy with all zero weights it will raise a value error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is true, np.median
is used in this case.
Co-authored-by: Lucy Liu <jliu176@gmail.com>
Co-authored-by: Lucy Liu <jliu176@gmail.com>
Made _weighted_percentile raise value error when weights are all zero, as requested in #31032.
Also changed test_weighted_percentile_zero_weight_zero_percentile to check for that error and that it's message is correct.
Added change log file.
(made changed recommended to me by adrinjalali and lucyleeow ).