Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Add sampling uncertainty on precision-recall and ROC curves #26192

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
Loading
from

Conversation

stephanecollot
Copy link
Contributor

Reference Issues/PRs

Closes #25856

What does this implement/fix? Explain your changes.

Add sampling uncertainty on precision-recall and ROC curves.
See more details in the Issue above.

@stephanecollot
Copy link
Contributor Author

Hi,

Here is a first version on the PR, adding this feature on precision-recall only, once we agree on the integration for this one, I will add the ROC in an analogous way, in this PR.
Thank you in advance for having an initial look on sklearn/metrics/_plot/precision_recall_curve.py

I will add unit tests and more function docstrings in sklearn/metrics/_plot/uncertainty.py soon.

@glemaitre @betatim @lorentzenchr

"""
TODO: Documentation

AISTAT 2023 `Sampling uncertainties on the Precision-Recall curve`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct title of paper is: Pointwise sampling uncertainties on the Precision-Recall curve
with authors: R.E.Q. Urlus, M.A. Baak, S. Collot, I. Fridman Rojas

@RUrlus
Copy link
Contributor

RUrlus commented Apr 18, 2023

@stephanecollot This implementation does not match the implementation in MMU and is not as described in the paper.

This code creates a grid of a fixed shape for each P,R point and evaluates the chi2 score.
This means that different thresholds have different resolutions, e.g. 100 bins to cover 0.5 - 0.55 whereas other thresholds might only have a region that covers 0.9 - 0.91 with the same number of bins. You draw a region for each threshold but the overlap is no longer comparable as they are computed with different resolutions.

The reference implementation creates a P, R grid with a set number of points per axis.
For each threshold we evaluate the region in the P,R grid to evaluate and for each predetermined point in this region we compute the chi2 score. We only store the minimum chi2 score for each grid point and draw a single contour for the grid.
Not only is this much faster, this is also consistent as the whole curve has the same resolution.

@stephanecollot
Copy link
Contributor Author

@stephanecollot This implementation does not match the implementation in MMU and is not as described in the paper.

This code creates a grid of a fixed shape for each P,R point and evaluates the chi2 score. This means that different thresholds have different resolutions, e.g. 100 bins to cover 0.5 - 0.55 whereas other thresholds might only have a region that covers 0.9 - 0.91 with the same number of bins. You draw a region for each threshold but the overlap is no longer comparable as they are computed with different resolutions.

The reference implementation creates a P, R grid with a set number of points per axis. For each threshold we evaluate the region in the P,R grid to evaluate and for each predetermined point in this region we compute the chi2 score. We only store the minimum chi2 score for each grid point and draw a single contour for the grid. Not only is this much faster, this is also consistent as the whole curve has the same resolution.

That is correct, I tried 3 different methods for the grid, fix number grid point per point (the current one), the paper one and an adaptative grid (i.e. "x point per cm²"). They had different pros and cons, and I picked the best in terms for plotting smoothness and execution time. But it is true that I'm not taking the minimum chi2, which can make the plot look different.
I'm going to have a closer look at this.

@lorentzenchr
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sampling uncertainty on precision-recall and ROC curves
3 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.