Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit 6a6217f

Browse filesBrowse files
sdpythonogrisellorentzenchr
authored
ENH Add mean_pinball_loss metric for quantile regression (#19415)
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>
1 parent bea9211 commit 6a6217f
Copy full SHA for 6a6217f

File tree

Expand file treeCollapse file tree

9 files changed

+595
-65
lines changed
Filter options
Expand file treeCollapse file tree

9 files changed

+595
-65
lines changed

‎doc/modules/classes.rst

Copy file name to clipboardExpand all lines: doc/modules/classes.rst
+1Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -991,6 +991,7 @@ details.
991991
metrics.mean_poisson_deviance
992992
metrics.mean_gamma_deviance
993993
metrics.mean_tweedie_deviance
994+
metrics.mean_pinball_loss
994995

995996
Multilabel ranking metrics
996997
--------------------------

‎doc/modules/model_evaluation.rst

Copy file name to clipboardExpand all lines: doc/modules/model_evaluation.rst
+68-3Lines changed: 68 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -416,7 +416,7 @@ defined as
416416

417417
.. math::
418418
419-
\texttt{accuracy}(y, \hat{y}) = \frac{1}{n_\text{samples}} \sum_{i=0}^{n_\text{samples}-1} 1(\hat{y}_i = y_i)
419+
\texttt{accuracy}(y, \hat{y}) = \frac{1}{n_\text{samples}} \sum_{i=0}^{n_\text{samples}-1} 1(\hat{y}_i = y_i)
420420
421421
where :math:`1(x)` is the `indicator function
422422
<https://en.wikipedia.org/wiki/Indicator_function>`_.
@@ -1960,8 +1960,8 @@ Regression metrics
19601960
The :mod:`sklearn.metrics` module implements several loss, score, and utility
19611961
functions to measure regression performance. Some of those have been enhanced
19621962
to handle the multioutput case: :func:`mean_squared_error`,
1963-
:func:`mean_absolute_error`, :func:`explained_variance_score` and
1964-
:func:`r2_score`.
1963+
:func:`mean_absolute_error`, :func:`explained_variance_score`,
1964+
:func:`r2_score` and :func:`mean_pinball_loss`.
19651965

19661966

19671967
These functions have an ``multioutput`` keyword argument which specifies the
@@ -2354,6 +2354,71 @@ the difference in errors decreases. Finally, by setting, ``power=2``::
23542354
we would get identical errors. The deviance when ``power=2`` is thus only
23552355
sensitive to relative errors.
23562356

2357+
.. _pinball_loss:
2358+
2359+
Pinball loss
2360+
------------
2361+
2362+
The :func:`mean_pinball_loss` function is used to evaluate the predictive
2363+
performance of quantile regression models. The `pinball loss
2364+
<https://en.wikipedia.org/wiki/Quantile_regression#Computation>`_ is equivalent
2365+
to :func:`mean_absolute_error` when the quantile parameter ``alpha`` is set to
2366+
0.5.
2367+
2368+
.. math::
2369+
2370+
\text{pinball}(y, \hat{y}) = \frac{1}{n_{\text{samples}}} \sum_{i=0}^{n_{\text{samples}}-1} \alpha \max(y_i - \hat{y}_i, 0) + (1 - \alpha) \max(\hat{y}_i - y_i, 0)
2371+
2372+
Here is a small example of usage of the :func:`mean_pinball_loss` function::
2373+
2374+
>>> from sklearn.metrics import mean_pinball_loss
2375+
>>> y_true = [1, 2, 3]
2376+
>>> mean_pinball_loss(y_true, [0, 2, 3], alpha=0.1)
2377+
0.03...
2378+
>>> mean_pinball_loss(y_true, [1, 2, 4], alpha=0.1)
2379+
0.3...
2380+
>>> mean_pinball_loss(y_true, [0, 2, 3], alpha=0.9)
2381+
0.3...
2382+
>>> mean_pinball_loss(y_true, [1, 2, 4], alpha=0.9)
2383+
0.03...
2384+
>>> mean_pinball_loss(y_true, y_true, alpha=0.1)
2385+
0.0
2386+
>>> mean_pinball_loss(y_true, y_true, alpha=0.9)
2387+
0.0
2388+
2389+
It is possible to build a scorer object with a specific choice of alpha::
2390+
2391+
>>> from sklearn.metrics import make_scorer
2392+
>>> mean_pinball_loss_95p = make_scorer(mean_pinball_loss, alpha=0.95)
2393+
2394+
Such a scorer can be used to evaluate the generalization performance of a
2395+
quantile regressor via cross-validation:
2396+
2397+
>>> from sklearn.datasets import make_regression
2398+
>>> from sklearn.model_selection import cross_val_score
2399+
>>> from sklearn.ensemble import GradientBoostingRegressor
2400+
>>>
2401+
>>> X, y = make_regression(n_samples=100, random_state=0)
2402+
>>> estimator = GradientBoostingRegressor(
2403+
... loss="quantile",
2404+
... alpha=0.95,
2405+
... random_state=0,
2406+
... )
2407+
>>> cross_val_score(estimator, X, y, cv=5, scoring=mean_pinball_loss_95p)
2408+
array([11.1..., 10.4... , 24.4..., 9.2..., 12.9...])
2409+
2410+
It is also possible to build scorer objects for hyper-parameter tuning. The
2411+
sign of the loss must be switched to ensure that greater means better as
2412+
explained in the example linked below.
2413+
2414+
.. topic:: Example:
2415+
2416+
* See :ref:`sphx_glr_auto_examples_ensemble_plot_gradient_boosting_quantile.py`
2417+
for an example of using a the pinball loss to evaluate and tune the
2418+
hyper-parameters of quantile regression models on data with non-symmetric
2419+
noise and outliers.
2420+
2421+
23572422
.. _clustering_metrics:
23582423

23592424
Clustering metrics

‎doc/whats_new/v1.0.rst

Copy file name to clipboardExpand all lines: doc/whats_new/v1.0.rst
+4Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -145,6 +145,10 @@ Changelog
145145
class methods and will be removed in 1.2.
146146
:pr:`18543` by `Guillaume Lemaitre`_.
147147

148+
- |Feature| :func:`metrics.mean_pinball_loss` exposes the pinball loss for
149+
quantile regression. :pr:`19415` by :user:`Xavier Dupré <sdpython>`
150+
and :user:`Oliver Grisel <ogrisel>`.
151+
148152
:mod:`sklearn.naive_bayes`
149153
..........................
150154

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.