FEA add pinball loss to SGDRegressor #22043

venkyyuvy · Dec 21, 2021

Reference Issues/PRs

Partially addresses: #20132

What does this implement/fix? Explain your changes.

Added the pinball loss to sgd_fast.pyx.

Any other comments?

The results doesn't seems to be comparable with the QuantileRegressor. Need help.

venkyyuvy · Dec 21, 2021

ping @lorentzenchr @glemaitre

sklearn/linear_model/_sgd_fast.pyx

doc/whats_new/v1.1.rst

sklearn/linear_model/_sgd_fast.pyx

sklearn/linear_model/tests/test_sgd.py

glemaitre · Dec 22, 2021

sklearn/linear_model/tests/test_sgd.py

+    )
+    from .._quantile import QuantileRegressor
+
+    benchmark = QuantileRegressor(alpha=1e-6, solver="highs").fit(X, y).score(X, y)


I assume that this will fail with older scipy because the solve is unkown

Now that I am seeing the alpha, it makes me think that we only support l1 penalty in the QuantileRegressor.

@lorentzenchr is it fine to support l2 and elasticnet in SGD?

Yes, this needs a skip_if for scipy >= 1.6, IIRC.

And yes, SGD already comes with L2 and L1+L2, so why not supporting it. It would be more code to forbid it:wink:

Do we need test cases for other penalties?

And yes, SGD already comes with L2 and L1+L2, so why not supporting it. It would be more code to forbid it

@lorentzenchr so the reason to not support L2 and L1+L2 in QuantileRegressor is that we cannot reformulate the problem as a linear programming problem?

Do we need test cases for other penalties?

Yes, it could be good. I don't know how easy this is but it could be great to have some integration test checking that the penalty has the expected impact on the coef_ and in the same time that we are indeed fitting the expected quantile.

I'm not sure on how to proceed further now, any pointers would be helpful. Thanks

sklearn/linear_model/tests/test_sgd.py

glemaitre · Dec 22, 2021

The first round of review.

lorentzenchr · Jan 18, 2022

Regardless of the fact that I'm not an expert for SGD and don't know how well it works for non-smooth objective, this PR would need unit tests similar to

scikit-learn/sklearn/linear_model/tests/test_quantile.py

Line 153 in a5b70b3

def test_asymmetric_error(quantile):

and

scikit-learn/sklearn/linear_model/tests/test_quantile.py

Line 213 in a5b70b3

def test_equivariance(quantile):

venkyyuvy · Jan 19, 2022

These are great pointers, I will look into it. Thanks

venkyyuvy · Feb 26, 2022

equivariance test is only failing for both with and without intercept situation.

jjerphan

Thank you for this contribution, @venkyyuvy.

Here are a few comments. If you can't continue this work, @ArturoAmorQ might be interested in pursuing it.

To follow-up with one of the proposal made by @lorentzenchr in #20132 (comment), I am personally be in favour of refactoring _sgd_fast.pyx to use the common loss private submodule and thus avoid duplication. This submodule not only implements the pinball loss, but also other losses that are used in several modules.

If we choose to perform this refactoring first, then all the terminal and concrete LossFunctions that are defined in this file must be moved to this private submodule and re-factored to extend the CyLossFunction extension type and BaseLoss case function:

scikit-learn/sklearn/_loss/_loss.pxd

Lines 27 to 30 in b571d64

    
           cdef class CyLossFunction: 
        
               cdef double cy_loss(self, double y_true, double raw_prediction) nogil 
        
               cdef double cy_gradient(self, double y_true, double raw_prediction) nogil 
        
               cdef double_pair cy_grad_hess(self, double y_true, double raw_prediction) nogil

scikit-learn/sklearn/_loss/loss.py

Lines 44 to 110 in b571d64

    
           # Note: The shape of raw_prediction for multiclass classifications are 
        
           # - GradientBoostingClassifier: (n_samples, n_classes) 
        
           # - HistGradientBoostingClassifier: (n_classes, n_samples) 
        
           # 
        
           # Note: Instead of inheritance like 
        
           # 
        
           #    class BaseLoss(BaseLink, CyLossFunction): 
        
           #    ... 
        
           # 
        
           #    # Note: Naturally, we would inherit in the following order 
        
           #    #     class HalfSquaredError(IdentityLink, CyHalfSquaredError, BaseLoss) 
        
           #    #   But because of https://github.com/cython/cython/issues/4350 we set BaseLoss as 
        
           #    #   the last one. This, of course, changes the MRO. 
        
           #    class HalfSquaredError(IdentityLink, CyHalfSquaredError, BaseLoss): 
        
           # 
        
           # we use composition. This way we improve maintainability by avoiding the above 
        
           # mentioned Cython edge case and have easier to understand code (which method calls 
        
           # which code). 
        
           class BaseLoss: 
        
               """Base class for a loss function of 1-dimensional targets. 
        
               Conventions: 
        
                   - y_true.shape = sample_weight.shape = (n_samples,) 
        
                   - y_pred.shape = raw_prediction.shape = (n_samples,) 
        
                   - If is_multiclass is true (multiclass classification), then 
        
                     y_pred.shape = raw_prediction.shape = (n_samples, n_classes) 
        
                     Note that this corresponds to the return value of decision_function. 
        
               y_true, y_pred, sample_weight and raw_prediction must either be all float64 
        
               or all float32. 
        
               gradient and hessian must be either both float64 or both float32. 
        
               Note that y_pred = link.inverse(raw_prediction). 
        
               Specific loss classes can inherit specific link classes to satisfy 
        
               BaseLink's abstractmethods. 
        
               Parameters 
        
               ---------- 
        
               sample_weight : {None, ndarray} 
        
                   If sample_weight is None, the hessian might be constant. 
        
               n_classes : {None, int} 
        
                   The number of classes for classification, else None. 
        
               Attributes 
        
               ---------- 
        
               closs: CyLossFunction 
        
               link : BaseLink 
        
               interval_y_true : Interval 
        
                   Valid interval for y_true 
        
               interval_y_pred : Interval 
        
                   Valid Interval for y_pred 
        
               differentiable : bool 
        
                   Indicates whether or not loss function is differentiable in 
        
                   raw_prediction everywhere. 
        
               need_update_leaves_values : bool 
        
                   Indicates whether decision trees in gradient boosting need to uptade 
        
                   leave values after having been fit to the (negative) gradients. 
        
               approx_hessian : bool 
        
                   Indicates whether the hessian is approximated or exact. If, 
        
                   approximated, it should be larger or equal to the exact one. 
        
               constant_hessian : bool 
        
                   Indicates whether the hessian is one for this loss. 
        
               is_multiclass : bool 
        
                   Indicates whether n_classes > 2 is allowed. 
        
               """

What do maintainers think? Would you first perform such a refactoring?

jjerphan · Aug 8, 2022

doc/whats_new/v1.1.rst

+- |Fix| :class:`linear_model.LassoLarsIC` now correctly computes AIC
+  and BIC. An error is now raised when `n_features > n_samples` and
+  when the noise variance is not provided.
+  :pr:`21481` by :user:`Guillaume Lemaitre <glemaitre>` and
+  :user:`Andrés Babino <ababino>`.
+


This is a spurious entry.

Suggested change

- |Fix| :class:`linear_model.LassoLarsIC` now correctly computes AIC

and BIC. An error is now raised when `n_features > n_samples` and

when the noise variance is not provided.

:pr:`21481` by :user:`Guillaume Lemaitre <glemaitre>` and

:user:`Andrés Babino <ababino>`.

jjerphan · Aug 8, 2022

doc/whats_new/v1.1.rst

+-|Feature| :class:`linear_model.SGDRegressor` would support pinball loss now.
+  :pr:`22043` by :user:`Venkatachalam Natchiappan <venkyyuvy>`.
+


This needs to be moved to doc/whats_new/v1.2.rst.

jjerphan · Aug 8, 2022

sklearn/linear_model/_sgd_fast.pyx

+        cdef double residual = fabs(y - p) - self.epsilon
+        return residual if residual > 0 else 0


This change makes sense, but that's out of the scope of this PR. Hence this should be reverted.

Suggested change

cdef double residual = fabs(y - p) - self.epsilon

return residual if residual > 0 else 0

cdef double ret = fabs(y - p) - self.epsilon

return ret if ret > 0 else 0

jjerphan · Aug 8, 2022

sklearn/linear_model/_sgd_fast.pyx

+        cdef double residual = fabs(y - p) - self.epsilon
+        return residual * residual if residual > 0 else 0


Same suggestion than previously.

Suggested change

cdef double residual = fabs(y - p) - self.epsilon

return residual * residual if residual > 0 else 0

cdef double ret = fabs(y - p) - self.epsilon

return ret * ret if ret > 0 else 0

lorentzenchr · Aug 8, 2022

What do maintainers think? Would you first perform such a refactoring?

I would prefer such a refactoring. But I'm not 100% sure it is feasible. It would certainly slow down the feature of this PR. So I'm ok both ways.

venkyyuvy · Aug 9, 2022

Thanks for the review comments.
I am still interested in pursuing this.

venkyyuvy added 2 commits December 21, 2021 16:13

adding pinball loss

cb1cfdb

fixes

c3e1df9

github-actions bot added module:linear_model cython labels Dec 21, 2021

glemaitre self-requested a review December 21, 2021 11:10

venkyyuvy changed the title ~~adding pinball loss~~ Adding pinball loss to SGDRegressor Dec 21, 2021

venkyyuvy added 2 commits December 21, 2021 16:59

sign convn

4a8069d

changelog

74f6c17

glemaitre requested changes Dec 22, 2021

View reviewed changes

glemaitre changed the title ~~Adding pinball loss to SGDRegressor~~ FEA add pinball loss to SGDRegressor Dec 22, 2021

venkyyuvy added 2 commits December 23, 2021 12:04

corrections

3a7fb59

quantile param range check

c88d933

venkyyuvy requested review from glemaitre and lorentzenchr December 23, 2021 06:46

venkyyuvy added 3 commits December 23, 2021 13:00

tests for partial fit

9dfece7

fix

1b50c3e

fix1

d7add9d

venkyyuvy force-pushed the pinball_sgd branch from effa8ff to d7add9d Compare January 3, 2022 05:57

Merge remote-tracking branch 'upstream/main' into pinball_sgd

9a216a3

venkyyuvy added 3 commits February 26, 2022 08:40

asymmetric error testing

aa069c9

adding equivarinace test

f55c0b6

Merge remote-tracking branch 'upstream/main' into pinball_sgd

85ce25e

lorentzenchr added the Stalled label Aug 6, 2022

jjerphan reviewed Aug 8, 2022

View reviewed changes

glemaitre removed their request for review January 11, 2023 13:07

ogrisel mentioned this pull request Mar 18, 2024

Alternative solvers for QuantileRegressor #20132

Open

3 tasks

ogrisel mentioned this pull request Jan 24, 2025

Prediction interval for MLPRegressor #8303

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEA add pinball loss to SGDRegressor #22043

FEA add pinball loss to SGDRegressor #22043

venkyyuvy commented Dec 21, 2021 •

edited

Loading

venkyyuvy commented Dec 21, 2021

glemaitre Dec 22, 2021

glemaitre Dec 22, 2021

lorentzenchr Dec 22, 2021

venkyyuvy Dec 23, 2021

glemaitre Dec 23, 2021

venkyyuvy Jan 3, 2022

glemaitre commented Dec 22, 2021

lorentzenchr commented Jan 18, 2022

venkyyuvy commented Jan 19, 2022 •

edited

Loading

venkyyuvy commented Feb 26, 2022

jjerphan left a comment •

edited

Loading

jjerphan Aug 8, 2022

jjerphan Aug 8, 2022

jjerphan Aug 8, 2022

jjerphan Aug 8, 2022

lorentzenchr commented Aug 8, 2022

venkyyuvy commented Aug 9, 2022

	cdef class CyLossFunction:
	cdef double cy_loss(self, double y_true, double raw_prediction) nogil
	cdef double cy_gradient(self, double y_true, double raw_prediction) nogil
	cdef double_pair cy_grad_hess(self, double y_true, double raw_prediction) nogil

	# Note: The shape of raw_prediction for multiclass classifications are
	# - GradientBoostingClassifier: (n_samples, n_classes)
	# - HistGradientBoostingClassifier: (n_classes, n_samples)
	#
	# Note: Instead of inheritance like
	#
	# class BaseLoss(BaseLink, CyLossFunction):
	# ...
	#
	# # Note: Naturally, we would inherit in the following order
	# # class HalfSquaredError(IdentityLink, CyHalfSquaredError, BaseLoss)
	# # But because of https://github.com/cython/cython/issues/4350 we set BaseLoss as
	# # the last one. This, of course, changes the MRO.
	# class HalfSquaredError(IdentityLink, CyHalfSquaredError, BaseLoss):
	#
	# we use composition. This way we improve maintainability by avoiding the above
	# mentioned Cython edge case and have easier to understand code (which method calls
	# which code).
	class BaseLoss:
	"""Base class for a loss function of 1-dimensional targets.

	Conventions:

	- y_true.shape = sample_weight.shape = (n_samples,)
	- y_pred.shape = raw_prediction.shape = (n_samples,)
	- If is_multiclass is true (multiclass classification), then
	y_pred.shape = raw_prediction.shape = (n_samples, n_classes)
	Note that this corresponds to the return value of decision_function.

	y_true, y_pred, sample_weight and raw_prediction must either be all float64
	or all float32.
	gradient and hessian must be either both float64 or both float32.

	Note that y_pred = link.inverse(raw_prediction).

	Specific loss classes can inherit specific link classes to satisfy
	BaseLink's abstractmethods.

	Parameters
	----------
	sample_weight : {None, ndarray}
	If sample_weight is None, the hessian might be constant.
	n_classes : {None, int}
	The number of classes for classification, else None.

	Attributes
	----------
	closs: CyLossFunction
	link : BaseLink
	interval_y_true : Interval
	Valid interval for y_true
	interval_y_pred : Interval
	Valid Interval for y_pred
	differentiable : bool
	Indicates whether or not loss function is differentiable in
	raw_prediction everywhere.
	need_update_leaves_values : bool
	Indicates whether decision trees in gradient boosting need to uptade
	leave values after having been fit to the (negative) gradients.
	approx_hessian : bool
	Indicates whether the hessian is approximated or exact. If,
	approximated, it should be larger or equal to the exact one.
	constant_hessian : bool
	Indicates whether the hessian is one for this loss.
	is_multiclass : bool
	Indicates whether n_classes > 2 is allowed.
	"""

		-\|Feature\| :class:`linear_model.SGDRegressor` would support pinball loss now.
		:pr:`22043` by :user:`Venkatachalam Natchiappan <venkyyuvy>`.

		cdef double residual = fabs(y - p) - self.epsilon
		return residual if residual > 0 else 0

		cdef double residual = fabs(y - p) - self.epsilon
		return residual * residual if residual > 0 else 0

Search code, repositories, users, issues, pull requests...

FEA add pinball loss to SGDRegressor #22043

Are you sure you want to change the base?

FEA add pinball loss to SGDRegressor #22043

Conversation

venkyyuvy commented Dec 21, 2021 • edited Loading

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

venkyyuvy commented Dec 21, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

glemaitre commented Dec 22, 2021

lorentzenchr commented Jan 18, 2022

venkyyuvy commented Jan 19, 2022 • edited Loading

venkyyuvy commented Feb 26, 2022

jjerphan left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lorentzenchr commented Aug 8, 2022

venkyyuvy commented Aug 9, 2022

venkyyuvy commented Dec 21, 2021 •

edited

Loading

venkyyuvy commented Jan 19, 2022 •

edited

Loading

jjerphan left a comment •

edited

Loading