-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
ENH Allows disabling refitting of CV estimators #30463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
❌ Linting issuesThis PR is introducing linting issues. Here's a summary of the issues. Note that you can avoid having linting issues by enabling You can see the details of the linting issues under the
|
I think I'm done from my side with the code changes and documentation. Code changes for |
@kayo09, please do not make such reviews - it confuses us. Only @paulAdrienMarie and I are assigned to work on this PR. Edit: @kayo09, could you try to remove the review. |
Update _coordinate_descent.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@paulAdrienMarie The refit
parameter was supposed to appear after cv
parameter. This is not a programming issue, but a conflict with the sklearn docs automation, as I followed the scheme followed by GridSearchCV
and accordingly made the documentation in the same order, ie refit
after cv
.
Just sharing this as feedback so that it may be useful for you in the future - not as criticism.
This was the error message from which I understood:
AssertionError: Docstring Error:
In function: sklearn.linear_model._coordinate_descent.ElasticNetCV.__init__
There's a parameter name mismatch in function docstring w.r.t. function signature, at index 8 diff: 'cv' != 'refit'
Full diff:
['l1_ratio',
'eps',
'n_alphas',
'alphas',
'fit_intercept',
'precompute',
'max_iter',
'tol',
+ 'refit',
'cv',
'copy_X',
- 'refit',
'verbose',
'n_jobs',
'positive',
'random_state',
'selection']
In function: sklearn.linear_model._coordinate_descent.MultiTaskElasticNetCV.__init__
There's a parameter name mismatch in function docstring w.r.t. function signature, at index 7 diff: 'cv' != 'refit'
Full diff:
['l1_ratio',
'eps',
'n_alphas',
'alphas',
'fit_intercept',
'max_iter',
'tol',
+ 'refit',
'cv',
'copy_X',
- 'refit',
'verbose',
'n_jobs',
'random_state',
'selection']
Could you help with fixing the code coverage issue? What would be the best way to go about it? |
Hi @AhmedThahir I think we need to add unitests for the feature we added to the models using CrossValidation. I can take a look at it ! |
Hi @paulAdrienMarie, were you able to able to make any progress? @jeremiedbb and @alifa98, could you pls support us? I've not been able to understand how to make appropriate tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @AhmedThahir
For the tests maybe you could try fitting once with refit=True and once with refit=False and check that the attributes of interest like the alphas are similar. You could do this for all the CV estimators that you are modified to include this new behavior.
Once that is taken care of we might also need a ChangeLog entry.
Co-authored-by: Omar Salman <omar.salman2007@gmail.com>
@OmarManzoor, thanks for the response :)
:)
Noted, can you check if this is a good idea to compare the
Okay, will contact you once that's done |
Hi there, can anyone support with this? |
Reference Issues/PRs
What does this implement/fix? Explain your changes.
What
refit
to toggle this behavior.Why
User does not want to waste resources on refitting when user only wants one/more of the following, which do not involve refitting:
This is especially impactful for large datasets.
How
refit
argument which disables refitting ifrefit=False
.refit=True
is set to prevent breaking existing usageAny other comments?
This is my first (hopefully first of many) PR for scikit-learn. If you have any feedback on my implementation/PR documentation/etc, feel free to share - I'd really appreciate it.
Thanks to @paulAdrienMarie for all the support!