-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
Refactor BaseEstimator's and Kernel's get_params and set_params #14175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. There's a lot going on, so I hope you can accept a slow review!
sklearn/base.py
Outdated
try: | ||
param_names = get_param_names_from_constructor(self.__class__) | ||
except SignatureError as e: | ||
raise SignatureError("scikit-learn estimators should always " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still more or less duplicated with kernels. Should we share this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still more or less duplicated with kernels. Should we share this?
I left that duplication to keep the error message of each class (scikit-learn estimators...
, scikit-learn kernels...
). I thought of 1) having the same scikit-learn estimators...
message for both, or 2) passing a string argument to get_param_names_from_constructor
to edit the message, but both felt awkward.
Though looking at it again, I noticed that set_params
is duplicated too, so maybe Kernel
could just inherit from BaseEstimator
. Are there any historical reasons as to why it does not does so? By inheriting It could also benefit from the pickling and repr stuff.
Also, Kernels do not include any nested estimator-like parameters right? get_params
is only shallow which seems enough, but set_params
can handle nested parameters, why is this?
No worries! 👌 |
Kernel does not have fit. We can have something else that is a common base
class.
…On Thu., 27 Jun. 2019, 12:03 am alegonz, ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In sklearn/base.py
<#14175 (comment)>
:
> @@ -190,14 +175,17 @@ def get_params(self, deep=True):
params : mapping of string to any
Parameter names mapped to their values.
"""
- out = dict()
- for key in self._get_param_names():
- value = getattr(self, key, None)
- if deep and hasattr(value, 'get_params'):
- deep_items = value.get_params().items()
- out.update((key + '__' + k, val) for k, val in deep_items)
- out[key] = value
- return out
+ try:
+ param_names = get_param_names_from_constructor(self.__class__)
+ except SignatureError as e:
+ raise SignatureError("scikit-learn estimators should always "
This is still more or less duplicated with kernels. Should we share this?
I left that duplication to keep the error message of each class (scikit-learn
estimators..., scikit-learn kernels...). I thought of 1) having the same scikit-learn
estimators... message for both, or 2) passing a string argument to
get_param_names_from_constructor to edit the message, but both felt
awkward.
Though looking at it again, I noticed that set_params is duplicated too,
so maybe Kernel could just inherit from BaseEstimator . Are there any
historical reasons as to why it does not does so? By inheriting It could
also benefit from the pickling and repr stuff.
Also, Kernels do not include any nested estimator-like parameters right?
get_params is only shallow which seems enough, but set_params can handle
nested parameters, why is this?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#14175?email_source=notifications&email_token=AAATH25ZAZJQZ2WZ7OMNSG3P4NZJXA5CNFSM4H27BY62YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOB4WW4BQ#discussion_r297683358>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAATH234RZMOX3HRBUL5BJLP4NZJXANCNFSM4H27BY6Q>
.
|
Oh, right. Forgot about that. I'm now realizing that even though Estimators are conceptually defined as having a fit method,
Right, I'll see what I can do. |
Just updated the PR with further refactoring. This time I extracted While it looks like the pickling and repr stuff in Note that |
sklearn/base.py
Outdated
""" | ||
try: | ||
return super().get_params(deep=deep) | ||
except SignatureError as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would be happier with super().__get_params(deep=deep, name='estimator')
than the introduction of these new exception classes as sentinels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll remove these exception classes and pass the name as an argument.
sklearn/exceptions.py
Outdated
self.param_name = param_name | ||
|
||
|
||
class SignatureError(RuntimeError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see why this should be public
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. I didn't think about that. Anyway, this will be gone as per the comment above.
e2de51e
to
5a89f4d
Compare
I removed the custom exception classes and rephrased the error messages. I felt that, like the custom classes I had added, introducing a |
"specify their parameters in the signature" | ||
" of their __init__ (no varargs)." | ||
" %s doesn't follow this convention." | ||
% self.__class__) from e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The from e
will show the problematic signature in the error raised by get_param_names_from_constructor
.
" %s doesn't follow this convention." | ||
% self.__class__) from e | ||
|
||
def set_params(self, **params): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left to preserve the docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just use the inherited method instead of overriding, but I left it as I wasn't sure if there are any rules regarding docstrings.
for arg in args: | ||
params[arg] = getattr(self, arg, None) | ||
return params | ||
% self.__class__) from e | ||
|
||
def set_params(self, **params): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
6fe90a6
to
9928673
Compare
@jnothman Just rebased onto latest master. The |
Sorry I'm short on review time at the moment. Give us reasonable time to
get to this and ping if you need to
… |
No worries 👍 |
…tion. This is meant to reuse it in gaussian_process.kernels.Kernel. Also added SignatureError to pass the signature and print the proper error message depending on the class.
changed BaseEstimator and gaussian_process.kernels.Kernel to inherit from it. Also added InvalidParameterError to pass the parameter name and print the proper error message in set_params depending on the class.
…ams and set_params.
9928673
to
8031fc0
Compare
@alegonz
Subseqeuntly, all grandchildren have automagically proprer signature. However, there are a few disadvantages:
I've looked at your solution, particularly |
Hi there. Sorry for the late reply.
It is not enough to inherit from class MySubclass(SomeSKLearnEstimator):
def __init__(self, myparam=42, **kwargs):
# Just pass **kwargs.
# No duplicated code.
super().__init__(**kwargs)
self.myparam = myparam
def get_params(self, deep=True):
# Gotta override get_params. But it shouldn't be hard.
# params from parent
parent_names = self._get_param_names(super())
params = self._get_params_from(parent_names, deep=deep)
# Add my params
names = self._get_param_names(self.__class__)
params.update(self._get_params_from(names, deep=deep))
# or just simply
# params['myparam'] = self.myparam
return params
x = MySubclass(myparam=42, param1=2, param2='bar')
print(x.get_params())
# {'param1': 2, 'param2': 'bar', 'myparam': 42} Currently AFAIK, without this PR one is left with three options:
|
@alegonz this looks pretty good, would you be able to rebase on |
Hi @adrinjalali , sorry for the late reply. Sure, I'd be happy to retake this PR. It's been a while since I opened this PR and the related issue so please give me some time to regain the context, re-study the code and address the rebase conflicts. Do you have any specific timeline for this PR? |
@alegonz not specific timeline. Just wanted to know if we should close this or give it to somebody else. But if you're happy to have another go, that'd be perfect. |
Reference Issues/PRs
Hi there, this is a PR for #13555 .
What does this implement/fix? Explain your changes.
_get_param_names
out ofBaseEstimator
as utility function now calledget_param_names_from_constructor
.gaussian_process.kernels.Kernel
.SignatureError
class to pass the signature from the utility function up to the appropriate exception message inBaseEstimator
andgaussian_process.kernels.Kernel
.gaussian_process.kernels.Kernel
to use the new utility function and remove duplication.GenericUnivariateSelect._make_selector
to useget_params
instead of the new utility function so that it raises the proper error in case of a selector with inappropriate signature.get_params
to call from a new_get_params_from
method. This is soget_params
can be easily overridden to take a signature of, for example, the parent class in custom subclasses of client code.Any other comments?
SignatureError
(subclassed fromRuntimeError
) these changes don't add or change current behavior, so I haven't added any new tests. If new tests might seem necessary, please let me know.