Description
This issue is an RFC to clarify the expected behavior max_iter
and n_iter_
(or estimators
and len(estimators_)
equivalently) when used with warm_start=True
.
Estimators to be considered
The estimators to be considered can be found in the following manner:
from inspect import signature
from sklearn.utils import all_estimators
type_filter = ["classifier", "regressor"]
estimators = []
for name, klass in all_estimators(type_filter=type_filter):
params = signature(klass).parameters
if (
any(it_param in params for it_param in ["max_iter", "n_estimators"]) and
"warm_start" in params
):
print(name)
which give
BaggingClassifier
BaggingRegressor
ElasticNet
ExtraTreesClassifier
ExtraTreesRegressor
GammaRegressor
GradientBoostingClassifier
GradientBoostingRegressor
HistGradientBoostingClassifier
HistGradientBoostingRegressor
HuberRegressor
Lasso
LogisticRegression
MLPClassifier
MLPRegressor
MultiTaskElasticNet
MultiTaskLasso
PassiveAggressiveClassifier
PassiveAggressiveRegressor
Perceptron
PoissonRegressor
RandomForestClassifier
RandomForestRegressor
SGDClassifier
SGDRegressor
TweedieRegressor
Review the different behaviors
We will evaluate the behaviour by doing the following experiment:
- set
max_iter=2
(orn_estimators=2
) andwarm_start=True
fit
the estimator and checkn_iter_
(orlen(estimators_)
)- set
max_iter=3
(orn_estimators=3
) fit
the estimator and checkn_iter_
(orlen(estimators_)
)
The idea is to check if we report the total number of iterations or just the number of iterations of the latest fit
call.
GLM estimators
from sklearn.linear_model import GammaRegressor, PoissonRegressor, TweedieRegressor
Estimators = [GammaRegressor, PoissonRegressor, TweedieRegressor]
for klass in Estimators:
print(klass.__name__)
estimator = klass(warm_start=True, max_iter=2, verbose=True).fit(X_reg, y_reg)
print(f"{estimator.n_iter_=}")
print("---------------------------------------------------------------------------")
estimator.set_params(max_iter=3)
estimator.fit(X_reg, y_reg)
print(f"{estimator.n_iter_=}")
print("---------------------------------------------------------------------------")
In this case, n_iter_
is reported to be 2
and then 3
. Using the verbose
option, the model did effectively 5
iterations.
Ensemble estimators
from sklearn.ensemble import (
BaggingClassifier,
BaggingRegressor,
ExtraTreesClassifier,
ExtraTreesRegressor,
GradientBoostingClassifier,
GradientBoostingRegressor,
RandomForestClassifier,
RandomForestRegressor,
)
Estimators = [
BaggingRegressor,
ExtraTreesRegressor,
GradientBoostingRegressor,
RandomForestRegressor,
]
for klass in Estimators:
print(klass.__name__)
estimator = klass(warm_start=True, n_estimators=2).fit(X_reg, y_reg)
print(f"{len(estimator.estimators_)=}")
print("---------------------------------------------------------------------------")
estimator.set_params(n_estimators=3)
estimator.fit(X_reg, y_reg)
print(f"{len(estimator.estimators_)=}")
print("---------------------------------------------------------------------------")
Estimators = [
BaggingClassifier,
ExtraTreesClassifier,
GradientBoostingClassifier,
RandomForestClassifier,
]
for klass in Estimators:
print(klass.__name__)
estimator = klass(warm_start=True, n_estimators=2).fit(X_clf, y_clf)
print(f"{len(estimator.estimators_)=}")
print("---------------------------------------------------------------------------")
estimator.set_params(n_estimators=3)
estimator.fit(X_clf, y_clf)
print(f"{len(estimator.estimators_)=}")
print("---------------------------------------------------------------------------")
In this case, len(estimators_)
is 2
and 3
. It differs from the previous GLM because we have in total only 3
estimators.
Similar behaviour for HistGradientBoosting
:
from sklearn.ensemble import (
HistGradientBoostingClassifier, HistGradientBoostingRegressor
)
Estimators = [HistGradientBoostingRegressor, HistGradientBoostingClassifier]
for klass in Estimators:
print(klass.__name__)
estimator = klass(warm_start=True, max_iter=2, verbose=True).fit(X_clf, y_clf)
print(f"{estimator.n_iter_=}")
print("---------------------------------------------------------------------------")
estimator.set_params(max_iter=3)
estimator.fit(X_clf, y_clf)
print(f"{estimator.n_iter_=}")
print("---------------------------------------------------------------------------")
Estimators using coordinate descent
from sklearn.linear_model import ElasticNet, Lasso
Estimators = [ElasticNet, Lasso]
for klass in Estimators:
print(klass.__name__)
estimator = klass(warm_start=True, max_iter=2).fit(X_reg, y_reg)
print(f"{estimator.n_iter_=}")
print("---------------------------------------------------------------------------")
estimator.set_params(max_iter=3)
estimator.fit(X_reg, y_reg)
print(f"{estimator.n_iter_=}")
print("---------------------------------------------------------------------------")
It will be similar for MultiTaskElasticNet
and MultiTaskLasso
.
This is equivalent to GLM. The _path
is called using self.max_iter
without taking into account n_iter_
. So the total number of iterations will be 5
.
MLP estimators:
from sklearn.neural_network import MLPClassifier, MLPRegressor
Estimators = [MLPRegressor, MLPClassifier]
for klass in Estimators:
print(klass.__name__)
estimator = klass(warm_start=True, max_iter=2, verbose=True).fit(X_clf, y_clf)
print(f"{estimator.n_iter_=}")
print("---------------------------------------------------------------------------")
estimator.set_params(max_iter=3)
estimator.fit(X_clf, y_clf)
print(f"{estimator.n_iter_=}")
print("---------------------------------------------------------------------------")
n_iter_
is reported to be 2
and 5
. So the fit
behavior is consistent with other linear model but the reported n_iter_
report the global number of iterations.
SGD estimators
from sklearn.linear_model import SGDClassifier, SGDRegressor
Estimators = [SGDRegressor, SGDClassifier]
for klass in Estimators:
print(klass.__name__)
estimator = klass(warm_start=True, max_iter=2, verbose=True).fit(X_clf, y_clf)
print(f"{estimator.n_iter_=}")
print("---------------------------------------------------------------------------")
estimator.set_params(max_iter=3)
estimator.fit(X_clf, y_clf)
print(f"{estimator.n_iter_=}")
print("---------------------------------------------------------------------------")
n_iter_
is reported to be 2
and 3
and make 5
iterations in total. In line with GLMs.
Perceptron
will expose the same behaviour.
Other estimators
HuberRegressor
is behaving the same as GLMs