Behaviour of warm_start=True and max_iter (and n_estimators)

This issue is an RFC to clarify the expected behavior max_iter and n_iter_ (or estimators and len(estimators_) equivalently) when used with warm_start=True.

Estimators to be considered

The estimators to be considered can be found in the following manner:

from inspect import signature
from sklearn.utils import all_estimators

type_filter = ["classifier", "regressor"]
estimators = []
for name, klass in all_estimators(type_filter=type_filter):
    params = signature(klass).parameters
    if (
        any(it_param in params for it_param in ["max_iter", "n_estimators"]) and
        "warm_start" in params
    ):
        print(name)

which give

BaggingClassifier
BaggingRegressor
ElasticNet
ExtraTreesClassifier
ExtraTreesRegressor
GammaRegressor
GradientBoostingClassifier
GradientBoostingRegressor
HistGradientBoostingClassifier
HistGradientBoostingRegressor
HuberRegressor
Lasso
LogisticRegression
MLPClassifier
MLPRegressor
MultiTaskElasticNet
MultiTaskLasso
PassiveAggressiveClassifier
PassiveAggressiveRegressor
Perceptron
PoissonRegressor
RandomForestClassifier
RandomForestRegressor
SGDClassifier
SGDRegressor
TweedieRegressor

Review the different behaviors

We will evaluate the behaviour by doing the following experiment:

set max_iter=2 (or n_estimators=2) and warm_start=True
fit the estimator and check n_iter_ (or len(estimators_))
set max_iter=3 (or n_estimators=3)
fit the estimator and check n_iter_ (or len(estimators_))

The idea is to check if we report the total number of iterations or just the number of iterations of the latest fit call.

GLM estimators

from sklearn.linear_model import GammaRegressor, PoissonRegressor, TweedieRegressor

Estimators = [GammaRegressor, PoissonRegressor, TweedieRegressor]
for klass in Estimators:
    print(klass.__name__)
    estimator = klass(warm_start=True, max_iter=2, verbose=True).fit(X_reg, y_reg)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")
    estimator.set_params(max_iter=3)
    estimator.fit(X_reg, y_reg)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")

In this case, n_iter_ is reported to be 2 and then 3. Using the verbose option, the model did effectively 5 iterations.

Ensemble estimators

from sklearn.ensemble import (
    BaggingClassifier,
    BaggingRegressor,
    ExtraTreesClassifier,
    ExtraTreesRegressor,
    GradientBoostingClassifier,
    GradientBoostingRegressor,
    RandomForestClassifier,
    RandomForestRegressor,
)

Estimators = [
    BaggingRegressor,
    ExtraTreesRegressor,
    GradientBoostingRegressor,
    RandomForestRegressor,
]
for klass in Estimators:
    print(klass.__name__)
    estimator = klass(warm_start=True, n_estimators=2).fit(X_reg, y_reg)
    print(f"{len(estimator.estimators_)=}")
    print("---------------------------------------------------------------------------")
    estimator.set_params(n_estimators=3)
    estimator.fit(X_reg, y_reg)
    print(f"{len(estimator.estimators_)=}")
    print("---------------------------------------------------------------------------")

Estimators = [
    BaggingClassifier,
    ExtraTreesClassifier,
    GradientBoostingClassifier,
    RandomForestClassifier,
]
for klass in Estimators:
    print(klass.__name__)
    estimator = klass(warm_start=True, n_estimators=2).fit(X_clf, y_clf)
    print(f"{len(estimator.estimators_)=}")
    print("---------------------------------------------------------------------------")
    estimator.set_params(n_estimators=3)
    estimator.fit(X_clf, y_clf)
    print(f"{len(estimator.estimators_)=}")
    print("---------------------------------------------------------------------------")

In this case, len(estimators_) is 2 and 3. It differs from the previous GLM because we have in total only 3 estimators.

Similar behaviour for HistGradientBoosting:

from sklearn.ensemble import (
    HistGradientBoostingClassifier, HistGradientBoostingRegressor
)

Estimators = [HistGradientBoostingRegressor, HistGradientBoostingClassifier]
for klass in Estimators:
    print(klass.__name__)
    estimator = klass(warm_start=True, max_iter=2, verbose=True).fit(X_clf, y_clf)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")
    estimator.set_params(max_iter=3)
    estimator.fit(X_clf, y_clf)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")

Estimators using coordinate descent

from sklearn.linear_model import ElasticNet, Lasso

Estimators = [ElasticNet, Lasso]
for klass in Estimators:
    print(klass.__name__)
    estimator = klass(warm_start=True, max_iter=2).fit(X_reg, y_reg)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")
    estimator.set_params(max_iter=3)
    estimator.fit(X_reg, y_reg)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")

It will be similar for MultiTaskElasticNet and MultiTaskLasso.

This is equivalent to GLM. The _path is called using self.max_iter without taking into account n_iter_. So the total number of iterations will be 5.

MLP estimators:

from sklearn.neural_network import MLPClassifier, MLPRegressor

Estimators = [MLPRegressor, MLPClassifier]
for klass in Estimators:
    print(klass.__name__)
    estimator = klass(warm_start=True, max_iter=2, verbose=True).fit(X_clf, y_clf)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")
    estimator.set_params(max_iter=3)
    estimator.fit(X_clf, y_clf)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")

n_iter_ is reported to be 2 and 5. So the fit behavior is consistent with other linear model but the reported n_iter_ report the global number of iterations.

SGD estimators

from sklearn.linear_model import SGDClassifier, SGDRegressor

Estimators = [SGDRegressor, SGDClassifier]
for klass in Estimators:
    print(klass.__name__)
    estimator = klass(warm_start=True, max_iter=2, verbose=True).fit(X_clf, y_clf)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")
    estimator.set_params(max_iter=3)
    estimator.fit(X_clf, y_clf)
    print(f"{estimator.n_iter_=}")
    print("---------------------------------------------------------------------------")

n_iter_ is reported to be 2 and 3 and make 5 iterations in total. In line with GLMs.
Perceptron will expose the same behaviour.

Other estimators

HuberRegressor is behaving the same as GLMs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Behaviour of `warm_start=True` and `max_iter` (and `n_estimators`) #25522

Estimators to be considered

Review the different behaviors

GLM estimators

Ensemble estimators

Estimators using coordinate descent

MLP estimators:

SGD estimators

Other estimators

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Search code, repositories, users, issues, pull requests...

Uh oh!

Behaviour of warm_start=True and max_iter (and n_estimators) #25522

Description

Estimators to be considered

Review the different behaviors

GLM estimators

Ensemble estimators

Estimators using coordinate descent

MLP estimators:

SGD estimators

Other estimators

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Behaviour of `warm_start=True` and `max_iter` (and `n_estimators`) #25522