TST remove _required_parameters and improve instance generation #29707

adrinjalali · Aug 23, 2024

This basically requires #29699 and #29702 to be merged first.

This PR refactors instance generation so that there is no more need for _required_parameters. This also means estimators are allowed to have init parameters with non-default values, which is already the case.

github-actions · Aug 23, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 4a99a3f. Link to the linter CI: here}

adrinjalali

@glemaitre this is where I further the work on instance creation, which you also mentioned on the other PR.

adrinjalali · Sep 4, 2024

sklearn/utils/_test_common/instance_generator.py

@@ -430,15 +444,8 @@ def _get_check_estimator_ids(obj):
            return re.sub(r"\s", "", str(obj))


-def _generate_column_transformer_instances():


This isn't removed, it's now yielded beside all other estimators.

adrinjalali · Sep 4, 2024

sklearn/utils/_test_common/instance_generator.py

-    HalvingGridSearchCV: dict(cv=3),
-    HalvingRandomSearchCV: dict(cv=3),


these two are manually set in instance generation.

adrinjalali · Sep 4, 2024

sklearn/compose/_column_transformer.py

@@ -1320,6 +1320,21 @@ def get_metadata_routing(self):

        return router

+    def _more_tags(self):
+        return {
+            "_xfail_checks": {


ColumnTransformer is now tested along all other estimators,hence more checks are failing and need to be ignored, and fixed later.

adrinjalali · Sep 4, 2024

sklearn/model_selection/_search_successive_halving.py

@@ -379,6 +379,9 @@ def _more_tags(self):
                    "Fail during parameter check since min/max resources requires"
                    " more samples"
                ),
+                "check_estimators_nan_inf": "FIXME",


These estimators are also tested with others now, and these tests fail. Need fixes in another PR.

adrinjalali · Sep 4, 2024

sklearn/pipeline.py

@@ -1881,6 +1881,15 @@ def get_metadata_routing(self):

        return router

+    def _more_tags(self):
+        return {
+            "_xfail_checks": {


adrinjalali · Sep 4, 2024

sklearn/tests/test_common.py

@@ -309,6 +311,8 @@ def _estimators_that_predict_in_fit():
    "estimator", column_name_estimators, ids=_get_check_estimator_ids
 )
 def test_pandas_column_name_consistency(estimator):
+    if isinstance(estimator, ColumnTransformer):


this and test_check_param_validation are being moved to estimator_checks and therefore will be able to be skipped on the estimator tag, for now they need to be skiped hardcoded here. The PR moving these tests will fix this issue as well.

adrinjalali · Sep 4, 2024

sklearn/utils/estimator_checks.py

+def check_estimator_cloneable(name, estimator_orig):
+    """Checks whether the estimator can be cloned."""
+    try:
+        clone(estimator_orig)
+    except Exception as e:
+        raise AssertionError(f"Cloning of {name} failed with error: {e}.") from e
+
+
+def check_estimator_repr(name, estimator_orig):
+    """Check that the estimator has a functioning repr."""
+    estimator = clone(estimator_orig)
+    try:
+        repr(estimator)
+    except Exception as e:
+        raise AssertionError(f"Repr of {name} failed with error: {e}.") from e


These two were inside check_parameters_default_constructible and now moved outside in their own tests.

adrinjalali · Sep 4, 2024

doc/developers/develop.rst

@@ -659,15 +659,6 @@ Even if it is not recommended, it is possible to override the method
 any of the keys documented above is not present in the output of `_get_tags()`,
 an error will occur.

-In addition to the tags, estimators also need to declare any non-optional


This is now not required, and there's no need for a replacement since we now pass instances to estimator checks and not classes.

glemaitre

LGTM. Just one question for having both TEST_PARAMS and INIT_PARAMS. It should be possible to converge towards a single dictionary.

sklearn/base.py

sklearn/tests/test_common.py

glemaitre · Sep 4, 2024

sklearn/utils/_test_common/instance_generator.py

@@ -304,48 +316,93 @@ def _generate_pipeline():
        )


+INIT_PARAMS = {


The name here is a bit confusing because we have now TEST_PARAMS and INIT_PARAMS and at a first glance this is not really easy to know why we have 2 dictionary.

Why the INIT_PARAM would not be enough to run all tests?

I didn't want to change much behaviour with this PR. Merging the two would mean we'd be setting the parameters according to TEST_PARAMS in all tests, while we're not doing now. I'll see if anything fails if I merge them.

glemaitre · Sep 4, 2024

sklearn/utils/_test_common/instance_generator.py

+        cv=2,
+        error_score="raise",
+    ),
+    HalvingRandomSearchCV: dict(


You probably need to tweak a bit more the parameter for this one to avoid the current failure.

This is odd, couldn't reproduce locally with single job, but running tests in parallel I see the failure.

Yep this is weird, it looks like a side-effect that should change the state of the random number generator and thus the behaviour (or data).

…eters

adrinjalali · Sep 5, 2024

@OmarManzoor this should be a relatively easy second review.

OmarManzoor

LGTM. Thanks @adrinjalali

sklearn/utils/_test_common/instance_generator.py

sklearn/utils/tests/test_estimator_checks.py

adrinjalali added 10 commits August 21, 2024 15:50

TST allow categorisation of tests into API and legacy

d28c3cb

TST refactor instance generation and parameter setting

13a8e27

add legacy to check_estimator

3474eea

fix tests

3975f17

Merge remote-tracking branch 'upstream/main' into tests/legacy

d460786

remove unnecessary vars

e27edd3

Merge remote-tracking branch 'upstream/main' into test/param_set

3cabeb7

TST remove _required_parameters

7eec068

Merge branch 'tests/legacy' into test/required_parameters

aa4d808

TST remove _required_parameters

c3f1249

ignore failing tests

7c4a3b2

adrinjalali added No Changelog Needed Developer API Third party developer API related labels Aug 26, 2024

adrinjalali added 3 commits September 4, 2024 14:48

a few more fixes

02f99a1

skipping tests that should be skipped

1f220c4

remove _required_parameters

f434406

adrinjalali commented Sep 4, 2024

View reviewed changes

adrinjalali marked this pull request as ready for review September 4, 2024 13:50

glemaitre changed the title ~~[WIP] TST remove _required_parameters and improve instance generation~~ TST remove _required_parameters and improve instance generation Sep 4, 2024

glemaitre self-requested a review September 4, 2024 14:36

glemaitre approved these changes Sep 4, 2024

View reviewed changes

adrinjalali added 6 commits September 5, 2024 09:16

trying different params

650bb8e

merge the two dicts

148e5d5

reduce diff

32a6ec6

Merge remote-tracking branch 'upstream/main' into test/required_param…

1ac2a8e

…eters

use new tags

0c7366c

test error messages of tests

27d315d

OmarManzoor approved these changes Sep 6, 2024

View reviewed changes

sklearn/utils/_test_common/instance_generator.py Outdated Show resolved Hide resolved

sklearn/utils/tests/test_estimator_checks.py Outdated Show resolved Hide resolved

OmarManzoor and others added 3 commits September 6, 2024 11:14

Update sklearn/utils/_test_common/instance_generator.py

912b14a

Update sklearn/utils/tests/test_estimator_checks.py

fd03329

Merge branch 'main' into test/required_parameters

4a99a3f

OmarManzoor enabled auto-merge (squash) September 6, 2024 06:15

OmarManzoor merged commit 95e9459 into scikit-learn:main Sep 6, 2024
28 checks passed

adrinjalali deleted the test/required_parameters branch September 6, 2024 07:22

		HalvingGridSearchCV: dict(cv=3),
		HalvingRandomSearchCV: dict(cv=3),

Search code, repositories, users, issues, pull requests...

Uh oh!

TST remove _required_parameters and improve instance generation #29707

TST remove _required_parameters and improve instance generation #29707

Uh oh!

Conversation

adrinjalali commented Aug 23, 2024

Uh oh!

github-actions bot commented Aug 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

adrinjalali left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adrinjalali commented Sep 5, 2024

Uh oh!

OmarManzoor left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Aug 23, 2024 •

edited

Loading