The Wayback Machine - https://web.archive.org/web/20210916111623/https://github.com/scikit-learn/scikit-learn/pull/13987
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Avoid uncessary copies in sklearn.preprocessing #13987

Merged
merged 3 commits into from Jun 1, 2019

Conversation

@rth
Copy link
Member

@rth rth commented May 30, 2019

Partially addresses #13986

This removes the copy=True in the fit method of StandardScaler, MinMaxScaler, MaxAbsScaler, RobustScaler where it is typically not necessary to compute the scaling factors.

In practice, this makes StandardScaler().fit_transform 10%-20% faster on the few examples I have tried.

If that copy was necessary and this mistakenly removed it check_transformer_general(.., readonly_memmap=True) would fail in common tests.

Copy link
Member

@NicolasHug NicolasHug left a comment

LGTM.

Looks like in these specific cases inplace would have been a more descriptive parameter name than copy, and might have prevented this.

Copy link
Member

@thomasjpfan thomasjpfan left a comment

LGTM

@thomasjpfan
Copy link
Member

@thomasjpfan thomasjpfan commented May 30, 2019

Does this need a whats_new entry as an enhancement or a bug fix?

@rth
Copy link
Member Author

@rth rth commented May 31, 2019

Thanks for the reviews! Added a what's new.

@thomasjpfan
Copy link
Member

@thomasjpfan thomasjpfan commented May 31, 2019

QuantileTransformer has a _check_inputs that copies during fit and transform. What you think about adding copy parameter to _check_inputs and setting it to false during fit and self.copy during transform?

@thomasjpfan thomasjpfan merged commit 9661a64 into scikit-learn:master Jun 1, 2019
16 checks passed
16 checks passed
LGTM analysis: C/C++ No code changes detected
Details
LGTM analysis: JavaScript No code changes detected
Details
LGTM analysis: Python No new or fixed alerts
Details
ci/circleci: deploy Your tests passed on CircleCI!
Details
ci/circleci: doc Your tests passed on CircleCI!
Details
ci/circleci: doc-min-dependencies Your tests passed on CircleCI!
Details
ci/circleci: lint Your tests passed on CircleCI!
Details
@codecov[bot]
codecov/patch 100% of diff hit (target 96.8%)
Details
@codecov[bot]
codecov/project 96.8% (+<.01%) compared to 3ed2002
Details
@azure-pipelines[bot]
scikit-learn.scikit-learn Build #20190531.31 succeeded
Details
@azure-pipelines[bot]
scikit-learn.scikit-learn (Linux py35_conda_openblas) Linux py35_conda_openblas succeeded
Details
@azure-pipelines[bot]
scikit-learn.scikit-learn (Linux py35_np_atlas) Linux py35_np_atlas succeeded
Details
@azure-pipelines[bot]
scikit-learn.scikit-learn (Linux pylatest_conda) Linux pylatest_conda succeeded
Details
@azure-pipelines[bot]
scikit-learn.scikit-learn (Windows py35_32) Windows py35_32 succeeded
Details
@azure-pipelines[bot]
scikit-learn.scikit-learn (Windows py37_64) Windows py37_64 succeeded
Details
@azure-pipelines[bot]
scikit-learn.scikit-learn (macOS pylatest_conda) macOS pylatest_conda succeeded
Details
@thomasjpfan
Copy link
Member

@thomasjpfan thomasjpfan commented Jun 1, 2019

Thank you @rth!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.