Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

[MRG] PERF Significant performance improvements in Partial Least Squares (PLS) Regression #23876

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 73 commits into
base: main
Choose a base branch
Loading
from

Conversation

fractionalhare
Copy link

@fractionalhare fractionalhare commented Jul 9, 2022

Description of Changes

Motivating Summary

This PR dramatically improves the speed of PLS regression by implementing the Dayal-MacGregor modified kernel algorithm (Dayal-MacGregor 1997). This has significant performance improvements over the current default, NIPALS, without sacrificing accuracy or numerical stability.

The Dayal-MacGregor algorithm accurately computes the weight and rotation matrices for the maximal covariance projection(s) of X without explicitly calculating the X score matrix and without deflating the Y target(s) at all. Thus it is (much) faster than NIPALS in the base case of univariate Y, and becomes asymptotically faster as the number of Y targets increases. Testing shows speed improvements of over 40% in the base case, and rising to over 95% when Y is multivariate.

In other words, this PR makes PLS regression 2x - 20x faster for equivalent results. The paper contains detailed numerical stability characteristics and accuracy results which are confirmed by my empirical tests below.

Specific Changes

  • the parameter constraints on the algorithm parameter for the _PLS base class were modified so that a user can select the new algorithm via algorithm="dayalmacgregor" or algorithm="kernel" - see comments below, I'm not attached to having both arguments refer to the new algorithm, but I've implemented it this way for now
  • the PLSRegression child class was modified to allow selection of the algorithm parameter during instantiation (previously a user could not select the algorithm, and the regression child class implicitly took the default of the base class). this argument defaults to algorithm="nipals" for backwards compatibility
  • the fit method was augmented to include the implementation of Dayal-MacGregor, within a conditional branch on the value of the algorithm attribute
  • Added a new test to cover the new algorithm

Performance & Accuracy Results

Small X (20 x 10) and Univariate Y (20 x 1) with 5 Components

NIPALS: 344 µs ± 4.77 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

Dayal-MacGregor: 198 µs ± 3.87 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

The output predictions agree to 6 decimal places.

43% faster

Small X (20 x 10) and Multivariate Y (20 x 2) with 5 Components

NIPALS: 700 µs ± 13.7 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

Dayal-MacGregor: 268 µs ± 4.92 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

The output predictions are equal to 4 decimal places.

62% faster

Moderate X (200 x 100) and Univariate Y (200 x 1) with 50 Components

NIPALS: 32.6 ms ± 2.16 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Dayal-MacGregor: 15 ms ± 470 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

The output predictions agree to 6 decimal places.

54% faster

Moderate X (200 x 100) and Multivariate Y (200 x 2) with 50 Components

NIPALS: 226 ms ± 21.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Dayal-MacGregor: 15.8 ms ± 280 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

The output predictions are equal to 6 decimal places.

94% faster

Moderate X (200 x 100) and Multivariate Y (200 x 3) with 50 Components

NIPALS: 346 ms ± 29.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Dayal-MacGregor: 17 ms ± 1.48 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)

The output predictions are equal to 5 decimal places.

95% faster

Large X (2000 x 1000) and Univariate Y (2000 x 1) with 100 Components

NIPALS: 859 ms ± 27.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Dayal-MacGregor: 271 ms ± 20.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

The output predictions are equal to 6 decimal places.

59% faster

Large X (2000 x 1000) and Multivariate Y (2000 x 2) with 100 Components

NIPALS: 5.71 s ± 390 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Dayal-MacGregor: 233 ms ± 10.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

The output predictions are equal to 6 decimal places.

96% faster

Large X (2000 x 1000) and Multivariate Y (2000 x 3) with 100 Components

NIPALS: 6.43 s ± 398 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Dayal-MacGregor: 236 ms ± 13.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

The output predictions agree to 6 decimal places.

97% faster

Testing

Dev Environment & Architecture

My testing environment is as follows:

  • Hardware: MacBook Pro (16-inc, 2021) with Apple M1 Max Chip and 64GB RAM
  • OS: macOS Monterey v12.4
  • Python: 3.10.5
  • Installer/Package Manager: Anaconda3 (for Apple Silicon/arm64)
  • Building Compiler: llvm-openmp

I don't think this will materially impact the results, but wanted to note it.

Testing Script

I put together a performance and accuracy testing script.

After running conda create -n sklearn-dev -c conda-forge python numpy scipy cython joblib threadpoolctl pytest compilers llvm-openmp, conda activate sklearn-dev and pip install --verbose --no-build-isolation --editable . in the directory, the following testing script should validate the foregoing performance and accuracy assessment. It randomly generates an m x n matrix X and m x n_targets matrix Y, and then fits PLS regression models estimated with each of the NIPALS and Dayal-MacGregor algorithms, then assesses performance and accuracy.

import numpy as np
from numpy.testing import assert_array_almost_equal
from sklearn.cross_decomposition import PLSRegression

#%% Config
m = 1000
n = 200
n_targets = 2

X = np.random.rand(m, n)
y = np.random.rand(m, n_targets)
n_components = 50
scale = True

####
# Performance Testing
####

#%% NIPALS
%%timeit
nipals = PLSRegression(n_components=n_components, scale=scale, algorithm='nipals')
nipals.fit(X=X, Y=y)
nipals.predict(X)

#%% DayalMacGregor Kernel
%%timeit
dayal = PLSRegression(n_components=n_components, scale=scale, algorithm='dayalmacgregor')
dayal.fit(X=X, Y=y)
dayal.predict(X)

#%%
####
# Accuracy/Stability Testing
####

control = PLSRegression(n_components=n_components, scale=scale, algorithm='nipals')
test = PLSRegression(n_components=n_components, scale=scale, algorithm='dayalmacgregor')
control.fit(X, y)
test.fit(X, y)
assert_array_almost_equal(test.predict(X), control.predict(X), decimal=4)

Comments

  • I kept NIPALS as the default algorithm. If this PR is accepted, we might want to discuss whether or not NIPALS should remain the default given that Dayal-MacGregor is strictly superior for PLSRegression. Maybe keep NIPALS as the default, mention the new algorithm prominently in documentation and issue a warning that the new algorithm will become the default in a release or two?

  • Given that Dayal-MacGregor doesn't deflate the Y target(s), there are no y_scores_, y_weights_ and y_rotations_ class attributes. This means that Dayal-MacGregor is only implemented for PLS Regression (not PLS Canonical), and a ValueError check is implemented to that effect. Moreover, the PLSRegression class documentation was edited to reflect the fact that these three attributes are only given for NIPALS/SVD.

  • Should this algorithm be available via both algorithm="dayalmacgregor" and algorithm="kernel", or is it better to choose one? I currently have both implemented. I can see arguments either way but I'm not strongly opinionated. The Dayal-MacGregor algorithm is a kernel algorithm, but it's not the only one that exists in the literature (e.g. SIMPLS is also a kernel algorithm, albeit numerically unstable).

@fractionalhare fractionalhare changed the title PERF Implement Dayal-MacGregor Kernel algorithm for significant performance improvements in Partial Least Squares (PLS) Regression ENH / PERF Implement Dayal-MacGregor Kernel algorithm for significant performance improvements in Partial Least Squares (PLS) Regression Jul 10, 2022
@fractionalhare
Copy link
Author

@ogrisel yep, done

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates. However #23876 (comment) has still not been addressed. Other than that, LGTM.

doc/whats_new/v1.2.rst Outdated Show resolved Hide resolved
@ogrisel
Copy link
Member

ogrisel commented Jul 20, 2022

Another remark, the sign of the extracted coef can sometimes flip when switch algorithm, e.g. look at the plots of:

Would it be possible to use sklearn.utils.extmath.svd_flip or _svd_flip_1d to avoid this artifact? If not easy to achieve, no big deal.

@fractionalhare
Copy link
Author

fractionalhare commented Jul 20, 2022

@ogrisel Implemented the change covering your last outstanding review comment about attributes.

Would it be possible to use sklearn.utils.extmath.svd_flip or _svd_flip_1d to avoid this artifact? If not easy to achieve, no big deal.

It doesn't seem easy to achieve - we don't actually use both the U and Vt from the decomposition to form the x_weights and y_weights like we do with NIPALS. So we can't iteratively flip the 1d ndarrays computed in eachn_component round with _svd_flip_1d. The computation isn't similar to NIPALS in this sense.

So what ends up happening is Dayal-MacGregor coefficient matrix has the right signs to match the coefficient matrix you obtain from NIPALS, but the X rotation matrix and the Y loading matrix may have opposite signs in column vectors compared to the corresponding matrices from NIPALS.

This quirk is harmless for prediction because the coefficient matrix is x_rotations_ @ y_loadings_ so when the signs on individual column vectors are opposite due to SVD, the coefficient matrices still match.

@fractionalhare
Copy link
Author

@ogrisel Who else do we need for a review?

@fractionalhare fractionalhare requested a review from ogrisel July 30, 2022 16:11
@fractionalhare
Copy link
Author

@ogrisel what is this waiting on?

@ogrisel
Copy link
Member

ogrisel commented Aug 12, 2022

@ogrisel what is this waiting on?

Unfortunately the documentation building CI has timed out. Can you please try to push another commit (e.g. yet another merge with the current main) to see if this problem has been resolved? I am not sure about the cause.

It would be great to be able to checkout the rendered HTML for the examples with your latest changes.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR overall looks good to me but I would like to get the CI working prior to considering a merge.


for k in range(n_components):
if n_targets == 1:
w = S
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please update the tests to also cover the case where n_targets == 1?

)
except StopIteration as e:
if str(e) != "Y residual is constant":
raise
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can drop this if / raise branch: StopIteration can never be raised when Y residual are not constant. This will also make the coverage report happier.

@ogrisel
Copy link
Member

ogrisel commented Aug 12, 2022

This quirk is harmless for prediction because the coefficient matrix is x_rotations_ @ y_loadings_ so when the signs on individual column vectors are opposite due to SVD, the coefficient matrices still match.

I agree but the fact that the sign of the coef changes when switching solvers is practically annoying, e.g. when writing a doc with some explanation of a plot of the latent space: the top right part of the figure could silently become the bottom left of the figure if the code is switched from nipals to the new solver.

Since we don't plan to change the default PLS solver right away this not necessarily a big problem as this subtle behavior change will not happen just by updating the version of scikit-learn. Still, if there had been a way to enforce this stability that would a nice convenience to our future users. See the Principle of Least Astonishment.

@lorentzenchr
Copy link
Member

I'm not an expert of PLS, but it seems to me that the proposed Dayal-MacGregor modified kernel algorithm is superior to the current algo. Therefore, in the long run, I would not introduce a new solver parameter.
Due to the sign change, the open question is how to do a proper deprecation cycle. Could we introduce a new parameter solver which we deprecate right from the beginning, saying that the default will change and that this parameter will be removed in 2 versions?

@ogrisel
Copy link
Member

ogrisel commented Aug 16, 2022

The new solver does compute all the attributes that the nipals algorithm could compute. So I would still keep the solver attribute to be able to select the nipals algorithm shall we decide to move to Dayal-MacGregor as the default in a future version of scikit-learn.

@lorentzenchr
Copy link
Member

Conclusion: Let's introduce the solver parameter

Possible (for me quite likely) future path outside of this PR: change default to the new solver, finally remove solver and get rid of the old solver.

@fractionalhare fractionalhare closed this by deleting the head repository Nov 27, 2022
@lorentzenchr
Copy link
Member

@fractionalhare Why did you close?

@lorentzenchr lorentzenchr reopened this Dec 10, 2022
@fractionalhare
Copy link
Author

@ogrisel what is required to get this merged now?

@ayaanhossain
Copy link

@fractionalhare @ogrisel @lorentzenchr @glemaitre hey guys any update on this?

@lorentzenchr
Copy link
Member

any update on this?

This PR needs 2 reviews, our scarcest ressource. It usually helps a lot if CI is all green.

@fractionalhare
Copy link
Author

Last status was an initial review from @ogrisel - enough time has passed that the review should probably be at least refreshed, and then one other person needs to do a review.

I can update the branch and see what coverage checks are still failing; we had the CI all green before so it should just be a matter of update and then tweaking tests.

@glemaitre
Copy link
Member

From what I see before we need to have the comments of @ogrisel addressed. I can provide an additional review once the CI is working.

@ayaanhossain
Copy link

any update on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.