-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
[MRG] PERF Significant performance improvements in Partial Least Squares (PLS) Regression #23876
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ns for y weight, score and rotation matrices in the class docstring
@ogrisel yep, done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates. However #23876 (comment) has still not been addressed. Other than that, LGTM.
Another remark, the sign of the extracted coef can sometimes flip when switch algorithm, e.g. look at the plots of:
Would it be possible to use |
@ogrisel Implemented the change covering your last outstanding review comment about attributes.
It doesn't seem easy to achieve - we don't actually use both the So what ends up happening is Dayal-MacGregor coefficient matrix has the right signs to match the coefficient matrix you obtain from NIPALS, but the X rotation matrix and the Y loading matrix may have opposite signs in column vectors compared to the corresponding matrices from NIPALS. This quirk is harmless for prediction because the coefficient matrix is |
@ogrisel Who else do we need for a review? |
@ogrisel what is this waiting on? |
Unfortunately the documentation building CI has timed out. Can you please try to push another commit (e.g. yet another merge with the current It would be great to be able to checkout the rendered HTML for the examples with your latest changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR overall looks good to me but I would like to get the CI working prior to considering a merge.
|
||
for k in range(n_components): | ||
if n_targets == 1: | ||
w = S |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please update the tests to also cover the case where n_targets == 1
?
) | ||
except StopIteration as e: | ||
if str(e) != "Y residual is constant": | ||
raise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can drop this if / raise branch: StopIteration
can never be raised when Y residual
are not constant. This will also make the coverage report happier.
I agree but the fact that the sign of the coef changes when switching solvers is practically annoying, e.g. when writing a doc with some explanation of a plot of the latent space: the top right part of the figure could silently become the bottom left of the figure if the code is switched from nipals to the new solver. Since we don't plan to change the default PLS solver right away this not necessarily a big problem as this subtle behavior change will not happen just by updating the version of scikit-learn. Still, if there had been a way to enforce this stability that would a nice convenience to our future users. See the Principle of Least Astonishment. |
I'm not an expert of PLS, but it seems to me that the proposed Dayal-MacGregor modified kernel algorithm is superior to the current algo. Therefore, in the long run, I would not introduce a new |
The new solver does compute all the attributes that the nipals algorithm could compute. So I would still keep the solver attribute to be able to select the nipals algorithm shall we decide to move to Dayal-MacGregor as the default in a future version of scikit-learn. |
Conclusion: Let's introduce the Possible (for me quite likely) future path outside of this PR: change default to the new solver, finally remove |
@fractionalhare Why did you close? |
@ogrisel what is required to get this merged now? |
@fractionalhare @ogrisel @lorentzenchr @glemaitre hey guys any update on this? |
This PR needs 2 reviews, our scarcest ressource. It usually helps a lot if CI is all green. |
Last status was an initial review from @ogrisel - enough time has passed that the review should probably be at least refreshed, and then one other person needs to do a review. I can update the branch and see what coverage checks are still failing; we had the CI all green before so it should just be a matter of update and then tweaking tests. |
From what I see before we need to have the comments of @ogrisel addressed. I can provide an additional review once the CI is working. |
any update on this? |
Description of Changes
Motivating Summary
This PR dramatically improves the speed of PLS regression by implementing the Dayal-MacGregor modified kernel algorithm (Dayal-MacGregor 1997). This has significant performance improvements over the current default, NIPALS, without sacrificing accuracy or numerical stability.
The Dayal-MacGregor algorithm accurately computes the weight and rotation matrices for the maximal covariance projection(s) of X without explicitly calculating the X score matrix and without deflating the Y target(s) at all. Thus it is (much) faster than NIPALS in the base case of univariate Y, and becomes asymptotically faster as the number of Y targets increases. Testing shows speed improvements of over 40% in the base case, and rising to over 95% when Y is multivariate.
In other words, this PR makes PLS regression 2x - 20x faster for equivalent results. The paper contains detailed numerical stability characteristics and accuracy results which are confirmed by my empirical tests below.
Specific Changes
algorithm
parameter for the_PLS
base class were modified so that a user can select the new algorithm viaalgorithm="dayalmacgregor"
oralgorithm="kernel"
- see comments below, I'm not attached to having both arguments refer to the new algorithm, but I've implemented it this way for nowPLSRegression
child class was modified to allow selection of thealgorithm
parameter during instantiation (previously a user could not select the algorithm, and the regression child class implicitly took the default of the base class). this argument defaults toalgorithm="nipals"
for backwards compatibilityfit
method was augmented to include the implementation of Dayal-MacGregor, within a conditional branch on the value of thealgorithm
attributePerformance & Accuracy Results
Small X (20 x 10) and Univariate Y (20 x 1) with 5 Components
NIPALS: 344 µs ± 4.77 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
Dayal-MacGregor: 198 µs ± 3.87 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
The output predictions agree to 6 decimal places.
43% faster
Small X (20 x 10) and Multivariate Y (20 x 2) with 5 Components
NIPALS: 700 µs ± 13.7 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
Dayal-MacGregor: 268 µs ± 4.92 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
The output predictions are equal to 4 decimal places.
62% faster
Moderate X (200 x 100) and Univariate Y (200 x 1) with 50 Components
NIPALS: 32.6 ms ± 2.16 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
Dayal-MacGregor: 15 ms ± 470 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
The output predictions agree to 6 decimal places.
54% faster
Moderate X (200 x 100) and Multivariate Y (200 x 2) with 50 Components
NIPALS: 226 ms ± 21.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Dayal-MacGregor: 15.8 ms ± 280 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
The output predictions are equal to 6 decimal places.
94% faster
Moderate X (200 x 100) and Multivariate Y (200 x 3) with 50 Components
NIPALS: 346 ms ± 29.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Dayal-MacGregor: 17 ms ± 1.48 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
The output predictions are equal to 5 decimal places.
95% faster
Large X (2000 x 1000) and Univariate Y (2000 x 1) with 100 Components
NIPALS: 859 ms ± 27.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Dayal-MacGregor: 271 ms ± 20.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
The output predictions are equal to 6 decimal places.
59% faster
Large X (2000 x 1000) and Multivariate Y (2000 x 2) with 100 Components
NIPALS: 5.71 s ± 390 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Dayal-MacGregor: 233 ms ± 10.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
The output predictions are equal to 6 decimal places.
96% faster
Large X (2000 x 1000) and Multivariate Y (2000 x 3) with 100 Components
NIPALS: 6.43 s ± 398 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Dayal-MacGregor: 236 ms ± 13.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
The output predictions agree to 6 decimal places.
97% faster
Testing
Dev Environment & Architecture
My testing environment is as follows:
I don't think this will materially impact the results, but wanted to note it.
Testing Script
I put together a performance and accuracy testing script.
After running
conda create -n sklearn-dev -c conda-forge python numpy scipy cython joblib threadpoolctl pytest compilers llvm-openmp
,conda activate sklearn-dev
andpip install --verbose --no-build-isolation --editable .
in the directory, the following testing script should validate the foregoing performance and accuracy assessment. It randomly generates an m x n matrix X and m x n_targets matrix Y, and then fits PLS regression models estimated with each of the NIPALS and Dayal-MacGregor algorithms, then assesses performance and accuracy.Comments
I kept NIPALS as the default algorithm. If this PR is accepted, we might want to discuss whether or not NIPALS should remain the default given that Dayal-MacGregor is strictly superior for
PLSRegression
. Maybe keep NIPALS as the default, mention the new algorithm prominently in documentation and issue a warning that the new algorithm will become the default in a release or two?Given that Dayal-MacGregor doesn't deflate the Y target(s), there are no
y_scores_
,y_weights_
andy_rotations_
class attributes. This means that Dayal-MacGregor is only implemented for PLS Regression (not PLS Canonical), and aValueError
check is implemented to that effect. Moreover, thePLSRegression
class documentation was edited to reflect the fact that these three attributes are only given for NIPALS/SVD.Should this algorithm be available via both
algorithm="dayalmacgregor"
andalgorithm="kernel"
, or is it better to choose one? I currently have both implemented. I can see arguments either way but I'm not strongly opinionated. The Dayal-MacGregor algorithm is a kernel algorithm, but it's not the only one that exists in the literature (e.g. SIMPLS is also a kernel algorithm, albeit numerically unstable).