Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

[MRG] PERF Significant performance improvements in Partial Least Squares (PLS) Regression #23876

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 73 commits into
base: main
Choose a base branch
Loading
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
73 commits
Select commit Hold shift + click to select a range
33e904b
add initial implementation of simpls and dayalmacgregor to test
Jul 9, 2022
1ea8e12
we also need to make the algorithm externally accessible from the chi…
Jul 9, 2022
c0b9a77
we also need to relax the parameter constraints to allow algorithm
Jul 9, 2022
885a7a2
add dayalmacgregor and simpls to parameter constraints
Jul 9, 2022
964133a
change the convention for checking univariate characteristic
Jul 9, 2022
97c4af7
change where we expose x scores and y scores
Jul 9, 2022
2b0ac82
change orientation of the coefficient matrix
Jul 9, 2022
b2658f4
add the _ at the end of the intercept attribute name
Jul 9, 2022
8897da9
remove simpls, just use dayalmacgregor, and augment to support multiv…
Jul 9, 2022
a6db0f7
only support regression with dayalmacgregor kernel;
Jul 9, 2022
d3092b1
need uppercase Y for scikitlearn conventions
Jul 9, 2022
e8d3132
working for multivate y
Jul 9, 2022
3c3596d
make compatible with black
Jul 9, 2022
b1d6419
add documentation for the new algorithm param in the PLSRegression class
Jul 9, 2022
c2f90ed
add more clarity about the intercept
Jul 10, 2022
91005a5
use single quotes in documentation
Jul 10, 2022
cf48a58
fix colon for sphinx documentation
Jul 10, 2022
730b0d3
found the indent
Jul 10, 2022
a61bf1a
make the non-regression exception more explicit
Jul 10, 2022
922254a
add a comment about obtaining eigenvector for largest eigenvalue
Jul 10, 2022
bde5725
change variable names to be in-line with nipals/svd implementation
Jul 10, 2022
2a47790
also expose the x weights
Jul 10, 2022
051b3ff
make notes about which attributes dont exist
Jul 10, 2022
01e8f48
black formatting, again
Jul 10, 2022
a0a2e22
use standard conventions for the algorithm str options in the docstring
Jul 10, 2022
9a67e8d
standardize way we refer to multiple str options to express limitatio…
Jul 10, 2022
8b4df86
add dayal macgregor to test coverage
Jul 10, 2022
ff723c1
add test indicating that dayal macgregor is only valid for regression…
Jul 10, 2022
e42af1c
make tests compliant with black formatting
Jul 10, 2022
08d8e9d
better test name
Jul 10, 2022
5b85218
fix linting error in test
Jul 10, 2022
903ce8b
properly import _PLS for tests
Jul 10, 2022
9b2c6b8
blackify tests again
Jul 10, 2022
8ae4a92
dont need test for inaccessible functionality
Jul 10, 2022
8655e68
dont import _PLS
Jul 10, 2022
96f7d94
formatting
Jul 10, 2022
754bf64
remove a conditional check on shape of Y, making iterative computatio…
Jul 11, 2022
cfce477
shorter comment
Jul 11, 2022
2126837
remove kernel option entirely, only dayalmacgregor
Jul 11, 2022
3d75953
fix test typo
Jul 11, 2022
ca91e4a
dont set the y scores, y weights or y rotations at all in dayal macgr…
Jul 11, 2022
e118bc8
change test to use assert_allclose
Jul 11, 2022
0431e9c
parameterized the dayal and nipals randomized test with scale=True an…
Jul 11, 2022
42aba7d
implement a test with a global seed
Jul 11, 2022
0e26e31
test formatting
Jul 11, 2022
bc094f5
add references
Jul 11, 2022
1cebe71
use global random seed
Jul 11, 2022
e3d618d
make the reference formatting compliant
Jul 11, 2022
194a016
by rearranging code in _PLS we dont need a dedicated fit method on PL…
Jul 11, 2022
c1f87d3
get rid of superfluous double transpose - originally it was transposi…
Jul 11, 2022
229b4b8
all algorithms have a single code block for setting the class attribu…
Jul 11, 2022
93ceb83
formatting
Jul 11, 2022
0c7d703
break out nipals_svd and dayal_macgregor into separately named privat…
Jul 11, 2022
29133e4
fix errant test
Jul 11, 2022
f8805c2
only take the real part of the eigenvector for multivariate Y - the c…
Jul 11, 2022
a7103e4
resolve RFE test regression
Jul 11, 2022
0075411
Merge branch 'main' into pls_algos
Jul 13, 2022
71faf51
Merge branch 'main' into pls_algos
fractionalhare Jul 15, 2022
0bb9154
Merge branch 'main' into pls_algos
Jul 18, 2022
6f8e1ca
make changes from further review
Jul 18, 2022
e245d1c
Merge branch 'pls_algos' of ssh://github.com/fractionalhare/scikit-le…
Jul 18, 2022
f3ae6fb
add changelog entry
Jul 18, 2022
f9cf920
fix error in eig
Jul 18, 2022
afec96c
we actually dont need to check attrs since algorithm is set at initia…
Jul 18, 2022
46a5e98
formatting
Jul 18, 2022
96080ca
change examples to use dayalmacgregor
Jul 19, 2022
7ecfeeb
Merge remote-tracking branch 'upstream/main' into pls_algos
Jul 19, 2022
d6c9dfa
Merge remote-tracking branch 'upstream/main' into pls_algos
Jul 20, 2022
1ce2488
implement the attribute correction for y_scores, y_rotations and corr…
Jul 20, 2022
478ec37
implement ogrisels recommended doc change for changelog
Jul 20, 2022
29ffdda
formatting
Jul 20, 2022
4998bb1
Merge branch 'main' into pls_algos
fractionalhare Jul 27, 2022
1c9a206
Merge branch 'main' into pls_algos
fractionalhare Jul 30, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions 8 doc/whats_new/v1.2.rst
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,14 @@ Changelog
:class:`cluster.AgglomerativeClustering` and will be renamed to `metric` in v1.4.
:pr:`23470` by :user:`Meekail Zain <micky774>`.

:mod:`sklearn.cross_decomposition`
..................................

- |Enhancement| The Dayal-MacGregor kernel algorithm was implemented for
:class:`cross_decomposition.PLSRegression` with the `algorithm="dayalmacgregor"`
parameter, which provides an asymptotic performance improvement over the default
algorithm, NIPALS. :pr:`23876` by :user:`Dylan Houlihan <fractionalhare>`.

:mod:`sklearn.datasets`
.......................

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@
# each Yj = 1*X1 + 2*X2 + noize
Y = np.dot(X, B) + np.random.normal(size=n * q).reshape((n, q)) + 5

pls2 = PLSRegression(n_components=3)
pls2 = PLSRegression(n_components=3, algorithm="dayalmacgregor")
pls2.fit(X, Y)
print("True B (such that: Y = XB + Err)")
print(B)
Expand All @@ -153,7 +153,7 @@
p = 10
X = np.random.normal(size=n * p).reshape((n, p))
y = X[:, 0] + 2 * X[:, 1] + np.random.normal(size=n * 1) + 5
pls1 = PLSRegression(n_components=3)
pls1 = PLSRegression(n_components=3, algorithm="dayalmacgregor")
pls1.fit(X, y)
# note that the number of components exceeds 1 (the dimension of y)
print("Estimated betas")
Expand Down
2 changes: 1 addition & 1 deletion 2 examples/cross_decomposition/plot_pcr_vs_pls.py
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@
pcr.fit(X_train, y_train)
pca = pcr.named_steps["pca"] # retrieve the PCA step of the pipeline

pls = PLSRegression(n_components=1)
pls = PLSRegression(n_components=1, algorithm="dayalmacgregor")
pls.fit(X_train, y_train)

fig, axes = plt.subplots(1, 2, figsize=(10, 3))
Expand Down
Loading
Morty Proxy This is a proxified and sanitized view of the page, visit original site.