[MRG+1] Allow sparse input to incremental PCA #13960
Conversation
880a7a3
to
17d7248
|
Looks good to me. |
17d7248
to
dc276bf
|
Please add an entry to the change log at |
3afe748
to
9e119f3
yes, but if the way you treat sparse matrices is to densify them, you might as well get the user to do that... |
A clarification: if the user wants to batchwise densify and fit multiple sparse matrices, then this is not currently possible. Example scenario: you want to fit a single estimator to multiple large |
|
Why can't the user currently pass them, dense, one by one to partial_fit?
|
|
I would presume that turning the entire sparse matrix to dense at once would be undesirable for memory reasons. |
|
That's why we require the user to make the data dense in situations where
doing it automatically may be deleterious.
|
9e119f3
to
5b45e71
|
Note: it's failing tests due to some other unrelated change, I think - failing tests in grid search CV. |
28c78a0
to
7ea580c
|
@NicolasHug @jnothman any further comments? |
|
Sorry my review time has been limited. Please add a check that partial_fit still raises an appropriate error when passed sparse X. Otherwise lgtm, thanks! |
|
As Joel mentioned please test the error in Also all the methods that accept/return a sparse Otherwise LGTM too! |
7ea580c
to
bf978a8
|
@jnothman @NicolasHug should be good to go now. |
bf978a8
to
cb5a185
|
Thanks @scottgigante ! |
|
Thanks @scottgigante ! |
df7dd83
into
scikit-learn:master
|
Thanks Scott! |
|
Thanks @jnothman and @NicolasHug for the detailed reviews! |


Reference Issues/PRs
Fixes #13957.
What does this implement/fix? Explain your changes.
Implements sparse input for IncrementalPCA. IncrementalPCA is by design suited to accepting sparse input; this allows the input to be sparse, and if it is so, converts the data to dense on a batch-wise basis.