DictionaryLearning: Fix several issues in the dict update #19198

jeremiedbb · Jan 18, 2021

Fixes Block coordinate descent for dictionary update has a non optimal step #4866: minor bug in the dict update when used by MiniBatchDictionaryLearning. I added a test which currently fail on master.
_update_dict seems to make smart things to update the residuals incrementally but I found that it's actually much faster (~10x to 20x) to write the function more naively and compute the objective function from scratch at the end. The impact on the time for the whole dict learning is negligible since the bottleneck is the sparse coding but I find version much more readable (and by reading the related issues I'm not the only one).
When an atom is not used, the current strategy is to generate a new one from a normal distribution. But it's very likely that it will still not be used. A discussion with @tomMoral lead us think that sample a new atom from the data may be a better strategy. Below is a plot of the objective function for both strategies.

more small adjustments. I explain thoses in dedicated comments

sklearn/decomposition/_dict_learning.py

jeremiedbb · Jan 18, 2021

sklearn/decomposition/_dict_learning.py

+
+            newd = Y[random_state.choice(n_samples)]
+
+            # add small noise to avoid making the sparse coding ill conditioned


Sampling from the data without adding noise triggered some warnings in the sparse coding at some point about ill conditioned problem. Adding a small noise fixed it. Do you have an opinion about that strategy @tomMoral or @agramfort ?

Interesting trick. Out of curiosity, which specific operation would trigger the warning? By default sparse coding is using our coordinate descent solver, no? I would have assumed that this would not raise any warning on ill conditioned problems. But maybe I am wrong?

Yes it is usually a good idea to add some noise. Else, you have a chance that you sample an atom that is only used to encode one variable. Adding some noise makes it somewhat more robust.

Note that instead of sampling the data, you could also sample from the residuals. These approaches are in particular appealing because they are related to the greedy algorithms that have somewhat some convergence guarantees.

@ogrisel The warning came from here

scikit-learn/sklearn/linear_model/_least_angle.py

Line 605 in 266a11b

if diag < 1e-7:

@jeremiedbb what do you think of @tomMoral's comment above? The current PR is already a big net improvement so maybe it's not necessary. But maybe it would converge even faster or to a en even better value of the objective function.

But if it makes the code more complex to organize, maybe this could be explored in a later PR to keep this PR simple to review and quick to merge.

I was really satisfied by the strategy I implemented in the sense that now all atoms are used after only a few iterations.

Sampling from the residuals should give very good results too but it requires keeping track of the residuals. The previous implementation was keeping track of the residuals but was much slower and very hard to follow.

sklearn/decomposition/_dict_learning.py

ogrisel · Jan 20, 2021

When an atom is not used, the current strategy is to generate a new one from a normal distribution. But it's very likely that it will still not be used. A discussion with @tomMoral lead us think that sample a new atom from the data may be a better strategy. Below is a plot of the objective function for both strategies.

This in quite an impressive impact.

Could you please re-run the convergence plots with 10 lines for 10 random starts for each version of the code to see if the run above was not just luck (and to quantify the variability induced by the random init)?

ogrisel

LGTM overall. Nice improvements both in terms of convergence quality and code readability. just a few comments. I like the fact that dictionary is now (n_components, n_features) everywhere.

Please do not forget to add an entry to whats new for 1.0.

sklearn/decomposition/_dict_learning.py

ogrisel · Jan 20, 2021

sklearn/decomposition/_dict_learning.py

+
+            newd = Y[random_state.choice(n_samples)]
+
+            # add small noise to avoid making the sparse coding ill conditioned


Interesting trick. Out of curiosity, which specific operation would trigger the warning? By default sparse coding is using our coordinate descent solver, no? I would have assumed that this would not raise any warning on ill conditioned problems. But maybe I am wrong?

sklearn/decomposition/_dict_learning.py

jeremiedbb · Jan 20, 2021

Could you please re-run the convergence plots with 10 lines for 10 random starts for each version of the code to see if the run above was not just luck (and to quantify the variability induced by the random init)?

See below. It's not random starts since the init is deterministic in dict learning, but different splits of the data. I added 2 strategies: set the atom to all zeros and leave the atom unchanged (these seem to be the only 2 strats in spams).

The proposed strategy is consistently better. The other 3 seem to be roughly the same (makes sense since with these strategies unused atoms remain unused for the whole run)

ogrisel · Jan 20, 2021

Thank you very much. This is very convincing.

ogrisel · Feb 2, 2021

I confirm +1 for merging. Maybe a quick second review by @arthurmensch @jakirkham @agramfort or @GaelVaroquaux?

jjerphan

LGTM, thank you @jeremiedbb for those improvements!

I just have added minors suggestions.

sklearn/decomposition/_dict_learning.py

jjerphan · Apr 14, 2021

sklearn/decomposition/tests/test_dict_learning.py

@@ -575,6 +578,31 @@ def test_sparse_coder_n_features_in():
    assert sc.n_features_in_ == d.shape[1]


+def test_update_dict():


Suggested change

def test_update_dict():

def test_update_dict():

"""Non-regression test for issue #4866."""

We don't use docstrings for the tests, only comments. I added that as a comment.

Actually, I think @thomasjpfan mentioned elsewhere that we were transitioning to docstring even in tests now that we no longer use nose (I think this is the reason).

No strong opinion for this PR.

we were transitioning to docstring even in tests now that we no longer use nose (I think this is the reason).

Exactly. Since that we don't make a strong statement, this is still depending on who is reviewing the PR :)

glemaitre

I have only small nitpicks for the future me that could read the code :)

sklearn/decomposition/_dict_learning.py

glemaitre · Apr 15, 2021

sklearn/decomposition/_dict_learning.py

@@ -807,7 +794,7 @@ def dict_learning_online(X, n_components=2, *, alpha=1, n_iter=100,
    else:
        X_train = X

-    dictionary = check_array(dictionary.T, order='F', dtype=np.float64,
+    dictionary = check_array(dictionary, order='F', dtype=np.float64,


It might be sufficient to call asfortranarray instead of making a full check_array
Maybe add the same comment regarding why using fortran as in the previous function.

the initial dictionary can be provided by the user so it's better to use check_array. We should call check_array in dict_learning as well, but I'd rather do that in a separate PR where I refactor some parts (e.g. #18975).

OK make sense then.

glemaitre · Apr 15, 2021

sklearn/decomposition/tests/test_dict_learning.py

@@ -575,6 +578,31 @@ def test_sparse_coder_n_features_in():
    assert sc.n_features_in_ == d.shape[1]


+def test_update_dict():


we were transitioning to docstring even in tests now that we no longer use nose (I think this is the reason).

Exactly. Since that we don't make a strong statement, this is still depending on who is reviewing the PR :)

ogrisel · Apr 15, 2021

Thanks for the fix @jeremiedbb and the reviews @glemaitre @jjerphan and @tomMoral!

tomMoral · Apr 15, 2021

Super cool to see this one land in scikit-learn! :)
Thx a lot @jeremiedbb ! 🚀

agramfort · Apr 15, 2021

thx heaps @jeremiedbb 🎉 🍻

…rn#19198) Co-authored-by: Olivier Grisel <olivier.grisel@gmail.com>

jeremiedbb added 3 commits January 18, 2021 17:31

several fixes to dict update in dict learning

ce065ba

fix docstrings

d78859b

avoid noise with 0 std

66553aa

github-actions bot added the module:decomposition label Jan 18, 2021

jeremiedbb commented Jan 18, 2021

View reviewed changes

sklearn/decomposition/_dict_learning.py Show resolved Hide resolved

jeremiedbb commented Jan 18, 2021

View reviewed changes

sklearn/decomposition/_dict_learning.py Show resolved Hide resolved

jeremiedbb added 2 commits January 19, 2021 14:36

add test for dict update

2885c21

better test

bd30598

jeremiedbb mentioned this pull request Jan 19, 2021

FIX Make dict learning loss match the paper #19210

Merged

ogrisel approved these changes Jan 20, 2021

View reviewed changes

jeremiedbb added 2 commits January 21, 2021 12:39

adress review comments

fcf260f

what's new

e67109c

jeremiedbb mentioned this pull request Jan 21, 2021

[MRG] Ensure determinism of SVD output in dict_learning #18433

Merged

Base automatically changed from master to main January 22, 2021 10:53

jeremiedbb and others added 6 commits January 22, 2021 15:48

only print at the end of the update

d963043

fix verbose behavior

440d6fb

Merge branch 'master' into fix-dict-update

f51d6fe

Merge branch 'master' into fix-dict-update

2cfa218

Merge branch 'main' into fix-dict-update

e1def9f

lint

dbe53ee

Merge branch 'master' into fix-dict-update

2986b4a

ogrisel added the Waiting for Reviewer label Apr 14, 2021

jjerphan approved these changes Apr 14, 2021

View reviewed changes

address comments

c04dd98

glemaitre approved these changes Apr 15, 2021

View reviewed changes

address comments

8491f46

ogrisel merged commit 2c5ea4e into scikit-learn:main Apr 15, 2021

thomasjpfan pushed a commit to thomasjpfan/scikit-learn that referenced this pull request Apr 19, 2021

DictionaryLearning: Fix several issues in the dict update (scikit-lea…

174d822

…rn#19198) Co-authored-by: Olivier Grisel <olivier.grisel@gmail.com>

glemaitre mentioned this pull request Apr 22, 2021

Release 0.24.2 #19954

Merged

12 tasks

This was referenced Mar 30, 2022

ENH: Allow the condition for removal of an atom to be externally set. #6385

Closed

Dictionary learning generating new atoms/pruning #6386

Closed


		newd = Y[random_state.choice(n_samples)]

		# add small noise to avoid making the sparse coding ill conditioned

	def test_update_dict():
	def test_update_dict():
	"""Non-regression test for issue #4866."""

Search code, repositories, users, issues, pull requests...

Uh oh!

DictionaryLearning: Fix several issues in the dict update #19198

DictionaryLearning: Fix several issues in the dict update #19198

Uh oh!

Conversation

jeremiedbb commented Jan 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ogrisel commented Jan 20, 2021

Uh oh!

ogrisel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jeremiedbb commented Jan 20, 2021

Uh oh!

ogrisel commented Jan 20, 2021

Uh oh!

ogrisel commented Feb 2, 2021

Uh oh!

jjerphan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ogrisel commented Apr 15, 2021

Uh oh!

tomMoral commented Apr 15, 2021

Uh oh!

agramfort commented Apr 15, 2021

Uh oh!

Uh oh!

jeremiedbb commented Jan 18, 2021 •

edited

Loading