Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

ENH: Allow the condition for removal of an atom to be externally set. #6385

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

jakirkham
Copy link
Contributor

Allows the threshold for when to generate a new atom for the dictionary to be set externally. Still needs tests.

@jakirkham
Copy link
Contributor Author

Thoughts on this?

@jnothman
Copy link
Member

jnothman commented Oct 4, 2016

It looks like we need more active core devs familiar with the dictionary learning code. Sorry. Perhaps I'll find a moment to brush up.

@jnothman
Copy link
Member

jnothman commented Oct 6, 2016

I agree that it seems strange for this constant to be hard-coded, but I'm also not familiar with the code. Can you explain the use-case and why this is the appropriate solution?

@jakirkham
Copy link
Contributor Author

jakirkham commented Oct 6, 2016

To give an example, if you have very noisy data, it may occur that some of the atoms are basically reconstructing the noise in the data. Instead of keeping those noise representing atoms for repeated refinement, we would better off to reinitialize them and possibly capture a true feature of the data.

Edit: Should add that was constitutes noise can vary between datasets. Hence why it would be desirable to set a different threshold here for reinitialization.

@jakirkham
Copy link
Contributor Author

Thanks for taking a look. 😄

@jnothman
Copy link
Member

jnothman commented Nov 1, 2016

Ping @vene. I'd like to be able to look at these PRs, @jairkham, but my unfamiliarity with dictionary learning is making me down-prioritise them within a long list of reviewing.

@jakirkham
Copy link
Contributor Author

Based on discussion, it sounds like this shouldn't be parameterizable. However we still would like to bump the threshold for when a basis gets tossed and reseeded from the data. The crucial point being it is better to not continue with a bad atom for too long. Please feel free to correct me if I'm missing something.

cc @GaelVaroquaux

@amueller amueller added the Needs Decision Requires decision label Aug 5, 2019
Base automatically changed from master to main January 22, 2021 10:48
@jeremiedbb
Copy link
Member

The _update_dict method has been refactored and an atom is not resampled based on its norm anymore. Instead it's resampled based on its use, more precisely it's based on ||code[:,k]|| (where code is (n_samples, n_components)). The value is still arbitrary, but consistent with the spams implementation.

Based on the experiments in #19198, this refactoring has greatly improved the resampling of useless atoms and eventually the quality of the dictionary.

I think that we can consider that #19198 kind of fixed this. Feel free to reopen if you think otherwise.

@jeremiedbb jeremiedbb closed this Mar 30, 2022
@jakirkham jakirkham deleted the dict_prune_cond branch March 30, 2022 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.