Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

WIP: Sparse coder #456

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 36 commits into from
Dec 20, 2011
Merged

WIP: Sparse coder #456

merged 36 commits into from
Dec 20, 2011

Conversation

vene
Copy link
Member

@vene vene commented Dec 6, 2011

This quick and dirty pull request aims to add an estimator (transformer) object that implements sparse coding against a fixed dictionary, in the form of the SparseCoder object.

At the same time this will address the small inconsistencies and missing info in docs that came up.

@@ -708,6 +708,77 @@ def transform(self, X, y=None):
return code


class SparseCoder(BaseDictionaryLearning):
""" Sparse coding
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please:

    """Sparse coding

PEP257 :)

@GaelVaroquaux
Copy link
Member

With regards to precomputing wavelets dictionnaries, I don't not think that this is a good idea, because a Wavelet transform can be implemented much faster than a dot product. In addition, it is a orthogonal basis, thus sparse coding should preferably be done using soft thresholding. Finally, this would be fairly image or signal specific, and I don't like the idea of such application-specific code creeping in a general-purpose object.

@@ -708,8 +708,79 @@ def transform(self, X, y=None):
return code


class SparseCoder(BaseDictionaryLearning):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This object should be imported in the init of decomposition.

It should also be added in docs/modules/classes.rst

@GaelVaroquaux
Copy link
Member

My biggest comment is that it is missing a narrative documentation. In
addition an example would be useful. You could do an example with
wavelet. Maybe to make it easier on computation power simply using a 1D
signal.

Gael

@@ -129,7 +129,8 @@ def sparse_encode(X, Y, gram=None, cov=None, algorithm='lasso_lars',
max_iter=1000)
for k in xrange(n_features):
# A huge amount of time is spent in this loop. It needs to be
# tight.
# tight

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pep8 whitespace ;)

@amueller
Copy link
Member

I think it would be good if the SparseCoder would be referenced in the DictionaryLearning "see also" section.
BTW I think the "see also" section of both, SparseCoder and DictionaryLearning should be cleaned up.
They contain a description that get's screwed up when generating the html docs.


if algorithm == 'lasso_lars':
if alpha is None:
alpha = 1.
alpha /= n_features # account for scaling
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that before, dict_learning and dict_learning_online would perform very different on exactly the same data and for the same alpha. This was indeed caused by sometimes forgetting to divide by n_features and this fixes it.

fabianp pushed a commit that referenced this pull request Dec 20, 2011
@fabianp fabianp merged commit 9a43b03 into scikit-learn:master Dec 20, 2011
@fabianp
Copy link
Member

fabianp commented Dec 20, 2011

merged, thanks

gbolmier added a commit to gbolmier/scikit-learn that referenced this pull request May 29, 2020
# More detailed explanatory text, if necessary. Wrap it to about 72
# characters or so. In some contexts, the first line is treated as the
# subject of the commit and the rest of the text as the body. The
# blank line separating the summary from the body is critical (unless
# you omit the body entirely); various tools like `log`, `shortlog`
# and `rebase` can get confused if you run the two together.

# Explain the problem that this commit is solving. Focus on why you
# are making this change as opposed to how (the code explains that).
# Are there side effects or other unintuitive consequences of this
# change? Here's the place to explain them.

# Further paragraphs come after blank lines.

#  - Bullet points are okay, too

#  - Typically a hyphen or asterisk is used for the bullet, preceded
#    by a single space, with blank lines in between, but conventions
#    vary here

# If you use an issue tracker, put references to them at the bottom,
# like this:

# Resolves: scikit-learn#123
# See also: scikit-learn#456, scikit-learn#789
gbolmier added a commit to gbolmier/scikit-learn that referenced this pull request May 30, 2020
# More detailed explanatory text, if necessary. Wrap it to about 72
# characters or so. In some contexts, the first line is treated as the
# subject of the commit and the rest of the text as the body. The
# blank line separating the summary from the body is critical (unless
# you omit the body entirely); various tools like `log`, `shortlog`
# and `rebase` can get confused if you run the two together.

# Explain the problem that this commit is solving. Focus on why you
# are making this change as opposed to how (the code explains that).
# Are there side effects or other unintuitive consequences of this
# change? Here's the place to explain them.

# Further paragraphs come after blank lines.

#  - Bullet points are okay, too

#  - Typically a hyphen or asterisk is used for the bullet, preceded
#    by a single space, with blank lines in between, but conventions
#    vary here

# If you use an issue tracker, put references to them at the bottom,
# like this:

# Resolves: scikit-learn#123
# See also: scikit-learn#456, scikit-learn#789
gbolmier added a commit to gbolmier/scikit-learn that referenced this pull request Aug 9, 2020
# More detailed explanatory text, if necessary. Wrap it to about 72
# characters or so. In some contexts, the first line is treated as the
# subject of the commit and the rest of the text as the body. The
# blank line separating the summary from the body is critical (unless
# you omit the body entirely); various tools like `log`, `shortlog`
# and `rebase` can get confused if you run the two together.

# Explain the problem that this commit is solving. Focus on why you
# are making this change as opposed to how (the code explains that).
# Are there side effects or other unintuitive consequences of this
# change? Here's the place to explain them.

# Further paragraphs come after blank lines.

#  - Bullet points are okay, too

#  - Typically a hyphen or asterisk is used for the bullet, preceded
#    by a single space, with blank lines in between, but conventions
#    vary here

# If you use an issue tracker, put references to them at the bottom,
# like this:

# Resolves: scikit-learn#123
# See also: scikit-learn#456, scikit-learn#789
gbolmier added a commit to gbolmier/scikit-learn that referenced this pull request Aug 9, 2020
# More detailed explanatory text, if necessary. Wrap it to about 72
# characters or so. In some contexts, the first line is treated as the
# subject of the commit and the rest of the text as the body. The
# blank line separating the summary from the body is critical (unless
# you omit the body entirely); various tools like `log`, `shortlog`
# and `rebase` can get confused if you run the two together.

# Explain the problem that this commit is solving. Focus on why you
# are making this change as opposed to how (the code explains that).
# Are there side effects or other unintuitive consequences of this
# change? Here's the place to explain them.

# Further paragraphs come after blank lines.

#  - Bullet points are okay, too

#  - Typically a hyphen or asterisk is used for the bullet, preceded
#    by a single space, with blank lines in between, but conventions
#    vary here

# If you use an issue tracker, put references to them at the bottom,
# like this:

# Resolves: scikit-learn#123
# See also: scikit-learn#456, scikit-learn#789
gbolmier added a commit to gbolmier/scikit-learn that referenced this pull request Aug 9, 2020
# More detailed explanatory text, if necessary. Wrap it to about 72
# characters or so. In some contexts, the first line is treated as the
# subject of the commit and the rest of the text as the body. The
# blank line separating the summary from the body is critical (unless
# you omit the body entirely); various tools like `log`, `shortlog`
# and `rebase` can get confused if you run the two together.

# Explain the problem that this commit is solving. Focus on why you
# are making this change as opposed to how (the code explains that).
# Are there side effects or other unintuitive consequences of this
# change? Here's the place to explain them.

# Further paragraphs come after blank lines.

#  - Bullet points are okay, too

#  - Typically a hyphen or asterisk is used for the bullet, preceded
#    by a single space, with blank lines in between, but conventions
#    vary here

# If you use an issue tracker, put references to them at the bottom,
# like this:

# Resolves: scikit-learn#123
# See also: scikit-learn#456, scikit-learn#789
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.