Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Add link to plot_feature_selection.py #30255

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

thedataninja1786
Copy link

Reference Issues/PRs

#26927

What does this implement/fix? Explain your changes.

This PR adds a module level docstring to the feature_selection directory referencing the example of feature selection for improving the classification of a noisy dataset from plot_feature_selection.py

Any other comments?

Copy link

github-actions bot commented Nov 9, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: edb3eb2. Link to the linter CI: here

@StefanieSenger
Copy link
Contributor

Hi @thedataninja1786,

thank you for making a PR.

Though it is not correct and you'll need to work on it a bit more. The goal of the main issue is to add links to the examples in the documentation on our website, not in our internal documentation. What you did, adding the link to one of our module.__init__.py files is only affecting internal documentation.

Please refer to our guide to documentation enhancements for information what part of the project is rendered into html and on how to build the documentation.

@thedataninja1786
Copy link
Author

thedataninja1786 commented Nov 12, 2024

Hi @StefanieSenger thank you for the input :) I have followed the original example #26926 and that is how I understood it. I have also checked the guidelines you provided, and here, https://github.com/scikit-learn/scikit-learn/blob/main/doc/modules/feature_selection.rst it seems that the link is already included. Tbh I'm not sure what I am supposed to do :)

@StefanieSenger
Copy link
Contributor

No worries, @thedataninja1786, I will try to explain:

There are certain parts in our documentation that get rendered into html by a project called sphinx. This is called "building the documentation". These are the parts that will appear on our website https://scikit-learn.org:

This is what this issue is about: We want to make the examples more discoverable by linking them at the places where they most matter on the API documentation and in the user guide.

You have discovered that your example is already linked in the user guide.

Now the question is, whether it also makes sense to put it anywhere in the API documentation. This is the question we ask you to answer. If so, you can provide the link in the docstring of a class like SelectKBest (which is used in your example) or possibly others. But if it doesn't make any specific sense, then you would comment on the issue, that you have checked that and no link needs to be added. (This might well be the case and is a contribution to scikit-learn.) The criterion on which to decide is: Does it add value to the user? We don't want to pollute users with too much information.

I hope it helps.

@thedataninja1786
Copy link
Author

Thank you @StefanieSenger for explaining, yes it does make sense. In light of this new information, I believe that this plot_feature_selection.py can be added to all univariate feature selection algorithms: GenericUnivariateSelect, SelectFdr, SelectFpr, SelectFwe, SelectKBest, SelectPercentile since as stated "This notebook is an example of using univariate feature selection to improve classification accuracy on a noisy dataset." However, there is an already an example of the implementation of each algorithm, so I am not entirely sure if it's necessary. Maybe it can be added in the Gallery Examples section (although I'm not sure if it falls under the same scope).

@StefanieSenger
Copy link
Contributor

Glad it helped. Now your thinking is on the right track.

I believe that this plot_feature_selection.py can be added to all univariate feature selection algorithms: GenericUnivariateSelect, SelectFdr, SelectFpr, SelectFwe, SelectKBest, SelectPercentile since as stated "This notebook is an example of using univariate feature selection to improve classification accuracy on a noisy dataset." However, there is an already an example of the implementation of each algorithm, so I am not entirely sure if it's necessary.

I agree, it seems unnecessary to add a link just to demonstrate usage, especially since these all have examples already.

Maybe it can be added in the Gallery Examples section (although I'm not sure if it falls under the same scope).

No, it is not in this scope, since the Example Gallery is auto-generated and serves another purpose (however this is a theoretical distinction only).

Now that I have looked into it, I would think, there is no other place (than the user guide) to put a link, that helps the users discover something new. This example focuses on usage/methodology and not on a specific fancy thing you can do with a certain estimator. So it is good it got linked in the user guide, and that's it.

I would therefore propose to procede like this: close this issue and comment on the main issue, that plot_feature_selection.py is done. Then you can pick out the next example and work on this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.