Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

DOC Fix load iris datasets #19729

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 21, 2021
Merged

DOC Fix load iris datasets #19729

merged 2 commits into from
Mar 21, 2021

Conversation

maliozer
Copy link
Contributor

since the iris datasets is not defined above, the "name iris is not defined" error is thrown.

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR @maliozer !

I would prefer this be defined during the "Using the Iris dataset" section in line 131:

    >>> from sklearn.datasets import load_iris
    >>> from sklearn import tree
    >>> iris = load_iris()
    >>> X, y = iris.data, iris.target
    >>> clf = tree.DecisionTreeClassifier()
    >>> clf = clf.fit(X, y)

@maliozer
Copy link
Contributor Author

maliozer commented Mar 19, 2021

Thank you for the PR @maliozer !

I would prefer this be defined during the "Using the Iris dataset" section in line 131:

    >>> from sklearn.datasets import load_iris
    >>> from sklearn import tree
    >>> iris = load_iris()
    >>> X, y = iris.data, iris.target
    >>> clf = tree.DecisionTreeClassifier()
    >>> clf = clf.fit(X, y)

The reason I chose not to add it at the Using the Iris dataset section, is because it would be meaningless why the iris datasets is loaded twice until you don't apply the graphviz example. The above example demonstrates the return_X_y parameter usage. I tried to leave the example as it is.

@thomasjpfan
Copy link
Member

For this specific case, I think it is nice to show the complete usage of the load_iris() Bunch object. The first part

    >>> from sklearn.datasets import load_iris
    >>> from sklearn import tree
    >>> iris = load_iris()
    >>> X, y = iris.data, iris.target
    >>> clf = tree.DecisionTreeClassifier()
    >>> clf = clf.fit(X, y)

shows how to extract the data and target from the iris Bunch object. Later, the graphviz example part shows how to get the feature names from the iris Bunch object.

Side note: Most of the user guide uses load_iris(return_X_y=True), so having an example with the default return_X_y=False would be beneficial.

DOC Fix move definition of iris Bunch object top of the related section
Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thomasjpfan thomasjpfan merged commit 1db2681 into scikit-learn:main Mar 21, 2021
@glemaitre glemaitre mentioned this pull request Apr 22, 2021
12 tasks
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Apr 22, 2021
glemaitre pushed a commit that referenced this pull request Apr 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.