Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

[MRG] FEA ICE lines individually colored by feature values #25807

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 26 commits into
base: main
Choose a base branch
Loading
from

Conversation

lurue101
Copy link

@lurue101 lurue101 commented Mar 10, 2023

Reference Issues/PRs

What does this implement/fix? Explain your changes.

This PR introduces a new feature, which is an extension of the Individual Conditional Expectation (ICE) Plots.
So far a single color for the ICE lines could be chosen by the user (in this case blue):

display = PartialDependenceDisplay.from_estimator(
    hgbdt_model,
    X_train,
    **features_info,
    ax=ax,
    **common_params,
)

standard

This extension allows coloring each line according to the values of another feature. For this, the user just has to pass a list of values. In this case, we use the discrete "workingday" feature from the bike rental dataset. True indicates workingday, false the weekend. This is as easy as just passing the series as the "color" parameter in the ice_lines_kw:

display = PartialDependenceDisplay.from_estimator(
    hgbdt_model,
    X_train,
    **features_info,
    ax=ax,
    **common_params,
    ice_lines_kw={"color": X_train["workingday"], "palette": "Paired_r"},
)

workingday

It also works with continuous features, for example, the hour of the bike rental:


display = PartialDependenceDisplay.from_estimator(
    hgbdt_model,
    X_train,
    **features_info,
    ax=ax,
    **common_params,
    ice_lines_kw={"color": X_train["hour"], "palette": "cool"},
)

hour

The advantage of this is that we get a more detailed picture and can visualize dependencies between the features.
In the first example (all lines in blue) all the plot tells us is that roughly half the lines show a sharp increase in bike rentals (Partial dependence) once the temperature rises above 18°, but it is unclear why some lines show this increase and others not.

With the new functionality, it is possible to dig deeper: The increase in bike rentals almost exclusively happens on the weekend. Possibly because during the week people need to get to and back from work independent of the temperature, whereas on the weekend warmer weather increases the number of bike rentals.

Any other comments?

The example and data is taken from here:
https://scikit-learn.org/stable/auto_examples/inspection/plot_partial_dependence.html

The plots above are also added to this page. The documentation renders well when I tested it, but the output doesn't show up.

I can't understand why the checks regarding matplotlib.colormaps fail, I can run it locally. This was also an attribute that was suggested a DeprecationWarning in matplotlib

@github-actions
Copy link

github-actions bot commented Aug 24, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 9f090f3. Link to the linter CI: here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant
Morty Proxy This is a proxified and sanitized view of the page, visit original site.