Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

DOC Fix UserWarning in plot_gpr_prior_posterior #29268

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

DOC Fix UserWarning in plot_gpr_prior_posterior #29268

wants to merge 1 commit into from

Conversation

ovenpickled
Copy link
Contributor

Reference Issues/PRs

Fixes #29055

What does this implement/fix? Explain your changes.

The dev documentation on the Scikit-learn website displayed an error while executing the dot product kernel code block. I scaled the data as the error requested and increased the number of iterations.

Any other comments?

Copy link

❌ Linting issues

This PR is introducing linting issues. Here's a summary of the issues. Note that you can avoid having linting issues by enabling pre-commit hooks. Instructions to enable them can be found here.

You can see the details of the linting issues under the lint job here


black

black detected issues. Please run black . locally and push the changes. Here you can see the detected issues. Note that running black might also fix some of the issues which might be detected by ruff. Note that the installed black version is black=24.3.0.


--- /home/runner/work/scikit-learn/scikit-learn/examples/gaussian_process/plot_gpr_prior_posterior.py	2024-06-14 18:50:36.782432+00:00
+++ /home/runner/work/scikit-learn/scikit-learn/examples/gaussian_process/plot_gpr_prior_posterior.py	2024-06-14 18:50:46.668581+00:00
@@ -204,24 +204,30 @@
 # Scale the data
 scaler = StandardScaler()
 X_train_scaled = scaler.fit_transform(X_train)
 
 # Define the kernel
-kernel = ConstantKernel(0.1, (0.01, 10.0)) * (DotProduct(sigma_0=1.0, sigma_0_bounds=(0.1, 10.0)) ** 2)
+kernel = ConstantKernel(0.1, (0.01, 10.0)) * (
+    DotProduct(sigma_0=1.0, sigma_0_bounds=(0.1, 10.0)) ** 2
+)
 
 # Increase the number of iterations
-gpr = GaussianProcessRegressor(kernel=kernel, random_state=0, n_restarts_optimizer=10, optimizer='fmin_l_bfgs_b')
+gpr = GaussianProcessRegressor(
+    kernel=kernel, random_state=0, n_restarts_optimizer=10, optimizer="fmin_l_bfgs_b"
+)
 
 # Plot prior
 fig, axs = plt.subplots(nrows=2, sharex=True, sharey=True, figsize=(10, 8))
 plot_gpr_samples(gpr, n_samples=n_samples, ax=axs[0])
 axs[0].set_title("Samples from prior distribution")
 
 # Fit the model and plot posterior
 gpr.fit(X_train_scaled, y_train)
 plot_gpr_samples(gpr, n_samples=n_samples, ax=axs[1])
-axs[1].scatter(X_train_scaled[:, 0], y_train, color="red", zorder=10, label="Observations")
+axs[1].scatter(
+    X_train_scaled[:, 0], y_train, color="red", zorder=10, label="Observations"
+)
 axs[1].legend(bbox_to_anchor=(1.05, 1.5), loc="upper left")
 axs[1].set_title("Samples from posterior distribution")
 
 fig.suptitle("Dot-product kernel", fontsize=18)
 plt.tight_layout()
would reformat /home/runner/work/scikit-learn/scikit-learn/examples/gaussian_process/plot_gpr_prior_posterior.py

Oh no! 💥 💔 💥
1 file would be reformatted, 926 files would be left unchanged.

ruff

ruff detected issues. Please run ruff check --fix --output-format=full . locally, fix the remaining issues, and push the changes. Here you can see the detected issues. Note that the installed ruff version is ruff=0.4.9.


examples/gaussian_process/plot_gpr_prior_posterior.py:199:1: I001 [*] Import block is un-sorted or un-formatted
    |
197 |   # Dot-product kernel
198 |   # ..................
199 | / from sklearn.gaussian_process import GaussianProcessRegressor
200 | | from sklearn.gaussian_process.kernels import ConstantKernel, DotProduct
201 | | from sklearn.preprocessing import StandardScaler
202 | | import matplotlib.pyplot as plt
203 | | 
204 | | # Scale the data
    | |_^ I001
205 |   scaler = StandardScaler()
206 |   X_train_scaled = scaler.fit_transform(X_train)
    |
    = help: Organize imports

examples/gaussian_process/plot_gpr_prior_posterior.py:209:89: E501 Line too long (103 > 88)
    |
208 | # Define the kernel
209 | kernel = ConstantKernel(0.1, (0.01, 10.0)) * (DotProduct(sigma_0=1.0, sigma_0_bounds=(0.1, 10.0)) ** 2)
    |                                                                                         ^^^^^^^^^^^^^^^ E501
210 | 
211 | # Increase the number of iterations
    |

examples/gaussian_process/plot_gpr_prior_posterior.py:212:89: E501 Line too long (113 > 88)
    |
211 | # Increase the number of iterations
212 | gpr = GaussianProcessRegressor(kernel=kernel, random_state=0, n_restarts_optimizer=10, optimizer='fmin_l_bfgs_b')
    |                                                                                         ^^^^^^^^^^^^^^^^^^^^^^^^^ E501
213 | 
214 | # Plot prior
    |

examples/gaussian_process/plot_gpr_prior_posterior.py:222:89: E501 Line too long (91 > 88)
    |
220 | gpr.fit(X_train_scaled, y_train)
221 | plot_gpr_samples(gpr, n_samples=n_samples, ax=axs[1])
222 | axs[1].scatter(X_train_scaled[:, 0], y_train, color="red", zorder=10, label="Observations")
    |                                                                                         ^^^ E501
223 | axs[1].legend(bbox_to_anchor=(1.05, 1.5), loc="upper left")
224 | axs[1].set_title("Samples from posterior distribution")
    |

Found 4 errors.
[*] 1 fixable with the `--fix` option.

Generated for commit: bb87338. Link to the linter CI: here

@ArturoAmorQ ArturoAmorQ changed the title fix UserWarning in documentation DOC Fix UserWarning in plot_gpr_prior_posterior Jun 17, 2024
Copy link
Member

@ArturoAmorQ ArturoAmorQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ovenpickled, thanks for the PR. Here is a first batch of comments. I just have to mention that I have the impression that you used an LLM to assist you with this issue. Even if using them is not discouraged per se, do try to keep the quality of the codebase by:

  • avoiding repeated imports
  • keeping line length below 88 characters
  • not making unrelated changes
  • most importantly: do not contribute code that you don't understand

Comment on lines +199 to +202
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import ConstantKernel, DotProduct
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

matplotlib.pyplot is already imported in line 33
GaussianProcessRegressor is already imported in line 100

Suggested change
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import ConstantKernel, DotProduct
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt
from sklearn.gaussian_process.kernels import ConstantKernel, DotProduct
from sklearn.preprocessing import StandardScaler

Comment on lines +208 to +209
# Define the kernel
kernel = ConstantKernel(0.1, (0.01, 10.0)) * (DotProduct(sigma_0=1.0, sigma_0_bounds=(0.1, 10.0)) ** 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid unrelated changes.

Suggested change
# Define the kernel
kernel = ConstantKernel(0.1, (0.01, 10.0)) * (DotProduct(sigma_0=1.0, sigma_0_bounds=(0.1, 10.0)) ** 2)
kernel = ConstantKernel(0.1, (0.01, 10.0)) * (
DotProduct(sigma_0=1.0, sigma_0_bounds=(0.1, 10.0)) ** 2
)


# plot prior
# Increase the number of iterations
gpr = GaussianProcessRegressor(kernel=kernel, random_state=0, n_restarts_optimizer=10, optimizer='fmin_l_bfgs_b')
Copy link
Member

@ArturoAmorQ ArturoAmorQ Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding comments to the code is mostly required when showing the reasoning behind a complex decision, for instance here, a comment would rather explain the need for the "fmin_l_bfgs_b" optimizer.

Also, try to keep lines under 88 characters (see the linter comment on this PR).

@ArturoAmorQ
Copy link
Member

Hi @ovenpickled are you still working on this PR? It actually suffices to set normalize_y=True in the GaussianProcessRegressor. Can you make this change and revert all the other changes?

@ovenpickled
Copy link
Contributor Author

Hi @ovenpickled are you still working on this PR? It actually suffices to set normalize_y=True in the GaussianProcessRegressor. Can you make this change and revert all the other changes?

yess sure

@ovenpickled ovenpickled closed this by deleting the head repository Jul 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

UserWarnings in the documentation
2 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.