DOC Fix UserWarning in plot_gpr_prior_posterior #29268

ovenpickled · Jun 14, 2024

Reference Issues/PRs

What does this implement/fix? Explain your changes.

The dev documentation on the Scikit-learn website displayed an error while executing the dot product kernel code block. I scaled the data as the error requested and increased the number of iterations.

Any other comments?

github-actions · Jun 14, 2024

❌ Linting issues

This PR is introducing linting issues. Here's a summary of the issues. Note that you can avoid having linting issues by enabling pre-commit hooks. Instructions to enable them can be found here.

You can see the details of the linting issues under the lint job here

`black`

black detected issues. Please run black . locally and push the changes. Here you can see the detected issues. Note that running black might also fix some of the issues which might be detected by ruff. Note that the installed black version is black=24.3.0.


--- /home/runner/work/scikit-learn/scikit-learn/examples/gaussian_process/plot_gpr_prior_posterior.py	2024-06-14 18:50:36.782432+00:00
+++ /home/runner/work/scikit-learn/scikit-learn/examples/gaussian_process/plot_gpr_prior_posterior.py	2024-06-14 18:50:46.668581+00:00
@@ -204,24 +204,30 @@
 # Scale the data
 scaler = StandardScaler()
 X_train_scaled = scaler.fit_transform(X_train)
 
 # Define the kernel
-kernel = ConstantKernel(0.1, (0.01, 10.0)) * (DotProduct(sigma_0=1.0, sigma_0_bounds=(0.1, 10.0)) ** 2)
+kernel = ConstantKernel(0.1, (0.01, 10.0)) * (
+    DotProduct(sigma_0=1.0, sigma_0_bounds=(0.1, 10.0)) ** 2
+)
 
 # Increase the number of iterations
-gpr = GaussianProcessRegressor(kernel=kernel, random_state=0, n_restarts_optimizer=10, optimizer='fmin_l_bfgs_b')
+gpr = GaussianProcessRegressor(
+    kernel=kernel, random_state=0, n_restarts_optimizer=10, optimizer="fmin_l_bfgs_b"
+)
 
 # Plot prior
 fig, axs = plt.subplots(nrows=2, sharex=True, sharey=True, figsize=(10, 8))
 plot_gpr_samples(gpr, n_samples=n_samples, ax=axs[0])
 axs[0].set_title("Samples from prior distribution")
 
 # Fit the model and plot posterior
 gpr.fit(X_train_scaled, y_train)
 plot_gpr_samples(gpr, n_samples=n_samples, ax=axs[1])
-axs[1].scatter(X_train_scaled[:, 0], y_train, color="red", zorder=10, label="Observations")
+axs[1].scatter(
+    X_train_scaled[:, 0], y_train, color="red", zorder=10, label="Observations"
+)
 axs[1].legend(bbox_to_anchor=(1.05, 1.5), loc="upper left")
 axs[1].set_title("Samples from posterior distribution")
 
 fig.suptitle("Dot-product kernel", fontsize=18)
 plt.tight_layout()
would reformat /home/runner/work/scikit-learn/scikit-learn/examples/gaussian_process/plot_gpr_prior_posterior.py

Oh no! 💥 💔 💥
1 file would be reformatted, 926 files would be left unchanged.

`ruff`

ruff detected issues. Please run ruff check --fix --output-format=full . locally, fix the remaining issues, and push the changes. Here you can see the detected issues. Note that the installed ruff version is ruff=0.4.9.


examples/gaussian_process/plot_gpr_prior_posterior.py:199:1: I001 [*] Import block is un-sorted or un-formatted
    |
197 |   # Dot-product kernel
198 |   # ..................
199 | / from sklearn.gaussian_process import GaussianProcessRegressor
200 | | from sklearn.gaussian_process.kernels import ConstantKernel, DotProduct
201 | | from sklearn.preprocessing import StandardScaler
202 | | import matplotlib.pyplot as plt
203 | | 
204 | | # Scale the data
    | |_^ I001
205 |   scaler = StandardScaler()
206 |   X_train_scaled = scaler.fit_transform(X_train)
    |
    = help: Organize imports

examples/gaussian_process/plot_gpr_prior_posterior.py:209:89: E501 Line too long (103 > 88)
    |
208 | # Define the kernel
209 | kernel = ConstantKernel(0.1, (0.01, 10.0)) * (DotProduct(sigma_0=1.0, sigma_0_bounds=(0.1, 10.0)) ** 2)
    |                                                                                         ^^^^^^^^^^^^^^^ E501
210 | 
211 | # Increase the number of iterations
    |

examples/gaussian_process/plot_gpr_prior_posterior.py:212:89: E501 Line too long (113 > 88)
    |
211 | # Increase the number of iterations
212 | gpr = GaussianProcessRegressor(kernel=kernel, random_state=0, n_restarts_optimizer=10, optimizer='fmin_l_bfgs_b')
    |                                                                                         ^^^^^^^^^^^^^^^^^^^^^^^^^ E501
213 | 
214 | # Plot prior
    |

examples/gaussian_process/plot_gpr_prior_posterior.py:222:89: E501 Line too long (91 > 88)
    |
220 | gpr.fit(X_train_scaled, y_train)
221 | plot_gpr_samples(gpr, n_samples=n_samples, ax=axs[1])
222 | axs[1].scatter(X_train_scaled[:, 0], y_train, color="red", zorder=10, label="Observations")
    |                                                                                         ^^^ E501
223 | axs[1].legend(bbox_to_anchor=(1.05, 1.5), loc="upper left")
224 | axs[1].set_title("Samples from posterior distribution")
    |

Found 4 errors.
[*] 1 fixable with the `--fix` option.

_{Generated for commit: bb87338. Link to the linter CI: here}

ArturoAmorQ

Hi @ovenpickled, thanks for the PR. Here is a first batch of comments. I just have to mention that I have the impression that you used an LLM to assist you with this issue. Even if using them is not discouraged per se, do try to keep the quality of the codebase by:

avoiding repeated imports
keeping line length below 88 characters
not making unrelated changes
most importantly: do not contribute code that you don't understand

ArturoAmorQ · Jun 17, 2024

examples/gaussian_process/plot_gpr_prior_posterior.py

+from sklearn.gaussian_process import GaussianProcessRegressor
 from sklearn.gaussian_process.kernels import ConstantKernel, DotProduct
+from sklearn.preprocessing import StandardScaler
+import matplotlib.pyplot as plt


matplotlib.pyplot is already imported in line 33
GaussianProcessRegressor is already imported in line 100

Suggested change

from sklearn.gaussian_process import GaussianProcessRegressor

from sklearn.gaussian_process.kernels import ConstantKernel, DotProduct

from sklearn.preprocessing import StandardScaler

import matplotlib.pyplot as plt

from sklearn.gaussian_process.kernels import ConstantKernel, DotProduct

from sklearn.preprocessing import StandardScaler

ArturoAmorQ · Jun 17, 2024

examples/gaussian_process/plot_gpr_prior_posterior.py

+# Define the kernel
+kernel = ConstantKernel(0.1, (0.01, 10.0)) * (DotProduct(sigma_0=1.0, sigma_0_bounds=(0.1, 10.0)) ** 2)


Please avoid unrelated changes.

Suggested change

# Define the kernel

kernel = ConstantKernel(0.1, (0.01, 10.0)) * (DotProduct(sigma_0=1.0, sigma_0_bounds=(0.1, 10.0)) ** 2)

kernel = ConstantKernel(0.1, (0.01, 10.0)) * (

DotProduct(sigma_0=1.0, sigma_0_bounds=(0.1, 10.0)) ** 2

)

ArturoAmorQ · Jun 17, 2024

examples/gaussian_process/plot_gpr_prior_posterior.py


-# plot prior
+# Increase the number of iterations
+gpr = GaussianProcessRegressor(kernel=kernel, random_state=0, n_restarts_optimizer=10, optimizer='fmin_l_bfgs_b')


Adding comments to the code is mostly required when showing the reasoning behind a complex decision, for instance here, a comment would rather explain the need for the "fmin_l_bfgs_b" optimizer.

Also, try to keep lines under 88 characters (see the linter comment on this PR).

ArturoAmorQ · Jul 1, 2024

Hi @ovenpickled are you still working on this PR? It actually suffices to set normalize_y=True in the GaussianProcessRegressor. Can you make this change and revert all the other changes?

ovenpickled · Jul 1, 2024

Hi @ovenpickled are you still working on this PR? It actually suffices to set normalize_y=True in the GaussianProcessRegressor. Can you make this change and revert all the other changes?

yess sure

Fix UserWarning

bb87338

ArturoAmorQ changed the title ~~fix UserWarning in documentation~~ DOC Fix UserWarning in plot_gpr_prior_posterior Jun 17, 2024

github-actions bot added the Documentation label Jun 17, 2024

ArturoAmorQ mentioned this pull request Jun 17, 2024

UserWarnings in the documentation #29055

Closed

7 tasks

ArturoAmorQ reviewed Jun 17, 2024

View reviewed changes

ArturoAmorQ mentioned this pull request Jun 18, 2024

DOC Add guidelines for automated contributions #29287

Merged

ovenpickled closed this by deleting the head repository Jul 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

DOC Fix UserWarning in plot_gpr_prior_posterior #29268

DOC Fix UserWarning in plot_gpr_prior_posterior #29268

Uh oh!

ovenpickled commented Jun 14, 2024

Uh oh!

github-actions bot commented Jun 14, 2024

Uh oh!

ArturoAmorQ left a comment

Uh oh!

ArturoAmorQ Jun 17, 2024

Uh oh!

ArturoAmorQ Jun 17, 2024

Uh oh!

ArturoAmorQ Jun 17, 2024 •

edited

Loading

Uh oh!

ArturoAmorQ commented Jul 1, 2024

Uh oh!

ovenpickled commented Jul 1, 2024

Uh oh!

Uh oh!

		# Define the kernel
		kernel = ConstantKernel(0.1, (0.01, 10.0)) * (DotProduct(sigma_0=1.0, sigma_0_bounds=(0.1, 10.0)) ** 2)

Search code, repositories, users, issues, pull requests...

Uh oh!

DOC Fix UserWarning in plot_gpr_prior_posterior #29268

DOC Fix UserWarning in plot_gpr_prior_posterior #29268

Uh oh!

Conversation

ovenpickled commented Jun 14, 2024

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

github-actions bot commented Jun 14, 2024

❌ Linting issues

black

ruff

Uh oh!

ArturoAmorQ left a comment

Choose a reason for hiding this comment

Uh oh!

ArturoAmorQ Jun 17, 2024

Choose a reason for hiding this comment

Uh oh!

ArturoAmorQ Jun 17, 2024

Choose a reason for hiding this comment

Uh oh!

ArturoAmorQ Jun 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ArturoAmorQ commented Jul 1, 2024

Uh oh!

ovenpickled commented Jul 1, 2024

Uh oh!

Uh oh!

`black`

`ruff`

ArturoAmorQ Jun 17, 2024 •

edited

Loading