Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Modifications to LinearRegression documentation. #22561

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
Loading
from

Conversation

thomasoliveira
Copy link

Reference Issues/PRs

Fixes #22551.

What does this implement/fix? Explain your changes.

From the documentation of https://scikit-learn.org/stable/modules/linear_model.html#ordinary-least-squares, one could understand that LinearRegression fits a model where the intercept $w_0$ is absent, that is, where $w_0=0$.

To prevent this misinterpretation, the present PR rephrases:

  1. "Across the module, we designate the vector $w = (w_1, ..., w_p)$ as coef_ and $w_0$ as intercept_." as "Across the module, we designate the vector $(w_1, ..., w_p)$ as coef_ and $w_0$ as intercept_."
  2. "LinearRegression fits a linear model with coefficients $w = (w_1, ..., w_p)" as "LinearRegression fits a linear model with coefficients $w = (w_0, w_1, ..., w_p)"
  3. "and will store the coefficients of the linear model in its coef_ member" as "and will store the coefficients of the linear model in its coef_ and intercept_ members"

Any other comments?

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @thomasoliveira !

doc/modules/linear_model.rst Outdated Show resolved Hide resolved
doc/modules/linear_model.rst Outdated Show resolved Hide resolved
@@ -13,7 +13,7 @@ value.

.. math:: \hat{y}(w, x) = w_0 + w_1 x_1 + ... + w_p x_p

Across the module, we designate the vector :math:`w = (w_1,
Across the module, we designate the vector :math:`(w_1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have 2 options:

  1. Still assign w = (w_1, ...) because this is used in the penalty terms. Then we need to add w_0 in many places.
  2. Adjust the penalty terms, because we can't use w anymore as the intercept is not penalized.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current documentation is formulated around X being a design matrix (including the constant 1 column), which implies that w_0 is in the w vector. In general, I think using X as the design matrix is consistent with literature around linear regression.

With that in mind, I prefer option 2 where we rewrite the penalty terms with summations that do not include the w_0 term.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case we should state that by :math:`Xw` we mean w_0 + w_1 x_1 + .... A priori, I'm fine with all ways as long as we are consistent throughout the user guide (within chapter linear models).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With option 2, how do we write an L2 penalty? Should we define a new intercept-less vector? With which name? (I would use \tilde{w}, but this only works in math mode not in Python code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion which option to choose as long as it is correct and consistent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using \tilde{w} would be a nice solution from a LaTeX notation perspective. If we want to keep it simple, we could also name the vector u, however that would be a departure from other literature (not a bad one imo). I prefer the prior.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lorentzenchr @thomasjpfan @Micky774 Regarding this, I feel \tilde{w} is better option for Penalty term. I can take this forward from here if everyone agrees and I can make a new PR with all the changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LinearRegression in User Guide seems to have $w_0=0$
5 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.