Support orthogonal polynomial features (via QR decomposition) in PolynomialFeatures

Describe the workflow you want to enable

I want to introduce support for orthogonal polynomial features via QR decomposition in PolynomialFeatures, closely mirroring the behavior of R's poly() function.

In regression modeling, using orthogonal polynomials can often lead to improved numerical stability and reduced multi-collinearity among polynomial terms

As an example of what the difference looks like in R,

#fits raw polynomial data without an orthogonal basis
model_raw <- lm(y ~ I(x) + I(x^2) + I(x^3), data = data)
#model_raw <- lm(y ~poly(x,3,raw=TRUE), data = data)

#fits the same degree-3 polynomial using an orthogonal basis
model_poly <- lm(y ~ poly(x, 3), data = data)

This behavior cannot currently be replicated with scikit-learn's PolynomialFeatures, which only produces the raw monomial terms. As a result transitioning from R to Python often leads to discrepancies in model behavior and performance.

Describe your proposed solution

I propose extending PolynomialFeatures with a new parameter:

PolynomialFeatures(..., method="raw")

Accepted values:

"raw" (default): retains existing behavior, returning standard raw terms
"qr": applies QR decomposition to each feature to generate orthogonal polynomial features.

Because R's poly() only operates on 1D input vectors, my thought was to apply QR decomposition feature by feature when the input is multi-dimensional. Each column is processed independently, mirroring R's approach.

This feature would interact with other parameters as follows:

include_bias: When method="qr", The orthogonal polynomial basis inherently includes a transformed first column. However, this column is not a plain column of ones. Therefore, the concept of include_bias=True (which appends a column of ones) becomes redundant or misleading in this context. One option is to always set include_bias=False if method=qr and always return orthogonal columns only, or raise a warning.
interaction_only: This would be incompatible with method="qr" since the QR-based transformation does not naturally support selective inclusion of interaction terms.

Describe alternatives you've considered, if relevant

Currently, users must implement QR decomposition manually when orthogonal polynomials are needed. This is a common pattern in statistical workflows but lacks "off the shelf" support in any major python library. This feature would eliminate the need to do this decomposition manually and would improve workflows for researchers who are used to R's statistical tools.

Additional context

This idea stemmed from a broader effort to convert statistical modeling pipelines from R to python, where discrepencies in regression results were traced to the lack of orthogonal polynomial support in PolynomialFeatures.

I have drafted and tested a 1D implementation of this feature but wanted feedback on whether this idea aligns with scikit-learn's scope before moving on. In particular, I'd appreciate input on

Acceptability of feature-wise orthogonalization for multi-feature input.
Preferred parameter naming (e.g., method="qr" vs. orthogonal=True).
Compatibility decisions around parameters like include_bias and interaction_only.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support orthogonal polynomial features (via QR decomposition) in `PolynomialFeatures` #31223

Describe the workflow you want to enable

Describe your proposed solution

Describe alternatives you've considered, if relevant

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Search code, repositories, users, issues, pull requests...

Support orthogonal polynomial features (via QR decomposition) in PolynomialFeatures #31223

Description

Describe the workflow you want to enable

Describe your proposed solution

Describe alternatives you've considered, if relevant

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Support orthogonal polynomial features (via QR decomposition) in `PolynomialFeatures` #31223