Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Confidence intervals for linear models #6773

Copy link
Copy link
Closed
@lqdc

Description

@lqdc
Issue body actions

Hey guys,

This is a proposal to add confidence intervals to linear models in scikit-learn.

This would be useful for people because stats-models only works on small datasets and is not as user friendly.
Useful in situations where one has to put more trust the estimated probabilities. Particularly where very low FP rate or FN rate is desired.

This is a rough example of how to do it with Logistic Regression with a test on some text data from twenty newsgroups:
https://gist.github.com/lqdc/1ea1682ad1214956d95904ebde3134a5

There are some limitations:
Standard Errors have to be estimated on validation set or some other non-training set. n-fold X-validation on train set would work.

Doesn't work well on data that is far from normally distributed, so wouldn't work on sparse/non-scaled data.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Morty Proxy This is a proxified and sanitized view of the page, visit original site.