Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

API make naive Bayes API consistent in regards to "priors" #27135

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
Loading
from

Conversation

glemaitre
Copy link
Member

@glemaitre glemaitre commented Aug 22, 2023

This PR intends to make the API of the class priors consistent across the different naive Bayes classifiers.

It deprecates class_prior and fit_prior in favor of a single parameter priors that is consistent with GaussianNB (introduced later) and LinearDiscriminantAnalysis.

I also introduce class_log_prior_ in GaussianNB to be consistent with other estimators.

This change will also be helpful for the development of #22574. We can consistently set the priors attribute for all naive Bayes implementations and define a predictable API.

@glemaitre glemaitre marked this pull request as draft August 22, 2023 16:45
@github-actions
Copy link

github-actions bot commented Aug 22, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 84292de. Link to the linter CI: here

@avm19
Copy link
Contributor

avm19 commented Aug 22, 2023

On a related note, there are differences in both setting and getting priors to/from estimators, compare:

  • GaussianNB(priors=...).class_prior_ vs
  • BenoulliNB(class_prior=...).class_log_prior_.

The docstring says that class_log_prior_ is "smoothed":

class_log_prior_ : ndarray of shape (n_classes,)
Log probability of each class (smoothed).

I could not find in source code where the smoothing is applied to priors or counts of labels (as opposed to features). It is possible that the "smoothed" bit is included by mistake (or that I missed something).

@glemaitre
Copy link
Member Author

glemaitre commented Aug 22, 2023

Indeed, I added class_log_prior_ to GaussianNB. I refactor the code such that all call the same method to define the fitted attribute. I don't really intend to deprecate class_prior_.

I could not find in source code where the smoothing is applied to priors or counts of labels (as opposed to features). It is possible that the "smoothed" bit is included by mistake (or that I missed something).

I did not look at it yet. I would expect the alpha to play a role in the smoothing. But it should most probably impact the class_count_ otherwise there is not smoothing. I need to have a closer look.

Edit: I did not see any smoothing as well. Not sure where this comes from. feature_log_prob_ is the one including some smoothing due to the alpha parameter.

@@ -577,30 +657,6 @@ def _check_X_y(self, X, y, reset=True):
"""Validate X and y in fit methods."""
return self._validate_data(X, y, accept_sparse="csr", reset=reset)

def _update_class_log_prior(self, class_prior=None):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function moved in _BaseNB since it could be used by all naive Bayes algorithm

@@ -439,24 +530,6 @@ def _partial_fit(self, X, y, classes=None, _refit=False, sample_weight=None):
self.var_ = np.zeros((n_classes, n_features))

self.class_count_ = np.zeros(n_classes, dtype=np.float64)

# Initialise the class prior
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now factorize across all specialized naive Bayes.

else:
self._priors = "uniform"

def _update_class_log_prior(self):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comes from the _BaseDiscreteNB and slightly changed to use the _priors attribute.

def __init__(self, *, priors=None):
self.priors = priors

def _validate_priors(self):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the main changes to handle smoothly the deprecation in the discrete naive Bayes.

@glemaitre glemaitre marked this pull request as ready for review August 22, 2023 20:05
Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also needs a doc build trigger to find all places in documentation where we need to change.

- if `"uniform"`: a uniform prior is used for each class;
- if `"empirical"`: the prior uses the class proportions from the
training data;
- if `None`: equivalent to `"empirical"`;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we not change the default and deprecate None here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, None allows me to find if a user is changing both None or class_prior/fit_prior. If I set the default to "empirical", I cannot detect if the user explicitly set to "empirical" or just using the default value.

It looks like we need to go with those long deprecation that introduce None for removing the parameter and in 1.6, we deprecate None and remove it in 1.8.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could have an undocumented hidden value, None or "DEFAULT" for you to know if the user is explicitly passing anything. We don't have to expose that value to users, do we?

Copy link
Contributor

@avm19 avm19 Sep 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a side note. Detecting a non-default (user-set) value is also relevant to glemaitre's suggestion of automatic propagation of priors from the meta-estimator in #22574 .

@avm19
Copy link
Contributor

avm19 commented Sep 11, 2023

It deprecates class_prior and fit_prior in favor of a single parameter priors that is consistent with GaussianNB (introduced later) and LinearDiscriminantAnalysis.

I was thinking if the name class_prior is better than just prior? Here are my thoughts:

  • Afaic, the term "prior" is used by Bayesians mainly for a prior distribution of a parameter of another distribution, which is a hidden variable (e.g. $p(\alpha|\alpha_1, \alpha_2)$ is the prior for parameter alpha in Bayesian ridge regression). But here we have a prior distribution of the label, which is an observed variable. Although Naive Bayes is not a Bayesian method, consistency with Bayesians won't harm.
  • If P(Y) and P(Y|X) are referred to as a "prior" and a "posterior", then P(X) and P(X|Y) can also be called so. The name class_prior disambiguates between the class label Y and the covariates X. We would normally call P(X) and P(X|Y) a marginal and a class-conditional distribution, respectively, but nevertheless...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.