Open
Description
Describe the workflow you want to enable
CategoricalNB : it needs to add proper treatment for unseen values in validation data (validation data has new values than train data )
to
https://scikit-learn.org/stable/modules/generated/sklearn.naive_bayes.CategoricalNB.html
meantime OrdinalEncoder put all unseen to one category - 0
CategoricalNB assumes that the sample matrix is encoded (for instance with the help of OrdinalEncoder) such that all categories for each feature are represented with numbers
where
is the number of available categories of feature .
using min_categories
also is not solving this problem
Describe your proposed solution
not sure
but unseen values in data row should be used for inference