Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

ENH: Consistent NA handling in unique(), and nunique() #61209

Copy link
Copy link
Open
@olek-osikowicz

Description

@olek-osikowicz
Issue body actions

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

Currently Series.nunique has a default parameter dropna=True.
However Series.unique does not accept the dropna the parameter.

This can cause the unexpected behaviour when: s.nunique() is not nessesarly equal to len(s.unique()).
See example below:

>>> import pandas as pd
>>> s = pd.Series([pd.NA, 1, pd.NA])
>>> s.unique()
array([<NA>, 1], dtype=object)
>>> len(s.unique())
2
>>> s.nunique()
1

I believe it should be addressed to avoid implicit behaviour.

Feature Description

Simplest way to addess it would be to change the default parameter of Series.nunique to dropna=False.
Analogously the same default parameter for DataFrame.nunique.

This would be consistent with current summary of the method:

Count number of distinct elements in specified axis.
Return Series with number of distinct elements. Can ignore NaN values.

"Can ignore NaN values.", hints that should be optional parameter not enabled by default.

Alternative Solutions

Another approach to force consistent NaN handling by default would be to addapt Series.unique to accept dropna and set it to True by default.

Although possible, this is more laborious and more impactful change on Pandas API.

Additional Context

No response

EDIT: Typos

Metadata

Metadata

Assignees

Labels

AlgosNon-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diffNon-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diffEnhancementNeeds DiscussionRequires discussion from core team before further actionRequires discussion from core team before further action

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    Morty Proxy This is a proxified and sanitized view of the page, visit original site.