Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

ENH: The row and column indexing mechanism of your dataframe is inefficient, leading to errors and unnecessary time consumption #61230

Copy link
Copy link
Open
@zyy37

Description

@zyy37
Issue body actions

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

The row and column indexing mechanism of your dataframe is inefficient, leading to errors and unnecessary time consumption for users. When two dataframes are merged or concated horizontally or vertically, it can cause index duplication. If iterating the index in a for loop, the operation will be repeated twice in one iteration, which is a typical scenario that leads to calculation errors. For example,

df = pd.concat([df1, df2]).drop_duplicates('title')
df.reset_index(drop=True, inplace=True) # this expression must be included every time, otherwise duplicate indexes will cause loop iteration errors.
df['name'] = None
for idx, row in df.iterrows():
    name_list = ['mike', 'jake', 'cook']
    df.at[idx, 'name'] = ",".join(name_list)

If there is no expression df.reset_index(drop=True, inplace=True), this cell will have two of the name_list instead of one written in the code, (Pdb) p df.at[idx, 'name'].index Index([1, 1], dtype='int64').
So I hope that when the rows or columns of the dataframe change, you can automatically maintain the index as an internal mechanism, just like C++'s vectors or arrays. After deletion and removal, the index or iterator is automatically maintained as a continuous number, and users do not manage this. This is also competitor analysis and benchmarking. Hope for improvement. Thank you.

Feature Description

n/a

Alternative Solutions

n/a

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementNeeds TriageIssue that has not been reviewed by a pandas team memberIssue that has not been reviewed by a pandas team member

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Morty Proxy This is a proxified and sanitized view of the page, visit original site.