ENH: The row and column indexing mechanism of your dataframe is inefficient, leading to errors and unnecessary time consumption

Feature Type

Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas

Problem Description

The row and column indexing mechanism of your dataframe is inefficient, leading to errors and unnecessary time consumption for users. When two dataframes are merged or concated horizontally or vertically, it can cause index duplication. If iterating the index in a for loop, the operation will be repeated twice in one iteration, which is a typical scenario that leads to calculation errors. For example,

df = pd.concat([df1, df2]).drop_duplicates('title')
df.reset_index(drop=True, inplace=True) # this expression must be included every time, otherwise duplicate indexes will cause loop iteration errors.
df['name'] = None
for idx, row in df.iterrows():
    name_list = ['mike', 'jake', 'cook']
    df.at[idx, 'name'] = ",".join(name_list)

If there is no expression df.reset_index(drop=True, inplace=True), this cell will have two of the name_list instead of one written in the code, (Pdb) p df.at[idx, 'name'].index Index([1, 1], dtype='int64').
So I hope that when the rows or columns of the dataframe change, you can automatically maintain the index as an internal mechanism, just like C++'s vectors or arrays. After deletion and removal, the index or iterator is automatically maintained as a continuous number, and users do not manage this. This is also competitor analysis and benchmarking. Hope for improvement. Thank you.

Feature Description

n/a

Alternative Solutions

n/a

Additional Context

No response

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: The row and column indexing mechanism of your dataframe is inefficient, leading to errors and unnecessary time consumption #61230

Feature Type

Problem Description

Feature Description

Alternative Solutions

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Search code, repositories, users, issues, pull requests...

Uh oh!

ENH: The row and column indexing mechanism of your dataframe is inefficient, leading to errors and unnecessary time consumption #61230

Description

Feature Type

Problem Description

Feature Description

Alternative Solutions

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions