Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

ENH: read_csv with usecols shouldn't change column order #61386

Copy link
Copy link
Open
@amarvin

Description

@amarvin
Issue body actions

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

The documentation for pandas.read_csv(usecols=[...]) says that it treats the iterable list of columns like an unordered set (updated in #18673 and #53763), so the returned dataframe won't necessarily have the same column order. This is different behaviour from other pandas data reading methods (e.g., pandas.read_parquet(columns=[...])). I think the order should be preserved. If usecols is converted to a set, I think it should instead be converted to OrderedSet or keys of collections.OrderedDict (or just dict in Python >3.6).

Feature Description

import pandas as pd

# Example CSV file (replace with your actual file)
csv_data = """
col1,col2,col3,col4
A,1,X,10
B,2,Y,20
C,3,Z,30
"""

with open("example.csv", "w") as f:
    f.write(csv_data)

# Desired column order
desired_order = ['col3', 'col1', 'col4']

# Read CSV with usecols (selects columns but doesn't order)
df = pd.read_csv("example.csv", usecols=desired_order)

print(df)  # incorrect column order

# Reindex DataFrame to enforce desired order (a popular workaround that I think shouldn't be required)
# One solution is to include this line in `read_csv`, when using `usecols` kwarg
df = df[desired_order]

print(df)  # correct column order

Alternative Solutions

Instead of converting usecols to set, convert it to dict.keys() which preserved order in Python >3.6

Additional Context

No response

Metadata

Metadata

Assignees

Labels

EnhancementNeeds TriageIssue that has not been reviewed by a pandas team memberIssue that has not been reviewed by a pandas team member

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    Morty Proxy This is a proxified and sanitized view of the page, visit original site.