Open
Description
Describe the bug
Here's something I noticed while looking into #31127
The test
pytest sklearn/utils/tests/test_indexing.py::test_safe_indexing_pandas_no_settingwithcopy_warning
checks that a copy is produced, and that no SettingWithCopyWarning
is produced
Indeed, no copy is raised, but why is using _safe_indexing
with a slice allowed to not make a copy? Is this intentional?
Based on responses, I can suggest what to do instead in #31127
(I am a little surprised that this always makes copies, given that a lot of the discussion in #28341 centered around wanting to avoid copies)
Steps/Code to Reproduce
import numpy as np
from sklearn.utils import _safe_indexing
import pandas as pd
X = pd.DataFrame({"a": [1, 2, 3], "b": [3, 4, 5]})
subset = _safe_indexing(X, slice(0, 2), axis=0)
subset.iloc[0, 0] = 10
Expected Results
No SettingWithCopyWarning
Actual Results
/home/marcogorelli/scikit-learn-dev/t.py:13: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
subset.iloc[0, 0] = 10
Versions
System:
python: 3.11.11 (main, Dec 4 2024, 08:55:07) [GCC 11.4.0]
executable: /home/marcogorelli/scikit-learn-dev/.venv/bin/python
machine: Linux-5.15.167.4-microsoft-standard-WSL2-x86_64-with-glibc2.35
Python dependencies:
sklearn: 1.7.dev0
pip: 24.2
setuptools: None
numpy: 2.1.0
scipy: 1.14.0
Cython: 3.0.11
pandas: 2.2.2
matplotlib: None
joblib: 1.4.2
threadpoolctl: 3.5.0
Built with OpenMP: True
threadpoolctl info:
user_api: blas
internal_api: openblas
num_threads: 16
prefix: libscipy_openblas
filepath: /home/marcogorelli/scikit-learn-dev/.venv/lib/python3.11/site-packages/numpy.libs/libscipy_openblas64_-ff651d7f.so
version: 0.3.27
threading_layer: pthreads
architecture: SkylakeX
user_api: blas
internal_api: openblas
num_threads: 16
prefix: libscipy_openblas
filepath: /home/marcogorelli/scikit-learn-dev/.venv/lib/python3.11/site-packages/scipy.libs/libscipy_openblas-c128ec02.so
version: 0.3.27.dev
threading_layer: pthreads
architecture: SkylakeX
user_api: openmp
internal_api: openmp
num_threads: 16
prefix: libgomp
filepath: /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0
version: None