Closed
Description
Describe the bug
Clustering by KMeans does not weight the input data.
Steps/Code to Reproduce
import numpy as np
from sklearn.cluster import KMeans
x = np.array([1, 1, 5, 5, 100, 100])
w = 10**np.array([8.,8,8,8,-8,-8]) # large weights for 1 and 5, small weights for 100
x=x.reshape(-1,1)# reshape to a 2-dimensional array requested for KMeans
centers_with_weight = KMeans(n_clusters=2, random_state=0,n_init=10).fit(x,sample_weight=w).cluster_centers_
centers_no_weight = KMeans(n_clusters=2, random_state=0,n_init=10).fit(x).cluster_centers_
Expected Results
centers_with_weight=[[1.],[5.]]
centers_no_weight=[[100.],[3.]]
Actual Results
centers_with_weight=[[100.],[3.]]
centers_no_weight=[[100.],[3.]]
Versions
System:
python: 3.10.4 (tags/v3.10.4:9d38120, Mar 23 2022, 23:13:41) [MSC v.1929 64 bit (AMD64)]
executable: E:\WPy64-31040\python-3.10.4.amd64\python.exe
machine: Windows-10-10.0.19045-SP0
Python dependencies:
sklearn: 1.2.1
pip: 22.3.1
setuptools: 62.1.0
numpy: 1.23.3
scipy: 1.8.1
Cython: 0.29.28
pandas: 1.4.2
matplotlib: 3.5.1
joblib: 1.2.0
threadpoolctl: 3.1.0
Built with OpenMP: True
threadpoolctl info:
user_api: blas
internal_api: openblas
prefix: libopenblas
filepath: E:\WPy64-31040\python-3.10.4.amd64\Lib\site-packages\numpy\.libs\libopenblas.FB5AE2TYXYH2IJRDKGDGQ3XBKLKTF43H.gfortran-win_amd64.dll
version: 0.3.20
threading_layer: pthreads
architecture: Haswell
num_threads: 12
user_api: blas
internal_api: openblas
prefix: libopenblas
filepath: E:\WPy64-31040\python-3.10.4.amd64\Lib\site-packages\scipy\.libs\libopenblas.XWYDX2IKJW2NMTWSFYNGFUWKQU3LYTCZ.gfortran-win_amd64.dll
version: 0.3.17
threading_layer: pthreads
architecture: Haswell
num_threads: 12
user_api: openmp
internal_api: openmp
prefix: vcomp
filepath: E:\WPy64-31040\python-3.10.4.amd64\Lib\site-packages\sklearn\.libs\vcomp140.dll
version: None
num_threads: 12