-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
Parallelize init_bound_dense in Elkan algorithm #19052
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelize init_bound_dense in Elkan algorithm #19052
Conversation
Thanks @YusukeNagasaka. looks good. It would make sense to also do the change in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for the improvement.
This makes huge performance improvement on A64FX CPU, even not much improvement on Intel CPU. @YusukeNagasaka will change |
The same parallelization is applied to The schedule in prange I adopted is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Merged! Thank you very much @YusukeNagasaka! |
What does this implement/fix?
Parallelize init_bound_dense function. This is an initialization part in Elkan algorithm.
Any other comments?
This fix does not affect the quality of the clustering itself. This parallelization only reduces execution time of KMeans.fit on multi-core CPUs, even more beneficial on many-core CPUs.
For instance, I tested on a machine with 2 sockets of Intel Xeon CPUs (totally 40 cores). The table below is the time spent during KMeans.fit and init_bound_dense (the time in seconds). The data is generated uniformly at random with
n_samples=1M, n_features=100
. The parameters of KMeans aren_clusters=1000, init='random', algorithm='elkan', max_iter=1000
.