Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

MAINT Introduce FastEuclideanPairwiseArgKmin #22065

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

jjerphan
Copy link
Member

@jjerphan jjerphan commented Dec 23, 2021

Reference Issues/PRs

Part of splitting #21462 (comment)

⚠ This targets the upstream:pairwise-distances-argkmin feature branch not upstream:main.

What does this implement/fix? Explain your changes.

This introduces FastEuclideanPairwiseDistancesArgKmin a specialization of PairwiseDistancesArgkmin for the euclidean and squared euclidean distances which uses the GEMM trick.

Any other comments

I will rebase and force-push on update made to #22064.

@thomasjpfan
Copy link
Member

thomasjpfan commented Dec 23, 2021

Marking this as a draft for now until #22064 is merged. (I also updated the opening message to signal that #22064 needs to be reviewed first)

This reverts the main changes made by 09a9527
to make the initialization in __cinit__ instead
of in __init__ because it's easier this way.

If there's a way to maintain the initialization
in __cinit__, let's do it.
@jjerphan jjerphan force-pushed the pairwise-distances-argkmin-fasteuclidean branch from e897695 to b700faa Compare January 6, 2022 13:04
jjerphan and others added 4 commits January 7, 2022 18:09
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
@jjerphan jjerphan marked this pull request as ready for review January 10, 2022 14:10
sklearn/metrics/_pairwise_distances_reduction.pyx Outdated Show resolved Hide resolved
Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

sklearn/metrics/_pairwise_distances_reduction.pyx Outdated Show resolved Hide resolved
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
@ogrisel
Copy link
Member

ogrisel commented Jan 12, 2022

I think we can ignore the code coverage report. It seems inconsistent and it's probably because this sub-PR does not target main.

@ogrisel ogrisel requested a review from thomasjpfan January 12, 2022 13:17
@ogrisel
Copy link
Member

ogrisel commented Jan 13, 2022

@thomasjpfan do you have other things in mind w.r.t. this PR?

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments, otherwise LGTM

sklearn/metrics/_pairwise_distances_reduction.pyx Outdated Show resolved Hide resolved
sklearn/metrics/_pairwise_distances_reduction.pyx Outdated Show resolved Hide resolved
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
@ogrisel ogrisel merged commit 60dcd13 into scikit-learn:pairwise-distances-argkmin Jan 13, 2022
@ogrisel
Copy link
Member

ogrisel commented Jan 13, 2022

Merged!

@jjerphan jjerphan deleted the pairwise-distances-argkmin-fasteuclidean branch January 13, 2022 16:00
lorentzenchr added a commit that referenced this pull request Feb 17, 2022
…min` (feature branch) (#22134)

* MAINT Introduce Pairwise Distances Reductions private submodule  (#22064)

* MAINT Introduce FastEuclideanPairwiseArgKmin  (#22065)

* fixup! Merge branch 'main' into pairwise-distances-argkmin

Remove duplicated Bunch

* MAINT Plug `PairwiseDistancesArgKmin` as a back-end (#22288)

* Forward pairwise_dist_chunk_size in the configuration

* Flip finalized results for PairwiseDistancesArgKmin

The previous would have made the code more complex
by introducing some boilerplate for the interface plugs.

Having it this way actually simplifies the code.

This also removes the haversine branch for
test_pairwise_distances_argkmin

* Plug PairwiseDistancesArgKmin as a back-end

* Adapt test accordingly

* Add whats_new entry

* Change input validation order for kneighbors

* Remove duplicated test_neighbors_distance_metric_deprecation

* Adapt the documentation

* Add mahalanobis case to test fixtures

* Correct whats_new entry

* CLN Remove unneeded private metric attribute

This was needed when 'fast_sqeuclidean' and 'fast_euclidean'
were present to choose the best implementation based on the user
specification.

Those metric have been removed since then, making this attribute
useless.

* TST Assert FutureWarning instead of DeprecationWarning in
test_neighbors_metrics

* MAINT Add use_pairwise_dist_activate to scikit-learn config

* TST Add a test for the 'brute' backends' results' consistency

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>

* fixup! MAINT Add use_pairwise_dist_activate to scikit-learn config

* fixup! fixup! MAINT Add use_pairwise_dist_activate to scikit-learn config

* TST Filter FutureWarning for WMinkowskiDistance

* MAINT pin numpydoc in arm for now (#22292)

* fixup! TST Filter FutureWarning for WMinkowskiDistance

* Revert keywords arguments removal for the GEMM trick for 'euclidean'

* MAINT pin max numpydoc for now (#22286)

* Add 'haversine' to CDIST_PAIRWISE_DISTANCES_REDUCTION_COMMON_METRICS

* fixup! Add 'haversine' to CDIST_PAIRWISE_DISTANCES_REDUCTION_COMMON_METRICS

* Apply suggestions from code review

* MAINT Document some config parameters for maintenance

Also rename one of them.

* FIX Support and test one of 'sqeuclidean' specification

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

* FIX Various typos fix and correct haversine

'haversine' is not supported by cdist.

* Directly use get_config

* CLN Apply comments from review

* Motivate swapped returned values

* TST Remove mahalanobis from test fixtures

* MNT Add comment regaduction functions' signatures

* TST Complete test for `pairwise_distance_{argmin,argmin_min}` (#22371)

* DOC Add sub-pull requests to the whats_new entry

* DOC place comment inside functions

* DOC move up whatsnew entry

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com>
thomasjpfan added a commit to thomasjpfan/scikit-learn that referenced this pull request Mar 1, 2022
…min` (feature branch) (scikit-learn#22134)

* MAINT Introduce Pairwise Distances Reductions private submodule  (scikit-learn#22064)

* MAINT Introduce FastEuclideanPairwiseArgKmin  (scikit-learn#22065)

* fixup! Merge branch 'main' into pairwise-distances-argkmin

Remove duplicated Bunch

* MAINT Plug `PairwiseDistancesArgKmin` as a back-end (scikit-learn#22288)

* Forward pairwise_dist_chunk_size in the configuration

* Flip finalized results for PairwiseDistancesArgKmin

The previous would have made the code more complex
by introducing some boilerplate for the interface plugs.

Having it this way actually simplifies the code.

This also removes the haversine branch for
test_pairwise_distances_argkmin

* Plug PairwiseDistancesArgKmin as a back-end

* Adapt test accordingly

* Add whats_new entry

* Change input validation order for kneighbors

* Remove duplicated test_neighbors_distance_metric_deprecation

* Adapt the documentation

* Add mahalanobis case to test fixtures

* Correct whats_new entry

* CLN Remove unneeded private metric attribute

This was needed when 'fast_sqeuclidean' and 'fast_euclidean'
were present to choose the best implementation based on the user
specification.

Those metric have been removed since then, making this attribute
useless.

* TST Assert FutureWarning instead of DeprecationWarning in
test_neighbors_metrics

* MAINT Add use_pairwise_dist_activate to scikit-learn config

* TST Add a test for the 'brute' backends' results' consistency

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>

* fixup! MAINT Add use_pairwise_dist_activate to scikit-learn config

* fixup! fixup! MAINT Add use_pairwise_dist_activate to scikit-learn config

* TST Filter FutureWarning for WMinkowskiDistance

* MAINT pin numpydoc in arm for now (scikit-learn#22292)

* fixup! TST Filter FutureWarning for WMinkowskiDistance

* Revert keywords arguments removal for the GEMM trick for 'euclidean'

* MAINT pin max numpydoc for now (scikit-learn#22286)

* Add 'haversine' to CDIST_PAIRWISE_DISTANCES_REDUCTION_COMMON_METRICS

* fixup! Add 'haversine' to CDIST_PAIRWISE_DISTANCES_REDUCTION_COMMON_METRICS

* Apply suggestions from code review

* MAINT Document some config parameters for maintenance

Also rename one of them.

* FIX Support and test one of 'sqeuclidean' specification

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

* FIX Various typos fix and correct haversine

'haversine' is not supported by cdist.

* Directly use get_config

* CLN Apply comments from review

* Motivate swapped returned values

* TST Remove mahalanobis from test fixtures

* MNT Add comment regaduction functions' signatures

* TST Complete test for `pairwise_distance_{argmin,argmin_min}` (scikit-learn#22371)

* DOC Add sub-pull requests to the whats_new entry

* DOC place comment inside functions

* DOC move up whatsnew entry

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Jérémie du Boisberranger <jeremiedbb@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.