Fix min_pos when all negative + speed up #19328

jeremiedbb · Feb 2, 2021

When all input elements are <= 0, min_pos returns the max float (resp. double) if input is float (resp.double). However, the max values are switched and it's actually returning the max double for float input and the max float for double input. Unlikely that it already cause an error in practice but that would have been a sneaky bug :)
I added a test that fails on main. I also added a generic test since the function was not tested at all.

I took the opportunity to fuse type the function.

I also noticed that the call to X.dtype.name is not free, whereas X.dtype is. When the inputs are rather small arrays, for instance in dict_learning they have shape ~ (n_components,), the computation can be completely dominated by the name finding:

X = np.random.RandomState(0).randn(100)                                                                                                                        

# main
%timeit min_pos(X)                                                                                                                                             
5.9 µs ± 35.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

# pr
%timeit min_pos(X)                                                                                                                                              
297 ns ± 2.41 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Turns out it was taking a significant proportion of the time spent in the sparse coding.

thomasjpfan

Minor comment, otherwise LGTM

thomasjpfan · Feb 2, 2021

sklearn/utils/arrayfuncs.pyx

+   """ 
+   Find the minimum value of an array over positive values 
+
+   Returns a huge value if none of the values are positive 


I see you are copying this over from before, but this can be slightly improved:

Suggested change

Returns a huge value if none of the values are positive

Returns maximum representable value of the input dtype if none of the values are positive

Agreed, done.

jeremiedbb · Feb 2, 2021

Just realized the previous indentation was 3 spaces 0o :)

rth

Looks good thanks!

jeremiedbb added 2 commits February 2, 2021 15:04

fix min_pos and add test

cb3a19e

test comment

4416d34

github-actions bot added the module:utils label Feb 2, 2021

ref PR

e35c835

thomasjpfan approved these changes Feb 2, 2021

View reviewed changes

jeremiedbb added 3 commits February 2, 2021 16:44

better doc + pep8

4d6d046

4 spaces

1313015

docstring

27f8473

rth approved these changes Feb 3, 2021

View reviewed changes

rth merged commit 673f625 into scikit-learn:main Feb 3, 2021

glemaitre added the To backport PR merged in master that need a backport to a release branch defined based on the milestone. label Feb 11, 2021

glemaitre added this to the 0.24.2 milestone Feb 11, 2021

glemaitre mentioned this pull request Apr 22, 2021

Release 0.24.2 #19954

Merged

12 tasks

glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Apr 22, 2021

Fix min_pos when all negative + speed up (scikit-learn#19328)

5ba925f

glemaitre pushed a commit that referenced this pull request Apr 28, 2021

Fix min_pos when all negative + speed up (#19328)

bfae276

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix min_pos when all negative + speed up #19328

Fix min_pos when all negative + speed up #19328

Uh oh!

jeremiedbb commented Feb 2, 2021

Uh oh!

thomasjpfan left a comment

Uh oh!

thomasjpfan Feb 2, 2021

Uh oh!

jeremiedbb Feb 2, 2021

Uh oh!

jeremiedbb commented Feb 2, 2021 •

edited

Loading

Uh oh!

rth left a comment

Uh oh!

Uh oh!

	Returns a huge value if none of the values are positive
	Returns maximum representable value of the input dtype if none of the values are positive

Search code, repositories, users, issues, pull requests...

Uh oh!

Fix min_pos when all negative + speed up #19328

Fix min_pos when all negative + speed up #19328

Uh oh!

Conversation

jeremiedbb commented Feb 2, 2021

Uh oh!

thomasjpfan left a comment

Choose a reason for hiding this comment

Uh oh!

thomasjpfan Feb 2, 2021

Choose a reason for hiding this comment

Uh oh!

jeremiedbb Feb 2, 2021

Choose a reason for hiding this comment

Uh oh!

jeremiedbb commented Feb 2, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rth left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jeremiedbb commented Feb 2, 2021 •

edited

Loading