Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

ENH/DEP: Use a ufunc under the hood for ndarray.clip #12519

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 15, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions 25 doc/release/1.17.0-notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -319,6 +319,31 @@ was accidental. The old behavior can be retained with
``structured_to_unstructured(arr[['a']]).squeeze(axis=-1)`` or far more simply,
``arr['a']``.

``clip`` now uses a ufunc under the hood
----------------------------------------
This means that registering clip functions for custom dtypes in C via
`descr->f->fastclip` is deprecated - they should use the ufunc registration
mechanism instead, attaching to the ``np.core.umath.clip`` ufunc.

It also means that ``clip`` accepts ``where`` and ``casting`` arguments,
and can be override with ``__array_ufunc__``.

A consequence of this change is that some behaviors of the old ``clip`` have
been deprecated:

* Passing ``nan`` to mean "do not clip" as one or both bounds. This didn't work
in all cases anyway, and can be better handled by passing infinities of the
appropriate sign.
* Using "unsafe" casting by default when an ``out`` argument is passed. Using
``casting="unsafe"`` explicitly will silence this warning.

Additionally, there are some corner cases with behavior changes:

* Padding ``max < min`` has changed to be more consistent across dtypes, but
should not be relied upon.
* Scalar ``min`` and ``max`` take part in promotion rules like they do in all
other ufuncs.

``__array_interface__`` offset now works as documented
------------------------------------------------------
The interface may use an ``offset`` value that was mistakenly ignored.
Expand Down
2 changes: 1 addition & 1 deletion 2 numpy/core/_add_newdocs.py
Original file line number Diff line number Diff line change
Expand Up @@ -2771,7 +2771,7 @@

add_newdoc('numpy.core.multiarray', 'ndarray', ('clip',
"""
a.clip(min=None, max=None, out=None)
a.clip(min=None, max=None, out=None, **kwargs)

Return an array whose values are limited to ``[min, max]``.
One of max or min must be given.
Expand Down
75 changes: 75 additions & 0 deletions 75 numpy/core/_methods.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
from numpy.core import umath as um
from numpy.core._asarray import asanyarray
from numpy.core import numerictypes as nt
from numpy.core import _exceptions
from numpy._globals import _NoValue

# save those O(100) nanoseconds!
Expand Down Expand Up @@ -55,6 +56,80 @@ def _count_reduce_items(arr, axis):
items *= arr.shape[ax]
return items

# Numpy 1.17.0, 2019-02-24
# Various clip behavior deprecations, marked with _clip_dep as a prefix.

def _clip_dep_is_scalar_nan(a):
# guarded to protect circular imports
from numpy.core.fromnumeric import ndim
if ndim(a) != 0:
return False
try:
return um.isnan(a)
except TypeError:
return False

def _clip_dep_is_byte_swapped(a):
if isinstance(a, mu.ndarray):
return not a.dtype.isnative
return False

def _clip_dep_invoke_with_casting(ufunc, *args, out=None, casting=None, **kwargs):
# normal path
if casting is not None:
return ufunc(*args, out=out, casting=casting, **kwargs)

# try to deal with broken casting rules
try:
return ufunc(*args, out=out, **kwargs)
except _exceptions._UFuncOutputCastingError as e:
# Numpy 1.17.0, 2019-02-24
warnings.warn(
"Converting the output of clip from {!r} to {!r} is deprecated. "
"Pass `casting=\"unsafe\"` explicitly to silence this warning, or "
"correct the type of the variables.".format(e.from_, e.to),
DeprecationWarning,
stacklevel=2
)
return ufunc(*args, out=out, casting="unsafe", **kwargs)

def _clip(a, min=None, max=None, out=None, *, casting=None, **kwargs):
if min is None and max is None:
raise ValueError("One of max or min must be given")

# Numpy 1.17.0, 2019-02-24
# This deprecation probably incurs a substantial slowdown for small arrays,
# it will be good to get rid of it.
if not _clip_dep_is_byte_swapped(a) and not _clip_dep_is_byte_swapped(out):
using_deprecated_nan = False
if _clip_dep_is_scalar_nan(min):
min = -float('inf')
using_deprecated_nan = True
if _clip_dep_is_scalar_nan(max):
max = float('inf')
using_deprecated_nan = True
if using_deprecated_nan:
warnings.warn(
"Passing `np.nan` to mean no clipping in np.clip has always "
"been unreliable, and is now deprecated. "
"In future, this will always return nan, like it already does "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The infinity substitutions above on scalar np.nan clip limits would seem to preclude the npy_isnan() checks in clip.c.src MIN and MAX preprocessor directives aimed at propagating nans.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct - but these substitutions only occur in the scalar case, not the array case. Unfortunately this was explicitly tested-for behavior, so I felt I had to preserve it.

"when min or max are arrays that contain nan. "
"To skip a bound, pass either None or an np.inf of an "
"appropriate sign.",
DeprecationWarning,
stacklevel=2
)

if min is None:
return _clip_dep_invoke_with_casting(
um.minimum, a, max, out=out, casting=casting, **kwargs)
elif max is None:
return _clip_dep_invoke_with_casting(
um.maximum, a, min, out=out, casting=casting, **kwargs)
else:
return _clip_dep_invoke_with_casting(
um.clip, a, min, max, out=out, casting=casting, **kwargs)

def _mean(a, axis=None, dtype=None, out=None, keepdims=False):
arr = asanyarray(a)

Expand Down
11 changes: 11 additions & 0 deletions 11 numpy/core/code_generators/generate_umath.py
Original file line number Diff line number Diff line change
Expand Up @@ -511,6 +511,13 @@ def english_upper(s):
TD(noobj),
TD(O, f='npy_ObjectMin')
),
'clip':
Ufunc(3, 1, ReorderableNone,
docstrings.get('numpy.core.umath.clip'),
'PyUFunc_SimpleUniformOperationTypeResolver',
TD(noobj),
[TypeDescription('O', 'npy_ObjectClip', 'OOO', 'O')]
),
'fmax':
Ufunc(2, 1, ReorderableNone,
docstrings.get('numpy.core.umath.fmax'),
Expand Down Expand Up @@ -963,6 +970,9 @@ def indent(st, spaces):
'O': 'OO_O',
'P': 'OO_O_method',
},
(3, 1): {
'O': 'OOO_O',
}
}

#for each name
Expand Down Expand Up @@ -1139,6 +1149,7 @@ def make_code(funcdict, filename):
#include "ufunc_type_resolution.h"
#include "loops.h"
#include "matmul.h"
#include "clip.h"
%s

static int
Expand Down
50 changes: 48 additions & 2 deletions 50 numpy/core/code_generators/ufunc_docstrings.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,14 +42,20 @@ def get(name):
def add_newdoc(place, name, doc):
doc = textwrap.dedent(doc).strip()

if name[0] != '_' and name != 'matmul':
# matmul is special, it does not use the OUT_SCALAR replacement strings
skip = (
# gufuncs do not use the OUT_SCALAR replacement strings
'matmul',
# clip has 3 inputs, which is not handled by this
'clip',
)
if name[0] != '_' and name not in skip:
if '\nx :' in doc:
assert '$OUT_SCALAR_1' in doc, "in {}".format(name)
elif '\nx2 :' in doc or '\nx1, x2 :' in doc:
assert '$OUT_SCALAR_2' in doc, "in {}".format(name)
else:
assert False, "Could not detect number of inputs in {}".format(name)

for k, v in subst.items():
doc = doc.replace('$' + k, v)

Expand Down Expand Up @@ -2535,6 +2541,46 @@ def add_newdoc(place, name, doc):

""")

add_newdoc('numpy.core.umath', 'clip',
"""
Clip (limit) the values in an array.

Given an interval, values outside the interval are clipped to
the interval edges. For example, if an interval of ``[0, 1]``
is specified, values smaller than 0 become 0, and values larger
than 1 become 1.

Equivalent to but faster than ``np.minimum(np.maximum(a, a_min), a_max)``.

Parameters
----------
a : array_like
Array containing elements to clip.
a_min : array_like
Minimum value.
a_max : array_like
Maximum value.
out : ndarray, optional
The results will be placed in this array. It may be the input
array for in-place clipping. `out` must be of the right shape
to hold the output. Its type is preserved.
$PARAMS

See Also
--------
numpy.clip :
Wrapper that makes the ``a_min`` and ``a_max`` arguments optional,
dispatching to one of `~numpy.core.umath.clip`,
`~numpy.core.umath.minimum`, and `~numpy.core.umath.maximum`.

Returns
-------
clipped_array : ndarray
An array with the elements of `a`, but where values
< `a_min` are replaced with `a_min`, and those > `a_max`
with `a_max`.
""")

add_newdoc('numpy.core.umath', 'matmul',
"""
Matrix product of two arrays.
Expand Down
14 changes: 11 additions & 3 deletions 14 numpy/core/fromnumeric.py
Original file line number Diff line number Diff line change
Expand Up @@ -1961,12 +1961,12 @@ def compress(condition, a, axis=None, out=None):
return _wrapfunc(a, 'compress', condition, axis=axis, out=out)


def _clip_dispatcher(a, a_min, a_max, out=None):
def _clip_dispatcher(a, a_min, a_max, out=None, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that by becoming a ufunc, we should remove the __array_function__ override. I feel the NEP was clear enough that no deprecation is necessary, but cc @shoyer just in case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree - what's left of clip is just a ufunc selector based on argument presence, there's no reason to allow customization of that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the current promise is that everything directly exposed by NumPy gets its own dispatcher, either via __array_function__ or __array_ufunc__. This PR means that np.clip now uses one of three ufuncs, one of which (um.clip) isn't even publicly exposed. So I don't think we can get rid of this dispatcher yet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider np.core.umath.clip sufficient exposure? It can never replace np.clip in the top namespace and IMO there's no clearer name available

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd think that np.core.umath is sufficient exposure (indeed, in astropy we check our ufunc coverage by enumerating what is in there). The larger question might be whether this multi-dispatch would still need an override. To me, that seems a bit odd, and counter to the idea that __array_function__ is only meant to cover what isn't covered by better matched overrides.

p.s. An analogy here may be ndarra.__pow__, which gives different ufuncs depending on the argument (though of course subclasses can override that...).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rest of np.core is not considered public API so I don't think that's enough exposure. Maybe we could put it as an attribute on the function, e.g., np.clip.ufunc?

I'm also reluctant to do this on an ad-hoc basis. I don't think it's worth the complexity of having exactly one special function that is neither a ufunc nor with a direct override via __array_function__. We don't need to change this all at once but at least we should have a rule for going forward that clearly applies to more than only a few functions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, perhaps not that simple. Maybe for now best to finish this PR without removing the __array_function__ override?

return (a, a_min, a_max)


@array_function_dispatch(_clip_dispatcher)
def clip(a, a_min, a_max, out=None):
def clip(a, a_min, a_max, out=None, **kwargs):
"""
Clip (limit) the values in an array.

Expand All @@ -1975,6 +1975,9 @@ def clip(a, a_min, a_max, out=None):
is specified, values smaller than 0 become 0, and values larger
than 1 become 1.

Equivalent to but faster than ``np.maximum(a_min, np.minimum(a, a_max))``.
No check is performed to ensure ``a_min < a_max``.

Parameters
----------
a : array_like
Expand All @@ -1992,6 +1995,11 @@ def clip(a, a_min, a_max, out=None):
The results will be placed in this array. It may be the input
array for in-place clipping. `out` must be of the right shape
to hold the output. Its type is preserved.
**kwargs
For other keyword-only arguments, see the
:ref:`ufunc docs <ufuncs.kwargs>`.

.. versionadded:: 1.17.0

Returns
-------
Expand Down Expand Up @@ -2020,7 +2028,7 @@ def clip(a, a_min, a_max, out=None):
array([3, 4, 2, 3, 4, 5, 6, 7, 8, 8])

"""
return _wrapfunc(a, 'clip', a_min, a_max, out=out)
return _wrapfunc(a, 'clip', a_min, a_max, out=out, **kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was about to similarly suggest we remove _wrapfunc, but that would break people defining their own .clip method.

I guess the question then becomes whether having such a .clip method should start raising a FutureWarning - maybe good to at least have an issue about this.



def _sum_dispatcher(a, axis=None, dtype=None, out=None, keepdims=None,
Expand Down
2 changes: 2 additions & 0 deletions 2 numpy/core/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -906,6 +906,8 @@ def generate_umath_c(ext, build_dir):
join('src', 'umath', 'loops.c.src'),
join('src', 'umath', 'matmul.h.src'),
join('src', 'umath', 'matmul.c.src'),
join('src', 'umath', 'clip.h.src'),
join('src', 'umath', 'clip.c.src'),
join('src', 'umath', 'ufunc_object.c'),
join('src', 'umath', 'extobj.c'),
join('src', 'umath', 'cpuid.c'),
Expand Down
3 changes: 3 additions & 0 deletions 3 numpy/core/src/common/lowlevel_strided_loops.h
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@
#include <npy_config.h>
#include "mem_overlap.h"

/* For PyArray_ macros used below */
#include "numpy/ndarrayobject.h"

/*
* NOTE: This API should remain private for the time being, to allow
* for further refinement. I think the 'aligned' mechanism
Expand Down
Loading
Morty Proxy This is a proxified and sanitized view of the page, visit original site.