-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
ENH: add matvec and vecmat gufuncs #25675
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3647a3c
to
a7ac211
Compare
a7ac211
to
5e5c51a
Compare
Refactored to avoid duplication of code: |
5e5c51a
to
5b5d64f
Compare
@charris - since you were the one originally asking for it, what do you think of this implementation? It is now quite lightweight, since it reuses inner loops. |
07487e7
to
09d4504
Compare
macos failure unrelated |
Moving milestone to 2.1, if we get this in we should change it again, but at least I don't think it needs to be prioritized. |
Yes, not critical for 2.0 for sure, but maybe more logical to clear the milestone rather than set it to 2.1 if there still is a chance. This PR has become fairly simple after my refactor. |
Out of curiousity, what is the status of this PR? |
09d4504
to
c373c4a
Compare
I think it would still be nice to get this in! I just rebased to make sure things still pass. @charris - as the person who originally suggested to have |
1b6ca20
to
a1e66f3
Compare
The circle-ci failure does not seem related:
EDIT: also seen in #27091 so indeed unrelated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I am still happy with the addition. A couple of doc comments that would be nice to address. I think the main point is actually that it would be good to expand matmul
docs to explain that these exist and fill a gap for stacks of vectors.
[ 0., 0., 1.], | ||
[ 6., 0., 8.]]) | ||
|
||
.. versionadded:: 2.0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs to be updated, and I think we should move it to the end of the intro.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, updated. I agree the location is not very logical, but kept there anyway, since matmul
and vecdot
have it there too.
Matrix-vector dot product of two arrays. | ||
|
||
Let :math:`\\mathbf{A}` be a maxtrix in ``x1`` and :math:`\\mathbf{v}` be | ||
a vector in ``x2``. The matrix-vector product is defined as: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's good, but if we have an idea, I don't think "a matrix in x1
" is very clear about referring to stacks of matrices/vectors.
In fact it would be nice to:
- Add
matvec
andvecmat
tomatmul
"See Also" - Probably also to its notes on stacks (IIRC we have those).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I tried to reword (will push up in a bit).
Rotate a set of vectors from Y to X along Z. | ||
|
||
>>> a = np.array([[0., 1., 0.], [-1., 0., 0.], [0., 0., 1.]]) | ||
>>> v = np.array([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.], [0., 6., 8.]]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can name these Y_vectors
, name a rotation_matrix
, and format all as with the rows below one another (like the result).
Could also call it X_vectors = ...
and then print that repr to document it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reformatted the arrays, but kept the names (since A and v are used in the description).
a1e66f3
to
16023ac
Compare
One thing that occurred to me, it might also make sense to put these into |
I think the main namespace is good enough, for now at least. It might change if the array API adopts them. |
16023ac
to
27ae492
Compare
27ae492
to
b09440e
Compare
@seberg - shall we get this in? I fixed the error I had made in the change-log entry so now also circle-ci passes. |
I am happy to get it in, someone should maybe confirm they like it as well. Am travelling right now, so may just punt for another while before doing another pass and merging, sorry. |
Did you compare performance of calls to matmul and vecmat/matvec? I would be very surprised if they were different. |
There should be no performance difference - it uses the same code. The advantage of ping also @charris who originally suggested we have these functions... |
We discussed this at a recent triage meeting and would like to put it in. It needs a final review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked through, and the only thing that stood out a bit is that vecmat and matvec may have subtly different behavior when it comes to accuracy for float32.
Overall, still happy to put this in.
numpy/_core/src/umath/matmul.c.src
Outdated
npy_intp vecdot_steps[5] = {0, is2_m, os_m, is1_n, is2_n}; | ||
for (npy_intp i = 0; i < n_outer; i++, args[0] += s0, args[1] += s1, args[2] += s2) { | ||
char *vecdot_args[3] = {args[0], args[1], args[2]}; | ||
@TYPE@_vecdot(vecdot_args, vecdot_dims, vecdot_steps, NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I have two comments, that I think I can glance over if we want this in the 2.2 release, but I also think should get an issue for follow up:
- The
vecmat
code uses our vector multiplication function. This means, it ensures a float64 accumulation (at least partially), and correctly doesn't need to fall back to our implementation for huge arrays. (but I think/hope that 64bit blas is starting to be common). - We probably should be able to remove the outer loop here using
C
transpose parameter to some blas functions.
@seberg - I hadn't thought about implementation differences for float32 at all... I'm fairly sure I initially had I do think it is better if the two functions behave the same way. My tendency would be to use the better precision sums for both, which would mean repeating the complex I'll look a bit at whether one can do this fairly easily, but let me know if you have suggestions... |
b09440e
to
89ffb69
Compare
@seberg - after spending most of the day on this (grr... still do not quite get the blas dimension stuff - too much switching between fortran and C...), I ended up with a bit of a frankencode: now the blas matrix-vector What do you think? |
Sorry, I didn't mean to push you into a rabbit hole :/, just realized that there likely is a difference due to the discussion about sum and float32 and was wondering if we should worry. I suspect your frankencode is fine (and the original would probably also have been fine). @mattip should we just go ahead with this? Pushed a commit to fix the EDIT: Of course feel free to force push my commits away. |
3825a86
to
981dd6e
Compare
Thanks for pushing the fixes - I did squash them together so it is one commit again. Given that |
Yeah, I think we can just put this in for 2.2. @charris wanted to branch this weekend. I pinged the mailing list; but if anyone shouts there (or here), we can always do a backport (it'll be a few weeks before the actual release anyway). |
With a fresh look, I now can get |
Internally, they mostly just call the relevant blas, or vecdot routines.
981dd6e
to
8cec646
Compare
OK, now did the gemm for vecmat complex - should be ready! |
Thanks Marten. |
This PR adds new
matvec
andvecmat
gufunc to calculate the matrix-vector and vector-matrix product, to add to plain matrix multiplication withmatmul
and the inner vector product withvecdot
.Fixes #12348
Note that for complex numbers,
vecmat
is defined asx†A
, i.e., the complex conjugate of the vector is taken. This seems to be the standard and is what we used forvecdot
too (x†x
). However, it is not whatmatmul
does for vector-matrix or indeed vector-vector products. I think this is a bug in matmul, which I'm happy to fix here. But I'll post to the mailing list to get feedback.Separately, with these functions available, in principle these could be used in
__matmul__
and the specializations innp.matmul
removed. But that can be a separate PR (if it is wanted at all).Summary of mailing list discussion
np.matmul
for vector-matrix should not be adjusted. Some surprise that "matrix multiplcation" deals with vectors at all.@
should do.matvec
andvecmat
functions.