Add iterator to memoryview #19

ghost · Sep 2, 2020

Hi Guido,

Here's the cleaned up version, without the extra commits.

Please let me know what you think.

PS: I think a next step in this module could be to add a new memory_item() function to work exclusively with our new memoryiter_next(), this would avoid a few redudant/duplicated checks that now happen when we call memory_item() from memoryiter_next(). (Note: memory_item() serves two extra functions, so those checks must exist for when the function is not called from memoryiter_next(), although they are duplicated when called from memoryiter_next() since we already perform those on memoryiter_next() itself)

gvanrossum

Two nits and a bugfix.

Your next step sounds right -- "hoist" the checks that validate the object each time. I would think you can start by adding members to the memoryiterobject struct to hold the length and the fmt code. After that we can look into inlining ptr_from_index (which is mostly lookup_dimension). I see all kinds of redundant checks there.

In the meantime, let's keep the eye on the ball.

I used this benchmark, two lines of shell code:

./python.exe -m timeit -s "a=memoryview(b'x'*1000000)" "for x in a: pass"
./python.exe -m timeit -s "a=b'x'*1000000" "for x in a: pass"

Right now memoryview is about 11% slower than bytes.

gvanrossum · Sep 2, 2020

Objects/memoryobject.c

+   The function is used by memory_subscript(), memoryiter_next() and
+   memory_as_sequence.
+   Note: Iteration of multi-dimensional arrays is not yet implemented
+   (as of Python 3.10.0a0) */


I'd drop the version -- whoever implements it should remove the comment. :-)

gvanrossum · Sep 2, 2020

Objects/memoryobject.c

+    }
+    if (seq->view.ndim != 1) {
+        PyErr_SetString(PyExc_NotImplementedError,
+            "multi-dimensional sub-views are not implemented");


Add return NULL; -- the PyErr... function doesn't automagically return!

You may be able to save a few usec by moving this check to memory_iter.

(Note that the error also gets invoked for a 0-dimensional memoryview object. In the past that would give TypeError: invalid indexing of 0-dim memory, so still an error. But not really "multi-dimensional". :-)

gvanrossum · Sep 2, 2020

Objects/memoryobject.c

@@ -3200,4 +3289,4 @@ PyTypeObject PyMemoryView_Type = {
    0,                                        /* tp_init */
    0,                                        /* tp_alloc */
    memoryview,                               /* tp_new */
-};
+};


Make sure the last line of the file ends in a newline character! (Most editors have a setting to make sure of this.)

ghost · Sep 3, 2020

The last 2 commits address the issues you raised.
Regarding performance compared to bytes, I can also confirm that memoryview is still slower (even more than 11% if optimizations are enabled)

If everything checks out for you at this stage, I look forward to continue working on this patch and try to close the performance difference.

Objects/memoryobject.c

gvanrossum

Looks good! Now let's optimize. Do you have ideas where to start?

ghost · Sep 3, 2020

I think it is worth to try out the suggestions you made above:

you can start by adding members to the memoryiterobject struct to hold the length and the fmt code"

I actually just tried the first part and there was no noticeable difference in comparison to the actual implementation.
The second part might be more successful as there are multiple indirections that could be cut-off.

Another idea could be to try and make a new lookup_dimension(), which would be called directly (instead of from ptr_to_index() ), which could do without some of the existent checks that the current version has.

What do you think? Any other ideas?

gvanrossum · Sep 3, 2020

Just try everything and see what makes a difference.-- --Guido (mobile)

ghost · Sep 3, 2020

Ok, I will work on it tonight.

ghost · Sep 3, 2020

Hi Guido,

Please find this first attempt at optimizing our patch. It might be possible to push it further but I would like to hear your thoughts on this current version before proceeding.

Thank you.

gvanrossum

Moving along nicely!

gvanrossum · Sep 3, 2020

Objects/memoryobject.c

+    Py_ssize_t it_index;
+    PyMemoryViewObject *it_seq; // Set to NULL when iterator is exhausted
+    Py_ssize_t length;
+    const char *fmt;


You forgot the it_ prefix here. :-)

gvanrossum · Sep 3, 2020

Objects/memoryobject.c

+{
+    Py_ssize_t nitems = view->shape[0];
+
+    // handle negative lookups (i.e [-2])


Why do you even handle these? In ...iter_next the index is always 0 or greater. You can replace this check with an assert.

gvanrossum · Sep 3, 2020

Objects/memoryobject.c

+    if (index < 0) {
+        index += nitems;
+    }
+    if (index < 0 || index >= nitems) {


Since your only caller already checks that the index is within bounds, you could skip the index >= nitems check as well. And then you don't even need the nitems variable.

Honest mistake - was reading lookup_dimension() and forgot that I didn't need those extra checks coming from memoryiter_next()

gvanrossum · Sep 3, 2020

Objects/memoryobject.c

+    return 0;
+}
+
+static char *


Consider making this function inline.

I tried and it didn't add any noticeable performance improvement, however "inlining" the function myself did yield a very noticeable performance improvement as iterating over memoryview is now faster than iterating over bytes, when compilling without optimizations enabled (i.e debug mode).

Do you have an explanation as to why enabling optimizations seems to have less effect on memoryview in comparison to bytes?

It's not faster than bytes for me yet, but it's definitely the fastest I've seen it. On my platform the time went from 12.7 to 9.96 msec (looping over a memoryview 1,000,000 of bytes.) Congratulations!

The main overhead that --with-pydebug adds is not so much compiler optimizations as extra asserts. E.g. remember the assert you added to your GET_SIZE macro? Probably the memoryview code still has more asserts than the bytes iterator.

A fun experiment: if the fmt code is 'B' and the data is contiguous, inline even more (you can just call PyLong_FromLong, see unpack_single.

All credit goes to you. Thank you once again for all the time you are taking to teach me, I can't wait to learn more from you and I couldn't be more grateful.

I will try your idea after dinner and report on it. On that note, wouldn't we be able to use that same hack for all the different formats, if we could infer that the data is contiguous?

Well, you don't want to duplicate all the logic from unpack_single(). In the end, who's going to care whether iterating over a memoryview takes 10 nsec or 11 nsec per iteration. For yourself, what's the most important data type when you're iterating over a numpy array?

Float.

I tried the idea you mentioned above and my specific implementation (which might be non-optimal) made things significantly slower.

Back to the current commit, when compiling with optimizations enabled, iterating over bytes is still ~30% faster than iterating over memoryview on my setup, however
with debug enabled, memoryview is ~1% faster.

Do you see any further optimizations, besides the one you mentioned, that could be done to try and close the difference between bytes and memoryview?

No, I think we should leave well enough alone. This is only a micro-optimization anyway -- nobody's going to care about any further optimizations, but the core devs would object against the added complexity of the code (e.g. if someone were to change the implementation of ptr_from_index or lookup_dimension they would have to remember to also update memoryiter_next.

i have one further suggestion before we move on to bigger and better things. If you move PyMemoryIter_Type after the various functions you won't need to add those forward declarations. This is how things are traditionally written. (For example, this is why PyMemoryView_Type is at the end of the file.)

Hi Guido,

I pushed a new commit that addresses your suggestion above, however after some consideration I opted to forward declare PyMemoryIter_type (instead of all the functions) because otherwise I would have to leave it above memor_iter() - which uses it to register with the garbage collector - and in my opinion it would negatively affect the reading flow.
I think the code is very readable this way, but please let me know if you would prefer no forward declarations at all.

(...) before we move on to bigger and better things.

Can't wait.

gvanrossum

Nice, just one nit. Let's get this submitted as a PR for cpython. In order to do that it's probably best to first file an issue on bugs.python.org where you show that memoryview iteration is slower than it could be. You can then submit your PR to fix the issue (read the instructions in the template carefully).

One final thing. Does your GitHub username really have to be so obscure? Most people just use some abbreviation of their name. :-)

gvanrossum · Sep 5, 2020

Objects/memoryobject.c

+   The function is used by memory_subscript(), memoryiter_next() and
+   memory_as_sequence.
+   Note: Iteration of multi-dimensional arrays is not yet implemented */


I think you can revert this whole comment, you no longer use memory_item in memoryiter_next.

ghost · Sep 5, 2020

I will re-read the dev-guide part related to PR.

Does your GitHub username really have to be so obscure?

No. I will fix that now.

gvanrossum

LGTM. Congrats!

ghost · Sep 5, 2020

Thank you! to the first of many.

ghost · Sep 6, 2020

CPython PR

Add iterator to memoryview

2ece789

gvanrossum reviewed Sep 2, 2020

View reviewed changes

Diogo Flores added 2 commits September 3, 2020 10:36

Fix formatting

6890921

Bugfix - proper error handling for diff dimension memoryviews

83eef55

gvanrossum reviewed Sep 3, 2020

View reviewed changes

Objects/memoryobject.c Show resolved Hide resolved

Code restructure - bugfix

b62bee2

gvanrossum reviewed Sep 3, 2020

View reviewed changes

Initial optimization attempt

bcf1b4d

gvanrossum reviewed Sep 3, 2020

View reviewed changes

gvanrossum mentioned this pull request Sep 3, 2020

Memoryview patch #18

Closed

Diogo Flores added 2 commits September 4, 2020 10:47

Optimization - memoryiter_next()

b1f69e6

Code restructure

2fcf41a

gvanrossum reviewed Sep 5, 2020

View reviewed changes

fix comment

a54b24d

gvanrossum approved these changes Sep 5, 2020

View reviewed changes

gvanrossum closed this Nov 23, 2020

Search code, repositories, users, issues, pull requests...

Add iterator to memoryview #19

Add iterator to memoryview #19

Uh oh!

Conversation

ghost commented Sep 2, 2020

Uh oh!

gvanrossum left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ghost commented Sep 3, 2020

Uh oh!

Uh oh!

gvanrossum left a comment

Choose a reason for hiding this comment

Uh oh!

ghost commented Sep 3, 2020

Uh oh!

gvanrossum commented Sep 3, 2020 via email

Uh oh!

ghost commented Sep 3, 2020

Uh oh!

ghost commented Sep 3, 2020 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gvanrossum left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ghost Sep 4, 2020 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gvanrossum left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ghost commented Sep 5, 2020

Uh oh!

gvanrossum left a comment

Choose a reason for hiding this comment

Uh oh!

ghost commented Sep 5, 2020

Uh oh!

ghost commented Sep 6, 2020 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ghost commented Sep 3, 2020 •

edited by ghost

Loading

ghost Sep 4, 2020 •

edited by ghost

Loading

ghost commented Sep 6, 2020 •

edited by ghost

Loading