Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit c66ffcf

Browse filesBrowse files
authored
gh-129987: Selectively re-enable SLP autovectorization of _PyEval_EvalFrameDefault (#132530)
Only disable SLP autovectorization of `_PyEval_EvalFrameDefault` on newer GCCs, as the optimization bug seems to exist only on GCC 12 and later, and before GCC 9 disabling the optimization has a dramatic performance impact.
1 parent 0879ebc commit c66ffcf
Copy full SHA for c66ffcf

File tree

Expand file treeCollapse file tree

1 file changed

+8
-4
lines changed
Filter options
Expand file treeCollapse file tree

1 file changed

+8
-4
lines changed

‎Python/ceval.c

Copy file name to clipboardExpand all lines: Python/ceval.c
+8-4Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -948,11 +948,15 @@ _PyObjectArray_Free(PyObject **array, PyObject **scratch)
948948
#include "generated_cases.c.h"
949949
#endif
950950

951-
#if (defined(__GNUC__) && !defined(__clang__)) && defined(__x86_64__)
951+
#if (defined(__GNUC__) && __GNUC__ >= 10 && !defined(__clang__)) && defined(__x86_64__)
952952
/*
953-
* gh-129987: The SLP autovectorizer can cause poor code generation for opcode
954-
* dispatch, negating any benefit we get from vectorization elsewhere in the
955-
* interpreter loop.
953+
* gh-129987: The SLP autovectorizer can cause poor code generation for
954+
* opcode dispatch in some GCC versions (observed in GCCs 12 through 15,
955+
* probably caused by https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115777),
956+
* negating any benefit we get from vectorization elsewhere in the
957+
* interpreter loop. Disabling it significantly affected older GCC versions
958+
* (prior to GCC 9, 40% performance drop), so we have to selectively disable
959+
* it.
956960
*/
957961
#define DONT_SLP_VECTORIZE __attribute__((optimize ("no-tree-slp-vectorize")))
958962
#else

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.