Description
Out of 263 total uops, 155 of these are ignored by the tier two optimizer. These represent over half of all uops by dynamic execution count.
This issue will serve as a checklist for auditing these missing uops, and adding them where they make sense. At first glance, there's quite a bit of potential here... especially around ability to narrow known output types (like _CONTAINS_OP_SET
), and the ability to narrow and remove guards on input types (like _BINARY_OP_SUBSCR_LIST_INT
). As I'm going through, I'll cross out anything that doesn't seem like it makes sense to add.
First, here are the 53 missing uops that each represent at least 0.1% of all uops executed:
-
(12.1%)_SET_IP
-
(10.1%)_CHECK_VALIDITY
-
(6.5%)_CHECK_VALIDITY_AND_SET_IP
-
(3.1%)_CHECK_PERIODIC
-
(2.8%)_MAKE_WARM
-
(1.7%)_START_EXECUTOR
-
_GUARD_NOS_INT
(1.5%) -
(1.0%)_BINARY_OP_SUBSCR_LIST_INT
-
_CHECK_FUNCTION
(1.0%) -
(0.7%)_CHECK_MANAGED_OBJECT_HAS_VALUES
-
_ITER_CHECK_LIST
(0.7%) -
_CONTAINS_OP_SET
(0.6%) -
_FOR_ITER_TIER_TWO
(0.6%) -
(0.6%)_GUARD_NOT_EXHAUSTED_LIST
-
(0.6%)_ITER_NEXT_LIST_TIER_TWO
-
(0.6%)_SAVE_RETURN_OFFSET
-
_CALL_LEN
(0.5%) -
_CALL_LIST_APPEND
(0.5%) -
_POP_TOP
(0.5%) -
_RESUME_CHECK
(0.5%) -
_BINARY_OP_SUBSCR_STR_INT
(0.4%) -
(0.4%)_GUARD_DORV_VALUES_INST_ATTR_FROM_DICT
-
(0.4%)_GUARD_KEYS_VERSION
-
(0.3%)_BINARY_OP_SUBSCR_DICT
-
_CALL_BUILTIN_FAST
(0.3%) -
_CHECK_STACK_SPACE_OPERAND
(0.3%) -
_GET_ITER
(0.3%) -
(0.3%)_STORE_SUBSCR
-
(0.2%)_GUARD_NOT_EXHAUSTED_RANGE
-
_BINARY_SLICE
(0.2%) -
_BUILD_LIST
(0.2%) -
_CALL_BUILTIN_O
(0.2%) -
_CALL_NON_PY_GENERAL
(0.2%) -
_CHECK_IS_NOT_PY_CALLABLE
(0.2%) -
_GUARD_NOS_FLOAT
(0.2%) -
_ITER_CHECK_RANGE
(0.2%) -
_ITER_CHECK_TUPLE
(0.2%) -
_LOAD_DEREF
(0.2%) -
(0.2%)_STORE_SUBSCR_LIST_INT
-
_BINARY_OP_EXTEND
(0.1%) -
_CALL_ISINSTANCE
(0.1%) -
_CALL_METHOD_DESCRIPTOR_FAST
(0.1%) -
_CALL_METHOD_DESCRIPTOR_FAST_WITH_KEYWORDS
(0.1%) -
_CALL_METHOD_DESCRIPTOR_NOARGS
(0.1%) -
_CALL_TYPE_1
(0.1%) -
_CHECK_ATTR_CLASS
(0.1%) -
_CONTAINS_OP_DICT
(0.1%) -
_GUARD_BINARY_OP_EXTEND
(0.1%) -
(0.1%)_GUARD_NOT_EXHAUSTED_TUPLE
-
(0.1%)_ITER_NEXT_TUPLE
-
(0.1%)_LIST_APPEND
-
(0.1%)_STORE_ATTR_SLOT
-
(0.1%)_STORE_SUBSCR_DICT
And here are the 102 missing uops that are less than 0.1%. These are less important, but still may net us some wins on individual benchmarks:
-
_BINARY_OP_SUBSCR_CHECK_FUNC
-
_BINARY_OP_SUBSCR_TUPLE_INT
-
_BUILD_MAP
-
_BUILD_SET
-
_BUILD_SLICE
-
_BUILD_STRING
-
_CALL_BUILTIN_CLASS
-
_CALL_BUILTIN_FAST_WITH_KEYWORDS
-
_CALL_INTRINSIC_1
-
_CALL_INTRINSIC_2
-
_CALL_KW_NON_PY
-
_CALL_METHOD_DESCRIPTOR_O
-
_CALL_STR_1
-
_CALL_TUPLE_1
-
_CHECK_ATTR_METHOD_LAZY_DICT
-
_CHECK_EG_MATCH
-
_CHECK_EXC_MATCH
-
_CHECK_FUNCTION_VERSION_INLINE
-
_CHECK_FUNCTION_VERSION_KW
-
_CHECK_IS_NOT_PY_CALLABLE_KW
-
_CHECK_METHOD_VERSION
-
_CHECK_METHOD_VERSION_KW
-
_CHECK_PERIODIC_IF_NOT_YIELD_FROM
-
_CONVERT_VALUE
-
_COPY_FREE_VARS
-
_DELETE_ATTR
-
_DELETE_DEREF
-
_DELETE_FAST
-
_DELETE_GLOBAL
-
_DELETE_NAME
-
_DELETE_SUBSCR
-
_DEOPT
-
_DICT_MERGE
-
_DICT_UPDATE
-
_END_FOR
-
_END_SEND
-
_ERROR_POP_N
-
_EXIT_INIT_CHECK
-
_EXPAND_METHOD
-
_EXPAND_METHOD_KW
-
_FATAL_ERROR
-
_FORMAT_SIMPLE
-
_FORMAT_WITH_SPEC
-
_GET_AITER
-
_GET_ANEXT
-
_GET_AWAITABLE
-
_GET_LEN
-
_GET_YIELD_FROM_ITER
-
_GUARD_DORV_NO_DICT
-
_GUARD_GLOBALS_VERSION
-
_GUARD_TOS_FLOAT
-
_GUARD_TOS_INT
-
_GUARD_TYPE_VERSION_AND_LOCK
-
_IMPORT_FROM
-
_IMPORT_NAME
-
_IS_NONE
-
_LIST_EXTEND
-
_LOAD_ATTR_NONDESCRIPTOR_NO_DICT
-
_LOAD_ATTR_NONDESCRIPTOR_WITH_VALUES
-
_LOAD_BUILD_CLASS
-
_LOAD_COMMON_CONSTANT
-
_LOAD_FAST_LOAD_FAST
-
_LOAD_FROM_DICT_OR_DEREF
-
_LOAD_GLOBAL
-
_LOAD_GLOBAL_BUILTINS
-
_LOAD_GLOBAL_MODULE
-
_LOAD_LOCALS
-
_LOAD_NAME
-
_LOAD_SUPER_ATTR_ATTR
-
_LOAD_SUPER_ATTR_METHOD
-
_MAKE_CALLARGS_A_TUPLE
-
_MAKE_CELL
-
_MAKE_FUNCTION
-
_MAP_ADD
-
_MATCH_CLASS
-
_MATCH_KEYS
-
_MATCH_MAPPING
-
_MATCH_SEQUENCE
-
_MAYBE_EXPAND_METHOD_KW
-
_NOP
-
_POP_EXCEPT
-
_POP_TWO_LOAD_CONST_INLINE_BORROW
-
_PUSH_EXC_INFO
-
_PUSH_NULL_CONDITIONAL
-
_SETUP_ANNOTATIONS
-
_SET_ADD
-
_SET_FUNCTION_ATTRIBUTE
-
_SET_UPDATE
-
_STORE_ATTR
-
_STORE_ATTR_INSTANCE_VALUE
-
_STORE_ATTR_WITH_HINT
-
_STORE_DEREF
-
_STORE_FAST_LOAD_FAST
-
_STORE_FAST_STORE_FAST
-
_STORE_GLOBAL
-
_STORE_NAME
-
_STORE_SLICE
-
_TIER2_RESUME_CHECK
-
_UNARY_INVERT
-
_UNARY_NEGATIVE
-
_UNPACK_SEQUENCE_LIST
-
_WITH_EXCEPT_START
Linked PRs
- GH-131798: Allow the JIT to remove more
int
/float
/str
guards #131800 - GH-131798: Remove guard for known type in TO_BOOL_STR #131816
- gh-131798: JIT: Narrow the return type of
_CONTAINS_OP_SET
to bool #132057 - gh-131798: JIT: Narrow the return type of
_BINARY_OP_SUBSCR_STR_INT
tostr
#132153 - gh-131798: Allow the JIT to remove an extra _TO_BOOL_BOOL for _CONTAINS_OP_DICT #132269
- GH-131798: Skip
self
/NULL
checks for some known non-methods #132278 - GH-131798: Remove JIT guards for
dict
,frozenset
,list
,set
, andtuple
#132289 - gh-131798: JIT: Split
CALL_TYPE_1
into several uops #132419 - gh-131798: Use
sym_new_type
instead ofsym_new_not_null
for_BUILD_LIST
,_BUILD_SLICE
, and_BUILD_MAP
#132434 - gh-131798: Use
sym_new_type
instead ofsym_new_not_null
for _BUILD_STRING, _BUILD_SET #132564 - gh-131798: JIT: Split CALL_STR_1 into several uops #132849
- gh-131798: JIT: Split CALL_TUPLE_1 into several uops #132851
- gh-131798: JIT: Narrow the return type of _CALL_LEN to int #132940
- gh-131798: JIT: Propagate the result in
_BINARY_OP_SUBSCR_TUPLE_INT
#133003 - gh-131798: JIT: Narrow the return type of
isinstance
for some known arguments #133172 - gh-131798: Split CALL_LEN into several uops #133180
- gh-131798: JIT: Split
CALL_ISINSTANCE
into several uops #133339 - gh-131798: JIT: Narrow the return type of
_GET_LEN
to int #133345 - gh-131798: JIT: Narrow the return type of _BINARY_SLICE to original container type #133527
- gh-131798: JIT: Assign type to sliced string #133609
- gh-131798: Split CALL_LIST_APPEND into serveral uops #134240
- gh-131798: JIT: Add
_POP_CALL_TWO_LOAD_CONST_INLINE_BORROW
#134268 - gh-131798: JIT: Optimize
_POP_CALL_TWO_LOAD_CONST_INLINE_BORROW
#134369 - GH-131798: Narrow types more aggressively in the JIT #134373
- GH-131798: Optimize cached class attributes and methods in the JIT #134403
- gh-131798: JIT: replace _LOAD_SMALL_INT with _LOAD_CONST_INLINE_BORROW #134406
- gh-131798: JIT: Further optimize
_CALL_ISINSTANCE
for class tuples #134543 - gh-131798: Small improvements to
remove_unneeded_uops
#134554