Description
The behavior of the symbolic values used in the tier 2 optimizer have a solid theoretical basis, but there is some subtlety involved.
The symbolic values and the lattice they form needs to be well documented and tested.
What are the symbolic values?
Each symbolic value represents not a real object, but knowledge about the set of objects that could be present at a given point in the code. This is powerful, but not intuitive.
Example:
def foo(x): # line 0
if x is None: # line 1
return "None" # line 2
return "ok" # line 3
At line 3, the symbolic value for x
is not None
, whereas the symbolic value for x
at line 0 is unknown
.
However, any object referred to x
is unchanged by foo
. It is easy to think in terms of objects, but it is really a type system that we are representing.
Why we need a lattice
Example:
def foo(x): # line 0
if x is None: # line 1
return "None" # line 2
if x is not None: # line 3
return "not None" # line 4
return "???" # line 5
and this trace:
exit_if(x is not None) # line 1
exit_if(x is None) # line 3
return "???" # line 5
The trace is a correct trace; it correctly represents one path through the function.
So what symbolic value does x
have for line 5? It is both None
and not None
.
This is why the lattice needs a top
value, even though it superficially makes no sense.
If a value is top
for a location, then that location is unreachable.
Linked PRs
- GH-115816: Make tier2 optimizer symbols testable, and add a few tests. #115953
- GH-115816: Assorted naming and formatting changes to improve maintainability. #115987
- gh-115816: Improve internal symbols API in optimizer #116011
- gh-115816: Improve internal symbols API in optimizer #116028
- gh-115816: Generate calls to sym_new_const() etc. without _Py_uop prefix #116077