Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

GH-130415: Narrow str to "" based on boolean tests #130476

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Mar 4, 2025
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Correct res type and change f var to empty
  • Loading branch information
fluhus committed Mar 3, 2025
commit ab263fe32a73500a8760ebb0d4f9e306247a4543
12 changes: 6 additions & 6 deletions 12 Lib/test/test_capi/test_opt.py
Original file line number Diff line number Diff line change
Expand Up @@ -1505,19 +1505,19 @@ def f(n):
for i in range(n):
dummy = "aaa"
# Hopefully the optimizer can't guess what the value is.
# f is always "", but we can only prove that it's a string:
f = dummy[:0]
# empty is always "", but we can only prove that it's a string:
empty = dummy[:0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can easily see the optimizer turning "aaa"[:0] into "".
empty doesn't need to be a constant, we just need it to be mostly "", for profiling.
Use something like empty = "a"[:(n % 1000) == 0]

Copy link
Member

@brandtbucher brandtbucher Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we check the actual path taken as part of the test, we need the value to always be "", not just mostly "". So maybe:

false = i == TIER2_THRESHOLD
empty = "X"[:false]

The optimizer can't prove false is False, so it's good enough for our purposes.

trace.append("A")
if not f: # Kept.
if not empty: # Kept.
trace.append("B")
if not f: # Removed!
if not empty: # Removed!
trace.append("C")
trace.append("D")
if f: # Removed!
if empty: # Removed!
trace.append("X")
trace.append("E")
trace.append("F")
if f: # Removed!
if empty: # Removed!
trace.append("X")
trace.append("G")
return trace
Expand Down
2 changes: 1 addition & 1 deletion 2 Python/optimizer_bytecodes.c
Original file line number Diff line number Diff line change
Expand Up @@ -439,7 +439,7 @@ dummy_func(void) {
// removed and the narrowed value to be invalid:
if (next_opcode == _GUARD_IS_FALSE_POP) {
sym_set_const(value, Py_GetConstant(Py_CONSTANT_EMPTY_STR));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is strictly incorrect. We don't know that value is "" until after the _GUARD_IS_FALSE_POP.
The reason that matters is that when we start attaching type information to side exits, as we probably will in 3.15, then this could lead us to infer that value is "" on both branches. Which would be wrong.

There are two possible fixes for this.

  • Combine TO_BOOL_STR and _GUARD_IS_FALSE_POP/_GUARD_IS_TRUE_POP into a single (super)instruction, then optimize that.
  • Annotate the bool value resulting from the TO_BOOL with its input, then in _GUARD_IS_FALSE_POP convert the input value to TO_BOOL.

I prefer the second option, although it may be more work, as it is more flexible and can be extended more easily.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, @Fidget-Spinner and I suggested something like the latter on the issue (new symbols like JitBoolOf(JitOptSymbol *source, bool inverted) and JitEqualTo(JitOptSymbol *lhs, JitOptSymbol *rhs, bool inverted)). That's probably the direction we're headed in longer term.

However, I don't think we should let perfect be the enemy of good here. We have nice, working optimizations in these PRs; just because we might sink info onto side exits in the future probably shouldn't prevent us from making changes like this now for 3.14, which are perfectly correct for the current optimizer (which doesn't sink anything).

I'm inclined to land these changes and other similar ones for ==/!= now, and make the symbolic representation of derived boolean values more complex later as an improvement (it will also be able to handle more uncommon cases like x = y == 42; if x: ...). I'm really worried that if we try to "future-proof" optimizations based on what we could do six months from now, it will prevent actual improvements in the near term.

But I'll defer to you here. If having value be narrowed one uop too early in the instruction stream is enough to block this PR, I can work with these new contributors on the more complex solution. But as-is, this has no bugs and works as intended. We don't sink value info onto side exits, so it's correct.

res = sym_new_type(ctx, &PyUnicode_Type);
res = sym_new_type(ctx, &PyBool_Type);
}
}
}
Expand Down
2 changes: 1 addition & 1 deletion 2 Python/optimizer_cases.c.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Morty Proxy This is a proxified and sanitized view of the page, visit original site.