Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

gh-133956 fix bug where ClassVar string annotation in @dataclass caused incorrect __init__ generation #134073

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
Loading
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
fix: string annotations ClassVar bug
  • Loading branch information
dzherb committed May 15, 2025
commit 92e2f769eaa1d308873ef977a2f4fd8e48a4a35f
42 changes: 27 additions & 15 deletions 42 Lib/dataclasses.py
Original file line number Diff line number Diff line change
Expand Up @@ -753,21 +753,33 @@ def _is_type(annotation, cls, a_module, a_type, is_type_predicate):
# that's defined. It was judged not worth it.

match = _MODULE_IDENTIFIER_RE.match(annotation)
if match:
ns = None
module_name = match.group(1)
if not module_name:
# No module name, assume the class's module did
# "from dataclasses import InitVar".
ns = sys.modules.get(cls.__module__).__dict__
else:
# Look up module_name in the class's module.
module = sys.modules.get(cls.__module__)
if module and module.__dict__.get(module_name) is a_module:
ns = sys.modules.get(a_type.__module__).__dict__
if ns and is_type_predicate(ns.get(match.group(2)), a_module):
return True
return False
if not match:
return False

ns = None
module_name = match.group(1)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an edge case that is not handled by this: foo.typing_filtered.ClassVar. The regex only accounts for one module level. While it is probably uncommon to import foo.typing_filtered without aliasing it, it is possible.

I don't want to change the world for a simple bug fix, but it seems like partitioning on . is more robust (and probably more performant) than using the current regex.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I agree that splitting on a dot would be faster, there are existing tests that cover cases like dataclasses.InitVar.[int] and dataclasses.InitVar+. If we want to support multiple module levels, we should extend the current regex pattern rather than replace it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth creating a new issue to track more levels of module nesting. And I think it's a mistake to test for dataclasses.InitVar.[int] and other invalid strings.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bug was actually caught by __.typx.ClassVar in the real world. I typically stuff all of my common imports, including typing_extensions as typx, into an internals subpackage (__) so that I can do from . import __. Reduces module namespace pollution and lets me avoid setting __all__ to define module interfaces. But, I understand that my practice may not be common.

That said, any fix to multi-level traversal here or another PR is going to affect the code that is actually being fixed. Imo, it would make more sense to fix both together holistically (to save developer effort), if there is any interest in actually fixing the multi-level traversal.

type_name = match.group(2)

if not module_name:
# No module name, assume the class's module did
# "from dataclasses import InitVar".
ns = sys.modules.get(cls.__module__).__dict__
else:
# Look up module_name in the class's module.
cls_module = sys.modules.get(cls.__module__)
if not cls_module:
return False
Comment on lines +770 to +771
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is interesting that we're paranoid about the dataclass not having its module imported on only this branch and not the branch above. Not suggesting a change, but a comment could be useful if you know why we only care in the else branch and not the if not module_name branch.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was following the checks that were already present in the function. In the else branch, there was a check for if module:

module = sys.modules.get(cls.__module__)
if module and module.__dict__.get(module_name) is a_module:
      ns = sys.modules.get(a_type.__module__).__dict__

I’d actually consider removing that check — it does seem unnecessary. At least, I can’t think of a case where cls.__module__ would be missing from sys.modules. And the tests are still passing.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I saw the original code. My suggestion would be to hoist module = sys.modules.get(cls.__module__) out of the if..else. If you want to preserve the rather paranoid sanity check alongside the hoisted assignment, you could. Something like:

module = sys.modules.get(cls.__module__)
if module:  # not sure that we need to be this paranoid 
    ns = module.__dict__
if module_name:
...


a_type_module = cls_module.__dict__.get(module_name)
if (
isinstance(a_type_module, types.ModuleType)
# Consider the case when a_type does not belong
# to the namespace, e.g. 'dataclasses.ClassVar[int]'
and a_type_module.__dict__.get(type_name) is a_type
):
ns = sys.modules.get(a_type.__module__).__dict__

return ns and is_type_predicate(ns.get(type_name), a_module)


def _get_field(cls, a_name, a_type, default_kw_only):
Expand Down
10 changes: 7 additions & 3 deletions 10 Lib/test/test_dataclasses/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4093,10 +4093,14 @@ def test_classvar_module_level_import(self):
from test.test_dataclasses import dataclass_module_1_str
from test.test_dataclasses import dataclass_module_2
from test.test_dataclasses import dataclass_module_2_str
from test.test_dataclasses import dataclass_module_3
from test.test_dataclasses import dataclass_module_3_str

for m in (dataclass_module_1, dataclass_module_1_str,
dataclass_module_2, dataclass_module_2_str,
):
for m in (
dataclass_module_1, dataclass_module_1_str,
dataclass_module_2, dataclass_module_2_str,
dataclass_module_3, dataclass_module_3_str,
):
with self.subTest(m=m):
# There's a difference in how the ClassVars are
# interpreted when using string annotations or
Expand Down
6 changes: 6 additions & 0 deletions 6 Lib/test/test_dataclasses/_types_proxy.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# We need this to test a case when a type
# is imported via some other package,
# like ClassVar from typing_extensions instead of typing.
# https://github.com/python/cpython/issues/133956
from typing import ClassVar
from dataclasses import InitVar
32 changes: 32 additions & 0 deletions 32 Lib/test/test_dataclasses/dataclass_module_3.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
#from __future__ import annotations
USING_STRINGS = False

# dataclass_module_3.py and dataclass_module_3_str.py are identical
# except only the latter uses string annotations.

from dataclasses import dataclass
import test.test_dataclasses._types_proxy as tp

T_CV2 = tp.ClassVar[int]
T_CV3 = tp.ClassVar

T_IV2 = tp.InitVar[int]
T_IV3 = tp.InitVar

@dataclass
class CV:
T_CV4 = tp.ClassVar
cv0: tp.ClassVar[int] = 20
cv1: tp.ClassVar = 30
cv2: T_CV2
cv3: T_CV3
not_cv4: T_CV4 # When using string annotations, this field is not recognized as a ClassVar.

@dataclass
class IV:
T_IV4 = tp.InitVar
iv0: tp.InitVar[int]
iv1: tp.InitVar
iv2: T_IV2
iv3: T_IV3
not_iv4: T_IV4 # When using string annotations, this field is not recognized as an InitVar.
32 changes: 32 additions & 0 deletions 32 Lib/test/test_dataclasses/dataclass_module_3_str.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
from __future__ import annotations
USING_STRINGS = True

# dataclass_module_3.py and dataclass_module_2_str.py are identical
# except only the latter uses string annotations.

from dataclasses import dataclass
import test.test_dataclasses._types_proxy as tp

T_CV2 = tp.ClassVar[int]
T_CV3 = tp.ClassVar

T_IV2 = tp.InitVar[int]
T_IV3 = tp.InitVar

@dataclass
class CV:
T_CV4 = tp.ClassVar
cv0: tp.ClassVar[int] = 20
cv1: tp.ClassVar = 30
cv2: T_CV2
cv3: T_CV3
not_cv4: T_CV4 # When using string annotations, this field is not recognized as a ClassVar.

@dataclass
class IV:
T_IV4 = tp.InitVar
iv0: tp.InitVar[int]
iv1: tp.InitVar
iv2: T_IV2
iv3: T_IV3
not_iv4: T_IV4 # When using string annotations, this field is not recognized as an InitVar.
Morty Proxy This is a proxified and sanitized view of the page, visit original site.