-
-
Notifications
You must be signed in to change notification settings - Fork 32k
bpo-39337: encodings.normalize_encoding() now ignores non-ASCII characters #22219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
update test
This reverts commit 0fcafb8.
Misc/NEWS.d/next/Library/2020-09-13-02-02-18.bpo-39337.L3NXTt.rst
Outdated
Show resolved
Hide resolved
* origin/master: (147 commits) Fix the attribute names in the docstring of GenericAlias (pythonGH-22594) bpo-39337: Add a test case for normalizing of codec names (pythonGH-19069) bpo-41557: Update Windows installer to use SQLite 3.33.0 (pythonGH-21960) bpo-41976: Fix the fallback to gcc of ctypes.util.find_library when using gcc>9 (pythonGH-22598) bpo-41306: Allow scale value to not be rounded (pythonGH-21715) bpo-41970: Avoid test failure in test_lib2to3 if the module is already imported (pythonGH-22595) bpo-41376: Fix the documentation of `site.getusersitepackages()` (pythonGH-21602) Revert "bpo-26680: Incorporate is_integer in all built-in and standard library numeric types (pythonGH-6121)" (pythonGH-22584) bpo-41923: PEP 613: Add TypeAlias to typing module (python#22532) Fix comment about PyObject_IsTrue. (pythonGH-22343) bpo-38605: Make 'from __future__ import annotations' the default (pythonGH-20434) bpo-41905: Add abc.update_abstractmethods() (pythonGH-22485) bpo-41944: No longer call eval() on content received via HTTP in the UnicodeNames tests (pythonGH-22575) bpo-41944: No longer call eval() on content received via HTTP in the CJK codec tests (pythonGH-22566) Post 3.10.0a1 Python 3.10.0a1 bpo-41584: clarify when the reflected method of a binary arithemtic operator is called (python#22505) bpo-41939: Fix test_site.test_license_exists_at_url() (python#22559) bpo-41774: Tweak new programming FAQ entry (pythonGH-22562) bpo-41936. Remove macros Py_ALLOW_RECURSION/Py_END_ALLOW_RECURSION (pythonGH-22552) ...
Lib/test/test_codecs.py
Outdated
self.assertEqual(out, 'utf_8') | ||
out = encodings.normalize_encoding('utf 8') | ||
self.assertEqual(out, 'utf_8') | ||
out = encodings.normalize_encoding('UTF 8') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind to add a comment to explain that the function does not convert upper case letters to lower case letters, just to make the purpose of this test even more explicit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I don't know how to exact explain it~
I found a case in https://github.com/python/cpython/blob/master/Lib/locale.py#L358.
Looks like It's fine to update encodings.normalize_encoding()
to conver to lower-case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just describe the fact, Lol~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can enhance
this encodings.normalize_encoding()
? I am not sure~
@@ -61,7 +61,8 @@ def normalize_encoding(encoding): | ||
if c.isalnum() or c == '.': | ||
if punct and chars: | ||
chars.append('_') | ||
chars.append(c) | ||
if c.isascii(): | ||
chars.append(c) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to ask you to add a ".. versionchanged:: 3.10" entry in the documentation, but then I noticed that the encodings module was never documented! Oh!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If end user will use this function or module, I can try to create the doc, but I need some time to do it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can and must be addressed in a separated PR anymore. The lack of documentation should not hold this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, copy that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
@taleinat @serhiy-storchaka or @malemburg: Do you want to double check this PR?
Misc/NEWS.d/next/Library/2020-09-13-02-02-18.bpo-39337.L3NXTt.rst
Outdated
Show resolved
Hide resolved
Merged, thanks. |
thanks victor for your tons of comments ;) |
encodings.normalize_encoding()
now ignores non-ASCII characters.https://bugs.python.org/issue39337