bpo-30708: Check for null characters in PyUnicode_AsWideCharString().#2285
bpo-30708: Check for null characters in PyUnicode_AsWideCharString().#2285serhiy-storchaka merged 4 commits intopython:masterpython/cpython:masterfrom serhiy-storchaka:PyUnicode_AsWideCharString-null-wcharsserhiy-storchaka/cpython:PyUnicode_AsWideCharString-null-wcharsCopy head branch name to clipboard
Conversation
Raise a ValueError if the second argument is NULL and the wchar_t\* string contains null characters.
|
@serhiy-storchaka, thanks for your PR! By analyzing the history of the files in this pull request, we identified @birkenfeld, @zooba and @bitdancer to be potential reviewers. |
| } | ||
| if (size == NULL && wcslen(wstr) != (size_t)buflen) { | ||
| PyErr_SetString(PyExc_ValueError, | ||
| "embedded null character"); |
There was a problem hiding this comment.
This error message doesn't really have a lot of information and could be a bit confusing. I'd suggest changing it to something like String has an embedded null character but size parameter was not provided
There was a problem hiding this comment.
This is a standard error message raised in a lot of places. Adding "but size parameter was not provided" only will confuse the end user/Python programmer, because at Python level there is no any size parameter. From Python view this exception just means that a string containing '\0' is passed to a function which rejects strings containing '\0'.
There was a problem hiding this comment.
It might be a standard error message but PyUnicode_AsWideCharString is part of the public C API, from the perspective of someone writing an extension this error would be pretty confusing, I guess someone calling it from interpreter code itself could quickly realize why this error was thrown digging into the code but I don't see how this would be more confusing if enough detail was added.
There was a problem hiding this comment.
This exception is for the users of extensions, not for the authors of extensions. PyArg_ParseTuple() raises the same exception, as well as some other functions. Using PyArg_ParseTuple() or PyUnicode_AsWideCharString() or other way for converting Python objects to C values is an implementation detail insignificant for the users of extensions. Words about a size parameter don't make sense at Python level.
The direct users of C API can read the PyUnicode_AsWideCharString() documentation when they get this exception (as well as they can read the PyArg_ParseTuple() documentation when they get the similar exception in PyArg_ParseTuple()). It is explicitly documented.
Objects/unicodeobject.c
Outdated
| memcpy(buffer, wstr, buflen * sizeof(wchar_t)); | ||
| if (size != NULL) | ||
| *size = buflen; | ||
| *size = buflen - 1; |
There was a problem hiding this comment.
In my opinion, this way of incrementing buflen to make room for the terminator and then setting this to buflen - 1 requires a bit of thinking when reading this at a glance. Maybe move setting the size to before the buflen is incremented but this isn't a big issue.
…ing(). (pythonGH-2285) Raise a ValueError if the second argument is NULL and the wchar_t\* string contains null characters.. (cherry picked from commit e613e6a)
…harString(). (pythonGH-2285) (pythonGH-2443) Raise a ValueError if the second argument is NULL and the wchar_t\* string contains null characters.. (cherry picked from commit e613e6a). (cherry picked from commit 0edffa3)
…ng(). (GH-2285) (GH-2443) (#2448) And use it instead of PyUnicode_AsWideCharString() if appropriate. _PyUnicode_AsWideCharString(unicode) is like PyUnicode_AsWideCharString(unicode, NULL), but raises a ValueError if the wchar_t* string contains null characters. (cherry picked from commit e613e6a). (cherry picked from commit 0edffa3)
Raise a
ValueErrorif the second argument isNULLand thewchar_t*string contains null characters.