gh-127146: Skip test_open_undecodable_uri on Emscripten by hoodmane · Pull Request #136510 · python/cpython

hoodmane · Jul 10, 2025

PR #136326 removed the Emscripten skip for this file but it is still broken.

Issue: Emscripten: Get test suite passing #127146

PR python#136326 removed the Emscripten skip for this file but it is still broken.

StanFromIreland · Jul 10, 2025

cc @serhiy-storchaka

serhiy-storchaka

I am surprised. Is test_open_with_undecodable_path passed?

freakboy3742 · Jul 11, 2025

@serhiy-storchaka Both test_open_undecodable_uri and test_open_undecodable_path are failing in CI at present (ref).

FWIW - in both cases, the test path being used is b'@test_42_tmp\xe7w\xf0' - the path is defined, and it is successfully opened, so the get_undecodable_path() check is passing where previously the test was explicitly skipped on Emscripten.

When I run this locally, I get a warning in the console:

Warning -- files was modified by test.test_sqlite3.test_dbapi
Warning --   Before: []
Warning --   After:  ['@test_42_tmp�w�'] 
test test.test_sqlite3.test_dbapi failed

which suggests to me that the bad unicode isn't round-tripping consistently through the filesystem.

I agree that this should be a "skip both or skip neither" situation; and I'd definitely prefer to understand better why it needs to be skipped explicitly before restoring the skip.

serhiy-storchaka · Jul 11, 2025

I suppose that paths are Unicode strings on Emscripten. Python and SQLite can use different ways to decode bytes path to Unicode, so os.path.exists() does not see the file created by SQLite, and unlink() cannot remove it.

This is similar to Windows where we need to keep a separate skip.

hoodmane · Jul 11, 2025

Well '@test_42_tmp\xe7w\xf0' does round trip through JavaScript, MEMFS, and NODEFS correctly. So I agree, this test should be fixable. I will investigate more.

hoodmane · Jul 11, 2025

Both test_open_undecodable_uri and test_open_undecodable_path are failing in CI at present

Interesting. Locally only test_open_with_undecodable_path is failing for me.

hoodmane · Jul 11, 2025

I changed the test to this and added syscall tracing:

    def get_undecodable_path(self):
        path = TESTFN_UNDECODABLE
        print("open", path)
        f = open(path, 'wb')
        print("close", path)
        f.close()
        print("unlink", path)
        unlink(path)
        return path

    @unittest.skipIf(sys.platform == "win32", "skipped on Windows")
    def test_open_with_undecodable_path(self):
        path = self.get_undecodable_path()
        self.addCleanup(unlink, path)
        print("sqlite.connect", path)
        c = sqlite.connect(path)
        with contextlib.closing(c) as cx:
            print("exists", path)
            exists = os.path.exists(path)
            print(" .. ", exists)
            self.assertTrue(exists)

The relevant part of the log looks like this:

open b'@test_42_tmp\xe7w\xf0'
___syscall_openat @test_42_tmp緰
close b'@test_42_tmp\xe7w\xf0'
_fd_close 3 /home/.../test_python_worker_601748æ/@test_42_tmp緰
unlink b'@test_42_tmp\xe7w\xf0'
___syscall_unlinkat @test_42_tmp緰
sqlite.connect b'@test_42_tmp\xe7w\xf0'
___syscall_openat /home/.../test_python_worker_601748æ/@test_42_tmp�w�
___syscall_stat64 /home/.../test_python_worker_601748æ/@test_42_tmp�w�
exists b'@test_42_tmp\xe7w\xf0'
___syscall_stat64 @test_42_tmp緰
 ..  False
___syscall_stat64 /home/.../test_python_worker_601748æ/@test_42_tmp�w�
_fd_close 3 /home/.../test_python_worker_601748æ/@test_42_tmp�w�
<unrelated teardown syscalls>
FAIL
___syscall_unlinkat @test_42_tmp緰

hoodmane · Jul 11, 2025

Okay I got it: the problem is that UTF8ArrayToString uses a different code path on strings with more than 16 bytes and the behaviors don't exactly match:

    // When using conditional TextDecoder, skip it for short strings as the overhead of the native call is not worth it.
    if (endPtr - idx > 16 && heapOrArray.buffer && UTF8Decoder) {
      return UTF8Decoder.decode({{{ getUnsharedTextDecoderView('heapOrArray', 'idx', 'endPtr') }}});
    }

https://github.com/emscripten-core/emscripten/blob/main/src/lib/libstrings.js#L57-L59

So when the file system decodes the string, it checks the length. If it's short, it uses the JS decoder, if it's long it uses the native decoder. sqlite fully resolves the path before opening the file, and the absolute path is longer than 16 bytes. Whereas the not fully resolved path is 15 bytes long and gets the correct slow path. Adding two extra bytes to TESTFN_UNDECODABLE makes the test pass.

hoodmane · Jul 11, 2025

Upstream report:
emscripten-core/emscripten#24690

pythongh-127146: Skip test_open_undecodable_uri on Emscripten

5588bb4

PR python#136326 removed the Emscripten skip for this file but it is still broken.

hoodmane requested a review from freakboy3742 July 10, 2025 14:25

hoodmane requested review from berkerpeksag and erlend-aasland as code owners July 10, 2025 14:25

hoodmane added the skip news label Jul 10, 2025

bedevere-app bot added awaiting review tests Tests in the Lib/test dir labels Jul 10, 2025

bedevere-app bot mentioned this pull request Jul 10, 2025

Emscripten: Get test suite passing #127146

Closed

serhiy-storchaka reviewed Jul 10, 2025

View reviewed changes

hoodmane closed this Jul 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

gh-127146: Skip test_open_undecodable_uri on Emscripten#136510

gh-127146: Skip test_open_undecodable_uri on Emscripten#136510
hoodmane wants to merge 1 commit intopython:mainpython/cpython:mainfrom
hoodmane:emscripten-skip-test_open_undecodable_urihoodmane/cpython:emscripten-skip-test_open_undecodable_uriCopy head branch name to clipboard

hoodmane commented Jul 10, 2025 •

edited by bedevere-app bot

Loading

Uh oh!

StanFromIreland commented Jul 10, 2025

Uh oh!

serhiy-storchaka left a comment

Uh oh!

freakboy3742 commented Jul 11, 2025

Uh oh!

serhiy-storchaka commented Jul 11, 2025

Uh oh!

hoodmane commented Jul 11, 2025

Uh oh!

hoodmane commented Jul 11, 2025

Uh oh!

hoodmane commented Jul 11, 2025 •

edited

Loading

Uh oh!

hoodmane commented Jul 11, 2025

Uh oh!

hoodmane commented Jul 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Search code, repositories, users, issues, pull requests...

Uh oh!

Conversation

hoodmane commented Jul 10, 2025 • edited by bedevere-app bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

StanFromIreland commented Jul 10, 2025

Uh oh!

serhiy-storchaka left a comment

Choose a reason for hiding this comment

Uh oh!

freakboy3742 commented Jul 11, 2025

Uh oh!

serhiy-storchaka commented Jul 11, 2025

Uh oh!

hoodmane commented Jul 11, 2025

Uh oh!

hoodmane commented Jul 11, 2025

Uh oh!

hoodmane commented Jul 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hoodmane commented Jul 11, 2025

Uh oh!

hoodmane commented Jul 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hoodmane commented Jul 10, 2025 •

edited by bedevere-app bot

Loading

hoodmane commented Jul 11, 2025 •

edited

Loading