Irrecoverable errors when running AST parser in parallel

Summary

I'm working on some static analysis tools that leverage RustPython's AST parser. The basic setup is that I enumerate a bunch of Python files, then in parallel (via Rayon), read them from disk, parse them with RustPython, and perform various operations on the AST.

This generally works great! However, I've noticed that the parallelized parsing can lead to irrecoverable errors (panics, segmentation faults, etc.) when the source Python files themselves contain certain contents, especially nested f-strings.

Here's an example snippet that fails for me occasionally, maybe one in twenty times, typically with Trace/BPT trap: 5, though depending on how exactly I structure the code, I can also get a Segmentation fault: 11:

use std::fs;
use std::path::Path;

use rayon::prelude::*;
use rustpython_parser::ast::Suite;
use rustpython_parser::parser;

fn main() {
    [
        Path::new("resources/test/broken/__init__.py").to_path_buf(),
        Path::new("resources/test/broken/make_string.py").to_path_buf(),
    ]
    .par_iter()
    .map(|path| {
        let contents = fs::read_to_string(path).unwrap();
        parser::parse_program(&contents).unwrap()
    })
    .collect::<Vec<Suite>>();
}

...where __init__.py is empty and make_string.py looks like (sorry, weirdly specific example, but it's been difficult to pinpoint the exact issue):

where_statement = f"""
{' OR '.join([f'column_name = "{field}"' for field in ['plate', 'well']])}
"""

A few observations:

On the larger Python codebase I'm using for development, I got a stack overflow 100% of the time until I removed the nested f-strings from two specific files (now it works without error every time). If helpful, I can probably come up with an example that errors consistently, but it may be more involved.
If I remove Rayon, and do everything serially, the code never errors.
If I run Rayon over a single file (e.g., remove the __init__.py from the above snippet), the code never errors. (I don't know if this is due to a Rayon optimization to not spin out a thread in that case, or something else.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Irrecoverable errors when running AST parser in parallel #4099

Summary

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Search code, repositories, users, issues, pull requests...

Irrecoverable errors when running AST parser in parallel #4099

Description

Summary

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions