fix(bwrap): default to deny-by-default filesystem (mirror seatbelt)#482
fix(bwrap): default to deny-by-default filesystem (mirror seatbelt)#482caarlos0 wants to merge 3 commits into
Conversation
The Bubblewrap backend used to bind-mount the entire host root read-only into every sandbox (`--ro-bind / /`), so the caller's $HOME, /root, /opt, /var/sys, /run/user/<uid>, and everything else readable by the calling uid was visible inside the sandbox by default. The macOS Seatbelt backend, by contrast, starts from `(deny default)` and only allows a narrow system baseline -- bwrap now matches that posture. The new baseline (`BASELINE_RO_BIND_PATHS`) mirrors seatbelt's `SYSTEM_READ_ALLOW` allowlist: top-level executable/library dirs (/bin, /sbin, /lib*), the /usr subpaths that seatbelt allows (without /usr/local), /etc, and the DNS stub-resolver directories under /run (/run/systemd/resolve, /run/NetworkManager, /run/resolvconf) so /etc/resolv.conf symlinks still resolve when network is allowed. $HOME, /opt, /usr/local, /var, /sys, and /run/user/<uid> are no longer visible until the caller opts in via `readonlyPaths` / `readwritePaths`. Paths are emitted via `--ro-bind-try` so missing entries are silently skipped (e.g. /lib32 on x86_64-only systems, /run/systemd/resolve on hosts without systemd-resolved). Files in /etc with restrictive perms (/etc/shadow, /etc/sudoers, /etc/ssh/ssh_host_*_key) remain unreadable to a non-root caller even though /etc is bound whole -- user-namespace UID mapping does not bypass kernel DAC. Updated the existing `filesystem_policy_produces_correct_mounts` test and added 5 new tests covering the new contract (no host-root bind, required baseline paths emitted, /usr/local not exposed, confidential paths excluded, DNS dirs included, baseline precedes policy mounts). Docs in docs/bwrap-support/bubblewrap-backend.md updated accordingly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
`nanvix_common` is a `[build-dependency]` of `lxc` and `wxc`. Build deps are compiled for the host, so cross-compiling lxc-exec from macOS to aarch64-unknown-linux-gnu pulled nanvix_common into a host build where `target_os` was neither "windows" nor "linux" -- the `REQUIRED_BINARIES` and `NANVIXD_BINARY` constants then had no definition and the crate failed to compile. Add empty/zero fallbacks for non-Windows/Linux hosts. The empty slice is correct because: - NanVix only runs on Windows and Linux, so iterating `REQUIRED_BINARIES` on other hosts must be a no-op. - The consuming build scripts (e.g. `src/core/lxc/build.rs`) already gate the surrounding logic behind `cfg(target_os = "linux")` and `feature = "microvm"`, so the fallback values are never reached in practice. Zero runtime impact on supported platforms; pure build-time portability fix. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Carlos Alexandro Becker <caarlos0@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR tightens the Bubblewrap backend’s default filesystem exposure by switching from a full host-root bind mount to a minimal allowlist baseline, adds regression tests for the new deny-by-default posture, and updates docs accordingly. It also adds NanVix constant fallbacks so the NanVix common crate can compile on non-Windows/Linux hosts when used as a build dependency.
Changes:
- Bubblewrap: replace
--ro-bind / /with a minimal baseline set of--ro-bind-trymounts and add targeted regression tests. - NanVix: add non-Windows/Linux fallbacks for
REQUIRED_BINARIESandNANVIXD_BINARYto support host builds on macOS/BSD. - Docs: document the Bubblewrap deny-by-default filesystem model and its consequences.
Show a summary per file
| File | Description |
|---|---|
| src/backends/nanvix/common/src/lib.rs | Adds non-Windows/Linux fallbacks for NanVix host-compiled constants to keep builds working when cross-compiling. |
| src/backends/bubblewrap/common/src/bwrap_command.rs | Introduces a minimal baseline allowlist (deny-by-default) via --ro-bind-try and expands/updates tests. |
| docs/bwrap-support/bubblewrap-backend.md | Documents the new baseline filesystem behavior and user-facing implications. |
Copilot's findings
- Files reviewed: 3/3 changed files
- Comments generated: 5
| /// Fallback for non-Windows/Linux hosts. See `REQUIRED_BINARIES` above. | ||
| #[cfg(not(any(target_os = "windows", target_os = "linux")))] | ||
| pub const NANVIXD_BINARY: &str = ""; |
| for path in BASELINE_RO_BIND_PATHS { | ||
| args.extend(["--ro-bind-try".into(), (*path).into(), (*path).into()]); | ||
| } |
| // ro — baseline paths are emitted via --ro-bind-try, so a bare | ||
| // --ro-bind must correspond to the user's readonlyPaths entry. | ||
| let ro_pos = args | ||
| .windows(3) | ||
| .position(|w| w[0] == "--ro-bind" && w[1] == "/data" && w[2] == "/data") | ||
| .expect("readonly policy path /data should produce a --ro-bind mount"); | ||
| assert!(ro_pos > 0); |
| let usr_local = args.iter().any(|a| a == "/usr/local"); | ||
| assert!(!usr_local, "baseline must not expose /usr/local by default"); | ||
| } |
| // /usr subpaths — mirrors seatbelt's baseline exactly, intentionally | ||
| // excluding /usr/local. |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
| Common consequences of this default: | ||
|
|
||
| - `$HOME` (e.g. `~/.aws/credentials`, `~/.ssh/id_*`, browser cookies) is | ||
| not readable from the sandbox. | ||
| - `/opt` and `/usr/local` tooling is not on PATH; list either path under | ||
| `readonlyPaths` if the script depends on it. | ||
| - `working_directory` must live under the baseline or a policy path — a | ||
| `cwd` of `~/project` without a matching `readonlyPaths` entry will fail. | ||
| - DNS works on systemd-resolved, NetworkManager, and resolvconf hosts | ||
| because the corresponding `/run/...` directories are bound. Hosts where | ||
| `/etc/resolv.conf` symlinks somewhere else need that target listed in | ||
| `readonlyPaths`. | ||
|
|
There was a problem hiding this comment.
question: Should we return readable errors in these cases if we know beforehand that these things may not work? Also it'd probably be worth it to create a GitHub issue if you think these can be resolved at some point in the future
| `/etc/resolv.conf` symlinks somewhere else need that target listed in | ||
| `readonlyPaths`. | ||
|
|
||
| Files in `/etc` that contain secrets (`/etc/shadow`, `/etc/sudoers`, |
There was a problem hiding this comment.
note: same question as above
| /// (`/etc/shadow`, `/etc/sudoers`, `/etc/ssh/ssh_host_*_key`) are mode | ||
| /// `0400` / `0640` root and remain unreadable to a non-root caller — | ||
| /// user-namespace UID mapping does not bypass kernel DAC. | ||
| const BASELINE_RO_BIND_PATHS: &[&str] = &[ |
There was a problem hiding this comment.
question: Is the addition of this a breaking change for what we have now? E.g These were suppose to be default deny but they were not prior to this change. We'd want to know so we can call this out via SDK versioning.
There was a problem hiding this comment.
Oh wait, I may have read this wrong initially. It looks like you want to allow these read only paths and disallow everything else unless specified right? If so, then I think we're good here.
📖 Description
Change the Bubblewrap backend's default filesystem posture from "host root mounted read-only" to deny-by-default, matching the macOS Seatbelt backend's
(deny default)baseline.Why
bwrap_command::build_argsused to emit:That bind-mounted the entire host root read-only into every sandbox, so the caller's
$HOME/.aws/credentials,$HOME/.ssh/id_*, browser cookies, etc. were readable inside the sandbox by default. The Seatbelt backend on macOS starts from(deny default)and only allows narrow system paths (SYSTEM_READ_ALLOWinsrc/backends/seatbelt/common/src/profile_builder.rs), so the two backends had a meaningful asymmetry in the confidentiality guarantees they offered. This PR closes that gap.What changes
New baseline (
BASELINE_RO_BIND_PATHS) — mirrors seatbelt'sSYSTEM_READ_ALLOW:/bin,/sbin,/lib,/lib32,/lib64,/libx32(symlinks under/usron merged-usr distros; bwrap follows source-side symlinks so both real-dir and symlinked distros work)./usrsubpaths:/usr/bin,/usr/sbin,/usr/lib,/usr/lib32,/usr/lib64,/usr/libexec,/usr/share— deliberately not/usrwholesale, so/usr/localis not implicitly exposed./etc— whole, like seatbelt's/private/etc. Files with restrictive perms (/etc/shadow,/etc/sudoers,/etc/ssh/ssh_host_*_key) stay unreadable to a non-root caller because user-namespace UID mapping does not bypass kernel DAC./run/systemd/resolve,/run/NetworkManager,/run/resolvconf— needed when/etc/resolv.confis a symlink. Narrow subpaths so/run/user/<uid>(D-Bus session, keyring, ssh-agent sockets) stays hidden.All emitted via
--ro-bind-tryso missing paths are silently skipped (e.g./lib32on x86_64-only systems,/run/systemd/resolveon hosts without systemd-resolved).What disappears from sandbox by default
$HOME,/root,/home/*,/opt,/srv,/mnt,/media,/var,/sys,/usr/local,/run/user/<uid>,/run/dbus. Callers who legitimately need any of these must list them underreadonlyPathsorreadwritePaths.What's preserved
readwritePaths/readonlyPaths/deniedPathssemantics — unchanged.--unshare-*flags, network policy handling, proxy env-var injection, working-dir, env clearing — unchanged.--dev /dev/--proc /proc/--tmpfs /tmpoverlay — unchanged.Drive-by build fix
The second commit (
fix(nanvix): compile as build-dep from non-Linux/Windows hosts) adds empty/zero fallbacks forREQUIRED_BINARIESandNANVIXD_BINARYsonanvix_commoncompiles on macOS hosts when pulled in as a[build-dependency]oflxc/wxcduring cross-compile. Zero runtime impact on supported platforms — the consuming build scripts already gate the surrounding logic behindcfg(target_os = "linux"/"windows")andfeature = "microvm". Separated out so it can be reviewed (or split into its own PR) independently.Breaking change for users
This is a behavior change. Configs that implicitly relied on
$HOME(or/opt,/var,/usr/local, …) being readable will start failing. The migration is to list the directory inreadonlyPaths:{ "filesystem": { "readonlyPaths": ["/home/alice/project", "/usr/local"] } }Documented in the updated "How It Works → Deny-by-default filesystem" and "Limitations" sections of
docs/bwrap-support/bubblewrap-backend.md.🔗 References
No tracking issue — this came out of a direct comparison between the seatbelt and bwrap baselines while reviewing the two unprivileged backends.
🔍 Validation
Unit tests (
cargo test -p bwrap_commonfromsrc/) — 21/21 pass, including 5 new tests covering the new contract:baseline_does_not_bind_mount_host_root— regression test for the old--ro-bind / /default.baseline_emits_required_ro_bind_try_paths—/bin,/sbin,/lib,/lib64,/usr/bin,/usr/lib,/usr/share,/etcall emitted.baseline_does_not_expose_usr_local— no--ro-bind /usr /usrand no explicit/usr/localentry.baseline_excludes_confidential_paths— no/home,/root,/opt,/srv,/var,/sys,/run/user,/run/dbusbind-mounts.baseline_includes_dns_stub_resolver_dirs— all three DNS dirs emitted via--ro-bind-try.baseline_mounts_precede_policy_mounts— policy mounts can still shadow baseline.Plus updated
filesystem_policy_produces_correct_mountsto match the new contract (a bare--ro-bind /data /datais now unambiguously the policy mount).Lint / format —
cargo clippy -p bwrap_common --all-targets -- -D warningsclean,cargo fmt --all -- --checkclean.Linux VM verification — cross-compiled
lxc-execforaarch64-unknown-linux-gnuand ran a 6-config smoke suite on a Linux VM (seesrc/target/vm-test-bundle/locally — gitignored). The suite plantsTOP_SECRET=hunter2in/home/SENTINEL_DO_NOT_LEAK.txton the host and verifies the secret does not appear in sandbox output without an explicitreadonlyPaths: ["/home"], then verifies the opt-in does expose it. Also covers/opt//var//sys//root//usr/localbeing hidden, DNS resolution working with network allowed, and/etc/shadowstaying unreadable via DAC. (Will paste the run output as a PR comment once the VM run is complete.)✅ Checklist
📋 Issue Type
Microsoft Reviewers: Open in CodeFlow