Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Copilot CLI bug: runaway MCP server spawning (IDE lock-file watcher re-init loop) #3701

Copy link
Copy link
@wibjorn

Description

@wibjorn
Issue body actions

Describe the bug

Environment

  • Copilot CLI 1.0.60 (WinGet GitHub.Copilot)
  • Node v24.15.0, npm 11.12.1
  • Windows 11 Enterprise 10.0.26200, PowerShell 7 / Windows Terminal
  • VS Code running with multiple workspaces open (IDE integration active)
  • Project MCP server playwright in repo-root .mcp.json: npx @playwright/mcp@latest

Summary

When a CLI session is IDE-connected and a project .mcp.json declares a stdio MCP server that is slow to complete its handshake, the CLI spawns the server dozens of times per second and never reaps prior attempts. One ~22-min session logged 1,637 spawn attempts; another reached 156 simultaneous live @playwright/mcp trees (plus orphaned headless browsers) in ~46 s, pushing CPU to ~90%. Two triggers produce the same runaway:

A. Reconnect-on-failure with no backoff. If the server exits immediately (here, npx returning npm E401 resolving @latest from a private feed with an expired token), the CLI restarts the client in a tight loop:

[ERROR] MCP transport for playwright closed
[ERROR] Failed to start MCP client for playwright: McpError: MCP error -32000: Connection closed
[ERROR] Starting MCP client for playwright ...   // ~30-90 ms later, repeat

B. IDE lock-file watcher re-initialization (primary defect). With auth fixed, the storm continued. VS Code updates lock files on a heartbeat; the CLI re-reads workspace MCP config and re-initializes servers on every change:

Starting IDE lock file watcher for workspace: ...\<repo>
Loaded workspace MCP config from ...\.mcp.json   x188
Starting MCP client for playwright ...           x188

Session telemetry: connected=3, registered=0, number of tools=0 — not one spawned server finished its handshake before the next re-init fired, so every fire spawned another tree.

Isolating contrast: a lightweight server in the same setup (a small custom MCP, also via npx) does not storm — it registers once, then re-inits see it connected and skip re-spawning. The trigger is fast repeated re-init that does not dedupe a server already running / mid-handshake.

Windows amplifier: spawned trees aren't terminated as a group. npx expands to cmd -> node npx-cli -> cmd -> node server.js (-> headless browser); killing the top process leaves the deeper node server and browser alive, so every abandoned attempt leaks.

Affected version

GitHub Copilot CLI 1.0.60.

Steps to reproduce the behavior

  1. Open the repo in VS Code (IDE integration on).
  2. Repo-root .mcp.json: {"mcpServers":{"playwright":{"command":"npx","args":["@playwright/mcp@latest"]}}}.
  3. Start the CLI in that workspace; issue any turn that initializes MCP.
  4. Within seconds, dozens-hundreds of node/npx (+ headless browser) processes accumulate; CPU spikes. (Optionally make the first launch fail, e.g. invalid npm auth, to also trigger variant A.)

Expected behavior

  • One MCP server instance per session, reused across turns.
  • Re-init only when the server's effective config changes (diff), debounced — not on every IDE lock-file heartbeat.
  • A server already running or mid-handshake is not re-spawned (dedupe by name + config hash).
  • Reconnect attempts bounded with exponential backoff.
  • On disconnect/supersede/exit, the whole spawned process group is terminated (Windows job object / kill-tree).

Additional context

Actual

New server tree spawned per re-init / per failed reconnect, no backoff, no teardown; prior trees (and headless browsers) leak until saturation.

Suggested fixes (priority order)

  1. Dedupe: track servers by name + resolved-config hash; skip launch if already running/connecting.
  2. Debounce + diff IDE lock-file / workspace-sync events; re-init only changed servers.
  3. Kill-tree on Windows: launch each server in a job object so child npx/node/browser processes are reaped on teardown.
  4. Bounded reconnect with exponential backoff + jitter and a failure cap (covers variant A).

Workaround

Pre-install the server and launch directly (no npx) so it registers fast enough to be deduped:
{"command":"node","args":["<abs-path>/@playwright/mcp/cli.js"]}.

Reactions are currently unavailable

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:mcpMCP server configuration, discovery, connectivity, OAuth, policy, and registryMCP server configuration, discovery, connectivity, OAuth, policy, and registryarea:platform-windowsWindows-specific: PowerShell, cmd, Git Bash, WSL, Windows TerminalWindows-specific: PowerShell, cmd, Git Bash, WSL, Windows Terminal

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Morty Proxy This is a proxified and sanitized view of the page, visit original site.