mcp: server modes, elicitation gating, audit subsystem (PR2)#652
mcp: server modes, elicitation gating, audit subsystem (PR2)#652
Conversation
The new mode-contract scenarios (MCP HTTP Mode Read Only / Delete Safe / Full Access / Audit Basic / Audit Disabled) all failed in CI with rc=2. Root cause: I synthesised SQL like select name from google.compute.networks where project = 'stackql-demo' delete from google.compute.firewalls where project = 'p' and firewall = 'f' against the test registry. google.compute.networks is registered as metadata (used by describe_resource / list_methods in earlier scenarios) but is not mocked as an executable SELECT target. The synthetic DELETE values 'p' / 'f' don't match the canonical mock either. The bundled mcp_client panics on unknown-target query errors with exit code 2. Switch all five scenarios to the canonical mocked SQL already used by the working MCP HTTPS Server Query / Exec Query Canonical scenarios: select name, id from google.storage.buckets where project = 'stackql-demo' delete from google.compute.firewalls where project = 'mutable-project' and firewall = 'deletable-firewall' The MCP HTTP Mode Safe Refuses Mutations Without Elicitation scenario already happened to pass (it expected rc != 0, which the synthetic SQL also produced) but the same SQL substitution makes its intent clearer. Also: - Clean up six empty stackql_mcp_server_*.log files inadvertently committed in the PR2 squash. These were left by unit tests that exercised audit-on paths before connectInProcess gained its disable- audit default. - Add stackql_mcp_server_*.log to .gitignore so future stray audit files don't get tracked. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
MCP HTTP Audit Basic Records Tool Calls failed on Windows CI with
rc=1 on the first SELECT against the audit-enabled 9923 server.
Root cause: the audit log path embedded in mcp.config was
`${CURDIR}/tmp/mcp-audit-9923.log`. On Windows, ${CURDIR} expands to
`D:\a\stackql\stackql\test\robot\functional`. Inside a JSON string,
the resulting `\a` `\s` `\t` etc are invalid backslash escapes, so
json.Unmarshal in cmd/mcp.go's runMCPServer fails silently and
config stays as zero-value. The server then binds the default
address 127.0.0.1:9876 instead of 9923, the client sees connection
refused, and rc=1.
Fix the embedded path to a relative filename `mcp-audit-9923.log`.
The stackql process resolves it through filepath.Abs against its cwd
(the directory robot was invoked from), so the file lands in the
repo root regardless of OS path-separator semantics. Update the
test's Get File call to use ${EXECDIR} to match.
Also:
- Add a preflight server_info call at the start of the audit test
so a future audit-server startup failure surfaces with a clear
message ("9923 server must be reachable...") instead of a generic
rc-comparison failure.
- Add mcp-audit-*.log to .gitignore so the audit file from a robot
run doesn't get accidentally committed.
The audit-disabled scenario is unaffected (its glob
stackql_mcp_server_*.log does not match mcp-audit-9923.log).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
This pkg/mcp_server/audit package should have the Sink interface and its implementations moved to a separate pkg/sink package and keep all of the mcp type stuff like formats and reasons can stay here for now. When new forms of sinks come in and we start doing stuff like file split / rotate / GC then it will be mega useful, not only for mcp
| @@ -0,0 +1,145 @@ | ||
| // Package policy classifies SQL queries and decides whether the server should |
There was a problem hiding this comment.
Please condense the separate QueryClass and Descision objects as return values of a composite Policy interface. The construction of same can be handled by a factory function at the client site or a New... constructor here, latter provided that only inherent types and no imports required
general-kroll-4-life
left a comment
There was a problem hiding this comment.
Overall pretty good. A couple of points to change
| // "stackql_mcp_server_<timestamp>.log") without bringing the format | ||
| // string into the sink package. When nil, a generic | ||
| // "sink_<RFC3339-utc-second>.log" filename is used. | ||
| DefaultFilename func(time.Time) string `json:"-" yaml:"-"` |
There was a problem hiding this comment.
force the client to at least specify the directory for the file sink. If the mcp server defaults this to cwd for now thats fine.
general-kroll-4-life
left a comment
There was a problem hiding this comment.
found one more thing that will do then
Summary
PR2 of the two-PR MCP server series. PR1 trimmed the tool surface to 11 + 1 prompt and shipped a single
read_only: true/falseflag. PR2 replaces that flag with a four-mode safety contract, wires MCP elicitation for approval-required modes, and adds an audit subsystem.IsReadOnly:read_only,safe(new default),delete_safe,full_access. Verb-classified gate (SELECT/INSERT|UPDATE|REPLACE|MERGE|UPSERT/DELETE/EXEC) decides allow / refuse / needs-approval per mode.elicitation/createif the client advertised the capability; otherwise refuses with a message pointing atfull_access.Sinkinterface; file sink ships with lumberjack rotation. Audit is on by default. Threefailure_modeoptions:strict(default),strict_mutations,best_effort.classify → decide → (refuse / elicit / allow) → handler → audit. Replaces PR1's per-handlerisReadOnlyConfigchecks.server_infoexposesmode(new) andis_read_only(back-compat: true iffmode == "read_only").read_only: trueJSON/YAML key still parses and maps tomode: read_only. When both are set,modewins.Breaking changes
mode: safe= "mutations require user approval." Operators running the bundled client (which does not advertise elicitation) or any non-elicitation automation must explicitly opt intomode: full_access../stackql_mcp_server_<RFC3339-utc-second>.logby default. To preserve PR1 behaviour set"audit": {"disabled": true}inmcp.config.Config.Server.IsReadOnly *boolremoved in favour ofConfig.Server.Mode string. Legacy wire form preserved.NewStackqlMCPBackendService,NewStackqlMCPReverseProxyService) no longer takeisReadOnly bool- the value is derived from the mode string carried byserverBuildInfo.Architectural invariant
pkg/mcp_server/...still imports zero symbols frominternal/...orany-sdk. Verified by grep. The newpolicyandauditpackages are pure: primitive types only, no SDK or internal dependencies.The
serverBuildInfodata carrier ininternal/stackql/mcpbackendremains an unexported interface backed by an unexported struct with constructor-only mutation, per the AGENTS.md data-carrier rule.What landed
New packages
QueryClass,ClassifyQuery,Decision,GateDecision, mode string constants. Pure functions.Event(flat, primitive-typed DTO),Sinkinterface,NopSink,NewFileSink(lumberjack-backed, fsync per record, 0600 file perms).New / changed files
addToolWithGate; init audit sink at server start; close it at shutdown.Mode+AuditConfigfields; legacyread_onlyshim via customUnmarshalJSON/UnmarshalYAML;Validate()rejects unknown modes + failure_modes.Modeadded toServerInfoOutputandServerInfoDTO;ReadOnlyretained for back-compat.serverBuildInfo;ServerInfo()returnsModeand derivesReadOnly.cfg.GetMode()), threads throughNewServerBuildInfo.mode: full_access+audit.disabled: trueso PR1 scenarios keep passing.Go module
gopkg.in/natefinch/lumberjack.v2for log rotation.Test plan
go build ./...clean.go vet ./...clean.golangci-lint run ./pkg/mcp_server/... ./internal/stackql/mcpbackend/... ./internal/stackql/cmd/...→ 0 issues.go test --tags sqlite_stackql -count=1 ./...→ all packages pass.New unit tests
TestMode_ReadOnly_RefusesAllMutationsAndLifecycleTestMode_Safe_RefusesMutationsWithoutElicitationTestMode_DeleteSafe_AllowsCreateRefusesDeleteAndLifecycleTestMode_FullAccess_AllowsEverythingTestMode_Safe_ElicitationAcceptProceeds(usesClientOptions.ElicitationHandler)TestMode_Safe_ElicitationDeclineRefusesTestMode_Safe_ElicitationCancelRefusesTestMode_DeleteSafe_ElicitationAcceptAllowsDeleteTestServerInfo_SurfacesModeTestAudit_RecordsAllToolCalls(end-to-end: in-process server → temp audit file → JSONL parse)read_only: falsedoesn't force a mode, unknown-mode rejected, audit.disabled parsing, failure_mode validation, empty-mode defaults to safe.Robot scenarios added
MCP HTTP Mode Read Only Refuses Mutations And Lifecycle(9920)MCP HTTP Mode Safe Refuses Mutations Without Elicitation(asserts the no-elicitation fallback)MCP HTTP Mode Delete Safe Allows Create Refuses Delete And Lifecycle(9921)MCP HTTP Mode Full Access Allows Everything(9922)MCP HTTP Audit Basic Records Tool Calls(9923 with known path)MCP HTTP Audit Disabled Writes No FileExisting
MCP HTTP Server Info Includes Versionextended to assert themodekey. ExistingMCP HTTPS Run Mutation Refused In Read Onlyupdated to assertread_only(underscore) - exercises the legacy shim on the 9916 server which still uses theread_only: truewire form.Out of scope (deferred)
ClientOptions.ElicitationHandlerand verified manually with elicitation-capable clients (Claude Desktop, Cursor, etc).Sinkinterface admits them as future additions.🤖 Generated with Claude Code