feat(core): activity events for recovery, metadata corruption, agent-report by illegalcall · Pull Request #1692 · ComposioHQ/agent-orchestrator

illegalcall · May 7, 2026

Summary

Wires activity events into three forensic-critical paths so RCA can reconstruct what happened after the fact (sub-issue of #1511):

Recovery subsystem — recovery.session_failed (MUST) per failed session, recovery.action_failed (SHOULD) on recoverSession outer catch. Adds "recovery" to ActivityEventSource.
Metadata corruption — metadata.corrupt_detected (MUST) when mutateMetadata side-renames a corrupt session-metadata file to .corrupt-{ts}. Includes data.renamedTo and a 200-char data.contentSample per B11.
Agent-report apply path — api.agent_report.transition_rejected (SHOULD) and api.agent_report.session_not_found (COULD).

After this lands, RCA can answer:

Did ao recover actually fix anything? Which sessions did it skip?
Has AO ever silently renamed a corrupt metadata file? When?
Did applyAgentReport reject a transition (and why)?

Closes #1660.

Invariants preserved

B22: recovery events fire per session, not per probe step (one recovery.session_failed per failed session in runRecovery's loop).
B11: metadata.corrupt_detected truncates data.contentSample to 200 chars — full file content would be dropped by the 16KB sanitizer cap.
No raw error stacks in data — errorMessage only.

Out of scope (per issue)

validator.ts per-probe catches (deliberate uncertainty handling).
cleanupSession runtime/workspace destroy silent catches (best-effort).
deleteMetadata unlink catch (expected race).

Test plan

metadata.test.ts regression: emits metadata.corrupt_detected with renamedTo, truncates contentSample to 200 chars, no emit for healthy JSON.
recovery-manager.test.ts regression: emits one recovery.session_failed per failed session, no emit when all succeed.
pnpm --filter @aoagents/ao-core typecheck passes.
All existing core tests pass (the two pre-existing OpenCode session-manager timeouts on main are unrelated).

🤖 Generated with Claude Code

…report Wires activity events into three forensic-critical paths so RCA can reconstruct what happened after the fact: 1. recovery subsystem - recovery.session_failed (MUST) per failed session in runRecovery - recovery.action_failed (SHOULD) on recoverSession outer catch Adds "recovery" to ActivityEventSource so `ao events list --source recovery` reconstructs an `ao recover` invocation timeline. 2. metadata corruption - metadata.corrupt_detected (MUST) when mutateMetadata renames a corrupt session-metadata file to .corrupt-{ts}. Includes data.renamedTo and a 200-char data.contentSample (B11) for forensic recovery. Previously only console.warn — silent overwrites had no queryable signal. 3. agent-report apply path - api.agent_report.transition_rejected (SHOULD) - api.agent_report.session_not_found (COULD) Per B22, recovery events fire per session, not per probe step. Per B11, metadata.corrupt_detected truncates contentSample to 200 chars (full file would exceed the 16KB sanitizer cap). Closes #1660

greptile-apps · May 7, 2026

Greptile Summary

This PR wires forensic activity events into three previously un-instrumented paths — the recovery subsystem, metadata-corruption detection, and the agent-report apply path — so post-incident RCA can reconstruct exactly what ao recover did, when a corrupt metadata file was silently renamed, and why an agent-report transition was rejected.

Recovery (recovery/manager.ts, recovery/actions.ts): emits recovery.session_failed once per failed session in the runRecovery loop (B22) and recovery.action_failed in the recoverSession outer catch; both events carry projectId from the assessment.
Metadata corruption (metadata.ts): emits metadata.corrupt_detected after a corrupt-JSON rename attempt, with a 200-char contentSample (B11) and a renameSucceeded flag; the summary string is built before the rename result is known, so it always reads "renamed to X" even when the rename failed.
Agent-report (agent-report.ts): emits api.agent_report.session_not_found and api.agent_report.transition_rejected; unlike every other new event, neither includes a projectId, making project-scoped RCA queries incomplete for these paths.

Confidence Score: 4/5

Safe to merge; the new event calls are best-effort (never throw), the lock behaviour is correct, and the two small inconsistencies are cosmetic rather than functional.

The core instrumentation is well-structured and consistent with the existing activity-events infrastructure. Two minor issues exist: the metadata.corrupt_detected summary string incorrectly claims a rename succeeded in failure cases, and both api.agent_report.* events omit projectId even though every other new event supplies it. Neither breaks any runtime behaviour, but they reduce the accuracy and filterability of the forensic data this PR is specifically designed to provide.

packages/core/src/agent-report.ts and packages/core/src/metadata.ts warrant a second look for the projectId gap and the misleading summary string respectively.

Important Files Changed

Filename	Overview
packages/core/src/activity-events.ts	Extends ActivityEventSource and ActivityEventKind unions with five new forensic event types and the "recovery" source; no logic changes to existing recording infrastructure.
packages/core/src/metadata.ts	Adds metadata.corrupt_detected event on corrupt-JSON rename path; summary field always claims a rename succeeded even when renameSucceeded is false.
packages/core/src/agent-report.ts	Adds api.agent_report.session_not_found and api.agent_report.transition_rejected events; both omit projectId, unlike every other new event in this PR.
packages/core/src/recovery/manager.ts	Emits recovery.session_failed once per failed session after executeAction returns {success:false}; correctly captures projectId from the assessment.
packages/core/src/recovery/actions.ts	Emits recovery.action_failed in the recoverSession outer catch; correctly extracts errorMessage before returning the failure result.
packages/core/src/tests/metadata.test.ts	Adds three new regression tests for metadata.corrupt_detected: emit on bad JSON, 200-char truncation, and no emit for healthy JSON.
packages/core/src/tests/recovery-manager.test.ts	New test file verifying recovery.session_failed fires once per failed session and is absent when all sessions succeed.

Sequence Diagram

sequenceDiagram
    participant RM as runRecovery (manager.ts)
    participant EA as executeAction (actions.ts)
    participant RS as recoverSession (actions.ts)
    participant AE as recordActivityEvent

    RM->>EA: executeAction(assessment)
    EA->>RS: recoverSession(...)
    alt recoverSession throws
        RS->>AE: recovery.action_failed
        RS-->>EA: "{success:false, error}"
    end
    EA-->>RM: RecoveryResult

    alt "result.success === false"
        RM->>AE: recovery.session_failed
    end

    participant AR as applyAgentReport (agent-report.ts)
    participant MM as mutateMetadata (metadata.ts)

    AR->>MM: readMetadataRaw
    alt session not found
        AR->>AE: api.agent_report.session_not_found
        AR-->>AR: throw Error
    end
    AR->>MM: mutateMetadata(updater)
    MM->>MM: validateAgentReportTransition
    alt transition invalid
        MM->>AE: api.agent_report.transition_rejected
        MM-->>AR: throw Error
    end

    AR->>MM: mutateMetadata (corrupt path)
    alt JSON parse fails
        MM->>MM: renameSync to .corrupt-ts
        MM->>AE: metadata.corrupt_detected
    end

Prompt To Fix All With AI

Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 3
packages/core/src/metadata.ts:378-384
The `summary` field is constructed before `renamed` is set, so it always reads "renamed to X" even when the `renameSync` call fails (`renameSucceeded: false`). Since this event is specifically for RCA forensics, a summary that contradicts `data.renameSucceeded` can mislead engineers who glance at the summary without inspecting the full payload.

```suggestion
          recordActivityEvent({
            projectId: inferredProjectId || undefined,
            sessionId,
            source: "session-manager",
            kind: "metadata.corrupt_detected",
            level: "error",
            summary: renamed
              ? `Corrupt metadata for session ${sessionId} renamed to ${basename(corruptPath)}`
              : `Corrupt metadata for session ${sessionId} detected (rename failed)`,
```

### Issue 2 of 3
packages/core/src/agent-report.ts:387-392
Both `api.agent_report.session_not_found` and `api.agent_report.transition_rejected` are emitted without a `projectId`, but every other new event in this PR (`recovery.*`, `metadata.corrupt_detected`) includes one. Since `applyAgentReport` receives `dataDir`, the same `basename(dirname(dataDir))` inference used in `metadata.ts` can populate `projectId` here, making these events filterable in project-scoped RCA queries.

```suggestion
  const raw = readMetadataRaw(dataDir, sessionId);
  const inferredProjectId = basename(dirname(dataDir)) || undefined;
  if (!raw) {
    recordActivityEvent({
      projectId: inferredProjectId,
      sessionId,
      source: "api",
      kind: "api.agent_report.session_not_found",
```

### Issue 3 of 3
packages/core/src/agent-report.ts:470-473
The `transition_rejected` event is also missing `projectId`, which was just inferred above. Passing it here keeps the two `api.agent_report.*` events consistent and filterable by project.

```suggestion
      recordActivityEvent({
        projectId: inferredProjectId,
        sessionId,
        source: "api",
        kind: "api.agent_report.transition_rejected",
```

_{Reviews (1): Last reviewed commit: "feat(core): activity events for recovery..." | Re-trigger Greptile}

greptile-apps · May 7, 2026

+          recordActivityEvent({
+            projectId: inferredProjectId || undefined,
+            sessionId,
+            source: "session-manager",
+            kind: "metadata.corrupt_detected",
+            level: "error",
+            summary: `Corrupt metadata for session ${sessionId} renamed to ${basename(corruptPath)}`,


The summary field is constructed before renamed is set, so it always reads "renamed to X" even when the renameSync call fails (renameSucceeded: false). Since this event is specifically for RCA forensics, a summary that contradicts data.renameSucceeded can mislead engineers who glance at the summary without inspecting the full payload.

Suggested change

recordActivityEvent({

projectId: inferredProjectId || undefined,

sessionId,

source: "session-manager",

kind: "metadata.corrupt_detected",

level: "error",

summary: `Corrupt metadata for session ${sessionId} renamed to ${basename(corruptPath)}`,

recordActivityEvent({

projectId: inferredProjectId || undefined,

sessionId,

source: "session-manager",

kind: "metadata.corrupt_detected",

level: "error",

summary: renamed

? `Corrupt metadata for session ${sessionId} renamed to ${basename(corruptPath)}`

: `Corrupt metadata for session ${sessionId} detected (rename failed)`,

Prompt To Fix With AI

This is a comment left during a code review. Path: packages/core/src/metadata.ts Line: 378-384 Comment: The `summary` field is constructed before `renamed` is set, so it always reads "renamed to X" even when the `renameSync` call fails (`renameSucceeded: false`). Since this event is specifically for RCA forensics, a summary that contradicts `data.renameSucceeded` can mislead engineers who glance at the summary without inspecting the full payload. ```suggestion recordActivityEvent({ projectId: inferredProjectId || undefined, sessionId, source: "session-manager", kind: "metadata.corrupt_detected", level: "error", summary: renamed ? `Corrupt metadata for session ${sessionId} renamed to ${basename(corruptPath)}` : `Corrupt metadata for session ${sessionId} detected (rename failed)`, ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · May 7, 2026

  const raw = readMetadataRaw(dataDir, sessionId);
  if (!raw) {
+    recordActivityEvent({
+      sessionId,
+      source: "api",
+      kind: "api.agent_report.session_not_found",


Both api.agent_report.session_not_found and api.agent_report.transition_rejected are emitted without a projectId, but every other new event in this PR (recovery.*, metadata.corrupt_detected) includes one. Since applyAgentReport receives dataDir, the same basename(dirname(dataDir)) inference used in metadata.ts can populate projectId here, making these events filterable in project-scoped RCA queries.

Suggested change

const raw = readMetadataRaw(dataDir, sessionId);

if (!raw) {

recordActivityEvent({

sessionId,

source: "api",

kind: "api.agent_report.session_not_found",

const raw = readMetadataRaw(dataDir, sessionId);

const inferredProjectId = basename(dirname(dataDir)) || undefined;

if (!raw) {

recordActivityEvent({

projectId: inferredProjectId,

sessionId,

source: "api",

kind: "api.agent_report.session_not_found",

Prompt To Fix With AI

This is a comment left during a code review. Path: packages/core/src/agent-report.ts Line: 387-392 Comment: Both `api.agent_report.session_not_found` and `api.agent_report.transition_rejected` are emitted without a `projectId`, but every other new event in this PR (`recovery.*`, `metadata.corrupt_detected`) includes one. Since `applyAgentReport` receives `dataDir`, the same `basename(dirname(dataDir))` inference used in `metadata.ts` can populate `projectId` here, making these events filterable in project-scoped RCA queries. ```suggestion const raw = readMetadataRaw(dataDir, sessionId); const inferredProjectId = basename(dirname(dataDir)) || undefined; if (!raw) { recordActivityEvent({ projectId: inferredProjectId, sessionId, source: "api", kind: "api.agent_report.session_not_found", ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · May 7, 2026

+      recordActivityEvent({
+        sessionId,
+        source: "api",
+        kind: "api.agent_report.transition_rejected",


The transition_rejected event is also missing projectId, which was just inferred above. Passing it here keeps the two api.agent_report.* events consistent and filterable by project.

Suggested change

recordActivityEvent({

sessionId,

source: "api",

kind: "api.agent_report.transition_rejected",

recordActivityEvent({

projectId: inferredProjectId,

sessionId,

source: "api",

kind: "api.agent_report.transition_rejected",

Prompt To Fix With AI

This is a comment left during a code review. Path: packages/core/src/agent-report.ts Line: 470-473 Comment: The `transition_rejected` event is also missing `projectId`, which was just inferred above. Passing it here keeps the two `api.agent_report.*` events consistent and filterable by project. ```suggestion recordActivityEvent({ projectId: inferredProjectId, sessionId, source: "api", kind: "api.agent_report.transition_rejected", ``` How can I resolve this? If you propose a fix, please make it concise.

illegalcall · May 8, 2026

@harshitsinghbhandari — flagging you as a likely reviewer/owner here based on git blame of the files this PR touches:

packages/core/src/agent-report.ts — sole-authored by you (~603 lines)
packages/core/src/metadata.ts — top contributor (~326 lines), most-recent meaningful commits are yours
packages/core/src/recovery/actions.ts — recently touched by you in refactor(core): storage redesign — projectId-based paths, JSON metadata #1466

recovery/manager.ts is originally @harsh-batheja's (#362), but you've been the active maintainer across the other three files. Would you be able to take this one over, or at least review?

greptile-apps Bot reviewed May 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(core): activity events for recovery, metadata corruption, agent-report#1692

feat(core): activity events for recovery, metadata corruption, agent-report#1692
illegalcall wants to merge 1 commit intomainComposioHQ/agent-orchestrator:mainfrom
feat/issue-1660ComposioHQ/agent-orchestrator:feat/issue-1660Copy head branch name to clipboard

illegalcall commented May 7, 2026

Uh oh!

greptile-apps Bot commented May 7, 2026

Important Files Changed

Uh oh!

greptile-apps Bot May 7, 2026

Uh oh!

greptile-apps Bot May 7, 2026

Uh oh!

greptile-apps Bot May 7, 2026

Uh oh!

illegalcall commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Search code, repositories, users, issues, pull requests...

Conversation

illegalcall commented May 7, 2026

Summary

Invariants preserved

Out of scope (per issue)

Test plan

Uh oh!

greptile-apps Bot commented May 7, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

illegalcall commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant