Google Developer forums

Sending custom telemetry and metadata to frontend

Hi everyone,

I’m deploying a custom ADK agent (LlmAgent subclass) to Vertex AI Agent Engine (Reasoning Engine) and I’m running into an issue where custom telemetry events seem to be stripped or filtered out before reaching the client.

The Goal: I need to send real-time telemetry (cost per turn, energy usage, RAG source URIs) to my Nuxt 3 frontend. I implemented a “Sidecar Pattern” where I yield a final, synthetic Event containing a JSON payload after the LLM generation finishes.

The Implementation: In my _run_async_impl, I manually yield a final event after the generator loop finishes:

Python

# agent.py
telemetry_payload = { "total_cost": 0.005, "sources": [...] }
sidecar_event = Event(
    invocation_id=ctx.invocation_id,
    author="system_telemetry",  # I've also tried author="model"
    content=types.Content(
        role="model",
        parts=[types.Part.from_text(text=json.dumps(telemetry_payload))]
    )
)
yield sidecar_event

The Problem:

  1. Local Dev Works: When running locally (adk web), the event arrives perfectly.

  2. Cloud Deployment Fails: When deployed to Vertex AI, the stream closes immediately after the final LLM token. The custom sidecar event never arrives at the client.

  3. Logs Confirm Injection: My Cloud Logging stderr confirms the code is executing and injecting the event: INFO: 💉 Injecting Sidecar Telemetry: 2072 bytes

  4. Client Stream: The client receives all standard text chunks, but the stream ends without receiving the final JSON chunk.

Hypothesis: It seems the managed Agent Engine service (the A2A conversion layer) is sanitizing or filtering the output stream.

Questions:

  1. Does the Reasoning Engine enforce a strict schema that drops events with custom author tags (like system_telemetry)?

  2. If I set author="model" (which I tried), does the engine filter out events that look like “data” or don’t match the internal generation state?

  3. Is there a recommended way to pass custom metadata (like citations or cost) through the Reasoning Engine stream without it being stripped?

Environment:

  • ADK Python v1.19.0

  • Vertex AI Agent Engine (Europe-West4)

  • Frontend: Nuxt 3 consuming the streaming API

Any insights on how to bypass this sanitization or properly structure the event would be appreciated!

Hey,

Hope you’re keeping well.

In Vertex AI Agent Engine, the streaming layer enforces a schema where only recognized role/author combinations tied to the LLM turn lifecycle are forwarded to the client. Custom events injected after the final token often get dropped because the managed A2A layer closes the stream once the finish_reason is reached. To pass telemetry or metadata reliably, you’ll need to embed it in the final model message as structured text or use the tool/function_call parts, which are preserved in the stream. Another option is to emit metadata via a separate side channel, such as Cloud Pub/Sub or a REST endpoint your frontend subscribes to, rather than relying on post-turn events in the same stream. This avoids the sanitization that happens when the Agent Engine wraps and serializes the stream for delivery.

Thanks and regards,
Taz

1 Like

The service drops events that do not match the allowed message structure for streamed model output. It only forwards events that represent model tokens or system events that the platform itself generates. Any event that does not fit those roles is removed before it reaches the client. Your sidecar event is valid inside local ADK runtime, but the managed runtime applies stricter rules.

The author field does not change this. Setting it to model does not bypass the filter. The engine checks whether the event belongs to the active generation step. Events outside that step are treated as metadata and are not streamed. The platform completes the generation, closes the stream, and discards anything injected after the final token.

The current public streaming path does not support sending custom metadata as a final synthetic event. The engine requires all streamed content to come from the model inference pipeline and blocks any extra user-defined messages.

The only supported options today are embedding the metadata inside the actual model output or sending it through a separate channel outside the model stream. The service does not expose a way to attach custom fields to the streamed response in the same channel that carries tokens.

Your implementation works locally because the local runtime does not enforce the production filters. The managed environment enforces them, and the sidecar pattern is removed as part of that filtering step.

1 Like

I appreciate your reply! I will look at your suggestions and reply to this thread with my preferred solution and its implementation.

Thank you for your detailed reply! I will update this thread with the implemenation I chose.

Damian,

Just wondering about the solution you decided on. I’m looking to implement a similar solution.

Thanks.

Morty Proxy This is a proxified and sanitized view of the page, visit original site.