Redundant MetricsCapture in trace_call produces orphan metrics with incomplete resource labels

Environment details

OS type and version: macOS / Linux
Python version: 3.13
google-cloud-spanner version: 3.63.0 (current main)

Description

Every Spanner operation that goes through trace_call() produces orphan OpenTelemetry metric data points with incomplete resource labels (missing project_id and instance_id). These orphan data points persist for the process lifetime due to cumulative aggregation and are re-exported to Cloud Monitoring every 60 seconds, which rejects them with:

INVALID_ARGUMENT: One or more TimeSeries could not be written:
timeSeries[...]: the set of resource labels is incomplete, missing (instance_id)

Root cause

trace_call() in _opentelemetry_tracing.py wraps every operation with a bare MetricsCapture() (no resource_info). Meanwhile, every caller of trace_call already provides its own MetricsCapture(self._resource_info) with correct labels.

When Python evaluates with trace_call(...) as span, MetricsCapture(self._resource_info):, two separate MetricsTracer instances are created:

tracer_A (from trace_call's internal MetricsCapture()): has instance_config, location, client_hash, client_uid, client_name from the factory, but never receives project_id or instance_id
tracer_B (from the caller's MetricsCapture(resource_info)): has correct labels, overwrites tracer_A in the context var

On exit, tracer_B records correct metrics first, then tracer_A records metrics with incomplete labels. Since the SpannerMetricsTracerFactory never has project_id/instance_id in its _client_attributes (only set per-tracer via resource_info or MetricsInterceptor), tracer_A always starts without them and is never populated because the MetricsInterceptor only touches the current context-var tracer (tracer_B).

With OpenTelemetry's cumulative aggregation, once these orphan aggregation buckets are created, they persist for the process lifetime and are re-exported every 60 seconds.

History

PR feat: Add Attempt, Operation and GFE Metrics python-spanner#1302 introduced the metrics system. All MetricsCapture() instances were bare, including the one in trace_call. The design relied on MetricsInterceptor to populate labels during gRPC calls.
PR feat: implement native asyncio support via Cross-Sync python-spanner#1509 added the _resource_info property and changed all caller sites from MetricsCapture() to MetricsCapture(self._resource_info) for eager label propagation. However, the bare MetricsCapture() inside trace_call was not removed, making it redundant and harmful.

Impact

Affects every Spanner operation (~27 code paths) on every invocation
Creates persistent orphan metric aggregation buckets
Produces repeated INVALID_ARGUMENT error logs every 60 seconds
Wastes CPU/network on exporting invalid TimeSeries
Application functionality is unaffected; valid metrics from the caller's MetricsCapture still work

Steps to reproduce

Create a spanner.Client() with metrics enabled (default)
Perform any Spanner operation (e.g., session.create(), snapshot.execute_sql())
Observe INVALID_ARGUMENT errors logged from the metrics exporter every 60 seconds

Suggested fix

Remove the bare MetricsCapture() from trace_call — it is redundant since every caller already provides its own. See PR googleapis/python-spanner#1522.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Redundant MetricsCapture in trace_call produces orphan metrics with incomplete resource labels #16173

Environment details

Description

Root cause

History

Impact

Steps to reproduce

Suggested fix

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Search code, repositories, users, issues, pull requests...

Uh oh!

Redundant MetricsCapture in trace_call produces orphan metrics with incomplete resource labels #16173

Description

Environment details

Description

Root cause

History

Impact

Steps to reproduce

Suggested fix

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions