Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit bcadb61

Browse filesBrowse files
authored
Sampling context improvements (#3847)
1 parent 7c70b9c commit bcadb61
Copy full SHA for bcadb61

File tree

Expand file treeCollapse file tree

18 files changed

+221
-151
lines changed
Filter options
Expand file treeCollapse file tree

18 files changed

+221
-151
lines changed

‎MIGRATION_GUIDE.md

Copy file name to clipboardExpand all lines: MIGRATION_GUIDE.md
+103-96Lines changed: 103 additions & 96 deletions
Original file line numberDiff line numberDiff line change
@@ -20,102 +20,109 @@ Looking to upgrade from Sentry SDK 2.x to 3.x? Here's a comprehensive list of wh
2020
- Redis integration: In Redis pipeline spans there is no `span["data"]["redis.commands"]` that contains a dict `{"count": 3, "first_ten": ["cmd1", "cmd2", ...]}` but instead `span["data"]["redis.commands.count"]` (containing `3`) and `span["data"]["redis.commands.first_ten"]` (containing `["cmd1", "cmd2", ...]`).
2121
- clickhouse-driver integration: The query is now available under the `db.query.text` span attribute (only if `send_default_pii` is `True`).
2222
- `sentry_sdk.init` now returns `None` instead of a context manager.
23-
- The `sampling_context` argument of `traces_sampler` now additionally contains all span attributes known at span start.
24-
- If you're using the Celery integration, the `sampling_context` argument of `traces_sampler` doesn't contain the `celery_job` dictionary anymore. Instead, the individual keys are now available as:
25-
26-
| Dictionary keys | Sampling context key |
27-
| ---------------------- | -------------------- |
28-
| `celery_job["args"]` | `celery.job.args` |
29-
| `celery_job["kwargs"]` | `celery.job.kwargs` |
30-
| `celery_job["task"]` | `celery.job.task` |
31-
32-
Note that all of these are serialized, i.e., not the original `args` and `kwargs` but rather OpenTelemetry-friendly span attributes.
33-
34-
- If you're using the AIOHTTP integration, the `sampling_context` argument of `traces_sampler` doesn't contain the `aiohttp_request` object anymore. Instead, some of the individual properties of the request are accessible, if available, as follows:
35-
36-
| Request property | Sampling context key(s) |
37-
| ---------------- | ------------------------------- |
38-
| `path` | `url.path` |
39-
| `query_string` | `url.query` |
40-
| `method` | `http.request.method` |
41-
| `host` | `server.address`, `server.port` |
42-
| `scheme` | `url.scheme` |
43-
| full URL | `url.full` |
44-
45-
- If you're using the Tornado integration, the `sampling_context` argument of `traces_sampler` doesn't contain the `tornado_request` object anymore. Instead, some of the individual properties of the request are accessible, if available, as follows:
46-
47-
| Request property | Sampling context key(s) |
48-
| ---------------- | --------------------------------------------------- |
49-
| `path` | `url.path` |
50-
| `query` | `url.query` |
51-
| `protocol` | `url.scheme` |
52-
| `method` | `http.request.method` |
53-
| `host` | `server.address`, `server.port` |
54-
| `version` | `network.protocol.name`, `network.protocol.version` |
55-
| full URL | `url.full` |
56-
57-
- If you're using the generic WSGI integration, the `sampling_context` argument of `traces_sampler` doesn't contain the `wsgi_environ` object anymore. Instead, the individual properties of the environment are accessible, if available, as follows:
58-
59-
| Env property | Sampling context key(s) |
60-
| ----------------- | ------------------------------------------------- |
61-
| `PATH_INFO` | `url.path` |
62-
| `QUERY_STRING` | `url.query` |
63-
| `REQUEST_METHOD` | `http.request.method` |
64-
| `SERVER_NAME` | `server.address` |
65-
| `SERVER_PORT` | `server.port` |
66-
| `SERVER_PROTOCOL` | `server.protocol.name`, `server.protocol.version` |
67-
| `wsgi.url_scheme` | `url.scheme` |
68-
| full URL | `url.full` |
69-
70-
- If you're using the generic ASGI integration, the `sampling_context` argument of `traces_sampler` doesn't contain the `asgi_scope` object anymore. Instead, the individual properties of the scope, if available, are accessible as follows:
71-
72-
| Scope property | Sampling context key(s) |
73-
| -------------- | ------------------------------- |
74-
| `type` | `network.protocol.name` |
75-
| `scheme` | `url.scheme` |
76-
| `path` | `url.path` |
77-
| `query` | `url.query` |
78-
| `http_version` | `network.protocol.version` |
79-
| `method` | `http.request.method` |
80-
| `server` | `server.address`, `server.port` |
81-
| `client` | `client.address`, `client.port` |
82-
| full URL | `url.full` |
83-
84-
- If you're using the RQ integration, the `sampling_context` argument of `traces_sampler` doesn't contain the `rq_job` object anymore. Instead, the individual properties of the job and the queue, if available, are accessible as follows:
85-
86-
| RQ property | Sampling context key(s) |
87-
| --------------- | ---------------------------- |
88-
| `rq_job.args` | `rq.job.args` |
89-
| `rq_job.kwargs` | `rq.job.kwargs` |
90-
| `rq_job.func` | `rq.job.func` |
91-
| `queue.name` | `messaging.destination.name` |
92-
| `rq_job.id` | `messaging.message.id` |
93-
94-
Note that `rq.job.args`, `rq.job.kwargs`, and `rq.job.func` are serialized and not the actual objects on the job.
95-
96-
- If you're using the AWS Lambda integration, the `sampling_context` argument of `traces_sampler` doesn't contain the `aws_event` and `aws_context` objects anymore. Instead, the following, if available, is accessible:
97-
98-
| AWS property | Sampling context key(s) |
99-
| ------------------------------------------- | ----------------------- |
100-
| `aws_event["httpMethod"]` | `http.request.method` |
101-
| `aws_event["queryStringParameters"]` | `url.query` |
102-
| `aws_event["path"]` | `url.path` |
103-
| full URL | `url.full` |
104-
| `aws_event["headers"]["X-Forwarded-Proto"]` | `network.protocol.name` |
105-
| `aws_event["headers"]["Host"]` | `server.address` |
106-
| `aws_context["function_name"]` | `faas.name` |
107-
108-
- If you're using the GCP integration, the `sampling_context` argument of `traces_sampler` doesn't contain the `gcp_env` and `gcp_event` keys anymore. Instead, the following, if available, is accessible:
109-
110-
| Old sampling context key | New sampling context key |
111-
| --------------------------------- | -------------------------- |
112-
| `gcp_env["function_name"]` | `faas.name` |
113-
| `gcp_env["function_region"]` | `faas.region` |
114-
| `gcp_env["function_project"]` | `gcp.function.project` |
115-
| `gcp_env["function_identity"]` | `gcp.function.identity` |
116-
| `gcp_env["function_entry_point"]` | `gcp.function.entry_point` |
117-
| `gcp_event.method` | `http.request.method` |
118-
| `gcp_event.query_string` | `url.query` |
23+
- The `sampling_context` argument of `traces_sampler` and `profiles_sampler` now additionally contains all span attributes known at span start.
24+
- The integration-specific content of the `sampling_context` argument of `traces_sampler` and `profiles_sampler` now looks different.
25+
- The Celery integration doesn't add the `celery_job` dictionary anymore. Instead, the individual keys are now available as:
26+
27+
| Dictionary keys | Sampling context key | Example |
28+
| ---------------------- | --------------------------- | ------------------------------ |
29+
| `celery_job["args"]` | `celery.job.args.{index}` | `celery.job.args.0` |
30+
| `celery_job["kwargs"]` | `celery.job.kwargs.{kwarg}` | `celery.job.kwargs.kwarg_name` |
31+
| `celery_job["task"]` | `celery.job.task` | |
32+
33+
Note that all of these are serialized, i.e., not the original `args` and `kwargs` but rather OpenTelemetry-friendly span attributes.
34+
35+
- The AIOHTTP integration doesn't add the `aiohttp_request` object anymore. Instead, some of the individual properties of the request are accessible, if available, as follows:
36+
37+
| Request property | Sampling context key(s) |
38+
| ----------------- | ------------------------------- |
39+
| `path` | `url.path` |
40+
| `query_string` | `url.query` |
41+
| `method` | `http.request.method` |
42+
| `host` | `server.address`, `server.port` |
43+
| `scheme` | `url.scheme` |
44+
| full URL | `url.full` |
45+
| `request.headers` | `http.request.header.{header}` |
46+
47+
- The Tornado integration doesn't add the `tornado_request` object anymore. Instead, some of the individual properties of the request are accessible, if available, as follows:
48+
49+
| Request property | Sampling context key(s) |
50+
| ----------------- | --------------------------------------------------- |
51+
| `path` | `url.path` |
52+
| `query` | `url.query` |
53+
| `protocol` | `url.scheme` |
54+
| `method` | `http.request.method` |
55+
| `host` | `server.address`, `server.port` |
56+
| `version` | `network.protocol.name`, `network.protocol.version` |
57+
| full URL | `url.full` |
58+
| `request.headers` | `http.request.header.{header}` |
59+
60+
- The WSGI integration doesn't add the `wsgi_environ` object anymore. Instead, the individual properties of the environment are accessible, if available, as follows:
61+
62+
| Env property | Sampling context key(s) |
63+
| ----------------- | ------------------------------------------------- |
64+
| `PATH_INFO` | `url.path` |
65+
| `QUERY_STRING` | `url.query` |
66+
| `REQUEST_METHOD` | `http.request.method` |
67+
| `SERVER_NAME` | `server.address` |
68+
| `SERVER_PORT` | `server.port` |
69+
| `SERVER_PROTOCOL` | `server.protocol.name`, `server.protocol.version` |
70+
| `wsgi.url_scheme` | `url.scheme` |
71+
| full URL | `url.full` |
72+
| `HTTP_*` | `http.request.header.{header}` |
73+
74+
- The ASGI integration doesn't add the `asgi_scope` object anymore. Instead, the individual properties of the scope, if available, are accessible as follows:
75+
76+
| Scope property | Sampling context key(s) |
77+
| -------------- | ------------------------------- |
78+
| `type` | `network.protocol.name` |
79+
| `scheme` | `url.scheme` |
80+
| `path` | `url.path` |
81+
| `query` | `url.query` |
82+
| `http_version` | `network.protocol.version` |
83+
| `method` | `http.request.method` |
84+
| `server` | `server.address`, `server.port` |
85+
| `client` | `client.address`, `client.port` |
86+
| full URL | `url.full` |
87+
| `headers` | `http.request.header.{header}` |
88+
89+
-The RQ integration doesn't add the `rq_job` object anymore. Instead, the individual properties of the job and the queue, if available, are accessible as follows:
90+
91+
| RQ property | Sampling context key | Example |
92+
| --------------- | ---------------------------- | ---------------------- |
93+
| `rq_job.args` | `rq.job.args.{index}` | `rq.job.args.0` |
94+
| `rq_job.kwargs` | `rq.job.kwargs.{kwarg}` | `rq.job.args.my_kwarg` |
95+
| `rq_job.func` | `rq.job.func` | |
96+
| `queue.name` | `messaging.destination.name` | |
97+
| `rq_job.id` | `messaging.message.id` | |
98+
99+
Note that `rq.job.args`, `rq.job.kwargs`, and `rq.job.func` are serialized and not the actual objects on the job.
100+
101+
- The AWS Lambda integration doesn't add the `aws_event` and `aws_context` objects anymore. Instead, the following, if available, is accessible:
102+
103+
| AWS property | Sampling context key(s) |
104+
| ------------------------------------------- | ------------------------------- |
105+
| `aws_event["httpMethod"]` | `http.request.method` |
106+
| `aws_event["queryStringParameters"]` | `url.query` |
107+
| `aws_event["path"]` | `url.path` |
108+
| full URL | `url.full` |
109+
| `aws_event["headers"]["X-Forwarded-Proto"]` | `network.protocol.name` |
110+
| `aws_event["headers"]["Host"]` | `server.address` |
111+
| `aws_context["function_name"]` | `faas.name` |
112+
| `aws_event["headers"]` | `http.request.headers.{header}` |
113+
114+
- The GCP integration doesn't add the `gcp_env` and `gcp_event` keys anymore. Instead, the following, if available, is accessible:
115+
116+
| Old sampling context key | New sampling context key |
117+
| --------------------------------- | ------------------------------ |
118+
| `gcp_env["function_name"]` | `faas.name` |
119+
| `gcp_env["function_region"]` | `faas.region` |
120+
| `gcp_env["function_project"]` | `gcp.function.project` |
121+
| `gcp_env["function_identity"]` | `gcp.function.identity` |
122+
| `gcp_env["function_entry_point"]` | `gcp.function.entry_point` |
123+
| `gcp_event.method` | `http.request.method` |
124+
| `gcp_event.query_string` | `url.query` |
125+
| `gcp_event.headers` | `http.request.header.{header}` |
119126

120127

121128
### Removed

‎sentry_sdk/integrations/_wsgi_common.py

Copy file name to clipboardExpand all lines: sentry_sdk/integrations/_wsgi_common.py
+15-1Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33

44
import sentry_sdk
55
from sentry_sdk.scope import should_send_default_pii
6-
from sentry_sdk.utils import AnnotatedValue, logger
6+
from sentry_sdk.utils import AnnotatedValue, logger, SENSITIVE_DATA_SUBSTITUTE
77

88
try:
99
from django.http.request import RawPostDataException
@@ -221,6 +221,20 @@ def _filter_headers(headers):
221221
}
222222

223223

224+
def _request_headers_to_span_attributes(headers):
225+
# type: (dict[str, str]) -> dict[str, str]
226+
attributes = {}
227+
228+
headers = _filter_headers(headers)
229+
230+
for header, value in headers.items():
231+
if isinstance(value, AnnotatedValue):
232+
value = SENSITIVE_DATA_SUBSTITUTE
233+
attributes[f"http.request.header.{header.lower()}"] = value
234+
235+
return attributes
236+
237+
224238
def _in_http_status_code_range(code, code_ranges):
225239
# type: (object, list[HttpStatusCodeRange]) -> bool
226240
for target in code_ranges:

‎sentry_sdk/integrations/aiohttp.py

Copy file name to clipboardExpand all lines: sentry_sdk/integrations/aiohttp.py
+4-3Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
from sentry_sdk.sessions import track_session
1414
from sentry_sdk.integrations._wsgi_common import (
1515
_filter_headers,
16+
_request_headers_to_span_attributes,
1617
request_body_within_bounds,
1718
)
1819
from sentry_sdk.tracing import (
@@ -389,11 +390,11 @@ def _prepopulate_attributes(request):
389390
except ValueError:
390391
attributes["server.address"] = request.host
391392

392-
try:
393+
with capture_internal_exceptions():
393394
url = f"{request.scheme}://{request.host}{request.path}" # noqa: E231
394395
if request.query_string:
395396
attributes["url.full"] = f"{url}?{request.query_string}"
396-
except Exception:
397-
pass
397+
398+
attributes.update(_request_headers_to_span_attributes(dict(request.headers)))
398399

399400
return attributes

‎sentry_sdk/integrations/asgi.py

Copy file name to clipboardExpand all lines: sentry_sdk/integrations/asgi.py
+7-4Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121
)
2222
from sentry_sdk.integrations._wsgi_common import (
2323
DEFAULT_HTTP_METHODS_TO_CAPTURE,
24+
_request_headers_to_span_attributes,
2425
)
2526
from sentry_sdk.sessions import track_session
2627
from sentry_sdk.tracing import (
@@ -32,6 +33,7 @@
3233
)
3334
from sentry_sdk.utils import (
3435
ContextVar,
36+
capture_internal_exceptions,
3537
event_from_exception,
3638
HAS_REAL_CONTEXTVARS,
3739
CONTEXTVARS_ERROR_MESSAGE,
@@ -348,19 +350,20 @@ def _prepopulate_attributes(scope):
348350
try:
349351
host, port = scope[attr]
350352
attributes[f"{attr}.address"] = host
351-
attributes[f"{attr}.port"] = port
353+
if port is not None:
354+
attributes[f"{attr}.port"] = port
352355
except Exception:
353356
pass
354357

355-
try:
358+
with capture_internal_exceptions():
356359
full_url = _get_url(scope)
357360
query = _get_query(scope)
358361
if query:
359362
attributes["url.query"] = query
360363
full_url = f"{full_url}?{query}"
361364

362365
attributes["url.full"] = full_url
363-
except Exception:
364-
pass
366+
367+
attributes.update(_request_headers_to_span_attributes(_get_headers(scope)))
365368

366369
return attributes

‎sentry_sdk/integrations/aws_lambda.py

Copy file name to clipboardExpand all lines: sentry_sdk/integrations/aws_lambda.py
+12-3Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,10 @@
2020
reraise,
2121
)
2222
from sentry_sdk.integrations import Integration
23-
from sentry_sdk.integrations._wsgi_common import _filter_headers
23+
from sentry_sdk.integrations._wsgi_common import (
24+
_filter_headers,
25+
_request_headers_to_span_attributes,
26+
)
2427

2528
from typing import TYPE_CHECKING
2629

@@ -162,7 +165,7 @@ def sentry_handler(aws_event, aws_context, *args, **kwargs):
162165
name=aws_context.function_name,
163166
source=TRANSACTION_SOURCE_COMPONENT,
164167
origin=AwsLambdaIntegration.origin,
165-
attributes=_prepopulate_attributes(aws_event, aws_context),
168+
attributes=_prepopulate_attributes(request_data, aws_context),
166169
):
167170
try:
168171
return handler(aws_event, aws_context, *args, **kwargs)
@@ -468,6 +471,7 @@ def _event_from_error_json(error_json):
468471

469472

470473
def _prepopulate_attributes(aws_event, aws_context):
474+
# type: (Any, Any) -> dict[str, Any]
471475
attributes = {
472476
"cloud.provider": "aws",
473477
}
@@ -486,10 +490,15 @@ def _prepopulate_attributes(aws_event, aws_context):
486490
url += f"?{aws_event['queryStringParameters']}"
487491
attributes["url.full"] = url
488492

489-
headers = aws_event.get("headers") or {}
493+
headers = {}
494+
if aws_event.get("headers") and isinstance(aws_event["headers"], dict):
495+
headers = aws_event["headers"]
496+
490497
if headers.get("X-Forwarded-Proto"):
491498
attributes["network.protocol.name"] = headers["X-Forwarded-Proto"]
492499
if headers.get("Host"):
493500
attributes["server.address"] = headers["Host"]
494501

502+
attributes.update(_request_headers_to_span_attributes(headers))
503+
495504
return attributes

‎sentry_sdk/integrations/celery/__init__.py

Copy file name to clipboardExpand all lines: sentry_sdk/integrations/celery/__init__.py
+10-3Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,6 @@
2020
ensure_integration_enabled,
2121
event_from_exception,
2222
reraise,
23-
_serialize_span_attribute,
2423
)
2524

2625
from typing import TYPE_CHECKING
@@ -514,9 +513,17 @@ def sentry_publish(self, *args, **kwargs):
514513

515514

516515
def _prepopulate_attributes(task, args, kwargs):
516+
# type: (Any, *Any, **Any) -> dict[str, str]
517517
attributes = {
518518
"celery.job.task": task.name,
519-
"celery.job.args": _serialize_span_attribute(args),
520-
"celery.job.kwargs": _serialize_span_attribute(kwargs),
521519
}
520+
521+
for i, arg in enumerate(args):
522+
with capture_internal_exceptions():
523+
attributes[f"celery.job.args.{i}"] = str(arg)
524+
525+
for kwarg, value in kwargs.items():
526+
with capture_internal_exceptions():
527+
attributes[f"celery.job.kwargs.{kwarg}"] = str(value)
528+
522529
return attributes

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.