Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Add SET TAGS profiling scripts and connector PROFILE instrumentation#780

Draft
tejassp-db wants to merge 3 commits into
maindatabricks/databricks-sql-python:mainfrom
sql-is-profilingdatabricks/databricks-sql-python:sql-is-profilingCopy head branch name to clipboard
Draft

Add SET TAGS profiling scripts and connector PROFILE instrumentation#780
tejassp-db wants to merge 3 commits into
maindatabricks/databricks-sql-python:mainfrom
sql-is-profilingdatabricks/databricks-sql-python:sql-is-profilingCopy head branch name to clipboard

Conversation

@tejassp-db
Copy link
Copy Markdown

Summary

  • Adds profiling scripts under examples/ to compare direct ALTER … SET TAGS vs. read-from-information_schema-then-write, for both column and table tags. Includes cleanup, chart generation
    (plot_comparison.py), and a PROFILING_README.md.
  • Adds [PROFILE] log lines in thrift_backend.py and auth/retry.py to capture per-attempt timing, statement IDs, retry decisions, and (truncated) SQL text.
  • Scripts pass enable_telemetry=False and load credentials from gitignored examples/credentials.env.

Profiling scripts to compare direct ALTER SET TAGS vs reading from
information_schema before writing. Includes scripts for column tags,
table tags, information_schema reads, cleanup, and chart generation.

Credentials are loaded from examples/credentials.env (gitignored).
Copy credentials.env.example and fill in workspace details.

Connector instrumentation adds [PROFILE] log lines in the Thrift
backend retry loop and urllib3 retry policy to capture per-attempt
timing, statement IDs, retry decisions, and SQL text.

Co-authored-by: Isaac
Pass enable_telemetry=False to sql.connect() to avoid telemetry
HTTP calls during profiling runs.

Co-authored-by: Isaac
- All 4 profiling scripts now accept --tables-per-iteration (defaults
  to --threads, i.e. 1 table per thread per iteration).
- NUM_TABLES bumped to 128.
- plot_comparison.py generates separate PNGs for table-level comparison
  (wall-clock, tables/sec) and individual operation detail (ops/sec,
  P50, P99, max). Column tags and table tags get their own PNGs.
- Chart labels spell out parameters (columns, tags_per_column, tables).
- All scripts pass enable_telemetry=False to sql.connect().

Co-authored-by: Isaac
@tejassp-db tejassp-db self-assigned this Apr 27, 2026
@tejassp-db tejassp-db added the invalid This doesn't seem right label Apr 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

invalid This doesn't seem right

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Morty Proxy This is a proxified and sanitized view of the page, visit original site.