UN-3266 [FEAT] Async Executor Backend for Prompt Studio#1849
UN-3266 [FEAT] Async Executor Backend for Prompt Studio#1849jaseemjaskp merged 116 commits intomainZipstack/unstract:mainfrom feat/async-prompt-service-v2Zipstack/unstract:feat/async-prompt-service-v2Copy head branch name to clipboard
Conversation
Conflicts resolved: - docker-compose.yaml: Use main's dedicated dashboard_metric_events queue for worker-metrics - PromptCard.jsx: Keep tool_id matching condition from our async socket feature - PromptRun.jsx: Merge useEffect import from main with our branch - ToolIde.jsx: Keep fire-and-forget socket approach (spinner waits for socket event) - SocketMessages.js: Keep both session-store and socket-custom-tool imports + updateCusToolMessages dep - SocketContext.js: Keep simpler path-based socket connection approach - usePromptRun.js: Keep Celery fire-and-forget with socket delivery over polling - setupProxy.js: Accept main's deletion (migrated to Vite)
for more information, see https://pre-commit.ci
… into feat/execution-backend
for more information, see https://pre-commit.ci
… into feat/execution-backend
|
The IDE callback worker should use the unified worker image (worker-unified.Dockerfile) consistent with all other v2 workers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Frontend Lint Report (Biome)✅ All checks passed! No linting or formatting issues found. |
Test ResultsSummary
Runner Tests - Full Report
SDK1 Tests - Full Report
|
|
Resolve merge with main after PR #1849 (Async Executor Backend) landed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove the unnecessary second Celery app (worker_celery.py) introduced in PR #1849 and consolidate on the existing celery_service.app. Both apps used the same RabbitMQ broker, PostgreSQL result backend, and JSON serialization — the duplication solved a non-existent problem. Also removes dead legacy Celery tasks (run_index_document, run_fetch_response, run_single_pass_extraction) that have zero callers since the executor migration, and fixes the ide-callback worker which was missing from run-worker-docker.sh's WORKERS, WORKER_QUEUES, and WORKER_HEALTH_PORTS arrays — causing it to crash locally. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… worker (#1892) [MISC] Remove redundant worker_celery.py and fix ide-callback worker Remove the unnecessary second Celery app (worker_celery.py) introduced in PR #1849 and consolidate on the existing celery_service.app. Both apps used the same RabbitMQ broker, PostgreSQL result backend, and JSON serialization — the duplication solved a non-existent problem. Also removes dead legacy Celery tasks (run_index_document, run_fetch_response, run_single_pass_extraction) that have zero callers since the executor migration, and fixes the ide-callback worker which was missing from run-worker-docker.sh's WORKERS, WORKER_QUEUES, and WORKER_HEALTH_PORTS arrays — causing it to crash locally. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Execution backend - revamp * async flow * Streaming progress to FE * Removing multi hop in Prompt studio ide and structure tool * UN-3234 [FIX] Add beta tag to agentic prompt studio navigation item * Added executors for agentic prompt studio * Added executors for agentic prompt studio * Removed redundant envs * Removed redundant envs * Removed redundant envs * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Removed redundant envs * adding worker for callbacks * adding worker for callbacks * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Pluggable apps and plugins to fit the new async prompt execution architecture * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Pluggable apps and plugins to fit the new async prompt execution architecture * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Pluggable apps and plugins to fit the new async prompt execution architecture * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * fix: write output files in agentic extraction pipeline Agentic extraction returned early without writing INFILE (JSON) or METADATA.json, causing destination connectors to read the original PDF and fail with "Expected tool output type: TXT, got: application/pdf". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests (#1850) * UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants in all affected test files to avoid world-writable directory vulnerabilities. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update docs * UN-3266 fix: remove dead code with undefined names in fetch_response Remove unreachable code block after the async callback return in fetch_response that still referenced output_count_before and response from the old synchronous implementation, causing ruff F821 errors. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Un 3266 fix security hotspot tmp paths (#1851) * UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants in all affected test files to avoid world-writable directory vulnerabilities. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve ruff linting failures across multiple files - B026: pass url positionally in worker_celery.py to avoid star-arg after keyword - N803: rename MockAsyncResult to mock_async_result in test_tasks.py - E501/I001: fix long line and import sort in llm_whisperer helper - ANN401: replace Any with object|None in dispatcher.py; add noqa in test helpers - F841: remove unused workflow_id and result assignments Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * UN-3266 fix: resolve SonarCloud bugs S2259 and S1244 in PR #1849 - S2259: guard against None after _discover_plugins() in loader.py to satisfy static analysis on the dict[str,type]|None field type - S1244: replace float equality checks with pytest.approx() in test_answer_prompt.py and test_phase2h.py Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve SonarCloud code smells in PR #1849 - S5799: Merge all implicit string concatenations in log messages (legacy_executor.py, tasks.py, dispatcher.py, orchestrator.py, registry.py, variable_replacement.py, structure_tool_task.py) - S1192: Extract duplicate literal to _NO_CELERY_APP_MSG constant in dispatcher.py - S1871: Merge identical elif/else branches in tasks.py and test_sanity_phase6j.py - S1186: Add comment to empty stub method in test_sanity_phase6a.py - S1481: Remove unused local variables in test_sanity_phase6d/e/f/g/h/j and test_phase5d.py - S117: Rename PascalCase local variables to snake_case in test_sanity_phase3/5/6i.py - S5655: Broaden tool type annotation to StreamMixin in IndexingUtils.generate_index_key and PlatformHelper.get_adapter_config - docker:S7031: Merge consecutive RUN instructions in worker-unified.Dockerfile - javascript:S1128: Remove unused pollForCompletion import in usePromptRun.js Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: wrap long log message in dispatcher.py to fix E501 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve remaining SonarCloud S117 naming violations Rename PascalCase local variables to snake_case to comply with S117: - legacy_executor.py: rename tuple-unpacked _get_prompt_deps() results (AnswerPromptService→answer_prompt_svc, RetrievalService→retrieval_svc, VariableReplacementService→variable_replacement_svc, LLM→llm_cls, EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls) and update all downstream usages including _apply_type_conversion and _handle_summarize - test_phase1_log_streaming.py: rename Mock* local variables to mock_* snake_case equivalents - test_sanity_phase3.py: rename MockDispatcher→mock_dispatcher_cls and MockShim→mock_shim_cls across all 10 test methods - test_sanity_phase5.py: rename MockShim→mock_shim, MockX2Text→mock_x2text in 6 test methods; MockDispatcher→mock_dispatcher_cls in dispatch test; fix LLM_cls→llm_cls, EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls in _mock_prompt_deps helper Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: resolve remaining SonarCloud code smells in PR #1849 - test_sanity_phase2/4.py, test_answer_prompt.py: rename PascalCase local variables in _mock_prompt_deps/_mock_deps to snake_case (RetrievalService→retrieval_svc, VariableReplacementService→ variable_replacement_svc, Index→index_cls, LLM_cls→llm_cls, EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls, AnswerPromptService→answer_prompt_svc_cls) — fixes S117 - test_sanity_phase3.py: remove unused local variable "result" — fixes S1481 - structure_tool_task.py: remove redundant json.JSONDecodeError from except clause (subclass of ValueError) — fixes S5713 - shared/workflow/execution/service.py: replace generic Exception with RuntimeError for structure tool failure — fixes S112 - run-worker-docker.sh: define EXECUTOR_WORKER_TYPE constant and replace 10 literal "executor" occurrences — fixes S1192 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve SonarCloud cognitive complexity and code smell violations - Reduce cognitive complexity in answer_prompt.py: - Extract _build_grammar_notes, _run_webhook_postprocess helpers - _is_safe_public_url: extracted _resolve_host_addresses helper - handle_json: early-return pattern eliminates nesting - construct_prompt: delegates grammar loop to _build_grammar_notes - Reduce cognitive complexity in legacy_executor.py: - Extract _execute_single_prompt, _run_table_extraction helpers - Extract _run_challenge_if_enabled, _run_evaluation_if_enabled - Extract _inject_table_settings, _finalize_pipeline_result - Extract _convert_number_answer, _convert_scalar_answer - Extract _sanitize_dict_values helper - _handle_answer_prompt CC reduced from 50 to ~7 - Reduce CC in structure_tool_task.py: guard-clause refactor - Reduce CC in backend: dto.py, deployment_helper.py, api_deployment_views.py, prompt_studio_helper.py - Fix S117: rename PascalCase local vars in test_answer_prompt.py - Fix S1192: extract EXECUTOR_WORKER_TYPE constant in run-worker.sh - Fix S1172: remove unused params from structure_tool_task.py - Fix S5713: remove redundant JSONDecodeError in json_repair_helper.py - Fix S112/S5727 in test_execution.py Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: remove unused RetrievalStrategy import from _handle_answer_prompt Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: rename UsageHelper params to lowercase (N803) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve remaining SonarCloud issues from check run 66691002192 - Add @staticmethod to _sanitize_null_values (fixes S2325 missing self) - Reduce _execute_single_prompt params from 25 to 11 (S107) by grouping services as deps tuple and extracting exec params from context.executor_params - Add NOSONAR suppression for raise exc in test helper (S112) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: remove unused locals in _handle_answer_prompt (F841) execution_id, file_hash, log_events_id, custom_data are now extracted inside _execute_single_prompt from context.executor_params. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: resolve Biome linting errors in frontend source files Auto-fixed 48 lint errors across 56 files: import ordering, block statements, unused variable prefixing, and formatting issues. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: replace dynamic import of SharePermission with static import in Workflows Resolves vite build warning about SharePermission.jsx being both dynamically and statically imported across the codebase. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve SonarCloud warnings in frontend components - Remove unnecessary try-catch around PostHog event calls - Flip negated condition in PromptOutput.handleTable for clarity Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Address PR #1849 review comments: fix null guards, dead code, and test drift - Remove redundant inline `import uuid as _uuid` in views.py (use module-level uuid) - URL-encode DB_USER in worker_celery.py result backend connection string - Remove misleading task_queues=[Queue("executor")] from dispatch-only Celery app - Remove dead `if not tool:` guards after objects.get() (already raises DoesNotExist) - Move profile_manager/default_profile null checks before first dereference - Reorder ProfileManager.objects.get before mark_document_indexed in tasks.py - Handle ProfileManager.DoesNotExist as warning, not hard failure - Wrap PostHog analytics in try/catch so failures don't block prompt execution - Handle pending-indexing 200 response in usePromptRun.js (clear RUNNING status) - Reset formData when metadata is missing in ConfigureDs.jsx - Fix test_should_skip_extraction tests: function now takes 1 arg (outputs only) - Fix agentic routing tests: mock X2Text.process, remove stale platform_helper kwarg Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix missing llm_usage_reason for summarize LLM usage tracking Add PSKeys.LLM_USAGE_REASON to usage_kwargs in _handle_summarize() so summarization costs appear under summarize_llm in API response metadata. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * UN-3266 [FIX] Fix single-pass extraction routing in LegacyExecutor - Route _handle_structure_pipeline to _handle_single_pass_extraction when is_single_pass=True (was always calling _handle_answer_prompt) - Delegate _handle_single_pass_extraction to cloud plugin via ExecutorRegistry, falling back to _handle_answer_prompt if plugin not installed Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fixing API depployment response mismatches * Fix single-pass extraction showing only one prompt result in real-time - Fix accumulation bug in usePromptOutput updateCoverage() where each loop iteration spread the original promptOutputs instead of the accumulating updatedPromptOutputs, causing only the last prompt to render - Improve index success toast to show document name - Strip adapter names from index key configs for consistent hashing - Update sdk1 uv.lock Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Move summarize from sync Django plugin to executor worker for IDE index When "Summarize" is enabled and a user indexes a new document in Prompt Studio, the backend returned a 500 because the sync Django plugin tried to read the extracted .txt file before extraction happened in the worker. Fix: defer summarization to the executor worker's _handle_ide_index (extract → summarize → index), build summarize_params in the payload, and track the summarize index in the ide_index_complete callback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Address PR #1849 review comments: null guards, thread safety - Fix null guard ordering in build_index_payload: add DefaultProfileError check before calling validators on default_profile - Fix null guard ordering in _fetch_single_pass_response: move check before .llm.id access and validators (was dead code after dereference) - Add threading.Lock to worker_celery singleton to prevent race under concurrent gunicorn threads Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add documentation to ExecutionResponse DTO describing result structure Documents the ExecutionResponse dataclass fields, especially the result attribute's per-file dict structure (output, metadata, metrics, error keys) as requested in PR review. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix PR review issues: IDOR, null guards, rollback, spinner, summarize, prompt lookup - Strip result payload from task_status to prevent IDOR data leak - Move null guard before validators in build_fetch_response_payload - Roll back indexing flag on broker failure in index_document - Use explicit IDs for spinner clearing instead of ORM serialization format - Catch broad exceptions in summarize tracking to prevent false failures - Guard prompt_id lookup in fetch_response with 400/404 responses Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix CI, tests, and add async prompt studio improvements - Fix biome import ordering in frontend test files - Fix 22 stale backend test assertions in test_tasks.py (patch targets, return values, view dispatch pattern, remove TestCeleryConfig) - Add ide_prompt_complete callback tests - Add frontend vitest config and regression tests for null guards, stale closures, and single-pass loading guard - Add LLMCompat llama-index compatibility wrapper in SDK1 - Add litellm cohere embed timeout monkey-patch (v1.82.3) - Improve S3 filesystem helper (region_name, empty credential handling) - Add RetrieverLLM and lazy LLM creation for retrievers - Improve worker cleanup: api_client.close() in finally blocks, early return on setup failure in API deployment tasks - Add workers conftest.py for test environment setup Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix pre-existing biome CI errors: import ordering and formatting Auto-fix 18 pre-existing organizeImports and formatting errors across 17 frontend files that were blocking biome ci on the entire codebase. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix ruff F821: add missing transaction import in prompt_studio_helper Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add input validation guards to bulk_fetch_response endpoint Validate prompt_ids (non-empty), document_id (required), and handle DoesNotExist for both prompts and document to return proper 400/404 instead of dispatching no-op tasks or raising unhandled 500 errors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * IDE Call backs * Sonar issues fix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix ruff errors: restore summary_profile variable, suppress TC001 in dispatcher Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update bun.lock to match package.json dependency ranges Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix all biome lint warnings: empty blocks, missing braces, forEach returns, unused vars Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Move ExecutionContext import into TYPE_CHECKING block Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix SonarQube issues: duplication, naming, nesting, unused var - Extract _parse_json_body() helper and _ERR_INVALID_JSON constant to deduplicate 5 identical JSON parsing blocks in internal_views.py - Rename `User` local variable to `user_model` (naming convention) - Merge _emit_result/_emit_error into unified _emit_event() in ide_callback tasks to reduce code duplication - Extract _get_task_error() helper to deduplicate AsyncResult error retrieval in ide_index_error and ide_prompt_error - Remove unused `mock_ar_cls` variable in test_ide_callback.py - Add security note documenting why @csrf_exempt is safe on internal endpoints Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Replace worker-ide-callback Dockerfile with worker-unified The IDE callback worker should use the unified worker image (worker-unified.Dockerfile) consistent with all other v2 workers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add celery_executor_agentic queue to executor worker The executor worker only consumed from celery_executor_legacy, but the agentic prompt studio dispatches tasks to celery_executor_agentic. This caused agentic operations to sit in RabbitMQ with no consumer, resulting in timeouts and stuck-in-progress states. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * FIxing email enforce type * Removing line-item from select choices * Update workers/shared/enums/worker_enums_base.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com> * Update backend/workflow_manager/workflow_v2/workflow_helper.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix false success logs and silent failures in ETL destination pipelines - Widen except clause in structure_tool_task to catch FileOperationError and log write paths for diagnostics - Add diagnostic logging at all silent return-None points in destination connector so missing INFILE/METADATA paths are visible in logs - Raise RuntimeError instead of silently skipping when no tool execution result is available for DB/FS destinations, preventing false success - Remove dead retry config from execute_bin task (max_retries=0) - Fix duplicate EXECUTOR/IDE_CALLBACK enum entries in WorkerType Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert ETL destination pipeline changes — deferring to next cut Reverts diagnostic logging and error-raising changes in structure_tool_task.py and destination_connector.py. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com> Co-authored-by: Ghost Jake <89829542+Deepak-Kesavan@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Ritwik G <100672805+ritwik-g@users.noreply.github.com> Co-authored-by: Kirtiman Mishra <110175055+kirtimanmishrazipstack@users.noreply.github.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* [FIX] Executor Queue for Agentic extraction (#1893) * Execution backend - revamp * async flow * Streaming progress to FE * Removing multi hop in Prompt studio ide and structure tool * UN-3234 [FIX] Add beta tag to agentic prompt studio navigation item * Added executors for agentic prompt studio * Added executors for agentic prompt studio * Removed redundant envs * Removed redundant envs * Removed redundant envs * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Removed redundant envs * adding worker for callbacks * adding worker for callbacks * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Pluggable apps and plugins to fit the new async prompt execution architecture * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Pluggable apps and plugins to fit the new async prompt execution architecture * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Pluggable apps and plugins to fit the new async prompt execution architecture * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * fix: write output files in agentic extraction pipeline Agentic extraction returned early without writing INFILE (JSON) or METADATA.json, causing destination connectors to read the original PDF and fail with "Expected tool output type: TXT, got: application/pdf". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests (#1850) * UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants in all affected test files to avoid world-writable directory vulnerabilities. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update docs * UN-3266 fix: remove dead code with undefined names in fetch_response Remove unreachable code block after the async callback return in fetch_response that still referenced output_count_before and response from the old synchronous implementation, causing ruff F821 errors. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Un 3266 fix security hotspot tmp paths (#1851) * UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants in all affected test files to avoid world-writable directory vulnerabilities. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve ruff linting failures across multiple files - B026: pass url positionally in worker_celery.py to avoid star-arg after keyword - N803: rename MockAsyncResult to mock_async_result in test_tasks.py - E501/I001: fix long line and import sort in llm_whisperer helper - ANN401: replace Any with object|None in dispatcher.py; add noqa in test helpers - F841: remove unused workflow_id and result assignments Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * UN-3266 fix: resolve SonarCloud bugs S2259 and S1244 in PR #1849 - S2259: guard against None after _discover_plugins() in loader.py to satisfy static analysis on the dict[str,type]|None field type - S1244: replace float equality checks with pytest.approx() in test_answer_prompt.py and test_phase2h.py Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve SonarCloud code smells in PR #1849 - S5799: Merge all implicit string concatenations in log messages (legacy_executor.py, tasks.py, dispatcher.py, orchestrator.py, registry.py, variable_replacement.py, structure_tool_task.py) - S1192: Extract duplicate literal to _NO_CELERY_APP_MSG constant in dispatcher.py - S1871: Merge identical elif/else branches in tasks.py and test_sanity_phase6j.py - S1186: Add comment to empty stub method in test_sanity_phase6a.py - S1481: Remove unused local variables in test_sanity_phase6d/e/f/g/h/j and test_phase5d.py - S117: Rename PascalCase local variables to snake_case in test_sanity_phase3/5/6i.py - S5655: Broaden tool type annotation to StreamMixin in IndexingUtils.generate_index_key and PlatformHelper.get_adapter_config - docker:S7031: Merge consecutive RUN instructions in worker-unified.Dockerfile - javascript:S1128: Remove unused pollForCompletion import in usePromptRun.js Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: wrap long log message in dispatcher.py to fix E501 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve remaining SonarCloud S117 naming violations Rename PascalCase local variables to snake_case to comply with S117: - legacy_executor.py: rename tuple-unpacked _get_prompt_deps() results (AnswerPromptService→answer_prompt_svc, RetrievalService→retrieval_svc, VariableReplacementService→variable_replacement_svc, LLM→llm_cls, EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls) and update all downstream usages including _apply_type_conversion and _handle_summarize - test_phase1_log_streaming.py: rename Mock* local variables to mock_* snake_case equivalents - test_sanity_phase3.py: rename MockDispatcher→mock_dispatcher_cls and MockShim→mock_shim_cls across all 10 test methods - test_sanity_phase5.py: rename MockShim→mock_shim, MockX2Text→mock_x2text in 6 test methods; MockDispatcher→mock_dispatcher_cls in dispatch test; fix LLM_cls→llm_cls, EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls in _mock_prompt_deps helper Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: resolve remaining SonarCloud code smells in PR #1849 - test_sanity_phase2/4.py, test_answer_prompt.py: rename PascalCase local variables in _mock_prompt_deps/_mock_deps to snake_case (RetrievalService→retrieval_svc, VariableReplacementService→ variable_replacement_svc, Index→index_cls, LLM_cls→llm_cls, EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls, AnswerPromptService→answer_prompt_svc_cls) — fixes S117 - test_sanity_phase3.py: remove unused local variable "result" — fixes S1481 - structure_tool_task.py: remove redundant json.JSONDecodeError from except clause (subclass of ValueError) — fixes S5713 - shared/workflow/execution/service.py: replace generic Exception with RuntimeError for structure tool failure — fixes S112 - run-worker-docker.sh: define EXECUTOR_WORKER_TYPE constant and replace 10 literal "executor" occurrences — fixes S1192 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve SonarCloud cognitive complexity and code smell violations - Reduce cognitive complexity in answer_prompt.py: - Extract _build_grammar_notes, _run_webhook_postprocess helpers - _is_safe_public_url: extracted _resolve_host_addresses helper - handle_json: early-return pattern eliminates nesting - construct_prompt: delegates grammar loop to _build_grammar_notes - Reduce cognitive complexity in legacy_executor.py: - Extract _execute_single_prompt, _run_table_extraction helpers - Extract _run_challenge_if_enabled, _run_evaluation_if_enabled - Extract _inject_table_settings, _finalize_pipeline_result - Extract _convert_number_answer, _convert_scalar_answer - Extract _sanitize_dict_values helper - _handle_answer_prompt CC reduced from 50 to ~7 - Reduce CC in structure_tool_task.py: guard-clause refactor - Reduce CC in backend: dto.py, deployment_helper.py, api_deployment_views.py, prompt_studio_helper.py - Fix S117: rename PascalCase local vars in test_answer_prompt.py - Fix S1192: extract EXECUTOR_WORKER_TYPE constant in run-worker.sh - Fix S1172: remove unused params from structure_tool_task.py - Fix S5713: remove redundant JSONDecodeError in json_repair_helper.py - Fix S112/S5727 in test_execution.py Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: remove unused RetrievalStrategy import from _handle_answer_prompt Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: rename UsageHelper params to lowercase (N803) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve remaining SonarCloud issues from check run 66691002192 - Add @staticmethod to _sanitize_null_values (fixes S2325 missing self) - Reduce _execute_single_prompt params from 25 to 11 (S107) by grouping services as deps tuple and extracting exec params from context.executor_params - Add NOSONAR suppression for raise exc in test helper (S112) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: remove unused locals in _handle_answer_prompt (F841) execution_id, file_hash, log_events_id, custom_data are now extracted inside _execute_single_prompt from context.executor_params. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: resolve Biome linting errors in frontend source files Auto-fixed 48 lint errors across 56 files: import ordering, block statements, unused variable prefixing, and formatting issues. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: replace dynamic import of SharePermission with static import in Workflows Resolves vite build warning about SharePermission.jsx being both dynamically and statically imported across the codebase. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve SonarCloud warnings in frontend components - Remove unnecessary try-catch around PostHog event calls - Flip negated condition in PromptOutput.handleTable for clarity Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Address PR #1849 review comments: fix null guards, dead code, and test drift - Remove redundant inline `import uuid as _uuid` in views.py (use module-level uuid) - URL-encode DB_USER in worker_celery.py result backend connection string - Remove misleading task_queues=[Queue("executor")] from dispatch-only Celery app - Remove dead `if not tool:` guards after objects.get() (already raises DoesNotExist) - Move profile_manager/default_profile null checks before first dereference - Reorder ProfileManager.objects.get before mark_document_indexed in tasks.py - Handle ProfileManager.DoesNotExist as warning, not hard failure - Wrap PostHog analytics in try/catch so failures don't block prompt execution - Handle pending-indexing 200 response in usePromptRun.js (clear RUNNING status) - Reset formData when metadata is missing in ConfigureDs.jsx - Fix test_should_skip_extraction tests: function now takes 1 arg (outputs only) - Fix agentic routing tests: mock X2Text.process, remove stale platform_helper kwarg Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix missing llm_usage_reason for summarize LLM usage tracking Add PSKeys.LLM_USAGE_REASON to usage_kwargs in _handle_summarize() so summarization costs appear under summarize_llm in API response metadata. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * UN-3266 [FIX] Fix single-pass extraction routing in LegacyExecutor - Route _handle_structure_pipeline to _handle_single_pass_extraction when is_single_pass=True (was always calling _handle_answer_prompt) - Delegate _handle_single_pass_extraction to cloud plugin via ExecutorRegistry, falling back to _handle_answer_prompt if plugin not installed Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fixing API depployment response mismatches * Fix single-pass extraction showing only one prompt result in real-time - Fix accumulation bug in usePromptOutput updateCoverage() where each loop iteration spread the original promptOutputs instead of the accumulating updatedPromptOutputs, causing only the last prompt to render - Improve index success toast to show document name - Strip adapter names from index key configs for consistent hashing - Update sdk1 uv.lock Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Move summarize from sync Django plugin to executor worker for IDE index When "Summarize" is enabled and a user indexes a new document in Prompt Studio, the backend returned a 500 because the sync Django plugin tried to read the extracted .txt file before extraction happened in the worker. Fix: defer summarization to the executor worker's _handle_ide_index (extract → summarize → index), build summarize_params in the payload, and track the summarize index in the ide_index_complete callback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Address PR #1849 review comments: null guards, thread safety - Fix null guard ordering in build_index_payload: add DefaultProfileError check before calling validators on default_profile - Fix null guard ordering in _fetch_single_pass_response: move check before .llm.id access and validators (was dead code after dereference) - Add threading.Lock to worker_celery singleton to prevent race under concurrent gunicorn threads Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add documentation to ExecutionResponse DTO describing result structure Documents the ExecutionResponse dataclass fields, especially the result attribute's per-file dict structure (output, metadata, metrics, error keys) as requested in PR review. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix PR review issues: IDOR, null guards, rollback, spinner, summarize, prompt lookup - Strip result payload from task_status to prevent IDOR data leak - Move null guard before validators in build_fetch_response_payload - Roll back indexing flag on broker failure in index_document - Use explicit IDs for spinner clearing instead of ORM serialization format - Catch broad exceptions in summarize tracking to prevent false failures - Guard prompt_id lookup in fetch_response with 400/404 responses Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix CI, tests, and add async prompt studio improvements - Fix biome import ordering in frontend test files - Fix 22 stale backend test assertions in test_tasks.py (patch targets, return values, view dispatch pattern, remove TestCeleryConfig) - Add ide_prompt_complete callback tests - Add frontend vitest config and regression tests for null guards, stale closures, and single-pass loading guard - Add LLMCompat llama-index compatibility wrapper in SDK1 - Add litellm cohere embed timeout monkey-patch (v1.82.3) - Improve S3 filesystem helper (region_name, empty credential handling) - Add RetrieverLLM and lazy LLM creation for retrievers - Improve worker cleanup: api_client.close() in finally blocks, early return on setup failure in API deployment tasks - Add workers conftest.py for test environment setup Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix pre-existing biome CI errors: import ordering and formatting Auto-fix 18 pre-existing organizeImports and formatting errors across 17 frontend files that were blocking biome ci on the entire codebase. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix ruff F821: add missing transaction import in prompt_studio_helper Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add input validation guards to bulk_fetch_response endpoint Validate prompt_ids (non-empty), document_id (required), and handle DoesNotExist for both prompts and document to return proper 400/404 instead of dispatching no-op tasks or raising unhandled 500 errors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * IDE Call backs * Sonar issues fix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix ruff errors: restore summary_profile variable, suppress TC001 in dispatcher Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update bun.lock to match package.json dependency ranges Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix all biome lint warnings: empty blocks, missing braces, forEach returns, unused vars Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Move ExecutionContext import into TYPE_CHECKING block Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix SonarQube issues: duplication, naming, nesting, unused var - Extract _parse_json_body() helper and _ERR_INVALID_JSON constant to deduplicate 5 identical JSON parsing blocks in internal_views.py - Rename `User` local variable to `user_model` (naming convention) - Merge _emit_result/_emit_error into unified _emit_event() in ide_callback tasks to reduce code duplication - Extract _get_task_error() helper to deduplicate AsyncResult error retrieval in ide_index_error and ide_prompt_error - Remove unused `mock_ar_cls` variable in test_ide_callback.py - Add security note documenting why @csrf_exempt is safe on internal endpoints Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Replace worker-ide-callback Dockerfile with worker-unified The IDE callback worker should use the unified worker image (worker-unified.Dockerfile) consistent with all other v2 workers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add celery_executor_agentic queue to executor worker The executor worker only consumed from celery_executor_legacy, but the agentic prompt studio dispatches tasks to celery_executor_agentic. This caused agentic operations to sit in RabbitMQ with no consumer, resulting in timeouts and stuck-in-progress states. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * FIxing email enforce type * Removing line-item from select choices * Update workers/shared/enums/worker_enums_base.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com> * Update backend/workflow_manager/workflow_v2/workflow_helper.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix false success logs and silent failures in ETL destination pipelines - Widen except clause in structure_tool_task to catch FileOperationError and log write paths for diagnostics - Add diagnostic logging at all silent return-None points in destination connector so missing INFILE/METADATA paths are visible in logs - Raise RuntimeError instead of silently skipping when no tool execution result is available for DB/FS destinations, preventing false success - Remove dead retry config from execute_bin task (max_retries=0) - Fix duplicate EXECUTOR/IDE_CALLBACK enum entries in WorkerType Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert ETL destination pipeline changes — deferring to next cut Reverts diagnostic logging and error-raising changes in structure_tool_task.py and destination_connector.py. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com> Co-authored-by: Ghost Jake <89829542+Deepak-Kesavan@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Ritwik G <100672805+ritwik-g@users.noreply.github.com> Co-authored-by: Kirtiman Mishra <110175055+kirtimanmishrazipstack@users.noreply.github.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * [MISC] Fix blank LLM profile edit form and settings menu bold style (#1896) - Replace form.resetFields() with form.setFieldsValue() to fix blank edit form caused by Ant Design's initialValues being a one-time snapshot - Remove Strict Mode-incompatible cleanup that cleared editLlmProfileId - Clear editLlmProfileId in parent when adding new profile - Remove unconditional bold on first settings popover menu item Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [FEAT] Port LINE_ITEM extraction to pluggable executor architecture LINE_ITEM enforce_type prompts previously raised an unconditional error in the new workers executor. Wire the structure-tool path to delegate to a `line_item` executor plugin (mirroring how TABLE delegates to the `table_extractor` plugin), so API deployments / ETL pipelines can run LINE_ITEM extraction once the cloud `line_item_extractor` plugin is installed. - legacy_executor.py: replace the LINE_ITEM error site in `_execute_single_prompt` with a delegation call and add a new `_run_line_item_extraction()` method structurally identical to `_run_table_extraction()`. - test_line_item_extraction.py: 5 new tests covering plugin missing, success, sub-context construction, failure path, and end-to-end Celery eager-mode delegation. - test_sanity_phase6d.py: update the existing LINE_ITEM guard test to match the new "install the line_item_extractor plugin" hint and patch `ExecutorRegistry.get` to simulate the missing plugin. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com> Co-authored-by: Ghost Jake <89829542+Deepak-Kesavan@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Ritwik G <100672805+ritwik-g@users.noreply.github.com> Co-authored-by: Kirtiman Mishra <110175055+kirtimanmishrazipstack@users.noreply.github.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Chandrasekharan M <117059509+chandrasekharan-zipstack@users.noreply.github.com>
* [FIX] Executor Queue for Agentic extraction (#1893) * Execution backend - revamp * async flow * Streaming progress to FE * Removing multi hop in Prompt studio ide and structure tool * UN-3234 [FIX] Add beta tag to agentic prompt studio navigation item * Added executors for agentic prompt studio * Added executors for agentic prompt studio * Removed redundant envs * Removed redundant envs * Removed redundant envs * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Removed redundant envs * adding worker for callbacks * adding worker for callbacks * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Pluggable apps and plugins to fit the new async prompt execution architecture * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Pluggable apps and plugins to fit the new async prompt execution architecture * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Pluggable apps and plugins to fit the new async prompt execution architecture * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * fix: write output files in agentic extraction pipeline Agentic extraction returned early without writing INFILE (JSON) or METADATA.json, causing destination connectors to read the original PDF and fail with "Expected tool output type: TXT, got: application/pdf". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests (#1850) * UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants in all affected test files to avoid world-writable directory vulnerabilities. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update docs * UN-3266 fix: remove dead code with undefined names in fetch_response Remove unreachable code block after the async callback return in fetch_response that still referenced output_count_before and response from the old synchronous implementation, causing ruff F821 errors. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Un 3266 fix security hotspot tmp paths (#1851) * UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants in all affected test files to avoid world-writable directory vulnerabilities. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve ruff linting failures across multiple files - B026: pass url positionally in worker_celery.py to avoid star-arg after keyword - N803: rename MockAsyncResult to mock_async_result in test_tasks.py - E501/I001: fix long line and import sort in llm_whisperer helper - ANN401: replace Any with object|None in dispatcher.py; add noqa in test helpers - F841: remove unused workflow_id and result assignments Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * UN-3266 fix: resolve SonarCloud bugs S2259 and S1244 in PR #1849 - S2259: guard against None after _discover_plugins() in loader.py to satisfy static analysis on the dict[str,type]|None field type - S1244: replace float equality checks with pytest.approx() in test_answer_prompt.py and test_phase2h.py Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve SonarCloud code smells in PR #1849 - S5799: Merge all implicit string concatenations in log messages (legacy_executor.py, tasks.py, dispatcher.py, orchestrator.py, registry.py, variable_replacement.py, structure_tool_task.py) - S1192: Extract duplicate literal to _NO_CELERY_APP_MSG constant in dispatcher.py - S1871: Merge identical elif/else branches in tasks.py and test_sanity_phase6j.py - S1186: Add comment to empty stub method in test_sanity_phase6a.py - S1481: Remove unused local variables in test_sanity_phase6d/e/f/g/h/j and test_phase5d.py - S117: Rename PascalCase local variables to snake_case in test_sanity_phase3/5/6i.py - S5655: Broaden tool type annotation to StreamMixin in IndexingUtils.generate_index_key and PlatformHelper.get_adapter_config - docker:S7031: Merge consecutive RUN instructions in worker-unified.Dockerfile - javascript:S1128: Remove unused pollForCompletion import in usePromptRun.js Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: wrap long log message in dispatcher.py to fix E501 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve remaining SonarCloud S117 naming violations Rename PascalCase local variables to snake_case to comply with S117: - legacy_executor.py: rename tuple-unpacked _get_prompt_deps() results (AnswerPromptService→answer_prompt_svc, RetrievalService→retrieval_svc, VariableReplacementService→variable_replacement_svc, LLM→llm_cls, EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls) and update all downstream usages including _apply_type_conversion and _handle_summarize - test_phase1_log_streaming.py: rename Mock* local variables to mock_* snake_case equivalents - test_sanity_phase3.py: rename MockDispatcher→mock_dispatcher_cls and MockShim→mock_shim_cls across all 10 test methods - test_sanity_phase5.py: rename MockShim→mock_shim, MockX2Text→mock_x2text in 6 test methods; MockDispatcher→mock_dispatcher_cls in dispatch test; fix LLM_cls→llm_cls, EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls in _mock_prompt_deps helper Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: resolve remaining SonarCloud code smells in PR #1849 - test_sanity_phase2/4.py, test_answer_prompt.py: rename PascalCase local variables in _mock_prompt_deps/_mock_deps to snake_case (RetrievalService→retrieval_svc, VariableReplacementService→ variable_replacement_svc, Index→index_cls, LLM_cls→llm_cls, EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls, AnswerPromptService→answer_prompt_svc_cls) — fixes S117 - test_sanity_phase3.py: remove unused local variable "result" — fixes S1481 - structure_tool_task.py: remove redundant json.JSONDecodeError from except clause (subclass of ValueError) — fixes S5713 - shared/workflow/execution/service.py: replace generic Exception with RuntimeError for structure tool failure — fixes S112 - run-worker-docker.sh: define EXECUTOR_WORKER_TYPE constant and replace 10 literal "executor" occurrences — fixes S1192 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve SonarCloud cognitive complexity and code smell violations - Reduce cognitive complexity in answer_prompt.py: - Extract _build_grammar_notes, _run_webhook_postprocess helpers - _is_safe_public_url: extracted _resolve_host_addresses helper - handle_json: early-return pattern eliminates nesting - construct_prompt: delegates grammar loop to _build_grammar_notes - Reduce cognitive complexity in legacy_executor.py: - Extract _execute_single_prompt, _run_table_extraction helpers - Extract _run_challenge_if_enabled, _run_evaluation_if_enabled - Extract _inject_table_settings, _finalize_pipeline_result - Extract _convert_number_answer, _convert_scalar_answer - Extract _sanitize_dict_values helper - _handle_answer_prompt CC reduced from 50 to ~7 - Reduce CC in structure_tool_task.py: guard-clause refactor - Reduce CC in backend: dto.py, deployment_helper.py, api_deployment_views.py, prompt_studio_helper.py - Fix S117: rename PascalCase local vars in test_answer_prompt.py - Fix S1192: extract EXECUTOR_WORKER_TYPE constant in run-worker.sh - Fix S1172: remove unused params from structure_tool_task.py - Fix S5713: remove redundant JSONDecodeError in json_repair_helper.py - Fix S112/S5727 in test_execution.py Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: remove unused RetrievalStrategy import from _handle_answer_prompt Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: rename UsageHelper params to lowercase (N803) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve remaining SonarCloud issues from check run 66691002192 - Add @staticmethod to _sanitize_null_values (fixes S2325 missing self) - Reduce _execute_single_prompt params from 25 to 11 (S107) by grouping services as deps tuple and extracting exec params from context.executor_params - Add NOSONAR suppression for raise exc in test helper (S112) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: remove unused locals in _handle_answer_prompt (F841) execution_id, file_hash, log_events_id, custom_data are now extracted inside _execute_single_prompt from context.executor_params. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: resolve Biome linting errors in frontend source files Auto-fixed 48 lint errors across 56 files: import ordering, block statements, unused variable prefixing, and formatting issues. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: replace dynamic import of SharePermission with static import in Workflows Resolves vite build warning about SharePermission.jsx being both dynamically and statically imported across the codebase. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve SonarCloud warnings in frontend components - Remove unnecessary try-catch around PostHog event calls - Flip negated condition in PromptOutput.handleTable for clarity Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Address PR #1849 review comments: fix null guards, dead code, and test drift - Remove redundant inline `import uuid as _uuid` in views.py (use module-level uuid) - URL-encode DB_USER in worker_celery.py result backend connection string - Remove misleading task_queues=[Queue("executor")] from dispatch-only Celery app - Remove dead `if not tool:` guards after objects.get() (already raises DoesNotExist) - Move profile_manager/default_profile null checks before first dereference - Reorder ProfileManager.objects.get before mark_document_indexed in tasks.py - Handle ProfileManager.DoesNotExist as warning, not hard failure - Wrap PostHog analytics in try/catch so failures don't block prompt execution - Handle pending-indexing 200 response in usePromptRun.js (clear RUNNING status) - Reset formData when metadata is missing in ConfigureDs.jsx - Fix test_should_skip_extraction tests: function now takes 1 arg (outputs only) - Fix agentic routing tests: mock X2Text.process, remove stale platform_helper kwarg Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix missing llm_usage_reason for summarize LLM usage tracking Add PSKeys.LLM_USAGE_REASON to usage_kwargs in _handle_summarize() so summarization costs appear under summarize_llm in API response metadata. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * UN-3266 [FIX] Fix single-pass extraction routing in LegacyExecutor - Route _handle_structure_pipeline to _handle_single_pass_extraction when is_single_pass=True (was always calling _handle_answer_prompt) - Delegate _handle_single_pass_extraction to cloud plugin via ExecutorRegistry, falling back to _handle_answer_prompt if plugin not installed Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fixing API depployment response mismatches * Fix single-pass extraction showing only one prompt result in real-time - Fix accumulation bug in usePromptOutput updateCoverage() where each loop iteration spread the original promptOutputs instead of the accumulating updatedPromptOutputs, causing only the last prompt to render - Improve index success toast to show document name - Strip adapter names from index key configs for consistent hashing - Update sdk1 uv.lock Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Move summarize from sync Django plugin to executor worker for IDE index When "Summarize" is enabled and a user indexes a new document in Prompt Studio, the backend returned a 500 because the sync Django plugin tried to read the extracted .txt file before extraction happened in the worker. Fix: defer summarization to the executor worker's _handle_ide_index (extract → summarize → index), build summarize_params in the payload, and track the summarize index in the ide_index_complete callback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Address PR #1849 review comments: null guards, thread safety - Fix null guard ordering in build_index_payload: add DefaultProfileError check before calling validators on default_profile - Fix null guard ordering in _fetch_single_pass_response: move check before .llm.id access and validators (was dead code after dereference) - Add threading.Lock to worker_celery singleton to prevent race under concurrent gunicorn threads Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add documentation to ExecutionResponse DTO describing result structure Documents the ExecutionResponse dataclass fields, especially the result attribute's per-file dict structure (output, metadata, metrics, error keys) as requested in PR review. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix PR review issues: IDOR, null guards, rollback, spinner, summarize, prompt lookup - Strip result payload from task_status to prevent IDOR data leak - Move null guard before validators in build_fetch_response_payload - Roll back indexing flag on broker failure in index_document - Use explicit IDs for spinner clearing instead of ORM serialization format - Catch broad exceptions in summarize tracking to prevent false failures - Guard prompt_id lookup in fetch_response with 400/404 responses Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix CI, tests, and add async prompt studio improvements - Fix biome import ordering in frontend test files - Fix 22 stale backend test assertions in test_tasks.py (patch targets, return values, view dispatch pattern, remove TestCeleryConfig) - Add ide_prompt_complete callback tests - Add frontend vitest config and regression tests for null guards, stale closures, and single-pass loading guard - Add LLMCompat llama-index compatibility wrapper in SDK1 - Add litellm cohere embed timeout monkey-patch (v1.82.3) - Improve S3 filesystem helper (region_name, empty credential handling) - Add RetrieverLLM and lazy LLM creation for retrievers - Improve worker cleanup: api_client.close() in finally blocks, early return on setup failure in API deployment tasks - Add workers conftest.py for test environment setup Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix pre-existing biome CI errors: import ordering and formatting Auto-fix 18 pre-existing organizeImports and formatting errors across 17 frontend files that were blocking biome ci on the entire codebase. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix ruff F821: add missing transaction import in prompt_studio_helper Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add input validation guards to bulk_fetch_response endpoint Validate prompt_ids (non-empty), document_id (required), and handle DoesNotExist for both prompts and document to return proper 400/404 instead of dispatching no-op tasks or raising unhandled 500 errors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * IDE Call backs * Sonar issues fix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix ruff errors: restore summary_profile variable, suppress TC001 in dispatcher Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update bun.lock to match package.json dependency ranges Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix all biome lint warnings: empty blocks, missing braces, forEach returns, unused vars Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Move ExecutionContext import into TYPE_CHECKING block Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix SonarQube issues: duplication, naming, nesting, unused var - Extract _parse_json_body() helper and _ERR_INVALID_JSON constant to deduplicate 5 identical JSON parsing blocks in internal_views.py - Rename `User` local variable to `user_model` (naming convention) - Merge _emit_result/_emit_error into unified _emit_event() in ide_callback tasks to reduce code duplication - Extract _get_task_error() helper to deduplicate AsyncResult error retrieval in ide_index_error and ide_prompt_error - Remove unused `mock_ar_cls` variable in test_ide_callback.py - Add security note documenting why @csrf_exempt is safe on internal endpoints Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Replace worker-ide-callback Dockerfile with worker-unified The IDE callback worker should use the unified worker image (worker-unified.Dockerfile) consistent with all other v2 workers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add celery_executor_agentic queue to executor worker The executor worker only consumed from celery_executor_legacy, but the agentic prompt studio dispatches tasks to celery_executor_agentic. This caused agentic operations to sit in RabbitMQ with no consumer, resulting in timeouts and stuck-in-progress states. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * FIxing email enforce type * Removing line-item from select choices * Update workers/shared/enums/worker_enums_base.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com> * Update backend/workflow_manager/workflow_v2/workflow_helper.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix false success logs and silent failures in ETL destination pipelines - Widen except clause in structure_tool_task to catch FileOperationError and log write paths for diagnostics - Add diagnostic logging at all silent return-None points in destination connector so missing INFILE/METADATA paths are visible in logs - Raise RuntimeError instead of silently skipping when no tool execution result is available for DB/FS destinations, preventing false success - Remove dead retry config from execute_bin task (max_retries=0) - Fix duplicate EXECUTOR/IDE_CALLBACK enum entries in WorkerType Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert ETL destination pipeline changes — deferring to next cut Reverts diagnostic logging and error-raising changes in structure_tool_task.py and destination_connector.py. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com> Co-authored-by: Ghost Jake <89829542+Deepak-Kesavan@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Ritwik G <100672805+ritwik-g@users.noreply.github.com> Co-authored-by: Kirtiman Mishra <110175055+kirtimanmishrazipstack@users.noreply.github.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * [MISC] Fix blank LLM profile edit form and settings menu bold style (#1896) - Replace form.resetFields() with form.setFieldsValue() to fix blank edit form caused by Ant Design's initialValues being a one-time snapshot - Remove Strict Mode-incompatible cleanup that cleared editLlmProfileId - Clear editLlmProfileId in parent when adding new profile - Remove unconditional bold on first settings popover menu item Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [FEAT] Port LINE_ITEM extraction to pluggable executor architecture LINE_ITEM enforce_type prompts previously raised an unconditional error in the new workers executor. Wire the structure-tool path to delegate to a `line_item` executor plugin (mirroring how TABLE delegates to the `table_extractor` plugin), so API deployments / ETL pipelines can run LINE_ITEM extraction once the cloud `line_item_extractor` plugin is installed. - legacy_executor.py: replace the LINE_ITEM error site in `_execute_single_prompt` with a delegation call and add a new `_run_line_item_extraction()` method structurally identical to `_run_table_extraction()`. - test_line_item_extraction.py: 5 new tests covering plugin missing, success, sub-context construction, failure path, and end-to-end Celery eager-mode delegation. - test_sanity_phase6d.py: update the existing LINE_ITEM guard test to match the new "install the line_item_extractor plugin" hint and patch `ExecutorRegistry.get` to simulate the missing plugin. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add line-item to prompt studio enforce_type select choices The `line-item` enforce_type is defined in the `ToolStudioPrompt` model but was missing from the `select_choices.json` dropdown options served to the prompt studio frontend, so users could not select it. Add it now that LINE_ITEM extraction is wired through the new pluggable executor architecture. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com> Co-authored-by: Ghost Jake <89829542+Deepak-Kesavan@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Ritwik G <100672805+ritwik-g@users.noreply.github.com> Co-authored-by: Kirtiman Mishra <110175055+kirtimanmishrazipstack@users.noreply.github.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Chandrasekharan M <117059509+chandrasekharan-zipstack@users.noreply.github.com>
* Execution backend - revamp
* async flow
* Streaming progress to FE
* Removing multi hop in Prompt studio ide and structure tool
* UN-3234 [FIX] Add beta tag to agentic prompt studio navigation item
* Added executors for agentic prompt studio
* Added executors for agentic prompt studio
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Removed redundant envs
* adding worker for callbacks
* adding worker for callbacks
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Pluggable apps and plugins to fit the new async prompt execution architecture
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Pluggable apps and plugins to fit the new async prompt execution architecture
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Pluggable apps and plugins to fit the new async prompt execution architecture
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* fix: write output files in agentic extraction pipeline
Agentic extraction returned early without writing INFILE (JSON) or
METADATA.json, causing destination connectors to read the original PDF
and fail with "Expected tool output type: TXT, got: application/pdf".
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests (#1850)
* UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests
Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with
pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants
in all affected test files to avoid world-writable directory vulnerabilities.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update docs
* UN-3266 fix: remove dead code with undefined names in fetch_response
Remove unreachable code block after the async callback return in
fetch_response that still referenced output_count_before and response
from the old synchronous implementation, causing ruff F821 errors.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Un 3266 fix security hotspot tmp paths (#1851)
* UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests
Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with
pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants
in all affected test files to avoid world-writable directory vulnerabilities.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve ruff linting failures across multiple files
- B026: pass url positionally in worker_celery.py to avoid star-arg after keyword
- N803: rename MockAsyncResult to mock_async_result in test_tasks.py
- E501/I001: fix long line and import sort in llm_whisperer helper
- ANN401: replace Any with object|None in dispatcher.py; add noqa in test helpers
- F841: remove unused workflow_id and result assignments
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* UN-3266 fix: resolve SonarCloud bugs S2259 and S1244 in PR #1849
- S2259: guard against None after _discover_plugins() in loader.py
to satisfy static analysis on the dict[str,type]|None field type
- S1244: replace float equality checks with pytest.approx() in
test_answer_prompt.py and test_phase2h.py
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve SonarCloud code smells in PR #1849
- S5799: Merge all implicit string concatenations in log messages
(legacy_executor.py, tasks.py, dispatcher.py, orchestrator.py,
registry.py, variable_replacement.py, structure_tool_task.py)
- S1192: Extract duplicate literal to _NO_CELERY_APP_MSG constant in
dispatcher.py
- S1871: Merge identical elif/else branches in tasks.py and
test_sanity_phase6j.py
- S1186: Add comment to empty stub method in test_sanity_phase6a.py
- S1481: Remove unused local variables in test_sanity_phase6d/e/f/g/h/j
and test_phase5d.py
- S117: Rename PascalCase local variables to snake_case in
test_sanity_phase3/5/6i.py
- S5655: Broaden tool type annotation to StreamMixin in
IndexingUtils.generate_index_key and PlatformHelper.get_adapter_config
- docker:S7031: Merge consecutive RUN instructions in
worker-unified.Dockerfile
- javascript:S1128: Remove unused pollForCompletion import in
usePromptRun.js
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: wrap long log message in dispatcher.py to fix E501
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve remaining SonarCloud S117 naming violations
Rename PascalCase local variables to snake_case to comply with S117:
- legacy_executor.py: rename tuple-unpacked _get_prompt_deps() results
(AnswerPromptService→answer_prompt_svc, RetrievalService→retrieval_svc,
VariableReplacementService→variable_replacement_svc, LLM→llm_cls,
EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls) and
update all downstream usages including _apply_type_conversion and
_handle_summarize
- test_phase1_log_streaming.py: rename Mock* local variables to
mock_* snake_case equivalents
- test_sanity_phase3.py: rename MockDispatcher→mock_dispatcher_cls
and MockShim→mock_shim_cls across all 10 test methods
- test_sanity_phase5.py: rename MockShim→mock_shim, MockX2Text→mock_x2text
in 6 test methods; MockDispatcher→mock_dispatcher_cls in dispatch test;
fix LLM_cls→llm_cls, EmbeddingCompat→embedding_compat_cls,
VectorDB→vector_db_cls in _mock_prompt_deps helper
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: resolve remaining SonarCloud code smells in PR #1849
- test_sanity_phase2/4.py, test_answer_prompt.py: rename PascalCase
local variables in _mock_prompt_deps/_mock_deps to snake_case
(RetrievalService→retrieval_svc, VariableReplacementService→
variable_replacement_svc, Index→index_cls, LLM_cls→llm_cls,
EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls,
AnswerPromptService→answer_prompt_svc_cls) — fixes S117
- test_sanity_phase3.py: remove unused local variable "result" — fixes S1481
- structure_tool_task.py: remove redundant json.JSONDecodeError from
except clause (subclass of ValueError) — fixes S5713
- shared/workflow/execution/service.py: replace generic Exception with
RuntimeError for structure tool failure — fixes S112
- run-worker-docker.sh: define EXECUTOR_WORKER_TYPE constant and
replace 10 literal "executor" occurrences — fixes S1192
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve SonarCloud cognitive complexity and code smell violations
- Reduce cognitive complexity in answer_prompt.py:
- Extract _build_grammar_notes, _run_webhook_postprocess helpers
- _is_safe_public_url: extracted _resolve_host_addresses helper
- handle_json: early-return pattern eliminates nesting
- construct_prompt: delegates grammar loop to _build_grammar_notes
- Reduce cognitive complexity in legacy_executor.py:
- Extract _execute_single_prompt, _run_table_extraction helpers
- Extract _run_challenge_if_enabled, _run_evaluation_if_enabled
- Extract _inject_table_settings, _finalize_pipeline_result
- Extract _convert_number_answer, _convert_scalar_answer
- Extract _sanitize_dict_values helper
- _handle_answer_prompt CC reduced from 50 to ~7
- Reduce CC in structure_tool_task.py: guard-clause refactor
- Reduce CC in backend: dto.py, deployment_helper.py,
api_deployment_views.py, prompt_studio_helper.py
- Fix S117: rename PascalCase local vars in test_answer_prompt.py
- Fix S1192: extract EXECUTOR_WORKER_TYPE constant in run-worker.sh
- Fix S1172: remove unused params from structure_tool_task.py
- Fix S5713: remove redundant JSONDecodeError in json_repair_helper.py
- Fix S112/S5727 in test_execution.py
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: remove unused RetrievalStrategy import from _handle_answer_prompt
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: rename UsageHelper params to lowercase (N803)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve remaining SonarCloud issues from check run 66691002192
- Add @staticmethod to _sanitize_null_values (fixes S2325 missing self)
- Reduce _execute_single_prompt params from 25 to 11 (S107)
by grouping services as deps tuple and extracting exec params
from context.executor_params
- Add NOSONAR suppression for raise exc in test helper (S112)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: remove unused locals in _handle_answer_prompt (F841)
execution_id, file_hash, log_events_id, custom_data are now extracted
inside _execute_single_prompt from context.executor_params.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: resolve Biome linting errors in frontend source files
Auto-fixed 48 lint errors across 56 files: import ordering, block
statements, unused variable prefixing, and formatting issues.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: replace dynamic import of SharePermission with static import in Workflows
Resolves vite build warning about SharePermission.jsx being both
dynamically and statically imported across the codebase.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: resolve SonarCloud warnings in frontend components
- Remove unnecessary try-catch around PostHog event calls
- Flip negated condition in PromptOutput.handleTable for clarity
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Address PR #1849 review comments: fix null guards, dead code, and test drift
- Remove redundant inline `import uuid as _uuid` in views.py (use module-level uuid)
- URL-encode DB_USER in worker_celery.py result backend connection string
- Remove misleading task_queues=[Queue("executor")] from dispatch-only Celery app
- Remove dead `if not tool:` guards after objects.get() (already raises DoesNotExist)
- Move profile_manager/default_profile null checks before first dereference
- Reorder ProfileManager.objects.get before mark_document_indexed in tasks.py
- Handle ProfileManager.DoesNotExist as warning, not hard failure
- Wrap PostHog analytics in try/catch so failures don't block prompt execution
- Handle pending-indexing 200 response in usePromptRun.js (clear RUNNING status)
- Reset formData when metadata is missing in ConfigureDs.jsx
- Fix test_should_skip_extraction tests: function now takes 1 arg (outputs only)
- Fix agentic routing tests: mock X2Text.process, remove stale platform_helper kwarg
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix missing llm_usage_reason for summarize LLM usage tracking
Add PSKeys.LLM_USAGE_REASON to usage_kwargs in _handle_summarize() so
summarization costs appear under summarize_llm in API response metadata.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* UN-3266 [FIX] Fix single-pass extraction routing in LegacyExecutor
- Route _handle_structure_pipeline to _handle_single_pass_extraction when
is_single_pass=True (was always calling _handle_answer_prompt)
- Delegate _handle_single_pass_extraction to cloud plugin via ExecutorRegistry,
falling back to _handle_answer_prompt if plugin not installed
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fixing API depployment response mismatches
* Fix single-pass extraction showing only one prompt result in real-time
- Fix accumulation bug in usePromptOutput updateCoverage() where each
loop iteration spread the original promptOutputs instead of the
accumulating updatedPromptOutputs, causing only the last prompt to
render
- Improve index success toast to show document name
- Strip adapter names from index key configs for consistent hashing
- Update sdk1 uv.lock
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Move summarize from sync Django plugin to executor worker for IDE index
When "Summarize" is enabled and a user indexes a new document in Prompt
Studio, the backend returned a 500 because the sync Django plugin tried
to read the extracted .txt file before extraction happened in the worker.
Fix: defer summarization to the executor worker's _handle_ide_index
(extract → summarize → index), build summarize_params in the payload,
and track the summarize index in the ide_index_complete callback.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Address PR #1849 review comments: null guards, thread safety
- Fix null guard ordering in build_index_payload: add DefaultProfileError
check before calling validators on default_profile
- Fix null guard ordering in _fetch_single_pass_response: move check
before .llm.id access and validators (was dead code after dereference)
- Add threading.Lock to worker_celery singleton to prevent race under
concurrent gunicorn threads
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Add documentation to ExecutionResponse DTO describing result structure
Documents the ExecutionResponse dataclass fields, especially the
result attribute's per-file dict structure (output, metadata, metrics,
error keys) as requested in PR review.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix PR review issues: IDOR, null guards, rollback, spinner, summarize, prompt lookup
- Strip result payload from task_status to prevent IDOR data leak
- Move null guard before validators in build_fetch_response_payload
- Roll back indexing flag on broker failure in index_document
- Use explicit IDs for spinner clearing instead of ORM serialization format
- Catch broad exceptions in summarize tracking to prevent false failures
- Guard prompt_id lookup in fetch_response with 400/404 responses
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix CI, tests, and add async prompt studio improvements
- Fix biome import ordering in frontend test files
- Fix 22 stale backend test assertions in test_tasks.py (patch targets,
return values, view dispatch pattern, remove TestCeleryConfig)
- Add ide_prompt_complete callback tests
- Add frontend vitest config and regression tests for null guards,
stale closures, and single-pass loading guard
- Add LLMCompat llama-index compatibility wrapper in SDK1
- Add litellm cohere embed timeout monkey-patch (v1.82.3)
- Improve S3 filesystem helper (region_name, empty credential handling)
- Add RetrieverLLM and lazy LLM creation for retrievers
- Improve worker cleanup: api_client.close() in finally blocks,
early return on setup failure in API deployment tasks
- Add workers conftest.py for test environment setup
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix pre-existing biome CI errors: import ordering and formatting
Auto-fix 18 pre-existing organizeImports and formatting errors across
17 frontend files that were blocking biome ci on the entire codebase.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix ruff F821: add missing transaction import in prompt_studio_helper
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Add input validation guards to bulk_fetch_response endpoint
Validate prompt_ids (non-empty), document_id (required), and handle
DoesNotExist for both prompts and document to return proper 400/404
instead of dispatching no-op tasks or raising unhandled 500 errors.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* IDE Call backs
* Sonar issues fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix ruff errors: restore summary_profile variable, suppress TC001 in dispatcher
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update bun.lock to match package.json dependency ranges
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix all biome lint warnings: empty blocks, missing braces, forEach returns, unused vars
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Move ExecutionContext import into TYPE_CHECKING block
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix SonarQube issues: duplication, naming, nesting, unused var
- Extract _parse_json_body() helper and _ERR_INVALID_JSON constant to
deduplicate 5 identical JSON parsing blocks in internal_views.py
- Rename `User` local variable to `user_model` (naming convention)
- Merge _emit_result/_emit_error into unified _emit_event() in
ide_callback tasks to reduce code duplication
- Extract _get_task_error() helper to deduplicate AsyncResult error
retrieval in ide_index_error and ide_prompt_error
- Remove unused `mock_ar_cls` variable in test_ide_callback.py
- Add security note documenting why @csrf_exempt is safe on internal endpoints
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Replace worker-ide-callback Dockerfile with worker-unified
The IDE callback worker should use the unified worker image
(worker-unified.Dockerfile) consistent with all other v2 workers.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Add celery_executor_agentic queue to executor worker
The executor worker only consumed from celery_executor_legacy, but the
agentic prompt studio dispatches tasks to celery_executor_agentic. This
caused agentic operations to sit in RabbitMQ with no consumer, resulting
in timeouts and stuck-in-progress states.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* FIxing email enforce type
* Removing line-item from select choices
* Update workers/shared/enums/worker_enums_base.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com>
* Update backend/workflow_manager/workflow_v2/workflow_helper.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix false success logs and silent failures in ETL destination pipelines
- Widen except clause in structure_tool_task to catch FileOperationError
and log write paths for diagnostics
- Add diagnostic logging at all silent return-None points in destination
connector so missing INFILE/METADATA paths are visible in logs
- Raise RuntimeError instead of silently skipping when no tool execution
result is available for DB/FS destinations, preventing false success
- Remove dead retry config from execute_bin task (max_retries=0)
- Fix duplicate EXECUTOR/IDE_CALLBACK enum entries in WorkerType
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Revert ETL destination pipeline changes — deferring to next cut
Reverts diagnostic logging and error-raising changes in
structure_tool_task.py and destination_connector.py.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix false success logs and missing data in ETL destination pipelines
The Celery-based structure tool execution never created the COPY_TO_FOLDER
directory that FS destination connectors expect, causing "file successfully
sent" logs with no data written. DB destinations silently skipped insertion
when tool_execution_result was None, also producing false success.
- Write structured output to COPY_TO_FOLDER for both regular and agentic paths
- Widen exception handling to catch FileOperationError from fs.json_dump
- Raise RuntimeError instead of silently skipping in FS/DB destination handlers
- Add warning logs at silent return-None points in result retrieval
- Surface METADATA.json write failures with error-level logging
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix missing context_retrieval metric for single pass extraction
The cloud single_pass_extraction plugin handles retrieval internally but
does not report context_retrieval timing in its metrics. This adds a
post-execution file-read measurement that injects the metric, matching
what RetrievalService.retrieve_complete_context provides for the normal
answer_prompt path. Also forces chunk-size=0 for single pass so the OSS
fallback path uses full-context retrieval instead of vector DB.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix Unstructured IO adapter PermissionError on remote storage
Rewrite UnstructuredHelper.process_document() to read file bytes via
fs.read() and wrap in BytesIO, removing the hardcoded local FileStorage
download/builtin open() pattern. The previous code assumed
input_file_path was a local filesystem path, which broke in the executor
worker container where files live in MinIO and the local staging path
isn't mounted. Now matches the pattern used by LLMWhisperer v2 and works
for both local and remote storage backends.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Defer subscription usage tracking to IDE callback workers
Move subscription usage commit from the synchronous Prompt Studio entry
points (index_document, prompt_responder) into the async IDE callback
tasks (ide_index_complete, ide_prompt_complete). With the new executor
flow, the sync entry points return before the actual execution finishes,
so the dashboard usage decorator was firing prematurely and missing real
usage data. Tracking now runs after the executor signals success via the
callback, ensuring dashboard usage metrics reflect completed runs only.
Errors from the subscription plugin are caught and logged so they never
fail the callback. Plugin is optional (OSS mode is a no-op).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix missing embedding metadata in API deployment with chunking
In the structure pipeline indexing path, _run_pipeline_index built the
INDEX executor_params without usage_kwargs. This caused EmbeddingCompat
to register its UsageHandler callback with empty kwargs, so audit DB
rows for embedding usage were stored without run_id/file_execution_id.
get_usage_by_model() in the API deployment then returned no embedding
data, leaving the embedding key absent from the response metadata.
Propagate usage_kwargs from extract_params through _run_pipeline_index
into the INDEX ExecutionContext so the embedding adapter's callback
records audit rows against the correct file_execution_id.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix email enforce type returning "NA" string and surface null in FE
The EMAIL branch in LegacyExecutor stored the literal string "NA" when
the LLM could not extract an address — first because the early-return
short-circuited before _convert_scalar_answer ran, and second because
_convert_scalar_answer never re-checked the second-pass LLM result.
The top-level NA → None sanitizer was also missing, so any "NA" that
slipped through type conversion reached the FE as a string.
Backend (workers/executor/executors/legacy_executor.py):
- _convert_scalar_answer now also returns None when the LLM
extraction itself yields "NA" (benefits both EMAIL and DATE)
- EMAIL branch in _convert_answer_by_type now delegates to
_convert_scalar_answer instead of inlining the if/else
- _sanitize_null_values re-adds the top-level scalar "NA" → None
branch as a defensive backstop, with .strip() symmetry across
dict, list, and nested helpers
Frontend (DisplayPromptResult.jsx + PromptCard.css):
- Split the null/undefined early-return: undefined still shows
"Yet to run", but null now renders an italic-grey "null" literal
so the user can distinguish "ran but produced no value" from
"never ran"
Tests:
- workers/tests/test_answer_prompt.py: TestNullSanitization NA
cases assert None instead of preserved strings; new
TestConvertScalarAnswer covers first-pass / second-pass /
success paths
- DisplayPromptResult.test.jsx: null test asserts the literal
"null" is rendered (and "Yet to run" is not)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Feat/line item executor plugin (#1899)
* [FIX] Executor Queue for Agentic extraction (#1893)
* Execution backend - revamp
* async flow
* Streaming progress to FE
* Removing multi hop in Prompt studio ide and structure tool
* UN-3234 [FIX] Add beta tag to agentic prompt studio navigation item
* Added executors for agentic prompt studio
* Added executors for agentic prompt studio
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Removed redundant envs
* adding worker for callbacks
* adding worker for callbacks
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Pluggable apps and plugins to fit the new async prompt execution architecture
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Pluggable apps and plugins to fit the new async prompt execution architecture
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Pluggable apps and plugins to fit the new async prompt execution architecture
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* fix: write output files in agentic extraction pipeline
Agentic extraction returned early without writing INFILE (JSON) or
METADATA.json, causing destination connectors to read the original PDF
and fail with "Expected tool output type: TXT, got: application/pdf".
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests (#1850)
* UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests
Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with
pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants
in all affected test files to avoid world-writable directory vulnerabilities.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update docs
* UN-3266 fix: remove dead code with undefined names in fetch_response
Remove unreachable code block after the async callback return in
fetch_response that still referenced output_count_before and response
from the old synchronous implementation, causing ruff F821 errors.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Un 3266 fix security hotspot tmp paths (#1851)
* UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests
Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with
pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants
in all affected test files to avoid world-writable directory vulnerabilities.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve ruff linting failures across multiple files
- B026: pass url positionally in worker_celery.py to avoid star-arg after keyword
- N803: rename MockAsyncResult to mock_async_result in test_tasks.py
- E501/I001: fix long line and import sort in llm_whisperer helper
- ANN401: replace Any with object|None in dispatcher.py; add noqa in test helpers
- F841: remove unused workflow_id and result assignments
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* UN-3266 fix: resolve SonarCloud bugs S2259 and S1244 in PR #1849
- S2259: guard against None after _discover_plugins() in loader.py
to satisfy static analysis on the dict[str,type]|None field type
- S1244: replace float equality checks with pytest.approx() in
test_answer_prompt.py and test_phase2h.py
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve SonarCloud code smells in PR #1849
- S5799: Merge all implicit string concatenations in log messages
(legacy_executor.py, tasks.py, dispatcher.py, orchestrator.py,
registry.py, variable_replacement.py, structure_tool_task.py)
- S1192: Extract duplicate literal to _NO_CELERY_APP_MSG constant in
dispatcher.py
- S1871: Merge identical elif/else branches in tasks.py and
test_sanity_phase6j.py
- S1186: Add comment to empty stub method in test_sanity_phase6a.py
- S1481: Remove unused local variables in test_sanity_phase6d/e/f/g/h/j
and test_phase5d.py
- S117: Rename PascalCase local variables to snake_case in
test_sanity_phase3/5/6i.py
- S5655: Broaden tool type annotation to StreamMixin in
IndexingUtils.generate_index_key and PlatformHelper.get_adapter_config
- docker:S7031: Merge consecutive RUN instructions in
worker-unified.Dockerfile
- javascript:S1128: Remove unused pollForCompletion import in
usePromptRun.js
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: wrap long log message in dispatcher.py to fix E501
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve remaining SonarCloud S117 naming violations
Rename PascalCase local variables to snake_case to comply with S117:
- legacy_executor.py: rename tuple-unpacked _get_prompt_deps() results
(AnswerPromptService→answer_prompt_svc, RetrievalService→retrieval_svc,
VariableReplacementService→variable_replacement_svc, LLM→llm_cls,
EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls) and
update all downstream usages including _apply_type_conversion and
_handle_summarize
- test_phase1_log_streaming.py: rename Mock* local variables to
mock_* snake_case equivalents
- test_sanity_phase3.py: rename MockDispatcher→mock_dispatcher_cls
and MockShim→mock_shim_cls across all 10 test methods
- test_sanity_phase5.py: rename MockShim→mock_shim, MockX2Text→mock_x2text
in 6 test methods; MockDispatcher→mock_dispatcher_cls in dispatch test;
fix LLM_cls→llm_cls, EmbeddingCompat→embedding_compat_cls,
VectorDB→vector_db_cls in _mock_prompt_deps helper
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: resolve remaining SonarCloud code smells in PR #1849
- test_sanity_phase2/4.py, test_answer_prompt.py: rename PascalCase
local variables in _mock_prompt_deps/_mock_deps to snake_case
(RetrievalService→retrieval_svc, VariableReplacementService→
variable_replacement_svc, Index→index_cls, LLM_cls→llm_cls,
EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls,
AnswerPromptService→answer_prompt_svc_cls) — fixes S117
- test_sanity_phase3.py: remove unused local variable "result" — fixes S1481
- structure_tool_task.py: remove redundant json.JSONDecodeError from
except clause (subclass of ValueError) — fixes S5713
- shared/workflow/execution/service.py: replace generic Exception with
RuntimeError for structure tool failure — fixes S112
- run-worker-docker.sh: define EXECUTOR_WORKER_TYPE constant and
replace 10 literal "executor" occurrences — fixes S1192
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve SonarCloud cognitive complexity and code smell violations
- Reduce cognitive complexity in answer_prompt.py:
- Extract _build_grammar_notes, _run_webhook_postprocess helpers
- _is_safe_public_url: extracted _resolve_host_addresses helper
- handle_json: early-return pattern eliminates nesting
- construct_prompt: delegates grammar loop to _build_grammar_notes
- Reduce cognitive complexity in legacy_executor.py:
- Extract _execute_single_prompt, _run_table_extraction helpers
- Extract _run_challenge_if_enabled, _run_evaluation_if_enabled
- Extract _inject_table_settings, _finalize_pipeline_result
- Extract _convert_number_answer, _convert_scalar_answer
- Extract _sanitize_dict_values helper
- _handle_answer_prompt CC reduced from 50 to ~7
- Reduce CC in structure_tool_task.py: guard-clause refactor
- Reduce CC in backend: dto.py, deployment_helper.py,
api_deployment_views.py, prompt_studio_helper.py
- Fix S117: rename PascalCase local vars in test_answer_prompt.py
- Fix S1192: extract EXECUTOR_WORKER_TYPE constant in run-worker.sh
- Fix S1172: remove unused params from structure_tool_task.py
- Fix S5713: remove redundant JSONDecodeError in json_repair_helper.py
- Fix S112/S5727 in test_execution.py
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: remove unused RetrievalStrategy import from _handle_answer_prompt
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: rename UsageHelper params to lowercase (N803)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve remaining SonarCloud issues from check run 66691002192
- Add @staticmethod to _sanitize_null_values (fixes S2325 missing self)
- Reduce _execute_single_prompt params from 25 to 11 (S107)
by grouping services as deps tuple and extracting exec params
from context.executor_params
- Add NOSONAR suppression for raise exc in test helper (S112)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: remove unused locals in _handle_answer_prompt (F841)
execution_id, file_hash, log_events_id, custom_data are now extracted
inside _execute_single_prompt from context.executor_params.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: resolve Biome linting errors in frontend source files
Auto-fixed 48 lint errors across 56 files: import ordering, block
statements, unused variable prefixing, and formatting issues.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: replace dynamic import of SharePermission with static import in Workflows
Resolves vite build warning about SharePermission.jsx being both
dynamically and statically imported across the codebase.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: resolve SonarCloud warnings in frontend components
- Remove unnecessary try-catch around PostHog event calls
- Flip negated condition in PromptOutput.handleTable for clarity
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Address PR #1849 review comments: fix null guards, dead code, and test drift
- Remove redundant inline `import uuid as _uuid` in views.py (use module-level uuid)
- URL-encode DB_USER in worker_celery.py result backend connection string
- Remove misleading task_queues=[Queue("executor")] from dispatch-only Celery app
- Remove dead `if not tool:` guards after objects.get() (already raises DoesNotExist)
- Move profile_manager/default_profile null checks before first dereference
- Reorder ProfileManager.objects.get before mark_document_indexed in tasks.py
- Handle ProfileManager.DoesNotExist as warning, not hard failure
- Wrap PostHog analytics in try/catch so failures don't block prompt execution
- Handle pending-indexing 200 response in usePromptRun.js (clear RUNNING status)
- Reset formData when metadata is missing in ConfigureDs.jsx
- Fix test_should_skip_extraction tests: function now takes 1 arg (outputs only)
- Fix agentic routing tests: mock X2Text.process, remove stale platform_helper kwarg
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix missing llm_usage_reason for summarize LLM usage tracking
Add PSKeys.LLM_USAGE_REASON to usage_kwargs in _handle_summarize() so
summarization costs appear under summarize_llm in API response metadata.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* UN-3266 [FIX] Fix single-pass extraction routing in LegacyExecutor
- Route _handle_structure_pipeline to _handle_single_pass_extraction when
is_single_pass=True (was always calling _handle_answer_prompt)
- Delegate _handle_single_pass_extraction to cloud plugin via ExecutorRegistry,
falling back to _handle_answer_prompt if plugin not installed
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fixing API depployment response mismatches
* Fix single-pass extraction showing only one prompt result in real-time
- Fix accumulation bug in usePromptOutput updateCoverage() where each
loop iteration spread the original promptOutputs instead of the
accumulating updatedPromptOutputs, causing only the last prompt to
render
- Improve index success toast to show document name
- Strip adapter names from index key configs for consistent hashing
- Update sdk1 uv.lock
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Move summarize from sync Django plugin to executor worker for IDE index
When "Summarize" is enabled and a user indexes a new document in Prompt
Studio, the backend returned a 500 because the sync Django plugin tried
to read the extracted .txt file before extraction happened in the worker.
Fix: defer summarization to the executor worker's _handle_ide_index
(extract → summarize → index), build summarize_params in the payload,
and track the summarize index in the ide_index_complete callback.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Address PR #1849 review comments: null guards, thread safety
- Fix null guard ordering in build_index_payload: add DefaultProfileError
check before calling validators on default_profile
- Fix null guard ordering in _fetch_single_pass_response: move check
before .llm.id access and validators (was dead code after dereference)
- Add threading.Lock to worker_celery singleton to prevent race under
concurrent gunicorn threads
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Add documentation to ExecutionResponse DTO describing result structure
Documents the ExecutionResponse dataclass fields, especially the
result attribute's per-file dict structure (output, metadata, metrics,
error keys) as requested in PR review.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix PR review issues: IDOR, null guards, rollback, spinner, summarize, prompt lookup
- Strip result payload from task_status to prevent IDOR data leak
- Move null guard before validators in build_fetch_response_payload
- Roll back indexing flag on broker failure in index_document
- Use explicit IDs for spinner clearing instead of ORM serialization format
- Catch broad exceptions in summarize tracking to prevent false failures
- Guard prompt_id lookup in fetch_response with 400/404 responses
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix CI, tests, and add async prompt studio improvements
- Fix biome import ordering in frontend test files
- Fix 22 stale backend test assertions in test_tasks.py (patch targets,
return values, view dispatch pattern, remove TestCeleryConfig)
- Add ide_prompt_complete callback tests
- Add frontend vitest config and regression tests for null guards,
stale closures, and single-pass loading guard
- Add LLMCompat llama-index compatibility wrapper in SDK1
- Add litellm cohere embed timeout monkey-patch (v1.82.3)
- Improve S3 filesystem helper (region_name, empty credential handling)
- Add RetrieverLLM and lazy LLM creation for retrievers
- Improve worker cleanup: api_client.close() in finally blocks,
early return on setup failure in API deployment tasks
- Add workers conftest.py for test environment setup
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix pre-existing biome CI errors: import ordering and formatting
Auto-fix 18 pre-existing organizeImports and formatting errors across
17 frontend files that were blocking biome ci on the entire codebase.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix ruff F821: add missing transaction import in prompt_studio_helper
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Add input validation guards to bulk_fetch_response endpoint
Validate prompt_ids (non-empty), document_id (required), and handle
DoesNotExist for both prompts and document to return proper 400/404
instead of dispatching no-op tasks or raising unhandled 500 errors.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* IDE Call backs
* Sonar issues fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix ruff errors: restore summary_profile variable, suppress TC001 in dispatcher
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update bun.lock to match package.json dependency ranges
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix all biome lint warnings: empty blocks, missing braces, forEach returns, unused vars
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Move ExecutionContext import into TYPE_CHECKING block
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix SonarQube issues: duplication, naming, nesting, unused var
- Extract _parse_json_body() helper and _ERR_INVALID_JSON constant to
deduplicate 5 identical JSON parsing blocks in internal_views.py
- Rename `User` local variable to `user_model` (naming convention)
- Merge _emit_result/_emit_error into unified _emit_event() in
ide_callback tasks to reduce code duplication
- Extract _get_task_error() helper to deduplicate AsyncResult error
retrieval in ide_index_error and ide_prompt_error
- Remove unused `mock_ar_cls` variable in test_ide_callback.py
- Add security note documenting why @csrf_exempt is safe on internal endpoints
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Replace worker-ide-callback Dockerfile with worker-unified
The IDE callback worker should use the unified worker image
(worker-unified.Dockerfile) consistent with all other v2 workers.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Add celery_executor_agentic queue to executor worker
The executor worker only consumed from celery_executor_legacy, but the
agentic prompt studio dispatches tasks to celery_executor_agentic. This
caused agentic operations to sit in RabbitMQ with no consumer, resulting
in timeouts and stuck-in-progress states.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* FIxing email enforce type
* Removing line-item from select choices
* Update workers/shared/enums/worker_enums_base.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com>
* Update backend/workflow_manager/workflow_v2/workflow_helper.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix false success logs and silent failures in ETL destination pipelines
- Widen except clause in structure_tool_task to catch FileOperationError
and log write paths for diagnostics
- Add diagnostic logging at all silent return-None points in destination
connector so missing INFILE/METADATA paths are visible in logs
- Raise RuntimeError instead of silently skipping when no tool execution
result is available for DB/FS destinations, preventing false success
- Remove dead retry config from execute_bin task (max_retries=0)
- Fix duplicate EXECUTOR/IDE_CALLBACK enum entries in WorkerType
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Revert ETL destination pipeline changes — deferring to next cut
Reverts diagnostic logging and error-raising changes in
structure_tool_task.py and destination_connector.py.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com>
Co-authored-by: Ghost Jake <89829542+Deepak-Kesavan@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Ritwik G <100672805+ritwik-g@users.noreply.github.com>
Co-authored-by: Kirtiman Mishra <110175055+kirtimanmishrazipstack@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* [MISC] Fix blank LLM profile edit form and settings menu bold style (#1896)
- Replace form.resetFields() with form.setFieldsValue() to fix blank
edit form caused by Ant Design's initialValues being a one-time snapshot
- Remove Strict Mode-incompatible cleanup that cleared editLlmProfileId
- Clear editLlmProfileId in parent when adding new profile
- Remove unconditional bold on first settings popover menu item
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* [FEAT] Port LINE_ITEM extraction to pluggable executor architecture
LINE_ITEM enforce_type prompts previously raised an unconditional error
in the new workers executor. Wire the structure-tool path to delegate to
a `line_item` executor plugin (mirroring how TABLE delegates to the
`table_extractor` plugin), so API deployments / ETL pipelines can run
LINE_ITEM extraction once the cloud `line_item_extractor` plugin is
installed.
- legacy_executor.py: replace the LINE_ITEM error site in
`_execute_single_prompt` with a delegation call and add a new
`_run_line_item_extraction()` method structurally identical to
`_run_table_extraction()`.
- test_line_item_extraction.py: 5 new tests covering plugin missing,
success, sub-context construction, failure path, and end-to-end
Celery eager-mode delegation.
- test_sanity_phase6d.py: update the existing LINE_ITEM guard test
to match the new "install the line_item_extractor plugin" hint
and patch `ExecutorRegistry.get` to simulate the missing plugin.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com>
Co-authored-by: Ghost Jake <89829542+Deepak-Kesavan@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Ritwik G <100672805+ritwik-g@users.noreply.github.com>
Co-authored-by: Kirtiman Mishra <110175055+kirtimanmishrazipstack@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: Chandrasekharan M <117059509+chandrasekharan-zipstack@users.noreply.github.com>
* Guard against undefined connector type in PostHog event lookup
ConfigureDs called type.toUpperCase() unconditionally, which threw a
TypeError when type was undefined and broke the connector save flow.
Use optional chaining and only fire the PostHog event when the key
maps to a known connector.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Add worker-executor-v2 service to docker-compose under workers-v2 profile
Defines a second executor worker container that runs the unified
worker image with WORKER_TYPE=executor on port 8092. Gated behind
the workers-v2 compose profile so it is opt-in and does not affect
default deployments.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Feat/line item executor plugin (#1900)
* [FIX] Executor Queue for Agentic extraction (#1893)
* Execution backend - revamp
* async flow
* Streaming progress to FE
* Removing multi hop in Prompt studio ide and structure tool
* UN-3234 [FIX] Add beta tag to agentic prompt studio navigation item
* Added executors for agentic prompt studio
* Added executors for agentic prompt studio
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Removed redundant envs
* adding worker for callbacks
* adding worker for callbacks
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Pluggable apps and plugins to fit the new async prompt execution architecture
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Pluggable apps and plugins to fit the new async prompt execution architecture
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Pluggable apps and plugins to fit the new async prompt execution architecture
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* fix: write output files in agentic extraction pipeline
Agentic extraction returned early without writing INFILE (JSON) or
METADATA.json, causing destination connectors to read the original PDF
and fail with "Expected tool output type: TXT, got: application/pdf".
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests (#1850)
* UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests
Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with
pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants
in all affected test files to avoid world-writable directory vulnerabilities.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update docs
* UN-3266 fix: remove dead code with undefined names in fetch_response
Remove unreachable code block after the async callback return in
fetch_response that still referenced output_count_before and response
from the old synchronous implementation, causing ruff F821 errors.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Un 3266 fix security hotspot tmp paths (#1851)
* UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests
Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with
pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants
in all affected test files to avoid world-writable directory vulnerabilities.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve ruff linting failures across multiple files
- B026: pass url positionally in worker_celery.py to avoid star-arg after keyword
- N803: rename MockAsyncResult to mock_async_result in test_tasks.py
- E501/I001: fix long line and import sort in llm_whisperer helper
- ANN401: replace Any with object|None in dispatcher.py; add noqa in test helpers
- F841: remove unused workflow_id and result assignments
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* UN-3266 fix: resolve SonarCloud bugs S2259 and S1244 in PR #1849
- S2259: guard against None after _discover_plugins() in loader.py
to satisfy static analysis on the dict[str,type]|None field type
- S1244: replace float equality checks with pytest.approx() in
test_answer_prompt.py and test_phase2h.py
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve SonarCloud code smells in PR #1849
- S5799: Merge all implicit string concatenations in log messages
(legacy_executor.py, tasks.py, dispatcher.py, orchestrator.py,
registry.py, variable_replacement.py, structure_tool_task.py)
- S1192: Extract duplicate literal to _NO_CELERY_APP_MSG constant in
dispatcher.py
- S1871: Merge identical elif/else branches in tasks.py and
test_sanity_phase6j.py
- S1186: Add comment to empty stub method in test_sanity_phase6a.py
- S1481: Remove unused local variables in test_sanity_phase6d/e/f/g/h/j
and test_phase5d.py
- S117: Rename PascalCase local variables to snake_case in
test_sanity_phase3/5/6i.py
- S5655: Broaden tool type annotation to StreamMixin in
IndexingUtils.generate_index_key and PlatformHelper.get_adapter_config
- docker:S7031: Merge consecutive RUN instructions in
worker-unified.Dockerfile
- javascript:S1128: Remove unused pollForCompletion import in
usePromptRun.js
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: wrap long log message in dispatcher.py to fix E501
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve remaining SonarCloud S117 naming violations
Rename PascalCase local variables to snake_case to comply with S117:
- legacy_executor.py: rename tuple-unpacked _get_prompt_deps() results
(AnswerPromptService→answer_prompt_svc, RetrievalService→retrieval_svc,
VariableReplacementService→variable_replacement_svc, LLM→llm_cls,
EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls) and
update all downstream usages including _apply_type_conversion and
_handle_summarize
- test_phase1_log_streaming.py: rename Mock* local variables to
mock_* snake_case equivalents
- test_sanity_phase3.py: rename MockDispatcher→mock_dispatcher_cls
and MockShim→mock_shim_cls across all 10 test methods
- test_sanity_phase5.py: rename MockShim→mock_shim, MockX2Text→mock_x2text
in 6 test methods; MockDispatcher→mock_dispatcher_cls in dispatch test;
fix LLM_cls→llm_cls, EmbeddingCompat→embedding_compat_cls,
VectorDB→vector_db_cls in _mock_prompt_deps helper
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: resolve remaining SonarCloud code smells in PR #1849
- test_sanity_phase2/4.py, test_answer_prompt.py: rename PascalCase
local variables in _mock_prompt_deps/_mock_deps to snake_case
(RetrievalService→retrieval_svc, VariableReplacementService→
variable_replacement_svc, Index→index_cls, LLM_cls→llm_cls,
EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls,
AnswerPromptService→answer_prompt_svc_cls) — fixes S117
- test_sanity_phase3.py: remove unused local variable "result" — fixes S1481
- structure_tool_task.py: remove redundant json.JSONDecodeError from
except clause (subclass of ValueError) — fixes S5713
- shared/workflow/execution/service.py: replace generic Exception with
RuntimeError for structure tool failure — fixes S112
- run-worker-docker.sh: define EXECUTOR_WORKER_TYPE constant and
replace 10 literal "executor" occurrences — fixes S1192
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve SonarCloud cognitive complexity and code smell violations
- Reduce cognitive complexity in answer_prompt.py:
- Extract _build_grammar_notes, _run_webhook_postprocess helpers
- _is_safe_public_url: extracted _resolve_host_addresses helper
- handle_json: early-return pattern eliminates nesting
- construct_prompt: delegates grammar loop to _build_grammar_notes
- Reduce cognitive complexity in legacy_executor.py:
- Extract _execute_single_prompt, _run_table_extraction helpers
- Extract _run_challenge_if_enabled, _run_evaluation_if_enabled
- Extract _inject_table_settings, _finalize_pipeline_result
- Extract _convert_number_answer, _convert_scalar_answer
- Extract _sanitize_dict_values helper
- _handle_answer_prompt CC reduced from 50 to ~7
- Reduce CC in structure_tool_task.py: guard-clause refactor
- Reduce CC in backend: dto.py, deployment_helper.py,
api_deployment_views.py, prompt_studio_helper.py
- Fix S117: rename PascalCase local vars in test_answer_prompt.py
- Fix S1192: extract EXECUTOR_WORKER_TYPE constant in run-worker.sh
- Fix S1172: remove unused params from structure_tool_task.py
- Fix S5713: remove redundant JSONDecodeError in json_repair_helper.py
- Fix S112/S5727 in test_execution.py
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: remove unused RetrievalStrategy import from _handle_answer_prompt
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: rename UsageHelper params to lowercase (N803)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve remaining SonarCloud issues from check run 66691002192
- Add @staticmethod to _sanitize_null_values (fixes S2325 missing self)
- Reduce _execute_single_prompt params from 25 to 11 (S107)
by grouping services as deps tuple and extracting exec params
from context.executor_params
- Add NOSONAR suppression for raise exc in test helper (S112)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: remove unused locals in _handle_answer_prompt (F841)
execution_id, file_hash, log_events_id, custom_data are now extracted
inside _execute_single_prompt from context.executor_params.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: resolve Biome linting errors in frontend source files
Auto-fixed 48 lint errors across 56 files: import ordering, block
statements, unused variable prefixing, and formatting issues.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: replace dynamic import of SharePermission with static import in Workflows
Resolves vite build warning about SharePermission.jsx being both
dynamically and statically imported across the codebase.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: resolve SonarCloud warnings in frontend components
- Remove unnecessary try-catch around PostHog event calls
- Flip negated condition in PromptOutput.handleTable for clarity
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Address PR #1849 review comments: fix null guards, dead code, and test drift
- Remove redundant inline `import uuid as _uuid` in views.py (use module-level uuid)
- URL-encode DB_USER in worker_celery.py result backend connection string
- Remove misleading task_queues=[Queue("executor")] from dispatch-only Celery app
- Remove dead `if not tool:` guards after objects.get() (already raises DoesNotExist)
- Move profile_manager/default_profile null checks before first dereference
- Reorder ProfileManager.objects.get before mark_document_indexed in tasks.py
- Handle ProfileManager.DoesNotExist as warning, not hard failure
- Wrap PostHog analytics in try/catch so failures don't block prompt execution
- Handle pending-indexing 200 response in usePromptRun.js (clear RUNNING status)
- Reset formData when metadata is missing in ConfigureDs.jsx
- Fix test_should_skip_extraction tests: function now takes 1 arg (outputs only)
- Fix agentic routing tests: mock X2Text.process, remove stale platform_helper kwarg
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix missing llm_usage_reason for summarize LLM usage tracking
Add PSKeys.LLM_USAGE_REASON to usage_kwargs in _handle_summarize() so
summarization costs appear under summarize_llm in API response metadata.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* UN-3266 [FIX] Fix single-pass extraction routing in LegacyExecutor
- Route _handle_structure_pipeline to _handle_single_pass_extraction when
is_single_pass=True (was always calling _handle_answer_prompt)
- Delegate _handle_single_pass_extraction to cloud plugin via ExecutorRegistry,
falling back to _handle_answer_prompt if plugin not installed
Co-Authored-By: Claude Opus 4.6 <noreply@anthro…
* Execution backend - revamp
* async flow
* Streaming progress to FE
* Removing multi hop in Prompt studio ide and structure tool
* UN-3234 [FIX] Add beta tag to agentic prompt studio navigation item
* Added executors for agentic prompt studio
* Added executors for agentic prompt studio
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Removed redundant envs
* adding worker for callbacks
* adding worker for callbacks
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Pluggable apps and plugins to fit the new async prompt execution architecture
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Pluggable apps and plugins to fit the new async prompt execution architecture
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Pluggable apps and plugins to fit the new async prompt execution architecture
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* fix: write output files in agentic extraction pipeline
Agentic extraction returned early without writing INFILE (JSON) or
METADATA.json, causing destination connectors to read the original PDF
and fail with "Expected tool output type: TXT, got: application/pdf".
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests (#1850)
* UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests
Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with
pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants
in all affected test files to avoid world-writable directory vulnerabilities.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update docs
* UN-3266 fix: remove dead code with undefined names in fetch_response
Remove unreachable code block after the async callback return in
fetch_response that still referenced output_count_before and response
from the old synchronous implementation, causing ruff F821 errors.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Un 3266 fix security hotspot tmp paths (#1851)
* UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests
Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with
pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants
in all affected test files to avoid world-writable directory vulnerabilities.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve ruff linting failures across multiple files
- B026: pass url positionally in worker_celery.py to avoid star-arg after keyword
- N803: rename MockAsyncResult to mock_async_result in test_tasks.py
- E501/I001: fix long line and import sort in llm_whisperer helper
- ANN401: replace Any with object|None in dispatcher.py; add noqa in test helpers
- F841: remove unused workflow_id and result assignments
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* UN-3266 fix: resolve SonarCloud bugs S2259 and S1244 in PR #1849
- S2259: guard against None after _discover_plugins() in loader.py
to satisfy static analysis on the dict[str,type]|None field type
- S1244: replace float equality checks with pytest.approx() in
test_answer_prompt.py and test_phase2h.py
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve SonarCloud code smells in PR #1849
- S5799: Merge all implicit string concatenations in log messages
(legacy_executor.py, tasks.py, dispatcher.py, orchestrator.py,
registry.py, variable_replacement.py, structure_tool_task.py)
- S1192: Extract duplicate literal to _NO_CELERY_APP_MSG constant in
dispatcher.py
- S1871: Merge identical elif/else branches in tasks.py and
test_sanity_phase6j.py
- S1186: Add comment to empty stub method in test_sanity_phase6a.py
- S1481: Remove unused local variables in test_sanity_phase6d/e/f/g/h/j
and test_phase5d.py
- S117: Rename PascalCase local variables to snake_case in
test_sanity_phase3/5/6i.py
- S5655: Broaden tool type annotation to StreamMixin in
IndexingUtils.generate_index_key and PlatformHelper.get_adapter_config
- docker:S7031: Merge consecutive RUN instructions in
worker-unified.Dockerfile
- javascript:S1128: Remove unused pollForCompletion import in
usePromptRun.js
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: wrap long log message in dispatcher.py to fix E501
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve remaining SonarCloud S117 naming violations
Rename PascalCase local variables to snake_case to comply with S117:
- legacy_executor.py: rename tuple-unpacked _get_prompt_deps() results
(AnswerPromptService→answer_prompt_svc, RetrievalService→retrieval_svc,
VariableReplacementService→variable_replacement_svc, LLM→llm_cls,
EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls) and
update all downstream usages including _apply_type_conversion and
_handle_summarize
- test_phase1_log_streaming.py: rename Mock* local variables to
mock_* snake_case equivalents
- test_sanity_phase3.py: rename MockDispatcher→mock_dispatcher_cls
and MockShim→mock_shim_cls across all 10 test methods
- test_sanity_phase5.py: rename MockShim→mock_shim, MockX2Text→mock_x2text
in 6 test methods; MockDispatcher→mock_dispatcher_cls in dispatch test;
fix LLM_cls→llm_cls, EmbeddingCompat→embedding_compat_cls,
VectorDB→vector_db_cls in _mock_prompt_deps helper
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: resolve remaining SonarCloud code smells in PR #1849
- test_sanity_phase2/4.py, test_answer_prompt.py: rename PascalCase
local variables in _mock_prompt_deps/_mock_deps to snake_case
(RetrievalService→retrieval_svc, VariableReplacementService→
variable_replacement_svc, Index→index_cls, LLM_cls→llm_cls,
EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls,
AnswerPromptService→answer_prompt_svc_cls) — fixes S117
- test_sanity_phase3.py: remove unused local variable "result" — fixes S1481
- structure_tool_task.py: remove redundant json.JSONDecodeError from
except clause (subclass of ValueError) — fixes S5713
- shared/workflow/execution/service.py: replace generic Exception with
RuntimeError for structure tool failure — fixes S112
- run-worker-docker.sh: define EXECUTOR_WORKER_TYPE constant and
replace 10 literal "executor" occurrences — fixes S1192
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve SonarCloud cognitive complexity and code smell violations
- Reduce cognitive complexity in answer_prompt.py:
- Extract _build_grammar_notes, _run_webhook_postprocess helpers
- _is_safe_public_url: extracted _resolve_host_addresses helper
- handle_json: early-return pattern eliminates nesting
- construct_prompt: delegates grammar loop to _build_grammar_notes
- Reduce cognitive complexity in legacy_executor.py:
- Extract _execute_single_prompt, _run_table_extraction helpers
- Extract _run_challenge_if_enabled, _run_evaluation_if_enabled
- Extract _inject_table_settings, _finalize_pipeline_result
- Extract _convert_number_answer, _convert_scalar_answer
- Extract _sanitize_dict_values helper
- _handle_answer_prompt CC reduced from 50 to ~7
- Reduce CC in structure_tool_task.py: guard-clause refactor
- Reduce CC in backend: dto.py, deployment_helper.py,
api_deployment_views.py, prompt_studio_helper.py
- Fix S117: rename PascalCase local vars in test_answer_prompt.py
- Fix S1192: extract EXECUTOR_WORKER_TYPE constant in run-worker.sh
- Fix S1172: remove unused params from structure_tool_task.py
- Fix S5713: remove redundant JSONDecodeError in json_repair_helper.py
- Fix S112/S5727 in test_execution.py
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: remove unused RetrievalStrategy import from _handle_answer_prompt
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: rename UsageHelper params to lowercase (N803)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve remaining SonarCloud issues from check run 66691002192
- Add @staticmethod to _sanitize_null_values (fixes S2325 missing self)
- Reduce _execute_single_prompt params from 25 to 11 (S107)
by grouping services as deps tuple and extracting exec params
from context.executor_params
- Add NOSONAR suppression for raise exc in test helper (S112)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: remove unused locals in _handle_answer_prompt (F841)
execution_id, file_hash, log_events_id, custom_data are now extracted
inside _execute_single_prompt from context.executor_params.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: resolve Biome linting errors in frontend source files
Auto-fixed 48 lint errors across 56 files: import ordering, block
statements, unused variable prefixing, and formatting issues.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: replace dynamic import of SharePermission with static import in Workflows
Resolves vite build warning about SharePermission.jsx being both
dynamically and statically imported across the codebase.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: resolve SonarCloud warnings in frontend components
- Remove unnecessary try-catch around PostHog event calls
- Flip negated condition in PromptOutput.handleTable for clarity
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Address PR #1849 review comments: fix null guards, dead code, and test drift
- Remove redundant inline `import uuid as _uuid` in views.py (use module-level uuid)
- URL-encode DB_USER in worker_celery.py result backend connection string
- Remove misleading task_queues=[Queue("executor")] from dispatch-only Celery app
- Remove dead `if not tool:` guards after objects.get() (already raises DoesNotExist)
- Move profile_manager/default_profile null checks before first dereference
- Reorder ProfileManager.objects.get before mark_document_indexed in tasks.py
- Handle ProfileManager.DoesNotExist as warning, not hard failure
- Wrap PostHog analytics in try/catch so failures don't block prompt execution
- Handle pending-indexing 200 response in usePromptRun.js (clear RUNNING status)
- Reset formData when metadata is missing in ConfigureDs.jsx
- Fix test_should_skip_extraction tests: function now takes 1 arg (outputs only)
- Fix agentic routing tests: mock X2Text.process, remove stale platform_helper kwarg
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix missing llm_usage_reason for summarize LLM usage tracking
Add PSKeys.LLM_USAGE_REASON to usage_kwargs in _handle_summarize() so
summarization costs appear under summarize_llm in API response metadata.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* UN-3266 [FIX] Fix single-pass extraction routing in LegacyExecutor
- Route _handle_structure_pipeline to _handle_single_pass_extraction when
is_single_pass=True (was always calling _handle_answer_prompt)
- Delegate _handle_single_pass_extraction to cloud plugin via ExecutorRegistry,
falling back to _handle_answer_prompt if plugin not installed
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fixing API depployment response mismatches
* Fix single-pass extraction showing only one prompt result in real-time
- Fix accumulation bug in usePromptOutput updateCoverage() where each
loop iteration spread the original promptOutputs instead of the
accumulating updatedPromptOutputs, causing only the last prompt to
render
- Improve index success toast to show document name
- Strip adapter names from index key configs for consistent hashing
- Update sdk1 uv.lock
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Move summarize from sync Django plugin to executor worker for IDE index
When "Summarize" is enabled and a user indexes a new document in Prompt
Studio, the backend returned a 500 because the sync Django plugin tried
to read the extracted .txt file before extraction happened in the worker.
Fix: defer summarization to the executor worker's _handle_ide_index
(extract → summarize → index), build summarize_params in the payload,
and track the summarize index in the ide_index_complete callback.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Address PR #1849 review comments: null guards, thread safety
- Fix null guard ordering in build_index_payload: add DefaultProfileError
check before calling validators on default_profile
- Fix null guard ordering in _fetch_single_pass_response: move check
before .llm.id access and validators (was dead code after dereference)
- Add threading.Lock to worker_celery singleton to prevent race under
concurrent gunicorn threads
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Add documentation to ExecutionResponse DTO describing result structure
Documents the ExecutionResponse dataclass fields, especially the
result attribute's per-file dict structure (output, metadata, metrics,
error keys) as requested in PR review.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix PR review issues: IDOR, null guards, rollback, spinner, summarize, prompt lookup
- Strip result payload from task_status to prevent IDOR data leak
- Move null guard before validators in build_fetch_response_payload
- Roll back indexing flag on broker failure in index_document
- Use explicit IDs for spinner clearing instead of ORM serialization format
- Catch broad exceptions in summarize tracking to prevent false failures
- Guard prompt_id lookup in fetch_response with 400/404 responses
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix CI, tests, and add async prompt studio improvements
- Fix biome import ordering in frontend test files
- Fix 22 stale backend test assertions in test_tasks.py (patch targets,
return values, view dispatch pattern, remove TestCeleryConfig)
- Add ide_prompt_complete callback tests
- Add frontend vitest config and regression tests for null guards,
stale closures, and single-pass loading guard
- Add LLMCompat llama-index compatibility wrapper in SDK1
- Add litellm cohere embed timeout monkey-patch (v1.82.3)
- Improve S3 filesystem helper (region_name, empty credential handling)
- Add RetrieverLLM and lazy LLM creation for retrievers
- Improve worker cleanup: api_client.close() in finally blocks,
early return on setup failure in API deployment tasks
- Add workers conftest.py for test environment setup
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix pre-existing biome CI errors: import ordering and formatting
Auto-fix 18 pre-existing organizeImports and formatting errors across
17 frontend files that were blocking biome ci on the entire codebase.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix ruff F821: add missing transaction import in prompt_studio_helper
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Add input validation guards to bulk_fetch_response endpoint
Validate prompt_ids (non-empty), document_id (required), and handle
DoesNotExist for both prompts and document to return proper 400/404
instead of dispatching no-op tasks or raising unhandled 500 errors.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* IDE Call backs
* Sonar issues fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix ruff errors: restore summary_profile variable, suppress TC001 in dispatcher
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update bun.lock to match package.json dependency ranges
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix all biome lint warnings: empty blocks, missing braces, forEach returns, unused vars
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Move ExecutionContext import into TYPE_CHECKING block
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix SonarQube issues: duplication, naming, nesting, unused var
- Extract _parse_json_body() helper and _ERR_INVALID_JSON constant to
deduplicate 5 identical JSON parsing blocks in internal_views.py
- Rename `User` local variable to `user_model` (naming convention)
- Merge _emit_result/_emit_error into unified _emit_event() in
ide_callback tasks to reduce code duplication
- Extract _get_task_error() helper to deduplicate AsyncResult error
retrieval in ide_index_error and ide_prompt_error
- Remove unused `mock_ar_cls` variable in test_ide_callback.py
- Add security note documenting why @csrf_exempt is safe on internal endpoints
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Replace worker-ide-callback Dockerfile with worker-unified
The IDE callback worker should use the unified worker image
(worker-unified.Dockerfile) consistent with all other v2 workers.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Add celery_executor_agentic queue to executor worker
The executor worker only consumed from celery_executor_legacy, but the
agentic prompt studio dispatches tasks to celery_executor_agentic. This
caused agentic operations to sit in RabbitMQ with no consumer, resulting
in timeouts and stuck-in-progress states.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* FIxing email enforce type
* Removing line-item from select choices
* Update workers/shared/enums/worker_enums_base.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com>
* Update backend/workflow_manager/workflow_v2/workflow_helper.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix false success logs and silent failures in ETL destination pipelines
- Widen except clause in structure_tool_task to catch FileOperationError
and log write paths for diagnostics
- Add diagnostic logging at all silent return-None points in destination
connector so missing INFILE/METADATA paths are visible in logs
- Raise RuntimeError instead of silently skipping when no tool execution
result is available for DB/FS destinations, preventing false success
- Remove dead retry config from execute_bin task (max_retries=0)
- Fix duplicate EXECUTOR/IDE_CALLBACK enum entries in WorkerType
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Revert ETL destination pipeline changes — deferring to next cut
Reverts diagnostic logging and error-raising changes in
structure_tool_task.py and destination_connector.py.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix false success logs and missing data in ETL destination pipelines
The Celery-based structure tool execution never created the COPY_TO_FOLDER
directory that FS destination connectors expect, causing "file successfully
sent" logs with no data written. DB destinations silently skipped insertion
when tool_execution_result was None, also producing false success.
- Write structured output to COPY_TO_FOLDER for both regular and agentic paths
- Widen exception handling to catch FileOperationError from fs.json_dump
- Raise RuntimeError instead of silently skipping in FS/DB destination handlers
- Add warning logs at silent return-None points in result retrieval
- Surface METADATA.json write failures with error-level logging
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix missing context_retrieval metric for single pass extraction
The cloud single_pass_extraction plugin handles retrieval internally but
does not report context_retrieval timing in its metrics. This adds a
post-execution file-read measurement that injects the metric, matching
what RetrievalService.retrieve_complete_context provides for the normal
answer_prompt path. Also forces chunk-size=0 for single pass so the OSS
fallback path uses full-context retrieval instead of vector DB.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix Unstructured IO adapter PermissionError on remote storage
Rewrite UnstructuredHelper.process_document() to read file bytes via
fs.read() and wrap in BytesIO, removing the hardcoded local FileStorage
download/builtin open() pattern. The previous code assumed
input_file_path was a local filesystem path, which broke in the executor
worker container where files live in MinIO and the local staging path
isn't mounted. Now matches the pattern used by LLMWhisperer v2 and works
for both local and remote storage backends.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Defer subscription usage tracking to IDE callback workers
Move subscription usage commit from the synchronous Prompt Studio entry
points (index_document, prompt_responder) into the async IDE callback
tasks (ide_index_complete, ide_prompt_complete). With the new executor
flow, the sync entry points return before the actual execution finishes,
so the dashboard usage decorator was firing prematurely and missing real
usage data. Tracking now runs after the executor signals success via the
callback, ensuring dashboard usage metrics reflect completed runs only.
Errors from the subscription plugin are caught and logged so they never
fail the callback. Plugin is optional (OSS mode is a no-op).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix missing embedding metadata in API deployment with chunking
In the structure pipeline indexing path, _run_pipeline_index built the
INDEX executor_params without usage_kwargs. This caused EmbeddingCompat
to register its UsageHandler callback with empty kwargs, so audit DB
rows for embedding usage were stored without run_id/file_execution_id.
get_usage_by_model() in the API deployment then returned no embedding
data, leaving the embedding key absent from the response metadata.
Propagate usage_kwargs from extract_params through _run_pipeline_index
into the INDEX ExecutionContext so the embedding adapter's callback
records audit rows against the correct file_execution_id.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix email enforce type returning "NA" string and surface null in FE
The EMAIL branch in LegacyExecutor stored the literal string "NA" when
the LLM could not extract an address — first because the early-return
short-circuited before _convert_scalar_answer ran, and second because
_convert_scalar_answer never re-checked the second-pass LLM result.
The top-level NA → None sanitizer was also missing, so any "NA" that
slipped through type conversion reached the FE as a string.
Backend (workers/executor/executors/legacy_executor.py):
- _convert_scalar_answer now also returns None when the LLM
extraction itself yields "NA" (benefits both EMAIL and DATE)
- EMAIL branch in _convert_answer_by_type now delegates to
_convert_scalar_answer instead of inlining the if/else
- _sanitize_null_values re-adds the top-level scalar "NA" → None
branch as a defensive backstop, with .strip() symmetry across
dict, list, and nested helpers
Frontend (DisplayPromptResult.jsx + PromptCard.css):
- Split the null/undefined early-return: undefined still shows
"Yet to run", but null now renders an italic-grey "null" literal
so the user can distinguish "ran but produced no value" from
"never ran"
Tests:
- workers/tests/test_answer_prompt.py: TestNullSanitization NA
cases assert None instead of preserved strings; new
TestConvertScalarAnswer covers first-pass / second-pass /
success paths
- DisplayPromptResult.test.jsx: null test asserts the literal
"null" is rendered (and "Yet to run" is not)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Feat/line item executor plugin (#1899)
* [FIX] Executor Queue for Agentic extraction (#1893)
* Execution backend - revamp
* async flow
* Streaming progress to FE
* Removing multi hop in Prompt studio ide and structure tool
* UN-3234 [FIX] Add beta tag to agentic prompt studio navigation item
* Added executors for agentic prompt studio
* Added executors for agentic prompt studio
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Removed redundant envs
* adding worker for callbacks
* adding worker for callbacks
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Pluggable apps and plugins to fit the new async prompt execution architecture
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Pluggable apps and plugins to fit the new async prompt execution architecture
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Pluggable apps and plugins to fit the new async prompt execution architecture
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* fix: write output files in agentic extraction pipeline
Agentic extraction returned early without writing INFILE (JSON) or
METADATA.json, causing destination connectors to read the original PDF
and fail with "Expected tool output type: TXT, got: application/pdf".
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests (#1850)
* UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests
Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with
pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants
in all affected test files to avoid world-writable directory vulnerabilities.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update docs
* UN-3266 fix: remove dead code with undefined names in fetch_response
Remove unreachable code block after the async callback return in
fetch_response that still referenced output_count_before and response
from the old synchronous implementation, causing ruff F821 errors.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Un 3266 fix security hotspot tmp paths (#1851)
* UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests
Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with
pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants
in all affected test files to avoid world-writable directory vulnerabilities.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve ruff linting failures across multiple files
- B026: pass url positionally in worker_celery.py to avoid star-arg after keyword
- N803: rename MockAsyncResult to mock_async_result in test_tasks.py
- E501/I001: fix long line and import sort in llm_whisperer helper
- ANN401: replace Any with object|None in dispatcher.py; add noqa in test helpers
- F841: remove unused workflow_id and result assignments
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* UN-3266 fix: resolve SonarCloud bugs S2259 and S1244 in PR #1849
- S2259: guard against None after _discover_plugins() in loader.py
to satisfy static analysis on the dict[str,type]|None field type
- S1244: replace float equality checks with pytest.approx() in
test_answer_prompt.py and test_phase2h.py
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve SonarCloud code smells in PR #1849
- S5799: Merge all implicit string concatenations in log messages
(legacy_executor.py, tasks.py, dispatcher.py, orchestrator.py,
registry.py, variable_replacement.py, structure_tool_task.py)
- S1192: Extract duplicate literal to _NO_CELERY_APP_MSG constant in
dispatcher.py
- S1871: Merge identical elif/else branches in tasks.py and
test_sanity_phase6j.py
- S1186: Add comment to empty stub method in test_sanity_phase6a.py
- S1481: Remove unused local variables in test_sanity_phase6d/e/f/g/h/j
and test_phase5d.py
- S117: Rename PascalCase local variables to snake_case in
test_sanity_phase3/5/6i.py
- S5655: Broaden tool type annotation to StreamMixin in
IndexingUtils.generate_index_key and PlatformHelper.get_adapter_config
- docker:S7031: Merge consecutive RUN instructions in
worker-unified.Dockerfile
- javascript:S1128: Remove unused pollForCompletion import in
usePromptRun.js
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: wrap long log message in dispatcher.py to fix E501
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve remaining SonarCloud S117 naming violations
Rename PascalCase local variables to snake_case to comply with S117:
- legacy_executor.py: rename tuple-unpacked _get_prompt_deps() results
(AnswerPromptService→answer_prompt_svc, RetrievalService→retrieval_svc,
VariableReplacementService→variable_replacement_svc, LLM→llm_cls,
EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls) and
update all downstream usages including _apply_type_conversion and
_handle_summarize
- test_phase1_log_streaming.py: rename Mock* local variables to
mock_* snake_case equivalents
- test_sanity_phase3.py: rename MockDispatcher→mock_dispatcher_cls
and MockShim→mock_shim_cls across all 10 test methods
- test_sanity_phase5.py: rename MockShim→mock_shim, MockX2Text→mock_x2text
in 6 test methods; MockDispatcher→mock_dispatcher_cls in dispatch test;
fix LLM_cls→llm_cls, EmbeddingCompat→embedding_compat_cls,
VectorDB→vector_db_cls in _mock_prompt_deps helper
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: resolve remaining SonarCloud code smells in PR #1849
- test_sanity_phase2/4.py, test_answer_prompt.py: rename PascalCase
local variables in _mock_prompt_deps/_mock_deps to snake_case
(RetrievalService→retrieval_svc, VariableReplacementService→
variable_replacement_svc, Index→index_cls, LLM_cls→llm_cls,
EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls,
AnswerPromptService→answer_prompt_svc_cls) — fixes S117
- test_sanity_phase3.py: remove unused local variable "result" — fixes S1481
- structure_tool_task.py: remove redundant json.JSONDecodeError from
except clause (subclass of ValueError) — fixes S5713
- shared/workflow/execution/service.py: replace generic Exception with
RuntimeError for structure tool failure — fixes S112
- run-worker-docker.sh: define EXECUTOR_WORKER_TYPE constant and
replace 10 literal "executor" occurrences — fixes S1192
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve SonarCloud cognitive complexity and code smell violations
- Reduce cognitive complexity in answer_prompt.py:
- Extract _build_grammar_notes, _run_webhook_postprocess helpers
- _is_safe_public_url: extracted _resolve_host_addresses helper
- handle_json: early-return pattern eliminates nesting
- construct_prompt: delegates grammar loop to _build_grammar_notes
- Reduce cognitive complexity in legacy_executor.py:
- Extract _execute_single_prompt, _run_table_extraction helpers
- Extract _run_challenge_if_enabled, _run_evaluation_if_enabled
- Extract _inject_table_settings, _finalize_pipeline_result
- Extract _convert_number_answer, _convert_scalar_answer
- Extract _sanitize_dict_values helper
- _handle_answer_prompt CC reduced from 50 to ~7
- Reduce CC in structure_tool_task.py: guard-clause refactor
- Reduce CC in backend: dto.py, deployment_helper.py,
api_deployment_views.py, prompt_studio_helper.py
- Fix S117: rename PascalCase local vars in test_answer_prompt.py
- Fix S1192: extract EXECUTOR_WORKER_TYPE constant in run-worker.sh
- Fix S1172: remove unused params from structure_tool_task.py
- Fix S5713: remove redundant JSONDecodeError in json_repair_helper.py
- Fix S112/S5727 in test_execution.py
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: remove unused RetrievalStrategy import from _handle_answer_prompt
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: rename UsageHelper params to lowercase (N803)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve remaining SonarCloud issues from check run 66691002192
- Add @staticmethod to _sanitize_null_values (fixes S2325 missing self)
- Reduce _execute_single_prompt params from 25 to 11 (S107)
by grouping services as deps tuple and extracting exec params
from context.executor_params
- Add NOSONAR suppression for raise exc in test helper (S112)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: remove unused locals in _handle_answer_prompt (F841)
execution_id, file_hash, log_events_id, custom_data are now extracted
inside _execute_single_prompt from context.executor_params.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: resolve Biome linting errors in frontend source files
Auto-fixed 48 lint errors across 56 files: import ordering, block
statements, unused variable prefixing, and formatting issues.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: replace dynamic import of SharePermission with static import in Workflows
Resolves vite build warning about SharePermission.jsx being both
dynamically and statically imported across the codebase.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: resolve SonarCloud warnings in frontend components
- Remove unnecessary try-catch around PostHog event calls
- Flip negated condition in PromptOutput.handleTable for clarity
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Address PR #1849 review comments: fix null guards, dead code, and test drift
- Remove redundant inline `import uuid as _uuid` in views.py (use module-level uuid)
- URL-encode DB_USER in worker_celery.py result backend connection string
- Remove misleading task_queues=[Queue("executor")] from dispatch-only Celery app
- Remove dead `if not tool:` guards after objects.get() (already raises DoesNotExist)
- Move profile_manager/default_profile null checks before first dereference
- Reorder ProfileManager.objects.get before mark_document_indexed in tasks.py
- Handle ProfileManager.DoesNotExist as warning, not hard failure
- Wrap PostHog analytics in try/catch so failures don't block prompt execution
- Handle pending-indexing 200 response in usePromptRun.js (clear RUNNING status)
- Reset formData when metadata is missing in ConfigureDs.jsx
- Fix test_should_skip_extraction tests: function now takes 1 arg (outputs only)
- Fix agentic routing tests: mock X2Text.process, remove stale platform_helper kwarg
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix missing llm_usage_reason for summarize LLM usage tracking
Add PSKeys.LLM_USAGE_REASON to usage_kwargs in _handle_summarize() so
summarization costs appear under summarize_llm in API response metadata.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* UN-3266 [FIX] Fix single-pass extraction routing in LegacyExecutor
- Route _handle_structure_pipeline to _handle_single_pass_extraction when
is_single_pass=True (was always calling _handle_answer_prompt)
- Delegate _handle_single_pass_extraction to cloud plugin via ExecutorRegistry,
falling back to _handle_answer_prompt if plugin not installed
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fixing API depployment response mismatches
* Fix single-pass extraction showing only one prompt result in real-time
- Fix accumulation bug in usePromptOutput updateCoverage() where each
loop iteration spread the original promptOutputs instead of the
accumulating updatedPromptOutputs, causing only the last prompt to
render
- Improve index success toast to show document name
- Strip adapter names from index key configs for consistent hashing
- Update sdk1 uv.lock
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Move summarize from sync Django plugin to executor worker for IDE index
When "Summarize" is enabled and a user indexes a new document in Prompt
Studio, the backend returned a 500 because the sync Django plugin tried
to read the extracted .txt file before extraction happened in the worker.
Fix: defer summarization to the executor worker's _handle_ide_index
(extract → summarize → index), build summarize_params in the payload,
and track the summarize index in the ide_index_complete callback.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Address PR #1849 review comments: null guards, thread safety
- Fix null guard ordering in build_index_payload: add DefaultProfileError
check before calling validators on default_profile
- Fix null guard ordering in _fetch_single_pass_response: move check
before .llm.id access and validators (was dead code after dereference)
- Add threading.Lock to worker_celery singleton to prevent race under
concurrent gunicorn threads
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Add documentation to ExecutionResponse DTO describing result structure
Documents the ExecutionResponse dataclass fields, especially the
result attribute's per-file dict structure (output, metadata, metrics,
error keys) as requested in PR review.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix PR review issues: IDOR, null guards, rollback, spinner, summarize, prompt lookup
- Strip result payload from task_status to prevent IDOR data leak
- Move null guard before validators in build_fetch_response_payload
- Roll back indexing flag on broker failure in index_document
- Use explicit IDs for spinner clearing instead of ORM serialization format
- Catch broad exceptions in summarize tracking to prevent false failures
- Guard prompt_id lookup in fetch_response with 400/404 responses
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix CI, tests, and add async prompt studio improvements
- Fix biome import ordering in frontend test files
- Fix 22 stale backend test assertions in test_tasks.py (patch targets,
return values, view dispatch pattern, remove TestCeleryConfig)
- Add ide_prompt_complete callback tests
- Add frontend vitest config and regression tests for null guards,
stale closures, and single-pass loading guard
- Add LLMCompat llama-index compatibility wrapper in SDK1
- Add litellm cohere embed timeout monkey-patch (v1.82.3)
- Improve S3 filesystem helper (region_name, empty credential handling)
- Add RetrieverLLM and lazy LLM creation for retrievers
- Improve worker cleanup: api_client.close() in finally blocks,
early return on setup failure in API deployment tasks
- Add workers conftest.py for test environment setup
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix pre-existing biome CI errors: import ordering and formatting
Auto-fix 18 pre-existing organizeImports and formatting errors across
17 frontend files that were blocking biome ci on the entire codebase.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix ruff F821: add missing transaction import in prompt_studio_helper
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Add input validation guards to bulk_fetch_response endpoint
Validate prompt_ids (non-empty), document_id (required), and handle
DoesNotExist for both prompts and document to return proper 400/404
instead of dispatching no-op tasks or raising unhandled 500 errors.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* IDE Call backs
* Sonar issues fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix ruff errors: restore summary_profile variable, suppress TC001 in dispatcher
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update bun.lock to match package.json dependency ranges
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix all biome lint warnings: empty blocks, missing braces, forEach returns, unused vars
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Move ExecutionContext import into TYPE_CHECKING block
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix SonarQube issues: duplication, naming, nesting, unused var
- Extract _parse_json_body() helper and _ERR_INVALID_JSON constant to
deduplicate 5 identical JSON parsing blocks in internal_views.py
- Rename `User` local variable to `user_model` (naming convention)
- Merge _emit_result/_emit_error into unified _emit_event() in
ide_callback tasks to reduce code duplication
- Extract _get_task_error() helper to deduplicate AsyncResult error
retrieval in ide_index_error and ide_prompt_error
- Remove unused `mock_ar_cls` variable in test_ide_callback.py
- Add security note documenting why @csrf_exempt is safe on internal endpoints
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Replace worker-ide-callback Dockerfile with worker-unified
The IDE callback worker should use the unified worker image
(worker-unified.Dockerfile) consistent with all other v2 workers.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Add celery_executor_agentic queue to executor worker
The executor worker only consumed from celery_executor_legacy, but the
agentic prompt studio dispatches tasks to celery_executor_agentic. This
caused agentic operations to sit in RabbitMQ with no consumer, resulting
in timeouts and stuck-in-progress states.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* FIxing email enforce type
* Removing line-item from select choices
* Update workers/shared/enums/worker_enums_base.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com>
* Update backend/workflow_manager/workflow_v2/workflow_helper.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix false success logs and silent failures in ETL destination pipelines
- Widen except clause in structure_tool_task to catch FileOperationError
and log write paths for diagnostics
- Add diagnostic logging at all silent return-None points in destination
connector so missing INFILE/METADATA paths are visible in logs
- Raise RuntimeError instead of silently skipping when no tool execution
result is available for DB/FS destinations, preventing false success
- Remove dead retry config from execute_bin task (max_retries=0)
- Fix duplicate EXECUTOR/IDE_CALLBACK enum entries in WorkerType
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Revert ETL destination pipeline changes — deferring to next cut
Reverts diagnostic logging and error-raising changes in
structure_tool_task.py and destination_connector.py.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com>
Co-authored-by: Ghost Jake <89829542+Deepak-Kesavan@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Ritwik G <100672805+ritwik-g@users.noreply.github.com>
Co-authored-by: Kirtiman Mishra <110175055+kirtimanmishrazipstack@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* [MISC] Fix blank LLM profile edit form and settings menu bold style (#1896)
- Replace form.resetFields() with form.setFieldsValue() to fix blank
edit form caused by Ant Design's initialValues being a one-time snapshot
- Remove Strict Mode-incompatible cleanup that cleared editLlmProfileId
- Clear editLlmProfileId in parent when adding new profile
- Remove unconditional bold on first settings popover menu item
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* [FEAT] Port LINE_ITEM extraction to pluggable executor architecture
LINE_ITEM enforce_type prompts previously raised an unconditional error
in the new workers executor. Wire the structure-tool path to delegate to
a `line_item` executor plugin (mirroring how TABLE delegates to the
`table_extractor` plugin), so API deployments / ETL pipelines can run
LINE_ITEM extraction once the cloud `line_item_extractor` plugin is
installed.
- legacy_executor.py: replace the LINE_ITEM error site in
`_execute_single_prompt` with a delegation call and add a new
`_run_line_item_extraction()` method structurally identical to
`_run_table_extraction()`.
- test_line_item_extraction.py: 5 new tests covering plugin missing,
success, sub-context construction, failure path, and end-to-end
Celery eager-mode delegation.
- test_sanity_phase6d.py: update the existing LINE_ITEM guard test
to match the new "install the line_item_extractor plugin" hint
and patch `ExecutorRegistry.get` to simulate the missing plugin.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com>
Co-authored-by: Ghost Jake <89829542+Deepak-Kesavan@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Ritwik G <100672805+ritwik-g@users.noreply.github.com>
Co-authored-by: Kirtiman Mishra <110175055+kirtimanmishrazipstack@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: Chandrasekharan M <117059509+chandrasekharan-zipstack@users.noreply.github.com>
* Guard against undefined connector type in PostHog event lookup
ConfigureDs called type.toUpperCase() unconditionally, which threw a
TypeError when type was undefined and broke the connector save flow.
Use optional chaining and only fire the PostHog event when the key
maps to a known connector.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Add worker-executor-v2 service to docker-compose under workers-v2 profile
Defines a second executor worker container that runs the unified
worker image with WORKER_TYPE=executor on port 8092. Gated behind
the workers-v2 compose profile so it is opt-in and does not affect
default deployments.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Feat/line item executor plugin (#1900)
* [FIX] Executor Queue for Agentic extraction (#1893)
* Execution backend - revamp
* async flow
* Streaming progress to FE
* Removing multi hop in Prompt studio ide and structure tool
* UN-3234 [FIX] Add beta tag to agentic prompt studio navigation item
* Added executors for agentic prompt studio
* Added executors for agentic prompt studio
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Removed redundant envs
* adding worker for callbacks
* adding worker for callbacks
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Pluggable apps and plugins to fit the new async prompt execution architecture
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Pluggable apps and plugins to fit the new async prompt execution architecture
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Pluggable apps and plugins to fit the new async prompt execution architecture
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* fix: write output files in agentic extraction pipeline
Agentic extraction returned early without writing INFILE (JSON) or
METADATA.json, causing destination connectors to read the original PDF
and fail with "Expected tool output type: TXT, got: application/pdf".
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests (#1850)
* UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests
Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with
pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants
in all affected test files to avoid world-writable directory vulnerabilities.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update docs
* UN-3266 fix: remove dead code with undefined names in fetch_response
Remove unreachable code block after the async callback return in
fetch_response that still referenced output_count_before and response
from the old synchronous implementation, causing ruff F821 errors.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Un 3266 fix security hotspot tmp paths (#1851)
* UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests
Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with
pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants
in all affected test files to avoid world-writable directory vulnerabilities.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve ruff linting failures across multiple files
- B026: pass url positionally in worker_celery.py to avoid star-arg after keyword
- N803: rename MockAsyncResult to mock_async_result in test_tasks.py
- E501/I001: fix long line and import sort in llm_whisperer helper
- ANN401: replace Any with object|None in dispatcher.py; add noqa in test helpers
- F841: remove unused workflow_id and result assignments
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* UN-3266 fix: resolve SonarCloud bugs S2259 and S1244 in PR #1849
- S2259: guard against None after _discover_plugins() in loader.py
to satisfy static analysis on the dict[str,type]|None field type
- S1244: replace float equality checks with pytest.approx() in
test_answer_prompt.py and test_phase2h.py
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve SonarCloud code smells in PR #1849
- S5799: Merge all implicit string concatenations in log messages
(legacy_executor.py, tasks.py, dispatcher.py, orchestrator.py,
registry.py, variable_replacement.py, structure_tool_task.py)
- S1192: Extract duplicate literal to _NO_CELERY_APP_MSG constant in
dispatcher.py
- S1871: Merge identical elif/else branches in tasks.py and
test_sanity_phase6j.py
- S1186: Add comment to empty stub method in test_sanity_phase6a.py
- S1481: Remove unused local variables in test_sanity_phase6d/e/f/g/h/j
and test_phase5d.py
- S117: Rename PascalCase local variables to snake_case in
test_sanity_phase3/5/6i.py
- S5655: Broaden tool type annotation to StreamMixin in
IndexingUtils.generate_index_key and PlatformHelper.get_adapter_config
- docker:S7031: Merge consecutive RUN instructions in
worker-unified.Dockerfile
- javascript:S1128: Remove unused pollForCompletion import in
usePromptRun.js
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: wrap long log message in dispatcher.py to fix E501
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve remaining SonarCloud S117 naming violations
Rename PascalCase local variables to snake_case to comply with S117:
- legacy_executor.py: rename tuple-unpacked _get_prompt_deps() results
(AnswerPromptService→answer_prompt_svc, RetrievalService→retrieval_svc,
VariableReplacementService→variable_replacement_svc, LLM→llm_cls,
EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls) and
update all downstream usages including _apply_type_conversion and
_handle_summarize
- test_phase1_log_streaming.py: rename Mock* local variables to
mock_* snake_case equivalents
- test_sanity_phase3.py: rename MockDispatcher→mock_dispatcher_cls
and MockShim→mock_shim_cls across all 10 test methods
- test_sanity_phase5.py: rename MockShim→mock_shim, MockX2Text→mock_x2text
in 6 test methods; MockDispatcher→mock_dispatcher_cls in dispatch test;
fix LLM_cls→llm_cls, EmbeddingCompat→embedding_compat_cls,
VectorDB→vector_db_cls in _mock_prompt_deps helper
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: resolve remaining SonarCloud code smells in PR #1849
- test_sanity_phase2/4.py, test_answer_prompt.py: rename PascalCase
local variables in _mock_prompt_deps/_mock_deps to snake_case
(RetrievalService→retrieval_svc, VariableReplacementService→
variable_replacement_svc, Index→index_cls, LLM_cls→llm_cls,
EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls,
AnswerPromptService→answer_prompt_svc_cls) — fixes S117
- test_sanity_phase3.py: remove unused local variable "result" — fixes S1481
- structure_tool_task.py: remove redundant json.JSONDecodeError from
except clause (subclass of ValueError) — fixes S5713
- shared/workflow/execution/service.py: replace generic Exception with
RuntimeError for structure tool failure — fixes S112
- run-worker-docker.sh: define EXECUTOR_WORKER_TYPE constant and
replace 10 literal "executor" occurrences — fixes S1192
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve SonarCloud cognitive complexity and code smell violations
- Reduce cognitive complexity in answer_prompt.py:
- Extract _build_grammar_notes, _run_webhook_postprocess helpers
- _is_safe_public_url: extracted _resolve_host_addresses helper
- handle_json: early-return pattern eliminates nesting
- construct_prompt: delegates grammar loop to _build_grammar_notes
- Reduce cognitive complexity in legacy_executor.py:
- Extract _execute_single_prompt, _run_table_extraction helpers
- Extract _run_challenge_if_enabled, _run_evaluation_if_enabled
- Extract _inject_table_settings, _finalize_pipeline_result
- Extract _convert_number_answer, _convert_scalar_answer
- Extract _sanitize_dict_values helper
- _handle_answer_prompt CC reduced from 50 to ~7
- Reduce CC in structure_tool_task.py: guard-clause refactor
- Reduce CC in backend: dto.py, deployment_helper.py,
api_deployment_views.py, prompt_studio_helper.py
- Fix S117: rename PascalCase local vars in test_answer_prompt.py
- Fix S1192: extract EXECUTOR_WORKER_TYPE constant in run-worker.sh
- Fix S1172: remove unused params from structure_tool_task.py
- Fix S5713: remove redundant JSONDecodeError in json_repair_helper.py
- Fix S112/S5727 in test_execution.py
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: remove unused RetrievalStrategy import from _handle_answer_prompt
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: rename UsageHelper params to lowercase (N803)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve remaining SonarCloud issues from check run 66691002192
- Add @staticmethod to _sanitize_null_values (fixes S2325 missing self)
- Reduce _execute_single_prompt params from 25 to 11 (S107)
by grouping services as deps tuple and extracting exec params
from context.executor_params
- Add NOSONAR suppression for raise exc in test helper (S112)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: remove unused locals in _handle_answer_prompt (F841)
execution_id, file_hash, log_events_id, custom_data are now extracted
inside _execute_single_prompt from context.executor_params.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: resolve Biome linting errors in frontend source files
Auto-fixed 48 lint errors across 56 files: import ordering, block
statements, unused variable prefixing, and formatting issues.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: replace dynamic import of SharePermission with static import in Workflows
Resolves vite build warning about SharePermission.jsx being both
dynamically and statically imported across the codebase.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: resolve SonarCloud warnings in frontend components
- Remove unnecessary try-catch around PostHog event calls
- Flip negated condition in PromptOutput.handleTable for clarity
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Address PR #1849 review comments: fix null guards, dead code, and test drift
- Remove redundant inline `import uuid as _uuid` in views.py (use module-level uuid)
- URL-encode DB_USER in worker_celery.py result backend connection string
- Remove misleading task_queues=[Queue("executor")] from dispatch-only Celery app
- Remove dead `if not tool:` guards after objects.get() (already raises DoesNotExist)
- Move profile_manager/default_profile null checks before first dereference
- Reorder ProfileManager.objects.get before mark_document_indexed in tasks.py
- Handle ProfileManager.DoesNotExist as warning, not hard failure
- Wrap PostHog analytics in try/catch so failures don't block prompt execution
- Handle pending-indexing 200 response in usePromptRun.js (clear RUNNING status)
- Reset formData when metadata is missing in ConfigureDs.jsx
- Fix test_should_skip_extraction tests: function now takes 1 arg (outputs only)
- Fix agentic routing tests: mock X2Text.process, remove stale platform_helper kwarg
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix missing llm_usage_reason for summarize LLM usage tracking
Add PSKeys.LLM_USAGE_REASON to usage_kwargs in _handle_summarize() so
summarization costs appear under summarize_llm in API response metadata.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* UN-3266 [FIX] Fix single-pass extraction routing in LegacyExecutor
- Route _handle_structure_pipeline to _handle_single_pass_extraction when
is_single_pass=True (was always calling _handle_answer_prompt)
- Delegate _handle_single_pass_extraction to cloud plugin via ExecutorRegistry,
falling back to _handle_answer_prompt if plugin not installed
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic…
* Execution backend - revamp
* async flow
* Streaming progress to FE
* Removing multi hop in Prompt studio ide and structure tool
* UN-3234 [FIX] Add beta tag to agentic prompt studio navigation item
* Added executors for agentic prompt studio
* Added executors for agentic prompt studio
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Removed redundant envs
* adding worker for callbacks
* adding worker for callbacks
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Pluggable apps and plugins to fit the new async prompt execution architecture
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Pluggable apps and plugins to fit the new async prompt execution architecture
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Pluggable apps and plugins to fit the new async prompt execution architecture
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* fix: write output files in agentic extraction pipeline
Agentic extraction returned early without writing INFILE (JSON) or
METADATA.json, causing destination connectors to read the original PDF
and fail with "Expected tool output type: TXT, got: application/pdf".
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests (#1850)
* UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests
Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with
pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants
in all affected test files to avoid world-writable directory vulnerabilities.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update docs
* UN-3266 fix: remove dead code with undefined names in fetch_response
Remove unreachable code block after the async callback return in
fetch_response that still referenced output_count_before and response
from the old synchronous implementation, causing ruff F821 errors.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Un 3266 fix security hotspot tmp paths (#1851)
* UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests
Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with
pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants
in all affected test files to avoid world-writable directory vulnerabilities.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve ruff linting failures across multiple files
- B026: pass url positionally in worker_celery.py to avoid star-arg after keyword
- N803: rename MockAsyncResult to mock_async_result in test_tasks.py
- E501/I001: fix long line and import sort in llm_whisperer helper
- ANN401: replace Any with object|None in dispatcher.py; add noqa in test helpers
- F841: remove unused workflow_id and result assignments
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* UN-3266 fix: resolve SonarCloud bugs S2259 and S1244 in PR #1849
- S2259: guard against None after _discover_plugins() in loader.py
to satisfy static analysis on the dict[str,type]|None field type
- S1244: replace float equality checks with pytest.approx() in
test_answer_prompt.py and test_phase2h.py
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve SonarCloud code smells in PR #1849
- S5799: Merge all implicit string concatenations in log messages
(legacy_executor.py, tasks.py, dispatcher.py, orchestrator.py,
registry.py, variable_replacement.py, structure_tool_task.py)
- S1192: Extract duplicate literal to _NO_CELERY_APP_MSG constant in
dispatcher.py
- S1871: Merge identical elif/else branches in tasks.py and
test_sanity_phase6j.py
- S1186: Add comment to empty stub method in test_sanity_phase6a.py
- S1481: Remove unused local variables in test_sanity_phase6d/e/f/g/h/j
and test_phase5d.py
- S117: Rename PascalCase local variables to snake_case in
test_sanity_phase3/5/6i.py
- S5655: Broaden tool type annotation to StreamMixin in
IndexingUtils.generate_index_key and PlatformHelper.get_adapter_config
- docker:S7031: Merge consecutive RUN instructions in
worker-unified.Dockerfile
- javascript:S1128: Remove unused pollForCompletion import in
usePromptRun.js
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: wrap long log message in dispatcher.py to fix E501
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve remaining SonarCloud S117 naming violations
Rename PascalCase local variables to snake_case to comply with S117:
- legacy_executor.py: rename tuple-unpacked _get_prompt_deps() results
(AnswerPromptService→answer_prompt_svc, RetrievalService→retrieval_svc,
VariableReplacementService→variable_replacement_svc, LLM→llm_cls,
EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls) and
update all downstream usages including _apply_type_conversion and
_handle_summarize
- test_phase1_log_streaming.py: rename Mock* local variables to
mock_* snake_case equivalents
- test_sanity_phase3.py: rename MockDispatcher→mock_dispatcher_cls
and MockShim→mock_shim_cls across all 10 test methods
- test_sanity_phase5.py: rename MockShim→mock_shim, MockX2Text→mock_x2text
in 6 test methods; MockDispatcher→mock_dispatcher_cls in dispatch test;
fix LLM_cls→llm_cls, EmbeddingCompat→embedding_compat_cls,
VectorDB→vector_db_cls in _mock_prompt_deps helper
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: resolve remaining SonarCloud code smells in PR #1849
- test_sanity_phase2/4.py, test_answer_prompt.py: rename PascalCase
local variables in _mock_prompt_deps/_mock_deps to snake_case
(RetrievalService→retrieval_svc, VariableReplacementService→
variable_replacement_svc, Index→index_cls, LLM_cls→llm_cls,
EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls,
AnswerPromptService→answer_prompt_svc_cls) — fixes S117
- test_sanity_phase3.py: remove unused local variable "result" — fixes S1481
- structure_tool_task.py: remove redundant json.JSONDecodeError from
except clause (subclass of ValueError) — fixes S5713
- shared/workflow/execution/service.py: replace generic Exception with
RuntimeError for structure tool failure — fixes S112
- run-worker-docker.sh: define EXECUTOR_WORKER_TYPE constant and
replace 10 literal "executor" occurrences — fixes S1192
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve SonarCloud cognitive complexity and code smell violations
- Reduce cognitive complexity in answer_prompt.py:
- Extract _build_grammar_notes, _run_webhook_postprocess helpers
- _is_safe_public_url: extracted _resolve_host_addresses helper
- handle_json: early-return pattern eliminates nesting
- construct_prompt: delegates grammar loop to _build_grammar_notes
- Reduce cognitive complexity in legacy_executor.py:
- Extract _execute_single_prompt, _run_table_extraction helpers
- Extract _run_challenge_if_enabled, _run_evaluation_if_enabled
- Extract _inject_table_settings, _finalize_pipeline_result
- Extract _convert_number_answer, _convert_scalar_answer
- Extract _sanitize_dict_values helper
- _handle_answer_prompt CC reduced from 50 to ~7
- Reduce CC in structure_tool_task.py: guard-clause refactor
- Reduce CC in backend: dto.py, deployment_helper.py,
api_deployment_views.py, prompt_studio_helper.py
- Fix S117: rename PascalCase local vars in test_answer_prompt.py
- Fix S1192: extract EXECUTOR_WORKER_TYPE constant in run-worker.sh
- Fix S1172: remove unused params from structure_tool_task.py
- Fix S5713: remove redundant JSONDecodeError in json_repair_helper.py
- Fix S112/S5727 in test_execution.py
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: remove unused RetrievalStrategy import from _handle_answer_prompt
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: rename UsageHelper params to lowercase (N803)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve remaining SonarCloud issues from check run 66691002192
- Add @staticmethod to _sanitize_null_values (fixes S2325 missing self)
- Reduce _execute_single_prompt params from 25 to 11 (S107)
by grouping services as deps tuple and extracting exec params
from context.executor_params
- Add NOSONAR suppression for raise exc in test helper (S112)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: remove unused locals in _handle_answer_prompt (F841)
execution_id, file_hash, log_events_id, custom_data are now extracted
inside _execute_single_prompt from context.executor_params.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: resolve Biome linting errors in frontend source files
Auto-fixed 48 lint errors across 56 files: import ordering, block
statements, unused variable prefixing, and formatting issues.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: replace dynamic import of SharePermission with static import in Workflows
Resolves vite build warning about SharePermission.jsx being both
dynamically and statically imported across the codebase.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: resolve SonarCloud warnings in frontend components
- Remove unnecessary try-catch around PostHog event calls
- Flip negated condition in PromptOutput.handleTable for clarity
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Address PR #1849 review comments: fix null guards, dead code, and test drift
- Remove redundant inline `import uuid as _uuid` in views.py (use module-level uuid)
- URL-encode DB_USER in worker_celery.py result backend connection string
- Remove misleading task_queues=[Queue("executor")] from dispatch-only Celery app
- Remove dead `if not tool:` guards after objects.get() (already raises DoesNotExist)
- Move profile_manager/default_profile null checks before first dereference
- Reorder ProfileManager.objects.get before mark_document_indexed in tasks.py
- Handle ProfileManager.DoesNotExist as warning, not hard failure
- Wrap PostHog analytics in try/catch so failures don't block prompt execution
- Handle pending-indexing 200 response in usePromptRun.js (clear RUNNING status)
- Reset formData when metadata is missing in ConfigureDs.jsx
- Fix test_should_skip_extraction tests: function now takes 1 arg (outputs only)
- Fix agentic routing tests: mock X2Text.process, remove stale platform_helper kwarg
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix missing llm_usage_reason for summarize LLM usage tracking
Add PSKeys.LLM_USAGE_REASON to usage_kwargs in _handle_summarize() so
summarization costs appear under summarize_llm in API response metadata.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* UN-3266 [FIX] Fix single-pass extraction routing in LegacyExecutor
- Route _handle_structure_pipeline to _handle_single_pass_extraction when
is_single_pass=True (was always calling _handle_answer_prompt)
- Delegate _handle_single_pass_extraction to cloud plugin via ExecutorRegistry,
falling back to _handle_answer_prompt if plugin not installed
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fixing API depployment response mismatches
* Fix single-pass extraction showing only one prompt result in real-time
- Fix accumulation bug in usePromptOutput updateCoverage() where each
loop iteration spread the original promptOutputs instead of the
accumulating updatedPromptOutputs, causing only the last prompt to
render
- Improve index success toast to show document name
- Strip adapter names from index key configs for consistent hashing
- Update sdk1 uv.lock
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Move summarize from sync Django plugin to executor worker for IDE index
When "Summarize" is enabled and a user indexes a new document in Prompt
Studio, the backend returned a 500 because the sync Django plugin tried
to read the extracted .txt file before extraction happened in the worker.
Fix: defer summarization to the executor worker's _handle_ide_index
(extract → summarize → index), build summarize_params in the payload,
and track the summarize index in the ide_index_complete callback.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Address PR #1849 review comments: null guards, thread safety
- Fix null guard ordering in build_index_payload: add DefaultProfileError
check before calling validators on default_profile
- Fix null guard ordering in _fetch_single_pass_response: move check
before .llm.id access and validators (was dead code after dereference)
- Add threading.Lock to worker_celery singleton to prevent race under
concurrent gunicorn threads
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Add documentation to ExecutionResponse DTO describing result structure
Documents the ExecutionResponse dataclass fields, especially the
result attribute's per-file dict structure (output, metadata, metrics,
error keys) as requested in PR review.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix PR review issues: IDOR, null guards, rollback, spinner, summarize, prompt lookup
- Strip result payload from task_status to prevent IDOR data leak
- Move null guard before validators in build_fetch_response_payload
- Roll back indexing flag on broker failure in index_document
- Use explicit IDs for spinner clearing instead of ORM serialization format
- Catch broad exceptions in summarize tracking to prevent false failures
- Guard prompt_id lookup in fetch_response with 400/404 responses
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix CI, tests, and add async prompt studio improvements
- Fix biome import ordering in frontend test files
- Fix 22 stale backend test assertions in test_tasks.py (patch targets,
return values, view dispatch pattern, remove TestCeleryConfig)
- Add ide_prompt_complete callback tests
- Add frontend vitest config and regression tests for null guards,
stale closures, and single-pass loading guard
- Add LLMCompat llama-index compatibility wrapper in SDK1
- Add litellm cohere embed timeout monkey-patch (v1.82.3)
- Improve S3 filesystem helper (region_name, empty credential handling)
- Add RetrieverLLM and lazy LLM creation for retrievers
- Improve worker cleanup: api_client.close() in finally blocks,
early return on setup failure in API deployment tasks
- Add workers conftest.py for test environment setup
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix pre-existing biome CI errors: import ordering and formatting
Auto-fix 18 pre-existing organizeImports and formatting errors across
17 frontend files that were blocking biome ci on the entire codebase.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix ruff F821: add missing transaction import in prompt_studio_helper
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Add input validation guards to bulk_fetch_response endpoint
Validate prompt_ids (non-empty), document_id (required), and handle
DoesNotExist for both prompts and document to return proper 400/404
instead of dispatching no-op tasks or raising unhandled 500 errors.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* IDE Call backs
* Sonar issues fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix ruff errors: restore summary_profile variable, suppress TC001 in dispatcher
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update bun.lock to match package.json dependency ranges
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix all biome lint warnings: empty blocks, missing braces, forEach returns, unused vars
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Move ExecutionContext import into TYPE_CHECKING block
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix SonarQube issues: duplication, naming, nesting, unused var
- Extract _parse_json_body() helper and _ERR_INVALID_JSON constant to
deduplicate 5 identical JSON parsing blocks in internal_views.py
- Rename `User` local variable to `user_model` (naming convention)
- Merge _emit_result/_emit_error into unified _emit_event() in
ide_callback tasks to reduce code duplication
- Extract _get_task_error() helper to deduplicate AsyncResult error
retrieval in ide_index_error and ide_prompt_error
- Remove unused `mock_ar_cls` variable in test_ide_callback.py
- Add security note documenting why @csrf_exempt is safe on internal endpoints
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Replace worker-ide-callback Dockerfile with worker-unified
The IDE callback worker should use the unified worker image
(worker-unified.Dockerfile) consistent with all other v2 workers.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Add celery_executor_agentic queue to executor worker
The executor worker only consumed from celery_executor_legacy, but the
agentic prompt studio dispatches tasks to celery_executor_agentic. This
caused agentic operations to sit in RabbitMQ with no consumer, resulting
in timeouts and stuck-in-progress states.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* FIxing email enforce type
* Removing line-item from select choices
* Update workers/shared/enums/worker_enums_base.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com>
* Update backend/workflow_manager/workflow_v2/workflow_helper.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix false success logs and silent failures in ETL destination pipelines
- Widen except clause in structure_tool_task to catch FileOperationError
and log write paths for diagnostics
- Add diagnostic logging at all silent return-None points in destination
connector so missing INFILE/METADATA paths are visible in logs
- Raise RuntimeError instead of silently skipping when no tool execution
result is available for DB/FS destinations, preventing false success
- Remove dead retry config from execute_bin task (max_retries=0)
- Fix duplicate EXECUTOR/IDE_CALLBACK enum entries in WorkerType
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Revert ETL destination pipeline changes — deferring to next cut
Reverts diagnostic logging and error-raising changes in
structure_tool_task.py and destination_connector.py.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix false success logs and missing data in ETL destination pipelines
The Celery-based structure tool execution never created the COPY_TO_FOLDER
directory that FS destination connectors expect, causing "file successfully
sent" logs with no data written. DB destinations silently skipped insertion
when tool_execution_result was None, also producing false success.
- Write structured output to COPY_TO_FOLDER for both regular and agentic paths
- Widen exception handling to catch FileOperationError from fs.json_dump
- Raise RuntimeError instead of silently skipping in FS/DB destination handlers
- Add warning logs at silent return-None points in result retrieval
- Surface METADATA.json write failures with error-level logging
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix missing context_retrieval metric for single pass extraction
The cloud single_pass_extraction plugin handles retrieval internally but
does not report context_retrieval timing in its metrics. This adds a
post-execution file-read measurement that injects the metric, matching
what RetrievalService.retrieve_complete_context provides for the normal
answer_prompt path. Also forces chunk-size=0 for single pass so the OSS
fallback path uses full-context retrieval instead of vector DB.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix Unstructured IO adapter PermissionError on remote storage
Rewrite UnstructuredHelper.process_document() to read file bytes via
fs.read() and wrap in BytesIO, removing the hardcoded local FileStorage
download/builtin open() pattern. The previous code assumed
input_file_path was a local filesystem path, which broke in the executor
worker container where files live in MinIO and the local staging path
isn't mounted. Now matches the pattern used by LLMWhisperer v2 and works
for both local and remote storage backends.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Defer subscription usage tracking to IDE callback workers
Move subscription usage commit from the synchronous Prompt Studio entry
points (index_document, prompt_responder) into the async IDE callback
tasks (ide_index_complete, ide_prompt_complete). With the new executor
flow, the sync entry points return before the actual execution finishes,
so the dashboard usage decorator was firing prematurely and missing real
usage data. Tracking now runs after the executor signals success via the
callback, ensuring dashboard usage metrics reflect completed runs only.
Errors from the subscription plugin are caught and logged so they never
fail the callback. Plugin is optional (OSS mode is a no-op).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix missing embedding metadata in API deployment with chunking
In the structure pipeline indexing path, _run_pipeline_index built the
INDEX executor_params without usage_kwargs. This caused EmbeddingCompat
to register its UsageHandler callback with empty kwargs, so audit DB
rows for embedding usage were stored without run_id/file_execution_id.
get_usage_by_model() in the API deployment then returned no embedding
data, leaving the embedding key absent from the response metadata.
Propagate usage_kwargs from extract_params through _run_pipeline_index
into the INDEX ExecutionContext so the embedding adapter's callback
records audit rows against the correct file_execution_id.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix email enforce type returning "NA" string and surface null in FE
The EMAIL branch in LegacyExecutor stored the literal string "NA" when
the LLM could not extract an address — first because the early-return
short-circuited before _convert_scalar_answer ran, and second because
_convert_scalar_answer never re-checked the second-pass LLM result.
The top-level NA → None sanitizer was also missing, so any "NA" that
slipped through type conversion reached the FE as a string.
Backend (workers/executor/executors/legacy_executor.py):
- _convert_scalar_answer now also returns None when the LLM
extraction itself yields "NA" (benefits both EMAIL and DATE)
- EMAIL branch in _convert_answer_by_type now delegates to
_convert_scalar_answer instead of inlining the if/else
- _sanitize_null_values re-adds the top-level scalar "NA" → None
branch as a defensive backstop, with .strip() symmetry across
dict, list, and nested helpers
Frontend (DisplayPromptResult.jsx + PromptCard.css):
- Split the null/undefined early-return: undefined still shows
"Yet to run", but null now renders an italic-grey "null" literal
so the user can distinguish "ran but produced no value" from
"never ran"
Tests:
- workers/tests/test_answer_prompt.py: TestNullSanitization NA
cases assert None instead of preserved strings; new
TestConvertScalarAnswer covers first-pass / second-pass /
success paths
- DisplayPromptResult.test.jsx: null test asserts the literal
"null" is rendered (and "Yet to run" is not)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Feat/line item executor plugin (#1899)
* [FIX] Executor Queue for Agentic extraction (#1893)
* Execution backend - revamp
* async flow
* Streaming progress to FE
* Removing multi hop in Prompt studio ide and structure tool
* UN-3234 [FIX] Add beta tag to agentic prompt studio navigation item
* Added executors for agentic prompt studio
* Added executors for agentic prompt studio
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Removed redundant envs
* adding worker for callbacks
* adding worker for callbacks
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Pluggable apps and plugins to fit the new async prompt execution architecture
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Pluggable apps and plugins to fit the new async prompt execution architecture
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Pluggable apps and plugins to fit the new async prompt execution architecture
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* fix: write output files in agentic extraction pipeline
Agentic extraction returned early without writing INFILE (JSON) or
METADATA.json, causing destination connectors to read the original PDF
and fail with "Expected tool output type: TXT, got: application/pdf".
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests (#1850)
* UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests
Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with
pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants
in all affected test files to avoid world-writable directory vulnerabilities.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update docs
* UN-3266 fix: remove dead code with undefined names in fetch_response
Remove unreachable code block after the async callback return in
fetch_response that still referenced output_count_before and response
from the old synchronous implementation, causing ruff F821 errors.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Un 3266 fix security hotspot tmp paths (#1851)
* UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests
Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with
pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants
in all affected test files to avoid world-writable directory vulnerabilities.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve ruff linting failures across multiple files
- B026: pass url positionally in worker_celery.py to avoid star-arg after keyword
- N803: rename MockAsyncResult to mock_async_result in test_tasks.py
- E501/I001: fix long line and import sort in llm_whisperer helper
- ANN401: replace Any with object|None in dispatcher.py; add noqa in test helpers
- F841: remove unused workflow_id and result assignments
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* UN-3266 fix: resolve SonarCloud bugs S2259 and S1244 in PR #1849
- S2259: guard against None after _discover_plugins() in loader.py
to satisfy static analysis on the dict[str,type]|None field type
- S1244: replace float equality checks with pytest.approx() in
test_answer_prompt.py and test_phase2h.py
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve SonarCloud code smells in PR #1849
- S5799: Merge all implicit string concatenations in log messages
(legacy_executor.py, tasks.py, dispatcher.py, orchestrator.py,
registry.py, variable_replacement.py, structure_tool_task.py)
- S1192: Extract duplicate literal to _NO_CELERY_APP_MSG constant in
dispatcher.py
- S1871: Merge identical elif/else branches in tasks.py and
test_sanity_phase6j.py
- S1186: Add comment to empty stub method in test_sanity_phase6a.py
- S1481: Remove unused local variables in test_sanity_phase6d/e/f/g/h/j
and test_phase5d.py
- S117: Rename PascalCase local variables to snake_case in
test_sanity_phase3/5/6i.py
- S5655: Broaden tool type annotation to StreamMixin in
IndexingUtils.generate_index_key and PlatformHelper.get_adapter_config
- docker:S7031: Merge consecutive RUN instructions in
worker-unified.Dockerfile
- javascript:S1128: Remove unused pollForCompletion import in
usePromptRun.js
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: wrap long log message in dispatcher.py to fix E501
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve remaining SonarCloud S117 naming violations
Rename PascalCase local variables to snake_case to comply with S117:
- legacy_executor.py: rename tuple-unpacked _get_prompt_deps() results
(AnswerPromptService→answer_prompt_svc, RetrievalService→retrieval_svc,
VariableReplacementService→variable_replacement_svc, LLM→llm_cls,
EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls) and
update all downstream usages including _apply_type_conversion and
_handle_summarize
- test_phase1_log_streaming.py: rename Mock* local variables to
mock_* snake_case equivalents
- test_sanity_phase3.py: rename MockDispatcher→mock_dispatcher_cls
and MockShim→mock_shim_cls across all 10 test methods
- test_sanity_phase5.py: rename MockShim→mock_shim, MockX2Text→mock_x2text
in 6 test methods; MockDispatcher→mock_dispatcher_cls in dispatch test;
fix LLM_cls→llm_cls, EmbeddingCompat→embedding_compat_cls,
VectorDB→vector_db_cls in _mock_prompt_deps helper
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: resolve remaining SonarCloud code smells in PR #1849
- test_sanity_phase2/4.py, test_answer_prompt.py: rename PascalCase
local variables in _mock_prompt_deps/_mock_deps to snake_case
(RetrievalService→retrieval_svc, VariableReplacementService→
variable_replacement_svc, Index→index_cls, LLM_cls→llm_cls,
EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls,
AnswerPromptService→answer_prompt_svc_cls) — fixes S117
- test_sanity_phase3.py: remove unused local variable "result" — fixes S1481
- structure_tool_task.py: remove redundant json.JSONDecodeError from
except clause (subclass of ValueError) — fixes S5713
- shared/workflow/execution/service.py: replace generic Exception with
RuntimeError for structure tool failure — fixes S112
- run-worker-docker.sh: define EXECUTOR_WORKER_TYPE constant and
replace 10 literal "executor" occurrences — fixes S1192
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve SonarCloud cognitive complexity and code smell violations
- Reduce cognitive complexity in answer_prompt.py:
- Extract _build_grammar_notes, _run_webhook_postprocess helpers
- _is_safe_public_url: extracted _resolve_host_addresses helper
- handle_json: early-return pattern eliminates nesting
- construct_prompt: delegates grammar loop to _build_grammar_notes
- Reduce cognitive complexity in legacy_executor.py:
- Extract _execute_single_prompt, _run_table_extraction helpers
- Extract _run_challenge_if_enabled, _run_evaluation_if_enabled
- Extract _inject_table_settings, _finalize_pipeline_result
- Extract _convert_number_answer, _convert_scalar_answer
- Extract _sanitize_dict_values helper
- _handle_answer_prompt CC reduced from 50 to ~7
- Reduce CC in structure_tool_task.py: guard-clause refactor
- Reduce CC in backend: dto.py, deployment_helper.py,
api_deployment_views.py, prompt_studio_helper.py
- Fix S117: rename PascalCase local vars in test_answer_prompt.py
- Fix S1192: extract EXECUTOR_WORKER_TYPE constant in run-worker.sh
- Fix S1172: remove unused params from structure_tool_task.py
- Fix S5713: remove redundant JSONDecodeError in json_repair_helper.py
- Fix S112/S5727 in test_execution.py
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: remove unused RetrievalStrategy import from _handle_answer_prompt
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: rename UsageHelper params to lowercase (N803)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve remaining SonarCloud issues from check run 66691002192
- Add @staticmethod to _sanitize_null_values (fixes S2325 missing self)
- Reduce _execute_single_prompt params from 25 to 11 (S107)
by grouping services as deps tuple and extracting exec params
from context.executor_params
- Add NOSONAR suppression for raise exc in test helper (S112)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: remove unused locals in _handle_answer_prompt (F841)
execution_id, file_hash, log_events_id, custom_data are now extracted
inside _execute_single_prompt from context.executor_params.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: resolve Biome linting errors in frontend source files
Auto-fixed 48 lint errors across 56 files: import ordering, block
statements, unused variable prefixing, and formatting issues.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: replace dynamic import of SharePermission with static import in Workflows
Resolves vite build warning about SharePermission.jsx being both
dynamically and statically imported across the codebase.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: resolve SonarCloud warnings in frontend components
- Remove unnecessary try-catch around PostHog event calls
- Flip negated condition in PromptOutput.handleTable for clarity
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Address PR #1849 review comments: fix null guards, dead code, and test drift
- Remove redundant inline `import uuid as _uuid` in views.py (use module-level uuid)
- URL-encode DB_USER in worker_celery.py result backend connection string
- Remove misleading task_queues=[Queue("executor")] from dispatch-only Celery app
- Remove dead `if not tool:` guards after objects.get() (already raises DoesNotExist)
- Move profile_manager/default_profile null checks before first dereference
- Reorder ProfileManager.objects.get before mark_document_indexed in tasks.py
- Handle ProfileManager.DoesNotExist as warning, not hard failure
- Wrap PostHog analytics in try/catch so failures don't block prompt execution
- Handle pending-indexing 200 response in usePromptRun.js (clear RUNNING status)
- Reset formData when metadata is missing in ConfigureDs.jsx
- Fix test_should_skip_extraction tests: function now takes 1 arg (outputs only)
- Fix agentic routing tests: mock X2Text.process, remove stale platform_helper kwarg
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix missing llm_usage_reason for summarize LLM usage tracking
Add PSKeys.LLM_USAGE_REASON to usage_kwargs in _handle_summarize() so
summarization costs appear under summarize_llm in API response metadata.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* UN-3266 [FIX] Fix single-pass extraction routing in LegacyExecutor
- Route _handle_structure_pipeline to _handle_single_pass_extraction when
is_single_pass=True (was always calling _handle_answer_prompt)
- Delegate _handle_single_pass_extraction to cloud plugin via ExecutorRegistry,
falling back to _handle_answer_prompt if plugin not installed
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fixing API depployment response mismatches
* Fix single-pass extraction showing only one prompt result in real-time
- Fix accumulation bug in usePromptOutput updateCoverage() where each
loop iteration spread the original promptOutputs instead of the
accumulating updatedPromptOutputs, causing only the last prompt to
render
- Improve index success toast to show document name
- Strip adapter names from index key configs for consistent hashing
- Update sdk1 uv.lock
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Move summarize from sync Django plugin to executor worker for IDE index
When "Summarize" is enabled and a user indexes a new document in Prompt
Studio, the backend returned a 500 because the sync Django plugin tried
to read the extracted .txt file before extraction happened in the worker.
Fix: defer summarization to the executor worker's _handle_ide_index
(extract → summarize → index), build summarize_params in the payload,
and track the summarize index in the ide_index_complete callback.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Address PR #1849 review comments: null guards, thread safety
- Fix null guard ordering in build_index_payload: add DefaultProfileError
check before calling validators on default_profile
- Fix null guard ordering in _fetch_single_pass_response: move check
before .llm.id access and validators (was dead code after dereference)
- Add threading.Lock to worker_celery singleton to prevent race under
concurrent gunicorn threads
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Add documentation to ExecutionResponse DTO describing result structure
Documents the ExecutionResponse dataclass fields, especially the
result attribute's per-file dict structure (output, metadata, metrics,
error keys) as requested in PR review.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix PR review issues: IDOR, null guards, rollback, spinner, summarize, prompt lookup
- Strip result payload from task_status to prevent IDOR data leak
- Move null guard before validators in build_fetch_response_payload
- Roll back indexing flag on broker failure in index_document
- Use explicit IDs for spinner clearing instead of ORM serialization format
- Catch broad exceptions in summarize tracking to prevent false failures
- Guard prompt_id lookup in fetch_response with 400/404 responses
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix CI, tests, and add async prompt studio improvements
- Fix biome import ordering in frontend test files
- Fix 22 stale backend test assertions in test_tasks.py (patch targets,
return values, view dispatch pattern, remove TestCeleryConfig)
- Add ide_prompt_complete callback tests
- Add frontend vitest config and regression tests for null guards,
stale closures, and single-pass loading guard
- Add LLMCompat llama-index compatibility wrapper in SDK1
- Add litellm cohere embed timeout monkey-patch (v1.82.3)
- Improve S3 filesystem helper (region_name, empty credential handling)
- Add RetrieverLLM and lazy LLM creation for retrievers
- Improve worker cleanup: api_client.close() in finally blocks,
early return on setup failure in API deployment tasks
- Add workers conftest.py for test environment setup
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix pre-existing biome CI errors: import ordering and formatting
Auto-fix 18 pre-existing organizeImports and formatting errors across
17 frontend files that were blocking biome ci on the entire codebase.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix ruff F821: add missing transaction import in prompt_studio_helper
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Add input validation guards to bulk_fetch_response endpoint
Validate prompt_ids (non-empty), document_id (required), and handle
DoesNotExist for both prompts and document to return proper 400/404
instead of dispatching no-op tasks or raising unhandled 500 errors.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* IDE Call backs
* Sonar issues fix
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix ruff errors: restore summary_profile variable, suppress TC001 in dispatcher
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update bun.lock to match package.json dependency ranges
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix all biome lint warnings: empty blocks, missing braces, forEach returns, unused vars
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Move ExecutionContext import into TYPE_CHECKING block
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix SonarQube issues: duplication, naming, nesting, unused var
- Extract _parse_json_body() helper and _ERR_INVALID_JSON constant to
deduplicate 5 identical JSON parsing blocks in internal_views.py
- Rename `User` local variable to `user_model` (naming convention)
- Merge _emit_result/_emit_error into unified _emit_event() in
ide_callback tasks to reduce code duplication
- Extract _get_task_error() helper to deduplicate AsyncResult error
retrieval in ide_index_error and ide_prompt_error
- Remove unused `mock_ar_cls` variable in test_ide_callback.py
- Add security note documenting why @csrf_exempt is safe on internal endpoints
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Replace worker-ide-callback Dockerfile with worker-unified
The IDE callback worker should use the unified worker image
(worker-unified.Dockerfile) consistent with all other v2 workers.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Add celery_executor_agentic queue to executor worker
The executor worker only consumed from celery_executor_legacy, but the
agentic prompt studio dispatches tasks to celery_executor_agentic. This
caused agentic operations to sit in RabbitMQ with no consumer, resulting
in timeouts and stuck-in-progress states.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* FIxing email enforce type
* Removing line-item from select choices
* Update workers/shared/enums/worker_enums_base.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com>
* Update backend/workflow_manager/workflow_v2/workflow_helper.py
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix false success logs and silent failures in ETL destination pipelines
- Widen except clause in structure_tool_task to catch FileOperationError
and log write paths for diagnostics
- Add diagnostic logging at all silent return-None points in destination
connector so missing INFILE/METADATA paths are visible in logs
- Raise RuntimeError instead of silently skipping when no tool execution
result is available for DB/FS destinations, preventing false success
- Remove dead retry config from execute_bin task (max_retries=0)
- Fix duplicate EXECUTOR/IDE_CALLBACK enum entries in WorkerType
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Revert ETL destination pipeline changes — deferring to next cut
Reverts diagnostic logging and error-raising changes in
structure_tool_task.py and destination_connector.py.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com>
Co-authored-by: Ghost Jake <89829542+Deepak-Kesavan@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Ritwik G <100672805+ritwik-g@users.noreply.github.com>
Co-authored-by: Kirtiman Mishra <110175055+kirtimanmishrazipstack@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
* [MISC] Fix blank LLM profile edit form and settings menu bold style (#1896)
- Replace form.resetFields() with form.setFieldsValue() to fix blank
edit form caused by Ant Design's initialValues being a one-time snapshot
- Remove Strict Mode-incompatible cleanup that cleared editLlmProfileId
- Clear editLlmProfileId in parent when adding new profile
- Remove unconditional bold on first settings popover menu item
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* [FEAT] Port LINE_ITEM extraction to pluggable executor architecture
LINE_ITEM enforce_type prompts previously raised an unconditional error
in the new workers executor. Wire the structure-tool path to delegate to
a `line_item` executor plugin (mirroring how TABLE delegates to the
`table_extractor` plugin), so API deployments / ETL pipelines can run
LINE_ITEM extraction once the cloud `line_item_extractor` plugin is
installed.
- legacy_executor.py: replace the LINE_ITEM error site in
`_execute_single_prompt` with a delegation call and add a new
`_run_line_item_extraction()` method structurally identical to
`_run_table_extraction()`.
- test_line_item_extraction.py: 5 new tests covering plugin missing,
success, sub-context construction, failure path, and end-to-end
Celery eager-mode delegation.
- test_sanity_phase6d.py: update the existing LINE_ITEM guard test
to match the new "install the line_item_extractor plugin" hint
and patch `ExecutorRegistry.get` to simulate the missing plugin.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com>
Co-authored-by: Ghost Jake <89829542+Deepak-Kesavan@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Ritwik G <100672805+ritwik-g@users.noreply.github.com>
Co-authored-by: Kirtiman Mishra <110175055+kirtimanmishrazipstack@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: Chandrasekharan M <117059509+chandrasekharan-zipstack@users.noreply.github.com>
* Guard against undefined connector type in PostHog event lookup
ConfigureDs called type.toUpperCase() unconditionally, which threw a
TypeError when type was undefined and broke the connector save flow.
Use optional chaining and only fire the PostHog event when the key
maps to a known connector.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Add worker-executor-v2 service to docker-compose under workers-v2 profile
Defines a second executor worker container that runs the unified
worker image with WORKER_TYPE=executor on port 8092. Gated behind
the workers-v2 compose profile so it is opt-in and does not affect
default deployments.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Feat/line item executor plugin (#1900)
* [FIX] Executor Queue for Agentic extraction (#1893)
* Execution backend - revamp
* async flow
* Streaming progress to FE
* Removing multi hop in Prompt studio ide and structure tool
* UN-3234 [FIX] Add beta tag to agentic prompt studio navigation item
* Added executors for agentic prompt studio
* Added executors for agentic prompt studio
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* Removed redundant envs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Removed redundant envs
* adding worker for callbacks
* adding worker for callbacks
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Pluggable apps and plugins to fit the new async prompt execution architecture
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Pluggable apps and plugins to fit the new async prompt execution architecture
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Pluggable apps and plugins to fit the new async prompt execution architecture
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* adding worker for callbacks
* fix: write output files in agentic extraction pipeline
Agentic extraction returned early without writing INFILE (JSON) or
METADATA.json, causing destination connectors to read the original PDF
and fail with "Expected tool output type: TXT, got: application/pdf".
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests (#1850)
* UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests
Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with
pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants
in all affected test files to avoid world-writable directory vulnerabilities.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update docs
* UN-3266 fix: remove dead code with undefined names in fetch_response
Remove unreachable code block after the async callback return in
fetch_response that still referenced output_count_before and response
from the old synchronous implementation, causing ruff F821 errors.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Un 3266 fix security hotspot tmp paths (#1851)
* UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests
Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with
pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants
in all affected test files to avoid world-writable directory vulnerabilities.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve ruff linting failures across multiple files
- B026: pass url positionally in worker_celery.py to avoid star-arg after keyword
- N803: rename MockAsyncResult to mock_async_result in test_tasks.py
- E501/I001: fix long line and import sort in llm_whisperer helper
- ANN401: replace Any with object|None in dispatcher.py; add noqa in test helpers
- F841: remove unused workflow_id and result assignments
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* UN-3266 fix: resolve SonarCloud bugs S2259 and S1244 in PR #1849
- S2259: guard against None after _discover_plugins() in loader.py
to satisfy static analysis on the dict[str,type]|None field type
- S1244: replace float equality checks with pytest.approx() in
test_answer_prompt.py and test_phase2h.py
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve SonarCloud code smells in PR #1849
- S5799: Merge all implicit string concatenations in log messages
(legacy_executor.py, tasks.py, dispatcher.py, orchestrator.py,
registry.py, variable_replacement.py, structure_tool_task.py)
- S1192: Extract duplicate literal to _NO_CELERY_APP_MSG constant in
dispatcher.py
- S1871: Merge identical elif/else branches in tasks.py and
test_sanity_phase6j.py
- S1186: Add comment to empty stub method in test_sanity_phase6a.py
- S1481: Remove unused local variables in test_sanity_phase6d/e/f/g/h/j
and test_phase5d.py
- S117: Rename PascalCase local variables to snake_case in
test_sanity_phase3/5/6i.py
- S5655: Broaden tool type annotation to StreamMixin in
IndexingUtils.generate_index_key and PlatformHelper.get_adapter_config
- docker:S7031: Merge consecutive RUN instructions in
worker-unified.Dockerfile
- javascript:S1128: Remove unused pollForCompletion import in
usePromptRun.js
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: wrap long log message in dispatcher.py to fix E501
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve remaining SonarCloud S117 naming violations
Rename PascalCase local variables to snake_case to comply with S117:
- legacy_executor.py: rename tuple-unpacked _get_prompt_deps() results
(AnswerPromptService→answer_prompt_svc, RetrievalService→retrieval_svc,
VariableReplacementService→variable_replacement_svc, LLM→llm_cls,
EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls) and
update all downstream usages including _apply_type_conversion and
_handle_summarize
- test_phase1_log_streaming.py: rename Mock* local variables to
mock_* snake_case equivalents
- test_sanity_phase3.py: rename MockDispatcher→mock_dispatcher_cls
and MockShim→mock_shim_cls across all 10 test methods
- test_sanity_phase5.py: rename MockShim→mock_shim, MockX2Text→mock_x2text
in 6 test methods; MockDispatcher→mock_dispatcher_cls in dispatch test;
fix LLM_cls→llm_cls, EmbeddingCompat→embedding_compat_cls,
VectorDB→vector_db_cls in _mock_prompt_deps helper
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: resolve remaining SonarCloud code smells in PR #1849
- test_sanity_phase2/4.py, test_answer_prompt.py: rename PascalCase
local variables in _mock_prompt_deps/_mock_deps to snake_case
(RetrievalService→retrieval_svc, VariableReplacementService→
variable_replacement_svc, Index→index_cls, LLM_cls→llm_cls,
EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls,
AnswerPromptService→answer_prompt_svc_cls) — fixes S117
- test_sanity_phase3.py: remove unused local variable "result" — fixes S1481
- structure_tool_task.py: remove redundant json.JSONDecodeError from
except clause (subclass of ValueError) — fixes S5713
- shared/workflow/execution/service.py: replace generic Exception with
RuntimeError for structure tool failure — fixes S112
- run-worker-docker.sh: define EXECUTOR_WORKER_TYPE constant and
replace 10 literal "executor" occurrences — fixes S1192
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve SonarCloud cognitive complexity and code smell violations
- Reduce cognitive complexity in answer_prompt.py:
- Extract _build_grammar_notes, _run_webhook_postprocess helpers
- _is_safe_public_url: extracted _resolve_host_addresses helper
- handle_json: early-return pattern eliminates nesting
- construct_prompt: delegates grammar loop to _build_grammar_notes
- Reduce cognitive complexity in legacy_executor.py:
- Extract _execute_single_prompt, _run_table_extraction helpers
- Extract _run_challenge_if_enabled, _run_evaluation_if_enabled
- Extract _inject_table_settings, _finalize_pipeline_result
- Extract _convert_number_answer, _convert_scalar_answer
- Extract _sanitize_dict_values helper
- _handle_answer_prompt CC reduced from 50 to ~7
- Reduce CC in structure_tool_task.py: guard-clause refactor
- Reduce CC in backend: dto.py, deployment_helper.py,
api_deployment_views.py, prompt_studio_helper.py
- Fix S117: rename PascalCase local vars in test_answer_prompt.py
- Fix S1192: extract EXECUTOR_WORKER_TYPE constant in run-worker.sh
- Fix S1172: remove unused params from structure_tool_task.py
- Fix S5713: remove redundant JSONDecodeError in json_repair_helper.py
- Fix S112/S5727 in test_execution.py
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: remove unused RetrievalStrategy import from _handle_answer_prompt
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: rename UsageHelper params to lowercase (N803)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* UN-3266 fix: resolve remaining SonarCloud issues from check run 66691002192
- Add @staticmethod to _sanitize_null_values (fixes S2325 missing self)
- Reduce _execute_single_prompt params from 25 to 11 (S107)
by grouping services as deps tuple and extracting exec params
from context.executor_params
- Add NOSONAR suppression for raise exc in test helper (S112)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* UN-3266 fix: remove unused locals in _handle_answer_prompt (F841)
execution_id, file_hash, log_events_id, custom_data are now extracted
inside _execute_single_prompt from context.executor_params.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: resolve Biome linting errors in frontend source files
Auto-fixed 48 lint errors across 56 files: import ordering, block
statements, unused variable prefixing, and formatting issues.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: replace dynamic import of SharePermission with static import in Workflows
Resolves vite build warning about SharePermission.jsx being both
dynamically and statically imported across the codebase.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: resolve SonarCloud warnings in frontend components
- Remove unnecessary try-catch around PostHog event calls
- Flip negated condition in PromptOutput.handleTable for clarity
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Address PR #1849 review comments: fix null guards, dead code, and test drift
- Remove redundant inline `import uuid as _uuid` in views.py (use module-level uuid)
- URL-encode DB_USER in worker_celery.py result backend connection string
- Remove misleading task_queues=[Queue("executor")] from dispatch-only Celery app
- Remove dead `if not tool:` guards after objects.get() (already raises DoesNotExist)
- Move profile_manager/default_profile null checks before first dereference
- Reorder ProfileManager.objects.get before mark_document_indexed in tasks.py
- Handle ProfileManager.DoesNotExist as warning, not hard failure
- Wrap PostHog analytics in try/catch so failures don't block prompt execution
- Handle pending-indexing 200 response in usePromptRun.js (clear RUNNING status)
- Reset formData when metadata is missing in ConfigureDs.jsx
- Fix test_should_skip_extraction tests: function now takes 1 arg (outputs only)
- Fix agentic routing tests: mock X2Text.process, remove stale platform_helper kwarg
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix missing llm_usage_reason for summarize LLM usage tracking
Add PSKeys.LLM_USAGE_REASON to usage_kwargs in _handle_summarize() so
summarization costs appear under summarize_llm in API response metadata.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* UN-3266 [FIX] Fix single-pass extraction routing in LegacyExecutor
- Route _handle_structure_pipeline to _handle_single_pass_extraction when
is_single_pass=True (was always calling _handle_answer_prompt)
- Delegate _handle_single_pass_extraction to cloud plugin via ExecutorRegistry,
falling back to _handle_answer_prompt if plugin not installed
Co-Authored-By: Claude Opus 4.6 <norep…
The INFILE overwrite (fs.json_dump to input_file_path) appeared twice in _write_pipeline_outputs(), causing an unnecessary extra MinIO PUT per file execution. Remove the duplicate that was a copy-paste artifact from PR #1849. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…wered table extraction (#1914) * Execution backend - revamp * async flow * Streaming progress to FE * Removing multi hop in Prompt studio ide and structure tool * UN-3234 [FIX] Add beta tag to agentic prompt studio navigation item * Added executors for agentic prompt studio * Added executors for agentic prompt studio * Removed redundant envs * Removed redundant envs * Removed redundant envs * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Removed redundant envs * adding worker for callbacks * adding worker for callbacks * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Pluggable apps and plugins to fit the new async prompt execution architecture * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Pluggable apps and plugins to fit the new async prompt execution architecture * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Pluggable apps and plugins to fit the new async prompt execution architecture * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * fix: write output files in agentic extraction pipeline Agentic extraction returned early without writing INFILE (JSON) or METADATA.json, causing destination connectors to read the original PDF and fail with "Expected tool output type: TXT, got: application/pdf". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests (#1850) * UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants in all affected test files to avoid world-writable directory vulnerabilities. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update docs * UN-3266 fix: remove dead code with undefined names in fetch_response Remove unreachable code block after the async callback return in fetch_response that still referenced output_count_before and response from the old synchronous implementation, causing ruff F821 errors. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Un 3266 fix security hotspot tmp paths (#1851) * UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants in all affected test files to avoid world-writable directory vulnerabilities. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve ruff linting failures across multiple files - B026: pass url positionally in worker_celery.py to avoid star-arg after keyword - N803: rename MockAsyncResult to mock_async_result in test_tasks.py - E501/I001: fix long line and import sort in llm_whisperer helper - ANN401: replace Any with object|None in dispatcher.py; add noqa in test helpers - F841: remove unused workflow_id and result assignments Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * UN-3266 fix: resolve SonarCloud bugs S2259 and S1244 in PR #1849 - S2259: guard against None after _discover_plugins() in loader.py to satisfy static analysis on the dict[str,type]|None field type - S1244: replace float equality checks with pytest.approx() in test_answer_prompt.py and test_phase2h.py Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve SonarCloud code smells in PR #1849 - S5799: Merge all implicit string concatenations in log messages (legacy_executor.py, tasks.py, dispatcher.py, orchestrator.py, registry.py, variable_replacement.py, structure_tool_task.py) - S1192: Extract duplicate literal to _NO_CELERY_APP_MSG constant in dispatcher.py - S1871: Merge identical elif/else branches in tasks.py and test_sanity_phase6j.py - S1186: Add comment to empty stub method in test_sanity_phase6a.py - S1481: Remove unused local variables in test_sanity_phase6d/e/f/g/h/j and test_phase5d.py - S117: Rename PascalCase local variables to snake_case in test_sanity_phase3/5/6i.py - S5655: Broaden tool type annotation to StreamMixin in IndexingUtils.generate_index_key and PlatformHelper.get_adapter_config - docker:S7031: Merge consecutive RUN instructions in worker-unified.Dockerfile - javascript:S1128: Remove unused pollForCompletion import in usePromptRun.js Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: wrap long log message in dispatcher.py to fix E501 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve remaining SonarCloud S117 naming violations Rename PascalCase local variables to snake_case to comply with S117: - legacy_executor.py: rename tuple-unpacked _get_prompt_deps() results (AnswerPromptService→answer_prompt_svc, RetrievalService→retrieval_svc, VariableReplacementService→variable_replacement_svc, LLM→llm_cls, EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls) and update all downstream usages including _apply_type_conversion and _handle_summarize - test_phase1_log_streaming.py: rename Mock* local variables to mock_* snake_case equivalents - test_sanity_phase3.py: rename MockDispatcher→mock_dispatcher_cls and MockShim→mock_shim_cls across all 10 test methods - test_sanity_phase5.py: rename MockShim→mock_shim, MockX2Text→mock_x2text in 6 test methods; MockDispatcher→mock_dispatcher_cls in dispatch test; fix LLM_cls→llm_cls, EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls in _mock_prompt_deps helper Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: resolve remaining SonarCloud code smells in PR #1849 - test_sanity_phase2/4.py, test_answer_prompt.py: rename PascalCase local variables in _mock_prompt_deps/_mock_deps to snake_case (RetrievalService→retrieval_svc, VariableReplacementService→ variable_replacement_svc, Index→index_cls, LLM_cls→llm_cls, EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls, AnswerPromptService→answer_prompt_svc_cls) — fixes S117 - test_sanity_phase3.py: remove unused local variable "result" — fixes S1481 - structure_tool_task.py: remove redundant json.JSONDecodeError from except clause (subclass of ValueError) — fixes S5713 - shared/workflow/execution/service.py: replace generic Exception with RuntimeError for structure tool failure — fixes S112 - run-worker-docker.sh: define EXECUTOR_WORKER_TYPE constant and replace 10 literal "executor" occurrences — fixes S1192 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve SonarCloud cognitive complexity and code smell violations - Reduce cognitive complexity in answer_prompt.py: - Extract _build_grammar_notes, _run_webhook_postprocess helpers - _is_safe_public_url: extracted _resolve_host_addresses helper - handle_json: early-return pattern eliminates nesting - construct_prompt: delegates grammar loop to _build_grammar_notes - Reduce cognitive complexity in legacy_executor.py: - Extract _execute_single_prompt, _run_table_extraction helpers - Extract _run_challenge_if_enabled, _run_evaluation_if_enabled - Extract _inject_table_settings, _finalize_pipeline_result - Extract _convert_number_answer, _convert_scalar_answer - Extract _sanitize_dict_values helper - _handle_answer_prompt CC reduced from 50 to ~7 - Reduce CC in structure_tool_task.py: guard-clause refactor - Reduce CC in backend: dto.py, deployment_helper.py, api_deployment_views.py, prompt_studio_helper.py - Fix S117: rename PascalCase local vars in test_answer_prompt.py - Fix S1192: extract EXECUTOR_WORKER_TYPE constant in run-worker.sh - Fix S1172: remove unused params from structure_tool_task.py - Fix S5713: remove redundant JSONDecodeError in json_repair_helper.py - Fix S112/S5727 in test_execution.py Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: remove unused RetrievalStrategy import from _handle_answer_prompt Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: rename UsageHelper params to lowercase (N803) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve remaining SonarCloud issues from check run 66691002192 - Add @staticmethod to _sanitize_null_values (fixes S2325 missing self) - Reduce _execute_single_prompt params from 25 to 11 (S107) by grouping services as deps tuple and extracting exec params from context.executor_params - Add NOSONAR suppression for raise exc in test helper (S112) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: remove unused locals in _handle_answer_prompt (F841) execution_id, file_hash, log_events_id, custom_data are now extracted inside _execute_single_prompt from context.executor_params. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: resolve Biome linting errors in frontend source files Auto-fixed 48 lint errors across 56 files: import ordering, block statements, unused variable prefixing, and formatting issues. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: replace dynamic import of SharePermission with static import in Workflows Resolves vite build warning about SharePermission.jsx being both dynamically and statically imported across the codebase. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve SonarCloud warnings in frontend components - Remove unnecessary try-catch around PostHog event calls - Flip negated condition in PromptOutput.handleTable for clarity Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Address PR #1849 review comments: fix null guards, dead code, and test drift - Remove redundant inline `import uuid as _uuid` in views.py (use module-level uuid) - URL-encode DB_USER in worker_celery.py result backend connection string - Remove misleading task_queues=[Queue("executor")] from dispatch-only Celery app - Remove dead `if not tool:` guards after objects.get() (already raises DoesNotExist) - Move profile_manager/default_profile null checks before first dereference - Reorder ProfileManager.objects.get before mark_document_indexed in tasks.py - Handle ProfileManager.DoesNotExist as warning, not hard failure - Wrap PostHog analytics in try/catch so failures don't block prompt execution - Handle pending-indexing 200 response in usePromptRun.js (clear RUNNING status) - Reset formData when metadata is missing in ConfigureDs.jsx - Fix test_should_skip_extraction tests: function now takes 1 arg (outputs only) - Fix agentic routing tests: mock X2Text.process, remove stale platform_helper kwarg Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix missing llm_usage_reason for summarize LLM usage tracking Add PSKeys.LLM_USAGE_REASON to usage_kwargs in _handle_summarize() so summarization costs appear under summarize_llm in API response metadata. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * UN-3266 [FIX] Fix single-pass extraction routing in LegacyExecutor - Route _handle_structure_pipeline to _handle_single_pass_extraction when is_single_pass=True (was always calling _handle_answer_prompt) - Delegate _handle_single_pass_extraction to cloud plugin via ExecutorRegistry, falling back to _handle_answer_prompt if plugin not installed Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fixing API depployment response mismatches * Add complete_vision() method to SDK1 LLM for multimodal completions Adds a new complete_vision() method alongside existing complete() that accepts pre-built multimodal messages (text + image_url) in OpenAI-style format. LiteLLM auto-translates for Anthropic/Bedrock/Vertex providers. This enables the agentic table extractor plugin to send page images alongside text prompts for VLM-based table detection and extraction. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * UN-3266 [FIX] Gate Run button by agentic table readiness checklist - PromptCardItems loads AgenticTableChecklist plugin and owns the isAgenticTableReady state, rendering the checklist above the prompt text area and delegating the settings gear visibility to the plugin. - Header and PromptOutput disable their Run buttons when isAgenticTableReady is false (default true for non-agentic types). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [FIX] Use correct primary key field in prompt count subquery (#1905) ToolStudioPrompt uses prompt_id as its primary key, not id. Count("id") causes FieldError on the list endpoint (500). Co-authored-by: Chandrasekharan M <chandrasekharan@zipstack.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [FIX] Add agentic_table as valid enforce_type choice The cloud build adds "agentic_table" to the prompt enforce_type dropdown, but the OSS ToolStudioPrompt model rejected it as an invalid choice. Add AGENTIC_TABLE to EnforceType and ship a matching migration so the value can be persisted. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * UN-3266 [FIX] Wire agentic_table enforce_type to executor dispatch The single-prompt run flow had no branch for prompts with enforce_type=agentic_table, so clicking Run silently fell through to the legacy prompt-service path and never invoked the agentic_table executor. Adds an AGENTIC_TABLE constant to TSPKeys, includes it in the OperationNotSupported guard, and dispatches to PayloadModifier.execute_agentic_table when the plugin is available so the result still flows through _handle_response. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * UN-3266 [FIX] Add agentic_table queue to executor worker defaults The ExecutionDispatcher derives the queue name from the executor name (celery_executor_{name}), so dispatches to the agentic_table executor land on celery_executor_agentic_table. The local docker-compose default only listed celery_executor_legacy and celery_executor_agentic, so no worker consumed the new queue and dispatch hung for the full 1-hour result timeout. Adds the missing queue to the docker-compose default. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * UN-3266 [FIX] Dispatch agentic_table prompts to executor on IDE Run The IDE Run button was building a legacy answer_prompt payload for agentic_table prompts, so the agentic table executor was never invoked. Branch fetch_response on enforce_type so agentic_table prompts are built via the cloud payload_modifier plugin and dispatched directly to celery_executor_agentic_table. Add the enforce_type to the OSS dropdown choices and the JSON-dump set in OutputManagerHelper so the persisted output is parseable by the FE table renderer. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * UN-3266 [FIX] Reshape agentic_table executor output in IDE callback The agentic_table executor returns {"output": {"tables": [...], "page_count": ..., "headers": [...], ...}}, but OutputManagerHelper.handle_prompt_output_update reads outputs[prompt.prompt_key] when persisting prompt output. Without a reshape the table list never lands under the prompt key and the FE sees an empty result. When cb_kwargs carries is_agentic_table=True and prompt_key (set by the cloud build_agentic_table_payload), reshape outputs to {prompt_key: tables} before calling update_prompt_output. The executor itself also shapes its envelope, so this is a defensive double-keying that keeps the legacy answer_prompt path untouched. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fixing timeout issues * API deployment fixes for Agentic table extractor * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixing syntax issues * Fix agentic_table executor reading INFILE after JSON overwrite Read from SOURCE instead of INFILE when dispatching to the agentic_table executor. INFILE gets overwritten with JSON output by the regular pipeline, causing PDFium parse errors when the agentic_table executor tries to process it as a PDF. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com> Co-authored-by: Ghost Jake <89829542+Deepak-Kesavan@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Ritwik G <100672805+ritwik-g@users.noreply.github.com> Co-authored-by: Chandrasekharan M <chandrasekharan@zipstack.com>
…wered table extraction (#1914) * Execution backend - revamp * async flow * Streaming progress to FE * Removing multi hop in Prompt studio ide and structure tool * UN-3234 [FIX] Add beta tag to agentic prompt studio navigation item * Added executors for agentic prompt studio * Added executors for agentic prompt studio * Removed redundant envs * Removed redundant envs * Removed redundant envs * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Removed redundant envs * adding worker for callbacks * adding worker for callbacks * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Pluggable apps and plugins to fit the new async prompt execution architecture * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Pluggable apps and plugins to fit the new async prompt execution architecture * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Pluggable apps and plugins to fit the new async prompt execution architecture * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * fix: write output files in agentic extraction pipeline Agentic extraction returned early without writing INFILE (JSON) or METADATA.json, causing destination connectors to read the original PDF and fail with "Expected tool output type: TXT, got: application/pdf". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests (#1850) * UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants in all affected test files to avoid world-writable directory vulnerabilities. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update docs * UN-3266 fix: remove dead code with undefined names in fetch_response Remove unreachable code block after the async callback return in fetch_response that still referenced output_count_before and response from the old synchronous implementation, causing ruff F821 errors. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Un 3266 fix security hotspot tmp paths (#1851) * UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants in all affected test files to avoid world-writable directory vulnerabilities. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve ruff linting failures across multiple files - B026: pass url positionally in worker_celery.py to avoid star-arg after keyword - N803: rename MockAsyncResult to mock_async_result in test_tasks.py - E501/I001: fix long line and import sort in llm_whisperer helper - ANN401: replace Any with object|None in dispatcher.py; add noqa in test helpers - F841: remove unused workflow_id and result assignments Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * UN-3266 fix: resolve SonarCloud bugs S2259 and S1244 in PR #1849 - S2259: guard against None after _discover_plugins() in loader.py to satisfy static analysis on the dict[str,type]|None field type - S1244: replace float equality checks with pytest.approx() in test_answer_prompt.py and test_phase2h.py Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve SonarCloud code smells in PR #1849 - S5799: Merge all implicit string concatenations in log messages (legacy_executor.py, tasks.py, dispatcher.py, orchestrator.py, registry.py, variable_replacement.py, structure_tool_task.py) - S1192: Extract duplicate literal to _NO_CELERY_APP_MSG constant in dispatcher.py - S1871: Merge identical elif/else branches in tasks.py and test_sanity_phase6j.py - S1186: Add comment to empty stub method in test_sanity_phase6a.py - S1481: Remove unused local variables in test_sanity_phase6d/e/f/g/h/j and test_phase5d.py - S117: Rename PascalCase local variables to snake_case in test_sanity_phase3/5/6i.py - S5655: Broaden tool type annotation to StreamMixin in IndexingUtils.generate_index_key and PlatformHelper.get_adapter_config - docker:S7031: Merge consecutive RUN instructions in worker-unified.Dockerfile - javascript:S1128: Remove unused pollForCompletion import in usePromptRun.js Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: wrap long log message in dispatcher.py to fix E501 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve remaining SonarCloud S117 naming violations Rename PascalCase local variables to snake_case to comply with S117: - legacy_executor.py: rename tuple-unpacked _get_prompt_deps() results (AnswerPromptService→answer_prompt_svc, RetrievalService→retrieval_svc, VariableReplacementService→variable_replacement_svc, LLM→llm_cls, EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls) and update all downstream usages including _apply_type_conversion and _handle_summarize - test_phase1_log_streaming.py: rename Mock* local variables to mock_* snake_case equivalents - test_sanity_phase3.py: rename MockDispatcher→mock_dispatcher_cls and MockShim→mock_shim_cls across all 10 test methods - test_sanity_phase5.py: rename MockShim→mock_shim, MockX2Text→mock_x2text in 6 test methods; MockDispatcher→mock_dispatcher_cls in dispatch test; fix LLM_cls→llm_cls, EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls in _mock_prompt_deps helper Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: resolve remaining SonarCloud code smells in PR #1849 - test_sanity_phase2/4.py, test_answer_prompt.py: rename PascalCase local variables in _mock_prompt_deps/_mock_deps to snake_case (RetrievalService→retrieval_svc, VariableReplacementService→ variable_replacement_svc, Index→index_cls, LLM_cls→llm_cls, EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls, AnswerPromptService→answer_prompt_svc_cls) — fixes S117 - test_sanity_phase3.py: remove unused local variable "result" — fixes S1481 - structure_tool_task.py: remove redundant json.JSONDecodeError from except clause (subclass of ValueError) — fixes S5713 - shared/workflow/execution/service.py: replace generic Exception with RuntimeError for structure tool failure — fixes S112 - run-worker-docker.sh: define EXECUTOR_WORKER_TYPE constant and replace 10 literal "executor" occurrences — fixes S1192 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve SonarCloud cognitive complexity and code smell violations - Reduce cognitive complexity in answer_prompt.py: - Extract _build_grammar_notes, _run_webhook_postprocess helpers - _is_safe_public_url: extracted _resolve_host_addresses helper - handle_json: early-return pattern eliminates nesting - construct_prompt: delegates grammar loop to _build_grammar_notes - Reduce cognitive complexity in legacy_executor.py: - Extract _execute_single_prompt, _run_table_extraction helpers - Extract _run_challenge_if_enabled, _run_evaluation_if_enabled - Extract _inject_table_settings, _finalize_pipeline_result - Extract _convert_number_answer, _convert_scalar_answer - Extract _sanitize_dict_values helper - _handle_answer_prompt CC reduced from 50 to ~7 - Reduce CC in structure_tool_task.py: guard-clause refactor - Reduce CC in backend: dto.py, deployment_helper.py, api_deployment_views.py, prompt_studio_helper.py - Fix S117: rename PascalCase local vars in test_answer_prompt.py - Fix S1192: extract EXECUTOR_WORKER_TYPE constant in run-worker.sh - Fix S1172: remove unused params from structure_tool_task.py - Fix S5713: remove redundant JSONDecodeError in json_repair_helper.py - Fix S112/S5727 in test_execution.py Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: remove unused RetrievalStrategy import from _handle_answer_prompt Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: rename UsageHelper params to lowercase (N803) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve remaining SonarCloud issues from check run 66691002192 - Add @staticmethod to _sanitize_null_values (fixes S2325 missing self) - Reduce _execute_single_prompt params from 25 to 11 (S107) by grouping services as deps tuple and extracting exec params from context.executor_params - Add NOSONAR suppression for raise exc in test helper (S112) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: remove unused locals in _handle_answer_prompt (F841) execution_id, file_hash, log_events_id, custom_data are now extracted inside _execute_single_prompt from context.executor_params. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: resolve Biome linting errors in frontend source files Auto-fixed 48 lint errors across 56 files: import ordering, block statements, unused variable prefixing, and formatting issues. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: replace dynamic import of SharePermission with static import in Workflows Resolves vite build warning about SharePermission.jsx being both dynamically and statically imported across the codebase. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve SonarCloud warnings in frontend components - Remove unnecessary try-catch around PostHog event calls - Flip negated condition in PromptOutput.handleTable for clarity Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Address PR #1849 review comments: fix null guards, dead code, and test drift - Remove redundant inline `import uuid as _uuid` in views.py (use module-level uuid) - URL-encode DB_USER in worker_celery.py result backend connection string - Remove misleading task_queues=[Queue("executor")] from dispatch-only Celery app - Remove dead `if not tool:` guards after objects.get() (already raises DoesNotExist) - Move profile_manager/default_profile null checks before first dereference - Reorder ProfileManager.objects.get before mark_document_indexed in tasks.py - Handle ProfileManager.DoesNotExist as warning, not hard failure - Wrap PostHog analytics in try/catch so failures don't block prompt execution - Handle pending-indexing 200 response in usePromptRun.js (clear RUNNING status) - Reset formData when metadata is missing in ConfigureDs.jsx - Fix test_should_skip_extraction tests: function now takes 1 arg (outputs only) - Fix agentic routing tests: mock X2Text.process, remove stale platform_helper kwarg Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix missing llm_usage_reason for summarize LLM usage tracking Add PSKeys.LLM_USAGE_REASON to usage_kwargs in _handle_summarize() so summarization costs appear under summarize_llm in API response metadata. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * UN-3266 [FIX] Fix single-pass extraction routing in LegacyExecutor - Route _handle_structure_pipeline to _handle_single_pass_extraction when is_single_pass=True (was always calling _handle_answer_prompt) - Delegate _handle_single_pass_extraction to cloud plugin via ExecutorRegistry, falling back to _handle_answer_prompt if plugin not installed Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fixing API depployment response mismatches * Add complete_vision() method to SDK1 LLM for multimodal completions Adds a new complete_vision() method alongside existing complete() that accepts pre-built multimodal messages (text + image_url) in OpenAI-style format. LiteLLM auto-translates for Anthropic/Bedrock/Vertex providers. This enables the agentic table extractor plugin to send page images alongside text prompts for VLM-based table detection and extraction. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * UN-3266 [FIX] Gate Run button by agentic table readiness checklist - PromptCardItems loads AgenticTableChecklist plugin and owns the isAgenticTableReady state, rendering the checklist above the prompt text area and delegating the settings gear visibility to the plugin. - Header and PromptOutput disable their Run buttons when isAgenticTableReady is false (default true for non-agentic types). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [FIX] Use correct primary key field in prompt count subquery (#1905) ToolStudioPrompt uses prompt_id as its primary key, not id. Count("id") causes FieldError on the list endpoint (500). Co-authored-by: Chandrasekharan M <chandrasekharan@zipstack.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [FIX] Add agentic_table as valid enforce_type choice The cloud build adds "agentic_table" to the prompt enforce_type dropdown, but the OSS ToolStudioPrompt model rejected it as an invalid choice. Add AGENTIC_TABLE to EnforceType and ship a matching migration so the value can be persisted. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * UN-3266 [FIX] Wire agentic_table enforce_type to executor dispatch The single-prompt run flow had no branch for prompts with enforce_type=agentic_table, so clicking Run silently fell through to the legacy prompt-service path and never invoked the agentic_table executor. Adds an AGENTIC_TABLE constant to TSPKeys, includes it in the OperationNotSupported guard, and dispatches to PayloadModifier.execute_agentic_table when the plugin is available so the result still flows through _handle_response. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * UN-3266 [FIX] Add agentic_table queue to executor worker defaults The ExecutionDispatcher derives the queue name from the executor name (celery_executor_{name}), so dispatches to the agentic_table executor land on celery_executor_agentic_table. The local docker-compose default only listed celery_executor_legacy and celery_executor_agentic, so no worker consumed the new queue and dispatch hung for the full 1-hour result timeout. Adds the missing queue to the docker-compose default. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * UN-3266 [FIX] Dispatch agentic_table prompts to executor on IDE Run The IDE Run button was building a legacy answer_prompt payload for agentic_table prompts, so the agentic table executor was never invoked. Branch fetch_response on enforce_type so agentic_table prompts are built via the cloud payload_modifier plugin and dispatched directly to celery_executor_agentic_table. Add the enforce_type to the OSS dropdown choices and the JSON-dump set in OutputManagerHelper so the persisted output is parseable by the FE table renderer. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * UN-3266 [FIX] Reshape agentic_table executor output in IDE callback The agentic_table executor returns {"output": {"tables": [...], "page_count": ..., "headers": [...], ...}}, but OutputManagerHelper.handle_prompt_output_update reads outputs[prompt.prompt_key] when persisting prompt output. Without a reshape the table list never lands under the prompt key and the FE sees an empty result. When cb_kwargs carries is_agentic_table=True and prompt_key (set by the cloud build_agentic_table_payload), reshape outputs to {prompt_key: tables} before calling update_prompt_output. The executor itself also shapes its envelope, so this is a defensive double-keying that keeps the legacy answer_prompt path untouched. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fixing timeout issues * API deployment fixes for Agentic table extractor * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixing syntax issues * Fix agentic_table executor reading INFILE after JSON overwrite Read from SOURCE instead of INFILE when dispatching to the agentic_table executor. INFILE gets overwritten with JSON output by the regular pipeline, causing PDFium parse errors when the agentic_table executor tries to process it as a PDF. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com> Co-authored-by: Ghost Jake <89829542+Deepak-Kesavan@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Ritwik G <100672805+ritwik-g@users.noreply.github.com> Co-authored-by: Chandrasekharan M <chandrasekharan@zipstack.com>
… only" (#1936) * api deployment notification init * UN-3431 [FIX] Stream tool-run logs to workflow execution UI with markdown rendering (#1927) * [FIX] Make tool-run logs visible in workflow execution UI Two stacked gaps were keeping tool-level log lines (Processing prompt, Running LLM completion, lookup calls, etc.) out of the workflow execution logs UI and the execution_log DB table for API / workflow runs: 1. Empty log_events_id. structure_tool_task seeded LOG_EVENTS_ID in StateStore but never threaded it into pipeline_ctx / agentic_ctx. ExecutorToolShim.stream_log gated publishing on self.log_events_id, so every tool-level log was dropped before it ever reached the broker. 2. Wrong payload shape. Even with the channel threaded, stream_log used LogPublisher.log_progress(...) whose payload omits execution_id / organization_id / file_execution_id. get_validated_log_data (log_utils.py) requires those IDs and LogType == LOG to persist to execution_log, so tool-level messages were silently filtered at the Redis->DB drain step — orchestration logs persisted, tool logs did not. Fixes: - ExecutionContext gains execution_id + file_execution_id, populated in structure_tool_task for both the legacy pipeline and agentic contexts. - LegacyExecutor caches the three IDs on self during execute() and passes them into every ExecutorToolShim construction (~7 callsites). - ExecutorToolShim.stream_log now dual-emits: PROGRESS (unchanged, drives the IDE prompt-card live progress pane) plus LOG carrying the workflow IDs (feeds the workflow execution logs UI and persists to execution_log via the existing drain). LOG emission is gated on execution_id + organization_id being present, so bare IDE test runs without a workflow still behave as before. Rendering polish - The LogModal and pipeline LogsModal now pipe log text through the existing CustomMarkdown renderer, so backticked identifiers render as inline-code pills and embedded newlines break lines. This lets multi-line structured events (e.g. the lookup pre-call trio) surface as a single row with readable inner formatting. - Prompt-key mentions inside legacy_executor tool logs are wrapped in backticks for consistency with the rest of the log surface. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Wrap prompt_name in backticks in remaining stream_log calls Completes the consistency pass on tool-run log formatting: the table- and line-item-extraction success and error paths still emitted prompt names without backticks, so the markdown-rendered logs UI showed them as bare text instead of inline-code pills. Matches the pattern already applied to the other 9 stream_log calls in this file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Validate URL schemes in CustomMarkdown link renderer Workflow logs rendered via CustomMarkdown can contain tool-generated or user-derived content, so an untrusted \`[text](url)\` sequence could inject a \`javascript:\` or \`data:\` scheme and get clickable through antd \`Typography.Link\`. Allow-list the safe external schemes (http, https, mailto, tel) before rendering as a link; everything else falls back to plain text while still honouring the existing internal-path branch used for in-app navigation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Thread workflow IDs into remaining shim/context callsites Addresses CodeRabbit review gaps so the log-plumbing fix is consistent across every pre-dispatch and plugin-dispatch path: - `table_ctx` / `line_item_ctx` in `legacy_executor.py` now carry `log_events_id`, `execution_id`, `file_execution_id` from context so downstream table/line-item plugins that build their own `ExecutorToolShim` pass the `execution_id + organization_id` gate and emit workflow LOG payloads. - `structure_tool_task.py` threads the same IDs into the bare pre-dispatch shim, so `X2Text.process()` calls during agentic extraction reach the workflow logs UI. - `LogsModal.jsx` stores the raw log string in row data and lets the column renderer wrap it in `CustomMarkdown` — the previous map stored a `<CustomMarkdown />` element that was then passed back into `CustomMarkdown.text`, producing `[object Object]` for multi-row lookups. - Dropped `getattr(context, ...)` on `execution_id` / `file_execution_id` now that they are dataclass fields — matches the direct access used for `organization_id`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [REFACTOR] Trim overly specific comments in log-plumbing changes Pass through the new comments added across this PR and either remove or tighten the ones that restate what the code already shows. Keep only the WHY lines that protect future readers from missing a non-obvious constraint (XSS guard in CustomMarkdown, dual PROGRESS/LOG emission in the shim, pre-dispatch shim needing workflow IDs so X2Text logs are not silently dropped). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [REFACTOR] Extract isSafeExternalUrl into shared helpers module Moves the URL scheme allow-list check out of CustomMarkdown into helpers/urlSafety.js so any future component that renders links from user- or tool-derived content can reuse the same guard instead of re-implementing it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Tighten URL guard, split publish try/excepts, and extract shim builder Addresses the must-fix and worth-doing comments from the PR review: Security - CustomMarkdown: treat protocol-relative URLs (`//host/...`) as external, not internal, so they can no longer skip the scheme guard via the `startsWith("/")` branch. - `isSafeExternalUrl`: drop the `window.location.origin` base so bare strings ("javascript", "../foo") fail to parse instead of silently resolving to `https://<origin>/...` and passing the scheme check. Silent failure + comment accuracy - ExecutorToolShim.stream_log: split the PROGRESS and LOG publish paths into separate try/except blocks so a LogDataDTO validation failure on the LOG payload is no longer mis-attributed to "progress publish failed". Corrected the inline comments — the DB drop is driven by LogPublisher's `payload.type == 'LOG'` check, and only `execution_id` + `organization_id` are strictly required. Refactor - New `LegacyExecutor._build_shim()` helper — all seven ExecutorToolShim callsites now share one construction path so the workflow-ID plumbing can't drift out of sync across sites again. - Thread `execution_id` / `file_execution_id` into the seven self-dispatched sub-`ExecutionContext`s alongside `log_events_id`, matching the table/line-item sites and keeping the context consistent for any downstream consumer that reads the IDs from the context rather than from the executor instance. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Address remaining type-design and silent-failure comments - ExecutionContext: drop the BE-coupled inline comment, document the new IDs in the Attributes block, and enforce the invariant that execution_id implies organization_id via __post_init__. - ExecutorToolShim: typed the three new IDs as str | None instead of str = "" so the signature matches the Optional semantics already enforced by the runtime guards. - LegacyExecutor: move per-request state to __init__ so _log_component is no longer a class-level mutable default shared across instances; stop silently coercing None IDs to ""; add a one-shot warning when a tool-sourced run lands without workflow IDs so the silent-no-persist case is visible in GKE logs. - structure_tool_task: emit the same warning when LOG_EVENTS_ID is absent from StateStore. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Surface first publish failure per shim at WARN Both PROGRESS and LOG publish paths previously swallowed every broker failure at DEBUG, so a misconfigured or down Redis broker meant every tool-level log silently vanished with no operator-visible signal. Track a per-shim _progress_publish_failed / _log_publish_failed flag and log the first failure at WARNING (with traceback), then downgrade subsequent failures on the same shim back to DEBUG. Preserves the non-fatal semantics of the publish path while making broker outages visible in GKE logs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3430 [FIX] Update modified_at field correctly for models (#1928) * [FIX] Auto-bump modified_at on QuerySet.update() and bulk_update() Django's auto_now=True only fires on Model.save(); QuerySet.update() and bulk_update() bypass save(), so BaseModel.modified_at silently stayed at the creation time for every bulk-path write. Audit trail drifted. Introduce BaseModelQuerySet that injects modified_at=timezone.now() into both paths, and expose it via BaseModelManager. Migrate all custom managers on BaseModel subclasses to compose BaseModelManager so their querysets inherit the overrides. Drop the ad-hoc modified_at=now() kwarg in FileHistoryHelper now that the queryset handles it. * [FIX] Materialize objs in BaseModelQuerySet.bulk_update to support generators Addresses PR review: if callers pass a non-rewindable iterable (generator, queryset iterator), the modified_at stamping loop would exhaust it before super().bulk_update() saw it, silently updating zero rows. list(objs) up front keeps generator callers working. Also drop the mock-based unit test — it needed django.setup() at module import which isn't viable without pytest-django, and proper DB-backed coverage is tracked separately. * [FIX] Auto-inject modified_at into BaseModel.save(update_fields=...) Django only runs auto_now for fields listed in update_fields, so every save(update_fields=["foo"]) on a BaseModel subclass silently drops the modified_at bump — same family of bug as QuerySet.update/bulk_update. Override BaseModel.save() to add modified_at to update_fields whenever the caller supplies a restricted list without it. Also drop two dead manual-assignment lines (execution.modified_at = timezone.now() before save()) that were redundant with auto_now on a full save(). * [FIX] Auto-bump modified_at on upsert bulk_create and drop workarounds QuerySet.bulk_create(update_conflicts=True, update_fields=[...]) runs an UPDATE on conflict with only the listed fields — same auto_now-bypass as save(update_fields=...) and QuerySet.update(). Patch BaseModelQuerySet's bulk_create to inject modified_at into update_fields on upsert. With that in place, the explicit "modified_at" entries in dashboard_metrics upsert callers are redundant. Drop them. * [REFACTOR] Tighten BaseModel auto-bump helpers and edge cases - Extract `_with_modified_at` helper; single source of truth for the "inject modified_at into a partial field list" rule across `bulk_update`, `bulk_create` and `BaseModel.save`. - Preserve Django's documented `save(update_fields=[])` no-op (signals-only save, no column writes) instead of rewriting it to `["modified_at"]`. Apply the same guard to `bulk_create(update_conflicts=True, update_fields=[])`. - Match Django's positional `save()` signature (`force_insert`, `force_update`, `using`, `update_fields`) so callers passing flags positionally still hit the auto-bump override. - Skip the per-obj `modified_at` stamp + `objs` materialization in `bulk_update` when the caller already listed `modified_at` — lets the opt-in path stay O(1) before the `super()` delegation. - Docstring corrections: "previous save() timestamp" (not just creation time); manager-level convention note; precise `auto_now` semantics (attribute still updates in-memory, just isn't persisted without `update_fields` inclusion). * UN-3403 [FEAT] Agentic table extractor plugin with multi-agent LLM-powered table extraction (#1914) * Execution backend - revamp * async flow * Streaming progress to FE * Removing multi hop in Prompt studio ide and structure tool * UN-3234 [FIX] Add beta tag to agentic prompt studio navigation item * Added executors for agentic prompt studio * Added executors for agentic prompt studio * Removed redundant envs * Removed redundant envs * Removed redundant envs * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Removed redundant envs * adding worker for callbacks * adding worker for callbacks * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Pluggable apps and plugins to fit the new async prompt execution architecture * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Pluggable apps and plugins to fit the new async prompt execution architecture * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Pluggable apps and plugins to fit the new async prompt execution architecture * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * fix: write output files in agentic extraction pipeline Agentic extraction returned early without writing INFILE (JSON) or METADATA.json, causing destination connectors to read the original PDF and fail with "Expected tool output type: TXT, got: application/pdf". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests (#1850) * UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants in all affected test files to avoid world-writable directory vulnerabilities. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update docs * UN-3266 fix: remove dead code with undefined names in fetch_response Remove unreachable code block after the async callback return in fetch_response that still referenced output_count_before and response from the old synchronous implementation, causing ruff F821 errors. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Un 3266 fix security hotspot tmp paths (#1851) * UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants in all affected test files to avoid world-writable directory vulnerabilities. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve ruff linting failures across multiple files - B026: pass url positionally in worker_celery.py to avoid star-arg after keyword - N803: rename MockAsyncResult to mock_async_result in test_tasks.py - E501/I001: fix long line and import sort in llm_whisperer helper - ANN401: replace Any with object|None in dispatcher.py; add noqa in test helpers - F841: remove unused workflow_id and result assignments Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * UN-3266 fix: resolve SonarCloud bugs S2259 and S1244 in PR #1849 - S2259: guard against None after _discover_plugins() in loader.py to satisfy static analysis on the dict[str,type]|None field type - S1244: replace float equality checks with pytest.approx() in test_answer_prompt.py and test_phase2h.py Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve SonarCloud code smells in PR #1849 - S5799: Merge all implicit string concatenations in log messages (legacy_executor.py, tasks.py, dispatcher.py, orchestrator.py, registry.py, variable_replacement.py, structure_tool_task.py) - S1192: Extract duplicate literal to _NO_CELERY_APP_MSG constant in dispatcher.py - S1871: Merge identical elif/else branches in tasks.py and test_sanity_phase6j.py - S1186: Add comment to empty stub method in test_sanity_phase6a.py - S1481: Remove unused local variables in test_sanity_phase6d/e/f/g/h/j and test_phase5d.py - S117: Rename PascalCase local variables to snake_case in test_sanity_phase3/5/6i.py - S5655: Broaden tool type annotation to StreamMixin in IndexingUtils.generate_index_key and PlatformHelper.get_adapter_config - docker:S7031: Merge consecutive RUN instructions in worker-unified.Dockerfile - javascript:S1128: Remove unused pollForCompletion import in usePromptRun.js Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: wrap long log message in dispatcher.py to fix E501 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve remaining SonarCloud S117 naming violations Rename PascalCase local variables to snake_case to comply with S117: - legacy_executor.py: rename tuple-unpacked _get_prompt_deps() results (AnswerPromptService→answer_prompt_svc, RetrievalService→retrieval_svc, VariableReplacementService→variable_replacement_svc, LLM→llm_cls, EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls) and update all downstream usages including _apply_type_conversion and _handle_summarize - test_phase1_log_streaming.py: rename Mock* local variables to mock_* snake_case equivalents - test_sanity_phase3.py: rename MockDispatcher→mock_dispatcher_cls and MockShim→mock_shim_cls across all 10 test methods - test_sanity_phase5.py: rename MockShim→mock_shim, MockX2Text→mock_x2text in 6 test methods; MockDispatcher→mock_dispatcher_cls in dispatch test; fix LLM_cls→llm_cls, EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls in _mock_prompt_deps helper Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: resolve remaining SonarCloud code smells in PR #1849 - test_sanity_phase2/4.py, test_answer_prompt.py: rename PascalCase local variables in _mock_prompt_deps/_mock_deps to snake_case (RetrievalService→retrieval_svc, VariableReplacementService→ variable_replacement_svc, Index→index_cls, LLM_cls→llm_cls, EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls, AnswerPromptService→answer_prompt_svc_cls) — fixes S117 - test_sanity_phase3.py: remove unused local variable "result" — fixes S1481 - structure_tool_task.py: remove redundant json.JSONDecodeError from except clause (subclass of ValueError) — fixes S5713 - shared/workflow/execution/service.py: replace generic Exception with RuntimeError for structure tool failure — fixes S112 - run-worker-docker.sh: define EXECUTOR_WORKER_TYPE constant and replace 10 literal "executor" occurrences — fixes S1192 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve SonarCloud cognitive complexity and code smell violations - Reduce cognitive complexity in answer_prompt.py: - Extract _build_grammar_notes, _run_webhook_postprocess helpers - _is_safe_public_url: extracted _resolve_host_addresses helper - handle_json: early-return pattern eliminates nesting - construct_prompt: delegates grammar loop to _build_grammar_notes - Reduce cognitive complexity in legacy_executor.py: - Extract _execute_single_prompt, _run_table_extraction helpers - Extract _run_challenge_if_enabled, _run_evaluation_if_enabled - Extract _inject_table_settings, _finalize_pipeline_result - Extract _convert_number_answer, _convert_scalar_answer - Extract _sanitize_dict_values helper - _handle_answer_prompt CC reduced from 50 to ~7 - Reduce CC in structure_tool_task.py: guard-clause refactor - Reduce CC in backend: dto.py, deployment_helper.py, api_deployment_views.py, prompt_studio_helper.py - Fix S117: rename PascalCase local vars in test_answer_prompt.py - Fix S1192: extract EXECUTOR_WORKER_TYPE constant in run-worker.sh - Fix S1172: remove unused params from structure_tool_task.py - Fix S5713: remove redundant JSONDecodeError in json_repair_helper.py - Fix S112/S5727 in test_execution.py Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: remove unused RetrievalStrategy import from _handle_answer_prompt Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: rename UsageHelper params to lowercase (N803) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve remaining SonarCloud issues from check run 66691002192 - Add @staticmethod to _sanitize_null_values (fixes S2325 missing self) - Reduce _execute_single_prompt params from 25 to 11 (S107) by grouping services as deps tuple and extracting exec params from context.executor_params - Add NOSONAR suppression for raise exc in test helper (S112) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: remove unused locals in _handle_answer_prompt (F841) execution_id, file_hash, log_events_id, custom_data are now extracted inside _execute_single_prompt from context.executor_params. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: resolve Biome linting errors in frontend source files Auto-fixed 48 lint errors across 56 files: import ordering, block statements, unused variable prefixing, and formatting issues. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: replace dynamic import of SharePermission with static import in Workflows Resolves vite build warning about SharePermission.jsx being both dynamically and statically imported across the codebase. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve SonarCloud warnings in frontend components - Remove unnecessary try-catch around PostHog event calls - Flip negated condition in PromptOutput.handleTable for clarity Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Address PR #1849 review comments: fix null guards, dead code, and test drift - Remove redundant inline `import uuid as _uuid` in views.py (use module-level uuid) - URL-encode DB_USER in worker_celery.py result backend connection string - Remove misleading task_queues=[Queue("executor")] from dispatch-only Celery app - Remove dead `if not tool:` guards after objects.get() (already raises DoesNotExist) - Move profile_manager/default_profile null checks before first dereference - Reorder ProfileManager.objects.get before mark_document_indexed in tasks.py - Handle ProfileManager.DoesNotExist as warning, not hard failure - Wrap PostHog analytics in try/catch so failures don't block prompt execution - Handle pending-indexing 200 response in usePromptRun.js (clear RUNNING status) - Reset formData when metadata is missing in ConfigureDs.jsx - Fix test_should_skip_extraction tests: function now takes 1 arg (outputs only) - Fix agentic routing tests: mock X2Text.process, remove stale platform_helper kwarg Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix missing llm_usage_reason for summarize LLM usage tracking Add PSKeys.LLM_USAGE_REASON to usage_kwargs in _handle_summarize() so summarization costs appear under summarize_llm in API response metadata. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * UN-3266 [FIX] Fix single-pass extraction routing in LegacyExecutor - Route _handle_structure_pipeline to _handle_single_pass_extraction when is_single_pass=True (was always calling _handle_answer_prompt) - Delegate _handle_single_pass_extraction to cloud plugin via ExecutorRegistry, falling back to _handle_answer_prompt if plugin not installed Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fixing API depployment response mismatches * Add complete_vision() method to SDK1 LLM for multimodal completions Adds a new complete_vision() method alongside existing complete() that accepts pre-built multimodal messages (text + image_url) in OpenAI-style format. LiteLLM auto-translates for Anthropic/Bedrock/Vertex providers. This enables the agentic table extractor plugin to send page images alongside text prompts for VLM-based table detection and extraction. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * UN-3266 [FIX] Gate Run button by agentic table readiness checklist - PromptCardItems loads AgenticTableChecklist plugin and owns the isAgenticTableReady state, rendering the checklist above the prompt text area and delegating the settings gear visibility to the plugin. - Header and PromptOutput disable their Run buttons when isAgenticTableReady is false (default true for non-agentic types). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [FIX] Use correct primary key field in prompt count subquery (#1905) ToolStudioPrompt uses prompt_id as its primary key, not id. Count("id") causes FieldError on the list endpoint (500). Co-authored-by: Chandrasekharan M <chandrasekharan@zipstack.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [FIX] Add agentic_table as valid enforce_type choice The cloud build adds "agentic_table" to the prompt enforce_type dropdown, but the OSS ToolStudioPrompt model rejected it as an invalid choice. Add AGENTIC_TABLE to EnforceType and ship a matching migration so the value can be persisted. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * UN-3266 [FIX] Wire agentic_table enforce_type to executor dispatch The single-prompt run flow had no branch for prompts with enforce_type=agentic_table, so clicking Run silently fell through to the legacy prompt-service path and never invoked the agentic_table executor. Adds an AGENTIC_TABLE constant to TSPKeys, includes it in the OperationNotSupported guard, and dispatches to PayloadModifier.execute_agentic_table when the plugin is available so the result still flows through _handle_response. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * UN-3266 [FIX] Add agentic_table queue to executor worker defaults The ExecutionDispatcher derives the queue name from the executor name (celery_executor_{name}), so dispatches to the agentic_table executor land on celery_executor_agentic_table. The local docker-compose default only listed celery_executor_legacy and celery_executor_agentic, so no worker consumed the new queue and dispatch hung for the full 1-hour result timeout. Adds the missing queue to the docker-compose default. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * UN-3266 [FIX] Dispatch agentic_table prompts to executor on IDE Run The IDE Run button was building a legacy answer_prompt payload for agentic_table prompts, so the agentic table executor was never invoked. Branch fetch_response on enforce_type so agentic_table prompts are built via the cloud payload_modifier plugin and dispatched directly to celery_executor_agentic_table. Add the enforce_type to the OSS dropdown choices and the JSON-dump set in OutputManagerHelper so the persisted output is parseable by the FE table renderer. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * UN-3266 [FIX] Reshape agentic_table executor output in IDE callback The agentic_table executor returns {"output": {"tables": [...], "page_count": ..., "headers": [...], ...}}, but OutputManagerHelper.handle_prompt_output_update reads outputs[prompt.prompt_key] when persisting prompt output. Without a reshape the table list never lands under the prompt key and the FE sees an empty result. When cb_kwargs carries is_agentic_table=True and prompt_key (set by the cloud build_agentic_table_payload), reshape outputs to {prompt_key: tables} before calling update_prompt_output. The executor itself also shapes its envelope, so this is a defensive double-keying that keeps the legacy answer_prompt path untouched. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fixing timeout issues * API deployment fixes for Agentic table extractor * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixing syntax issues * Fix agentic_table executor reading INFILE after JSON overwrite Read from SOURCE instead of INFILE when dispatching to the agentic_table executor. INFILE gets overwritten with JSON output by the regular pipeline, causing PDFium parse errors when the agentic_table executor tries to process it as a PDF. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com> Co-authored-by: Ghost Jake <89829542+Deepak-Kesavan@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Ritwik G <100672805+ritwik-g@users.noreply.github.com> Co-authored-by: Chandrasekharan M <chandrasekharan@zipstack.com> * UN-3358 [FIX] Drop cross-region S3 buckets from connector listing (#1931) * list bucket * greptile review * payload metadata in api deployment * slack webhook payload * Uns 611 clubbed notification dispatch (#1951) * UN-3439 [FIX] Accept wildcard subdomain origins in SocketIO and Django CORS (#1938) * UN-3439 [FIX] Accept wildcard subdomain origins in SocketIO and Django CORS Production socket connections were failing for `*.env.us-central.unstract.com` because python-socketio does exact-string comparison on `cors_allowed_origins`, so a literal `*` pattern silently rejected every real subdomain. - Add `CORS_ALLOWED_ORIGIN_REGEXES` derived from `WEB_APP_ORIGIN_URL_WITH_WILD_CARD`. - Wire SocketIO via `_RegexOrigin` whose `__eq__` does the regex match — single list entry covers all wildcard subdomains, no library subclass needed. - Normalize `WEB_APP_ORIGIN_URL` through `urlparse` so trailing slashes / paths in env are stripped (also fixes the `…com//oauth-status/` double-slash). - Add startup guard for malformed env values. Resolves item #1 of UN-3439. Items #2/#3 (decoupling indexing from Socket.io, fallback) are owned separately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3439 [FIX] Address PR review: canonical origin, fullmatch, unhashable RegexOrigin, tests Addresses five review comments on #1938: 1. coderabbitai (Major) — RFC 6454 canonicalization. Browsers serialize `Origin` headers with a lowercase host and no explicit default ports; `parsed_url.netloc` preserved both, so `https://APP.EXAMPLE.COM:443` would silently fail to match the browser's `https://app.example.com`. Switch to `parsed_url.hostname` + drop default ports, and reject non-http(s) schemes at startup. 2. greptile (P2) — `re.fullmatch` instead of `re.match`. With `re.match` plus `$`, a candidate ending in `\n` matches because `$` is allowed before an optional trailing newline. `fullmatch` removes the ambiguity. 3. self — `_RegexOrigin.__hash__` violated `a == b ⇒ hash(a) == hash(b)` (one fixed pattern hash vs. many matching strings). Today this is masked because python-socketio uses linear `__eq__` on a list, but if the allow-list is ever wrapped in a set, every legitimate subdomain would silently be rejected — exactly the failure mode UN-3439 closes. Make instances unhashable so the contract can't be broken. 4. self — No regression tests. Add `backend/utils/tests/test_cors_origin.py` (33 cases) covering: regex match/no-match, lookalike spoofing, scheme mismatch, trailing-newline rejection, non-string equality protocol, unhashability, ReDoS bounds, URL normalization (case, default ports, trailing slash, paths, queries), startup-guard rejections (empty, no-scheme, non-browser-scheme, no-host), and end-to-end via the same `RegexOrigin` path SocketIO uses. 5. self — Over-clever wildcard-to-regex builder. The `split('*').join(re.escape, ...)` construction generalised to N wildcards but the input has exactly one; replace with a direct rf-string that's self-evident on review. Refactor for testability: extract `RegexOrigin` and `normalize_web_app_origin` into `backend/utils/cors_origin.py` (Django-free, importable from settings and tests). Settings now delegates to one helper call; `log_events.py` imports `RegexOrigin`. No behavioural change beyond what each comment fixes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3439 [FIX] Address SonarCloud quality gate The Sonar quality gate failed with C reliability + 5 security hotspots, all on the new test file: - S905 (Bug, Major) — `{ro}` flagged as no-side-effect statement (Sonar doesn't see the implicit `__hash__` call). Drove the C reliability rating. Fix: use `len({ro})` so the side effect is via an explicit function call; test still asserts the same `TypeError`. - S5727 (Code Smell, Critical) — `assert ro != None` is tautological and doesn't exercise `__eq__`. Switch to `(ro == None) is False` which directly tests that `NotImplemented` falls back to identity-equality. - S5332 × 5 (Hotspots) — `http://` and `ftp://` literals in test data. These are intentional inputs proving the rejection logic. Annotate with `# NOSONAR` and an explanatory comment so the hotspots can be marked reviewed. No production code changed; tests still 33/33 passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3439 [FIX] Remove last S5727 code smell — test __eq__ via dunder Sonar S5727 correctly inferred that ``ro == None`` is statically always False (NotImplemented falls back to identity), making the assertion look tautological. The intent is to lock the protocol contract: ``__eq__`` must return the ``NotImplemented`` sentinel for non-strings. Test that directly via ``ro.__eq__(None) is NotImplemented`` instead of going through ``==``. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3439 [FIX] Address remaining CodeRabbit nits — port validation, ReDoS bound Two minor follow-ups from the second CodeRabbit pass: - `parsed.port` is a property that raises ValueError on malformed/out-of-range inputs (e.g. `:abc`, `:99999`). That bypassed our normalized config-error message and surfaced as a stack trace. Wrap the access and re-raise with the same actionable text. Adds two test cases (`https://example.com:abc`, `https://example.com:99999`) to lock the new behaviour. - The 50ms ReDoS timing bound is too tight for noisy CI runners. Loosen to 500ms — still orders of magnitude below what catastrophic backtracking would produce. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ReverseMerge: V0.161.4 hotfix (#1943) * Change csp to report only * [HOTFIX] Bool-parse ENABLE_HIGHLIGHT_API_DEPLOYMENT env var (v0.161.4) (#1939) [HOTFIX] Bool-parse ENABLE_HIGHLIGHT_API_DEPLOYMENT env var (#1937) [FIX] Bool-parse ENABLE_HIGHLIGHT_API_DEPLOYMENT env var os.environ.get returns the raw string when the variable is set, so ENABLE_HIGHLIGHT_API_DEPLOYMENT="False" was truthy in Python (any non-empty string is truthy). Wrap in CommonUtils.str_to_bool so "False" / "false" / "0" actually evaluate to False. The setting is consumed by the cloud configuration plugin's spec default (ConfigSpec.default in plugins/configuration/cloud_config.py) on cloud and on-prem builds. With this fix, an admin who explicitly sets the env var to a falsy string sees highlight data stripped as expected. Co-authored-by: vishnuszipstack <117254672+vishnuszipstack@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Deepak K <89829542+Deepak-Kesavan@users.noreply.github.com> Co-authored-by: vishnuszipstack <117254672+vishnuszipstack@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3448 [FIX] Remove vestigial `uv pip install` line in uv-lock-automation workflow (#1941) * UN-3448 [FIX] Add --system flag to uv pip install in uv-lock-automation workflow Modern uv requires uv pip install to run inside a virtual environment OR with the explicit --system flag. The workflow currently has neither, so it errors out: error: No virtual environment found for Python 3.12.9; run `uv venv` to create an environment, or pass `--system` to install into a non-virtual environment This breaks every PR that touches a pyproject.toml (the workflow's paths filter triggers on those). Last successful run was 2026-04-01, before a behaviour change in uv or astral-sh/setup-uv@v7. The --system flag is exactly what the error message suggests and is correct here — we install pip into the runner's system Python; the downstream uv-lock.sh script creates its own venvs as needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3448 [FIX] Remove vestigial `uv pip install` line per review Per @jaseemjaskp's review: the pre-step `uv pip install ... pip` does nothing useful for this workflow. The downstream uv-lock.sh script uses uv sync at line 74, which manages its own venvs internally and never invokes pip directly: $ grep -rn 'pip' docker/scripts/uv-lock-gen/ docker/scripts/uv-lock-gen/uv-lock.sh:2:set -o pipefail Only match is pipefail (shell option), no real pip references. Removing the line entirely is cleaner than papering over with --system. The line was likely copy-pasted from a sibling workflow that legitimately needed pip in the system Python. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ReverseMerge: V0.163.2 hotfix (#1946) * [HOTFIX] Use importlib.util.find_spec for pluggable worker discovery (#1918) * [FIX] Use importlib.util.find_spec for pluggable worker discovery _verify_pluggable_worker_exists() previously checked for the literal file `pluggable_worker/<name>/worker.py` on disk, which breaks when the plugin has been compiled to a .so (Nuitka, Cython, or any C extension) — the module is perfectly importable but the pre-check rejects it because only the .py extension is considered. Replace the filesystem check with importlib.util.find_spec(), which is Python's standard way to ask "is this module resolvable by the import system?". It honors every registered finder — source .py, compiled .so, bytecode .pyc, namespace packages, zipimports — so the function now matches what its docstring claims: verifying the module can be loaded, not that a specific file extension is present. Behavior is preserved for existing deployments: - Images with no `pluggable_worker/<name>/` subpackage → find_spec raises ModuleNotFoundError (ImportError subclass) → returns False. - Images with source .py → find_spec resolves the .py → returns True. - Images with compiled .so → find_spec resolves the .so → returns True. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Handle ValueError from find_spec in pluggable worker verification Greptile-flagged edge case: importlib.util.find_spec() can raise ValueError (not just ImportError) when sys.modules has a partially initialised module entry with __spec__ = None from a prior failed import. Broaden the except to catch both. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Resolve api-deployment worker directory from enum import path worker.py:452 did worker_type.value.replace("-", "_") to derive the on-disk dir name. All WorkerType enum values already use underscores, so the replace was a no-op; for API_DEPLOYMENT whose dir is "api-deployment" (hyphen), it resolved to "api_deployment" and the os.path.exists() check failed. Boot then logged a spurious "❌ Worker directory not found: /app/api_deployment" at ERROR level. The task registration path (builder + celery autodiscover via to_import_path) is unaffected, so this was purely log noise — but noise at ERROR level that masks real failures in log scans. Fix: derive the directory from the authoritative to_import_path() which already handles the hyphen case (api_deployment -> api-deployment). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [HOTFIX] Add IAM Role / Instance Profile auth mode to AWS Bedrock adapter (#1944) * [FEAT] Allow Bedrock to fall through to boto3's default credential chain Match the S3/MinIO connector pattern: when AWS access keys are left blank on the Bedrock LLM and embedding adapter forms, drop them from the kwargs dict so boto3's default credential chain handles authentication. This unlocks IAM role / instance profile / IRSA / AWS Profile scenarios on hosts that already have ambient AWS credentials (e.g. EKS workers with IRSA, EC2 with an instance profile). - llm1/static/bedrock.json: clarify access-key descriptions to mention IRSA and instance profile (already non-required at v0.163.2 base). - embedding1/static/bedrock.json: drop aws_access_key_id and aws_secret_access_key from top-level required; same description fix; expose aws_profile_name for parity with the LLM form. - base1.py: AWSBedrockLLMParameters and AWSBedrockEmbeddingParameters now strip empty access-key values from the validated kwargs before returning, so empty strings don't override boto3's default chain. AWSBedrockEmbeddingParameters fields gain explicit None defaults and an aws_profile_name field. Backward-compatible: existing adapters with access keys filled in continue to work unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FEAT] Add Authentication Type selector to Bedrock adapter form Add an explicit `auth_type` selector with two options, making the auth choice clear to users: - "Access Keys" (default): existing flow, keys required - "IAM Role / Instance Profile (on-prem AWS only)": no fields; relies on boto3's default credential chain (IRSA on EKS, task role on ECS, instance profile on EC2). Description on the selector explicitly notes this option is only for AWS-hosted Unstract deployments. The form-only auth_type field is stripped before LiteLLM validation in both AWSBedrockLLMParameters.validate() and AWSBedrockEmbeddingParameters. validate(). Empty access keys continue to be stripped so boto3 falls through to the default chain even when the access_keys arm is selected without values (matches the S3/MinIO connector pattern). Backward-compatible: legacy adapters without auth_type behave as "Access Keys" mode (the default), and existing keys are forwarded unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [REVIEW] Address Bedrock auth_type review feedback Fixes the P0/P1 issues raised by greptile-apps and jaseemjaskp on PR #1944. Behaviour fixes: - Stale-key leak in IAM Role mode: switching an existing adapter from Access Keys to IAM Role would carry truthy stored access keys through the strip-empty-only loop, so boto3 silently authenticated with the old long-lived credentials instead of falling through to the host's IRSA / instance-profile identity. Both LLM and embedding paths were affected. - Silent acceptance of unknown auth_type: a typo (e.g. "access_key") or a malformed payload from a non-UI client passed through the dict comprehension untouched, with no enum guard. - Cross-field validation gap: explicit Access Keys mode with blank or whitespace-only values silently fell through to the default credential chain instead of surfacing the misconfiguration. Implementation: - Add a module-level _resolve_bedrock_aws_credentials helper used by both AWSBedrockLLMParameters.validate() and AWSBedrock EmbeddingParameters.validate(), so the auth-type contract is expressed once. - Validates auth_type against an allowlist (None | "access_keys" | "iam_role"); raises ValueError on anything else. - iam_role: unconditionally drops aws_access_key_id and aws_secret_access_key. - access_keys (explicit): requires non-blank values; raises ValueError if either is empty or whitespace-only. - Legacy (auth_type absent): retains the lenient strip behaviour so pre-PR adapter configurations continue to deserialise unchanged. - Restore aws_region_name as required (no `= None` default) on AWSBedrockEmbeddingParameters; only credentials may legitimately be absent. - Drop the orphan aws_profile_name field from embedding1/static/bedrock.json: it was added for parity with the LLM form but lives outside the auth_type oneOf and contradicts the selector's "no further input" semantics. The LLM form already had aws_profile_name pre-PR and is left alone for backwards compatibility. Tests: - New tests/test_bedrock_adapter.py covers 15 cases across LLM and embedding adapters: legacy-no-auth-type, explicit access_keys with valid/blank/whitespace keys, iam_role with stale/no keys, unknown auth_type rejection, cross-field validation, and preservation of unrelated params (model_id, aws_profile_name, region, thinking). Skipped (P2 nice-to-have): - Comment-scope clarification, MinIO reference rewording, validate-mutates-caller'\''s-dict, and the LLM form description nit about aws_profile_name visibility. These don'\''t change behaviour and can be addressed in a follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> --------- Co-authored-by: Chandrasekharan M <117059509+chandrasekharan-zipstack@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Athul <89829560+athul-rs@users.noreply.github.com> * batch notification --------- Co-authored-by: ali <117142933+muhammad-ali-e@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Ritwik G <100672805+ritwik-g@users.noreply.github.com> Co-authored-by: Deepak K <89829542+Deepak-Kesavan@users.noreply.github.com> Co-authored-by: vishnuszipstack <117254672+vishnuszipstack@users.noreply.github.com> Co-authored-by: Praveen Kumar <praveen@zipstack.com> Co-authored-by: Chandrasekharan M <117059509+chandrasekharan-zipstack@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Athul <89829560+athul-rs@users.noreply.github.com> * Uns 611 clubbed notification dispatch (#1959) * batch notification * notification slack * notification API * delivery mode batch by default * UI change * PR reviews * sonar issues * sonar issues * code rabbit refactor * greptile comments resolve * UN-3056 Scope enqueue execution_id exemption to INPROGRESS Keep execution_id in _ENQUEUE_REQUIRED_FIELDS as the canonical required set; carve out the INPROGRESS exemption at the validator instead of dropping it broadly. Non-INPROGRESS callers (COMPLETED / ERROR / STOPPED / PARTIAL_SUCCESS) once again get a loud 400 if they omit execution_id, addressing Greptile's silent-failure concern on e6534949d. Extends the comment above the tuple to also flag the consumer-side gap: INPROGRESS buffer rows ship with execution_id=null, so API receivers cannot correlate them with execution logs until the producer-reorder follow-up (UN-3056) lands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * greptile comments resolve * UN-3056 Skip deactivated notifications in BATCHED flush _dispatch_group's lock query did not check notification.is_active, so PENDING NotificationBuffer rows tied to a deactivated source notification still dispatched on the next flush tick (up to one NOTIFICATION_CLUB_INTERVAL of stale traffic). IMMEDIATE deactivation is instant because the GET notifications endpoint filters by is_active=True; this restores the same expectation for BATCHED. Also adds select_related("notification") so the later rows[0].notification read is part of the same query rather than a per-group round-trip. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * greptile comments resolve * remove immediate mode * add legacy code * add legacy code * greptile review * greptile review * UI as per new designs * UN-3056 [FIX] Make Inactive platform-key activation keyboard-accessible Adds role/tabIndex/aria-label/onKeyDown to the Inactive Tag so keyboard users can activate platform keys. Disabled state (no key id) is non-focusable via tabIndex=-1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3056 [FEAT] Harden batched notification dispatch + review cleanups Address human review feedback on the batched-notification PR: - Fix dead-letter path (Johny): send_webhook_notification now re-raises on retry exhaustion via raise_on_final_failure so the Celery link_error fires and buffer rows reach DEAD_LETTER instead of being silently lost. - Add crash-window reaper: new BufferStatus.SENDING claim state with success (mark_buffer_dispatched) / failure callbacks; _reclaim_stale_sending returns rows stuck past NOTIFICATION_DISPATCH_LEASE_SECONDS to PENDING. - Move FAILURE_STATUSES onto ExecutionStatus (failure_statuses/is_failure); drop the duplicated frozenset and update call sites (Chandru). - Remove dead delivery_mode column + DeliveryMode enum (product is batched-only); rename dispatch_with_delivery_mode -> dispatch_notifications. - Squash notification_v2 migrations 0002+0003 into 0002_notification_batching. - Scrub JIRA/mfbt references from docstrings; clarify NOTIFICATION_CLUB_INTERVAL is a per-org-overridable default. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * UN-3056 [FEAT] Self-review fixes: shared failure rule, observability, hardening Addresses the in-scope items from the self-review on PR #1936: - Single-source the failure-only rule via notification_v2.helper.is_failure_run, used by api_v2 / pipeline_v2 / internal_api_views (_apply_failure_filter); the pipeline path keeps a documented last_run_status backstop. Fixes the false "parity" docstring (#1). - Emit metric= counters at the notification drop sites (backend dispatch_notifications, worker _route_notification) and a row-id sample on the dead-letter log so a delivered-never event is observable (#4). - process_notification_buffer.py honors its "never raises" contract: wrap response.json() so a non-JSON 200 returns False instead of raising (#5). - Bind the flush cap to the renderer's MAX_BATCH_SIZE so rows and rendered events stay in lock-step by construction (#7). - status db_comment now documents the PENDING -> SENDING -> DISPATCHED/DEAD_LETTER lifecycle in both the model and migration 0002 (#8). - Scrub stale IMMEDIATE / worker-callback comments from the provider docstrings (#2, #10). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * UN-3056 [FEAT] Bound buffer redelivery + drop dead provider cluster - NotificationBuffer.dispatch_attempts + NOTIFICATION_MAX_DISPATCH_ATTEMPTS: _dispatch_group dead-letters rows past the cap and increments on each SENDING claim, bounding the reaper reclaim loop so a lost terminal callback can't redeliver forever (self-review #3). - Delete the orphaned synchronous notification_v2/provider/ cluster — zero callers after the batched dispatch_notifications path replaced it (#2). - Fold dispatch_attempts into 0002_notification_batching; refresh lifecycle db_comments + BufferStatus docstring. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * UN-3056 [FEAT] Portable timestamp render in clubbed notification _humanize_timestamp used the `%-d` strftime directive, a glibc/Linux extension that raises ValueError on macOS/Windows. The call sat outside the fromisoformat try/except, so the raise propagated through build_envelope -> render_clubbed_message and was swallowed by process_notification_buffer's outer except, silently skipping every due group on non-Linux dev/CI machines. Interpolate the day from dt.day (plain int, no leading zero) instead so the render is platform-portable; output is byte-identical to the old format. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * UN-3056 [FIX] Clear SonarCloud issues on PR #1936 - internal_serializers.py: validate() now has a single terminal return (S3516); the total_files branch is if/else instead of an early return. - internal_views.py update_status: guard-clause on invalid serializer + extract _truncate_error_message / _update_file_aggregates helpers to drop cognitive complexity below 15 (S3776). Behavior unchanged. - PlatformSettings.jsx: extract InactivePlatformKeyTag sibling component so the key-row map callback drops below cognitive complexity 15 (S3776); keyboard activation (Enter/Space) behavior preserved. - process_notification_buffer.py: logger.exception() in the HTTPError branch to capture the traceback (S8572). - scheduler.sh: explicit return statements (S7682) — run_task returns the task exit code; cleanup returns 0 with the exit moved into the trap. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * UN-3056 [FIX] Refund dispatch_attempts on broker-publish failure _dispatch_group increments dispatch_attempts atomically with the PENDING -> SENDING claim. When _send_clubbed fails to publish to the broker, the revert reset status/dispatched_at but left the increment in place, so a clean broker outage (no task queued, no webhook sent) still burned redelivery budget — N consecutive outages would dead-letter a never-delivered row. Decrement dispatch_attempts in the broker-failure revert so a publish that never reached the broker doesn't consume the cap. Crash / lost- callback paths never hit this except block, so they keep the increment and remain bounded by the reaper, which is the redelivery risk the cap exists for. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * UN-3056 [FIX] Single-source failure rule, webhook compat shape, decouple flush cadence Addresses review feedback on PR #1936: - Single-source the failure rule: add canonical is_failure_run to unstract.core (beside ExecutionStatus). The clubbed renderer derives its summary counts / emoji from it (drops the duplicate _SUCCESS_STATUSES / _is_success / _is_effective_success), and notification_v2.helper.is_failure_run delegates to it. Routing filter and rendered outcome can no longer drift. - Webhook backward compat: build_envelope spreads the legacy flat fields (type, pipeline_id, pipeline_name, status, execution_id?, error_message?) onto a single-event envelope alongside summary/events, so existing API webhook receivers parsing the pre-clubbing flat body keep working. Multi-event stays envelope-only; Slack path untouched. - Decouple the buffer-flush poll cadence from the log-history consumer: dedicated NOTIFICATION_BUFFER_POLL_INTERVAL (default 10s); scheduler.sh wakes at the min of the two intervals and fires each task on its own elapsed interval. Removes the misleading "5s" comment and the shared-knob coupling. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * UN-3056 [FIX] Single-event webhook compat covers worker legacy shape The single-event legacy superset in build_envelope only reproduced the backend dispatch DTO (PipelineStatusPayload.to_dict). The worker callback path's pre-clubbing body (NotificationPayload.to_webhook_payload) also emitted top-level `timestamp` and `additional_data`, so receivers reading those against the old flat shape broke even on single-event sends. Add both keys to _LEGACY_FLAT_KEYS; purely additive (the existing not-None guard keeps backend-origin events from gaining an empty additional_data). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * UN-3056 [FIX] Address review: fail-closed failure filter, doc accuracy, tests Addresses @jaseemjaskp's review on PR #1936. Correctness: - _apply_failure_filter now fails CLOSED when an execution_id was requested but the row is missing (replication lag / race): drop notify_on_failures rows + emit a metric, so a run we can't confirm as a failure never sends a success alert to a failure-only subscriber. Doc/comment accuracy: - tasks.py: rows transition from SENDING (not PENDING/DISPATCHED) to DEAD_LETTER. - NotificationBuffer.flush_after db_comment (model + migration, byte-identical): now() + org's effective club interval (per-org override, else default). - enqueue_notification_buffer: drop the non-existent "not BATCHED" gate claim. - api/slack webhook provider docstrings: frame pass-through by payload SHAPE. - Scrub leftover UNS-611 / UNS-611 v2 JIRA refs (scheduler.sh, PlatformSettings). Cleanup / resilience: - PlatformSettings: log the interval-load failure instead of swallowing it. - _dispatch_group returns a single int (was an always-identical (rows, rows)). - clubbed_renderer: drop build_envelope from __all__ (kept the internal import). Tests: - Rewrite the broken TestNotificationDispatchSite to characterise the new HTTP buffer-enqueue contract (_route_notification / _enqueue_to_buffer); the old suite imported the deleted send_notification_to_worker. - Add pure-function tests for is_failure_run and the clubbed renderer (envelope shape, single-event legacy keys incl. timestamp + additional_data, MAX_BATCH_SIZE cap, Slack overflow footer). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * UN-3056 [FIX] Self-review polish: docstring accuracy + test coverage Follow-up to f70a2a1b5 from a multi-agent self-review pass. - slack_webhook.py: "single-line" → "Slack mrkdwn body" (render_slack_text emits a multi-line body: header + divider + per-event lines). - internal_api_views._load_execution: comment that only DoesNotExist is caught (missing row → fail-closed; malformed id → 500) so the two paths aren't collapsed by a future widened except. - PlatformSettings.jsx: comment no longer says "silently" (the load failure is now logged in the catch). Test coverage added: - clubbed renderer: humanized timestamp in events[] (the dt.day dodge for the %-d glibc bug), unparseable timestamp → placeholder, error_message absent on success events, empty batch, file-count column collapse when additional_data has no totals. - is_failure_run: STOPPED + failed_files>0 (both predicates true). - worker dispatch site: notification with no notification_type key is skipped. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * UN-3056 [FIX] Resolve SonarCloud findings (log-injection guard + bash [[ ) - internal_api_views: strip CR/LF from the request-supplied execution_id at both log sites before logging (SonarCloud pythonsecurity:S5145 log-forging). The id is UUID-validated upstream so this is defense-in-depth, but it also clears the New-Code Security Rating that was failing the quality gate. - scheduler.sh: use [[ ]] instead of [ ] for the four conditional tests (SonarCloud shelldre:S7688); the script is bash (#!/usr/bin/env bash). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * UN-3056 [FIX] Early-failure dispatch, robust buffer mark, authoritative is_failure Field fixes from local end-to-end testing plus a renderer-correctness follow-up: - Early-failure dispatch: the general worker now notifies failure subscribers on its terminal-error branch (notify_execution_failure) for runs that ERROR before the file-processing callback (build / tool-registry / source errors, total_files=0) — previously silent since that callback is the only ETL/Task dispatch site. Fired only after autoretry is exhausted; mutually exclusive with the callback. API deployments already covered via update_pipeline_status. - Robust buffer mark: the notification worker reports SENDING -> DISPATCHED / DEAD_LETTER over the backend internal API (buffer/mark/{dispatched,dead-letter}) instead of Celery link/link_error. Those callbacks routed to the `celery` queue, which the unified -A worker also drains without the backend tasks registered, so ~half were dropped as "unregistered task" and rows stuck in SENDING. Legacy mark tasks kept but deprecated to drain in-flight messages. - Authoritative is_failure verdict: dispatch sites carry the verdict the routing filter used on the payload; the clubbed renderer prefers it over re-deriving from `status` (vocabulary differs: PipelineStatus SUCCESS/FAILURE vs ExecutionStatus), falling back to is_failure_run for worker/legacy payloads so the rendered outcome can't disagree with why the alert fired. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * UN-3056 [FIX] Atomic status + file-count write in update_status Wrap update_execution() and _update_file_aggregates() in a single transaction.atomic() block. Previously they committed independently, so a failure between the two writes could leave WorkflowExecution at a terminal status (COMPLETED) with failed_files=None. The notify_on_failures filter reads failed_files straight from the DB via is_failure_run(), which scores (COMPLETED, None) as a success, silently dropping the failure alert for a partial-failure run. Committing both writes together closes that window. Addresses Greptile P1 on internal_views.py. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Comments to self review * UN-3056 [FIX] Guard broker-failure revert on SENDING state _send_clubbed reverted rows by id only. If send_task raised after the message reached the broker, the worker still runs, delivers, and marks the rows DISPATCHED/DEAD_LETTER over the internal API; the unguarded revert then flips those terminal rows back to PENDING and refunds dispatch_attempts, resurrecting them so the next flush re-dispatches = duplicate delivery. Add a status=SENDING guard so only rows still in the claimed state are reverted, mirroring the source-state guards every other transition in this file already uses (_dispatch_group → PENDING, _mark_buffer_rows → SENDING). Addresses Jaseem's concurrency finding on internal_api_views.py. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * UN-3056 [FIX] Scope flush lock to buffer rows; bound un-renderable group Address review comments on PR #1936: - Add of=("self",) to the flush SELECT FOR UPDATE so SKIP LOCKED takes the row lock on the buffer rows only, not the join-related Notification row. An unrelated lock on that Notification (e.g. an admin edit) no longer silently skips the whole buffer group's dispatch. - Charge a dispatch attempt when render/prepare raises (poison payload) so the dispatch-attempt cap dead-letters an un-renderable group instead of re-rendering the same payload every flush tick forever. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com> Co-authored-by: Chandrasekharan M <117059509+chandrasekharan-zipstack@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com> Co-authored-by: Ghost Jake <89829542+Deepak-Kesavan@users.noreply…
What
Introduces a pluggable executor system that replaces Docker-container-based tool execution with Celery worker tasks, and migrates the Prompt Studio IDE to an async execution model using Socket.IO for result delivery. Gated behind the
async_prompt_executionfeature flag for safe rollout.Why
The existing architecture has several limitations:
How
Backend (65 files)
index_document,fetch_response,single_pass_extractionnow return HTTP 202 (accepted) with atask_idinstead of blocking. Gated byasync_prompt_executionfeature flag — old sync path preserved as fallbackbackend/prompt_studio/prompt_studio_core_v2/tasks.py):ide_index_complete,ide_prompt_complete,ide_prompt_erroretc. run onprompt_studio_callbackqueue, perform ORM writes viaOutputManagerHelper, and emitprompt_studio_resultSocket.IO eventsbackend/backend/worker_celery.py): A second Celery app instance that coexists with Django's Celery app, configured to route tasks to executor workersprompt_studio_helper.pyrewrite: RemovedPromptToolHTTP calls entirely. Newbuild_index_payload(),build_fetch_response_payload(),build_single_pass_payload()methods constructExecutionContextobjects with all ORM data pre-loadedbackend/backend/workers/,file_execution_tasks.py,celery_task.py(old in-process workers)Workers (70 files, ~19,500 new lines)
workers/executor/): NewWorkerType.EXECUTORCelery worker withLegacyExecutorhandling all operations:extract,index,answer_prompt,single_pass_extraction,summarize,agentic_extraction,structure_pipelineBaseExecutor→ExecutorRegistry(class-decorator self-registration) →ExecutionOrchestrator→ExecutionDispatcher(Celerysend_task)ExecutorToolShim: Lightweight stand-in forBaseToolthat satisfies SDK1 adapter interfaces without Docker contextworkers/file_processing/structure_tool_task.py): Celery-native replacement for Docker-basedStructureTool.run()with profile overrides, smart table detection, and output file managementSDK1 (22 files)
unstract/sdk1/src/unstract/sdk1/execution/):ExecutionContext,ExecutionResult(serializable DTOs for Celery JSON transport),ExecutionDispatcher(dispatch()+dispatch_with_callback()),BaseExecutor,ExecutorRegistryFrontend (275 files)
usePromptStudioSockethook listens forprompt_studio_resultSocket.IO events.usePromptRunrewritten from polling to fire-and-forget.PromptRun.jsxconditionally renders async or sync path based on feature flagDocker / Infrastructure
worker-executor-v2,worker-prompt-studio-callback,worker-metricsworkers-v2services from opt-in (profiles: [workers-v2]) to defaultArchitecture Change
Can this PR break any existing features? If yes, please list possible items. If no, please explain why.
Yes, potential breaking changes — mitigated by feature flag:
Prompt Studio IDE async path — gated by
async_prompt_executionfeature flag. When flag is OFF (default), all 3 endpoints (index_document,fetch_response,single_pass_extraction) use the old sync path returning HTTP 200. No behavior change for existing users.Review Guidelines
This PR touches 441 files across backend, frontend, workers, and SDK1. Below is a structured review path to navigate it efficiently.
Code Structure Overview
Recommended Review Order
Review in dependency order — each layer builds on the previous:
execution/context.py,result.py,dispatcher.py,registry.pyto_dict()/from_dict()round-trips correct? Is theOperationenum complete? Queue naming (celery_executor_{name}).executor/tasks.py,executor/worker.pyexecute_extraction: retry policy, error handling, log correlation.executors/legacy_executor.py(focus on_OPERATION_MAP+execute())answer_prompt.py,index.py,retrieval.pyexecutor_paramsmatch whatbuild_*_payload()sends? Lazy import pattern (_get_prompt_deps(),_get_indexing_deps()).views.pylines 351–583dispatch_with_callbackusage with correct callback task names and queue.prompt_studio_helper.py(build_index_payload,build_fetch_response_payload,build_single_pass_payload)executor_params? Key compatibility with executor handlers.tasks.py(callback tasks)ide_prompt_complete: ORM writes viaOutputManagerHelper. Socket.IO emission shape. Error callback cleanup. State store setup/teardown.usePromptRun.js,usePromptStudioSocket.js,PromptRun.jsx_emit_result(). Timeout handling. Status cleanup on failure.docker/docker-compose.yamlworker-executor-v2,worker-prompt-studio-callback. Removed old workers. Queue bindings.workers/tests/test_sanity_phase*.pyData Flow (End-to-End)
Known Code Duplication
views.py— 3 view actionsbuild_payload → get_dispatcher → dispatch_with_callback → return 202tasks.py— callback taskside_index_completeandide_prompt_completefollow same structure: extract kwargs → setup state → check result → ORM work → emit → cleanuptasks.py— legacy tasksrun_index_document,run_fetch_response,run_single_pass_extractionkept alongside new callback tasksFiles Safe to Skim
workers/tests/— 24 test files, ~10,000 lines. Well-structured but high volume. Focus ontest_sanity_phase2.py(full Celery chain) andtest_sanity_phase4.py(IDE payload compatibility) as representative examples.workers/executor/executors/retrievers/— 7 retriever implementations. All follow the same pattern. Reviewing one (simple.py) covers the pattern.architecture-*.md,phase*.md) — Reference material, not code.Relevant Docs
architecture-executor-system.md,architecture-flow-diagram.md,architecture-sequence-diagrams.mdin repo rootarchitecture-migration-phases.mdrollout-plan.mdRelated Issues or PRs
Dependencies Versions / Env Variables
New env variables:
FLIPT_SERVICE_AVAILABLEfalseNotes on Testing
cd workers && uv run pytest -v— 490+ tests (444 inworkers/tests/+ extras)cd unstract/sdk1 && uv run pytest -v— 146+ testscd backend && python -m pytest prompt_studio/prompt_studio_core_v2/test_tasks.py -vasync_prompt_execution=true), trigger prompt runs in IDE, verify Socket.IO events deliver results via Network → WS → Messages tabScreenshots
N/A (primarily backend/worker architecture change; frontend UX unchanged when feature flag is off)
Checklist
I have read and understood the Contribution Guidelines.