Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

taskprocessing:worker does not atomically claim tasks → duplicate processing with multiple workers #61052

Copy link
Copy link
@bygadd

Description

@bygadd
Issue body actions

Bug description

Running multiple occ taskprocessing:worker processes in parallel (as the AI admin docs recommend — "run the command 4 or more times") makes the same scheduled task get processed by more than one worker. This happens in two ways:

  • concurrently — several workers pick up the same task in the same instant, and
  • sequentially — a second worker re-claims a task the first has already finished.

Each duplicate is a full extra provider/LLM invocation → wasted worker capacity and multiplied external API cost. On a busy instance the effective throughput collapses toward a single worker even when many are running.

Root cause (two parts)

1. OC\Core\Command\TaskProcessing\WorkerCommand::processNextTask() never calls lockTask().
It calls IManager::getNextScheduledTask() (a plain SELECT, no lock) and then processTask() directly. By contrast the OCS endpoint TaskProcessingApiController::getNextScheduledTask() does it correctly — it loops, calls lockTask(), and skips tasks it failed to claim. So concurrent CLI workers all SELECT the same scheduled row and all proceed to process it.

2. TaskMapper::lockTask() guards with status != STATUS_RUNNING instead of status = STATUS_SCHEDULED.
This lets a second worker re-claim a task that is in any non-running state (including STATUS_SUCCESSFUL) which it had SELECTed before the first worker finished → sequential re-processing of an already completed task.

Steps to reproduce

  1. Configure a synchronous TaskProcessing provider (e.g. integration_openai).
  2. Run 4 workers in parallel: occ taskprocessing:worker -v ×4.
  3. Schedule several core:text2text tasks.
  4. Watch the worker logs: the same task id is logged Processing task N / Finished processing task N by multiple PIDs.

Verified on 33.0.5 (MySQL): with 4 workers the same task id appears 4× within the same second. After applying only fix #1, a residual ~1/15 sequential re-processing of already-STATUS_SUCCESSFUL tasks remained — traced to root cause #2.

Expected behavior

Each scheduled task is processed exactly once, regardless of how many parallel workers run.

Proposed fix (verified — 8 workers, 0 duplication)

  1. In WorkerCommand::processNextTask(), after getNextScheduledTask() call lockTask() and, when it returns false, add the task id to an ignore list and re-fetch — mirroring TaskProcessingApiController.
  2. In TaskMapper::lockTask(), change the guard from ->neq('status', … STATUS_RUNNING) to ->eq('status', … STATUS_SCHEDULED).

With both changes, 8 concurrent workers processed every task exactly once (0 duplicates over repeated measurement windows). A PR follows.

Affected versions

The missing lockTask() call and the != running guard are both present on 30.x–33.x and current master.

Reactions are currently unavailable

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    To triage
    Show more project fields

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Morty Proxy This is a proxified and sanitized view of the page, visit original site.