taskprocessing:worker does not atomically claim tasks → duplicate processing with multiple workers

Bug description

Running multiple occ taskprocessing:worker processes in parallel (as the AI admin docs recommend — "run the command 4 or more times") makes the same scheduled task get processed by more than one worker. This happens in two ways:

concurrently — several workers pick up the same task in the same instant, and
sequentially — a second worker re-claims a task the first has already finished.

Each duplicate is a full extra provider/LLM invocation → wasted worker capacity and multiplied external API cost. On a busy instance the effective throughput collapses toward a single worker even when many are running.

Root cause (two parts)

1. OC\Core\Command\TaskProcessing\WorkerCommand::processNextTask() never calls lockTask().
It calls IManager::getNextScheduledTask() (a plain SELECT, no lock) and then processTask() directly. By contrast the OCS endpoint TaskProcessingApiController::getNextScheduledTask() does it correctly — it loops, calls lockTask(), and skips tasks it failed to claim. So concurrent CLI workers all SELECT the same scheduled row and all proceed to process it.

2. TaskMapper::lockTask() guards with status != STATUS_RUNNING instead of status = STATUS_SCHEDULED.
This lets a second worker re-claim a task that is in any non-running state (including STATUS_SUCCESSFUL) which it had SELECTed before the first worker finished → sequential re-processing of an already completed task.

Steps to reproduce

Configure a synchronous TaskProcessing provider (e.g. integration_openai).
Run 4 workers in parallel: occ taskprocessing:worker -v ×4.
Schedule several core:text2text tasks.
Watch the worker logs: the same task id is logged Processing task N / Finished processing task N by multiple PIDs.

Verified on 33.0.5 (MySQL): with 4 workers the same task id appears 4× within the same second. After applying only fix #1, a residual ~1/15 sequential re-processing of already-STATUS_SUCCESSFUL tasks remained — traced to root cause #2.

Expected behavior

Each scheduled task is processed exactly once, regardless of how many parallel workers run.

Proposed fix (verified — 8 workers, 0 duplication)

In WorkerCommand::processNextTask(), after getNextScheduledTask() call lockTask() and, when it returns false, add the task id to an ignore list and re-fetch — mirroring TaskProcessingApiController.
In TaskMapper::lockTask(), change the guard from ->neq('status', … STATUS_RUNNING) to ->eq('status', … STATUS_SCHEDULED).

With both changes, 8 concurrent workers processed every task exactly once (0 duplicates over repeated measurement windows). A PR follows.

Affected versions

The missing lockTask() call and the != running guard are both present on 30.x–33.x and current master.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

taskprocessing:worker does not atomically claim tasks → duplicate processing with multiple workers #61052

Bug description

Root cause (two parts)

Steps to reproduce

Expected behavior

Proposed fix (verified — 8 workers, 0 duplication)

Affected versions

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Search code, repositories, users, issues, pull requests...

Uh oh!

taskprocessing:worker does not atomically claim tasks → duplicate processing with multiple workers #61052

Description

Bug description

Root cause (two parts)

Steps to reproduce

Expected behavior

Proposed fix (verified — 8 workers, 0 duplication)

Affected versions

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions