📦[Feature] Support for async job offloading #323

syntaxsdev · Jul 14, 2025

Description

closes #320

This adds support for async jobs. What this does is allows you to submit an audio for processing and receive a job_id which you can check back for later.

Job data is stored in a sqlite DB
Transcription results ARE NOT stored in sqlite DB, rather in a set of jobs
Processing in the background in order using a queue
Data is temporarily stored for a period of time

Jobs are cleaned up on two occasions:
https://github.com/syntaxsdev/whisper-asr-webservice/blob/3678d2aff95aff673fd4496e5aa8c2b11c1a6ae6/app/config.py#L56-L59

# How long to keep a batch process after its value been read - Default is 30 minutes
JOB_CLEANUP_AFTER_READ = int(os.getenv("JOB_CLEANUP_AFTER_READ", 1800))
# How long to keep a batch process after its value been abandoned (not read) - Default is 24 hours
JOB_CLEANUP_ABANDONED = int(os.getenv("JOB_CLEANUP_ABANDONED", 86400))

This means, once a job is processed, you have

24 hours (default) to read the value or it will be considered abandoned and deleted
30 minutes (default) after you read it before it is deleted

Usage

POST asr/ - just set the async_job param to true
Example response:

to retrieve:
GET asr/{job_id} (new endpoint)
Example response:

In this particular example, I used diarization on WhisperX model as well.

If a failure occurs, it will display the status as failed and also be cleaned at the JOB_CLEANUP_AFTER_READ period.

Other Notes

I've built it with support eventually to expand to async batch jobs, where you can upload multiple files at once (or multiple files into a job) and then eventually kick off the job, which is why the output is structured as such.

I think a separate PR would be warranted for that.

Testing

Tested both locally and containerized (CPU/GPU).
Verified works in Kubernetes (OpenShift)
Works with 921MB (nearly 1GB) audio file, tested on GPU - took 2 minutes

Test containers:
docker.io/syntaxsdev/whisper-asr-webservice:latest
docker.io/syntaxsdev/whisper-asr-webservice:latest-gpu

syntaxsdev · Aug 11, 2025

bump @ahmetoner :)

syntaxsdev added 3 commits July 13, 2025 23:09

feat: support for async job offloading of audio files

3678d2a

refactor: cleanup unsued

cbd6773

feat: update makefile to support dev / local builds

f2f4063

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

📦[Feature] Support for async job offloading #323

📦[Feature] Support for async job offloading #323

Uh oh!

syntaxsdev commented Jul 14, 2025 •

edited

Loading

Uh oh!

syntaxsdev commented Aug 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Search code, repositories, users, issues, pull requests...

📦[Feature] Support for async job offloading #323

Are you sure you want to change the base?

📦[Feature] Support for async job offloading #323

Uh oh!

Conversation

syntaxsdev commented Jul 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Usage

Other Notes

Testing

Uh oh!

syntaxsdev commented Aug 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

syntaxsdev commented Jul 14, 2025 •

edited

Loading