Top 23 Python speech-recognition Projects

transformers

1 225 154,507 10.0 Python

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Project mention: Detecting AI Slop: Techniques & Red Flags | dev.to | 2025-12-28

HuggingFace Transformers - Library for building custom detectors
Stream

getstream.io featured

Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
faster-whisper

2 25 19,699 6.8 Python

Faster Whisper transcription with CTranslate2
whisperX

3 39 19,392 8.6 Python

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Project mention: A beginner's guide to the Whisperx-A40-Large model by Victor-Upmeet on Replicate | dev.to | 2026-01-04

The whisperx-a40-large model is an accelerated version of the popular Whisper automatic speech recognition (ASR) model. Developed by Victor Upmeet, it provides fast transcription with word-level timestamps and speaker diarization. This model builds upon the capabilities of Whisper, which was originally created by OpenAI, and incorporates optimizations from the WhisperX project for improved performance.
FunASR

4 4 14,261 8.2 Python

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Project mention: CosyVoice 2025 Complete Guide: The Ultimate Multi-lingual Text-to-Speech Solution | dev.to | 2025-12-15

FunASR - Automatic Speech Recognition
PaddleSpeech

5 6 12,477 8.4 Python

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
speechbrain

6 28 11,003 9.2 Python

A PyTorch-based Speech Toolkit

Project mention: 5 must know open-source repositories to build cool AI apps | dev.to | 2025-10-29

Star the Speech Brain repository ⭐
espnet

7 15 9,666 9.9 Python

End-to-End Speech Processing Toolkit
InfluxDB

www.influxdata.com featured

InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
SpeechRecognition

8 16 8,919 8.3 Python

Speech recognition module for Python, supporting several engines and APIs, online and offline.
SenseVoice

9 2 7,280 4.9 Python

Multilingual Voice Understanding Model
voice-pro

10 13 5,220 7.1 Python

Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.

Project mention: Show HN: Likes/day as fake profile → built my own dating app in 100 days | news.ycombinator.com | 2025-12-16
wenet

11 6 4,983 7.8 Python

Production First and Production Ready End-to-End Speech Recognition Toolkit

Project mention: CosyVoice 2025 Complete Guide: The Ultimate Multi-lingual Text-to-Speech Solution | dev.to | 2025-12-15

WeNet - Speech Recognition Toolkit
Porcupine

12 32 4,573 8.5 Python

On-device wake word detection powered by deep learning

Project mention: Show HN: Shoggoth Mini – A weird tentacle robot powered by GPT-4o and RL | news.ycombinator.com | 2025-07-15

> also, "GPT-4o continuously listens to speech through the audio stream," is going to be problematic
Seems like openWakeWord or porcupine could be able to solve by adding a layer for wake word detection before sending the prompt off.
I wonder if latency would be any better with a local model cached in a 16GB or 24GB graphics card. It would have to be a quantized/distilled model, but maybe performance would still be acceptable.
https://github.com/dscripka/openWakeWord
https://github.com/Picovoice/porcupine
ml-road

13 1 4,562 2.8 Python

Machine Learning and Agentic AI Resources, Practice and Research

Project mention: Neural Networks: Zero to Hero | news.ycombinator.com | 2026-01-04

Well, no ... For a start any "AI" course 20 years ago probably wouldn't have even mentioned neural nets, and certainly not as a mainstream technique.
A 20yr old "AI" curriculum would have looked more like the 3rd edition of Russel & Norvig's "Artificial Intelligence - A Modern Approach".
https://github.com/yanshengjia/ml-road/blob/master/resources...
Karpathy's videos aren't an AI (except in modern sense of AI=LLMs) course, or even a machine learning course, or even a neural network course for that matter (despite the title) - it's really just "From Zero to LLMs".
distil-whisper

14 10 4,010 7.6 Python

Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
whisper-asr-webservice

15 11 3,090 8.1 Python

OpenAI Whisper ASR Webservice API
lingvo

16 1 2,855 6.0 Python

Lingvo
whisper-standalone-win

17 7 2,768 5.0 Python

Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.

Project mention: Show HN: Python Audio Transcription: Convert Speech to Text Locally | news.ycombinator.com | 2025-09-22

I like this version of Whisper which has diarization built in: https://github.com/Purfview/whisper-standalone-win
whisper-timestamped

18 2 2,713 6.1 Python

Multilingual Automatic Speech Recognition with word-level timestamps and confidence
lip-reading-deeplearning

19 1 1,892 10.0 Python

:unlock: Lip Reading - Cross Audio-Visual Recognition using 3D Architectures
kalliope

20 4 1,753 0.0 Python

Kalliope is a framework that will help you to create your own personal assistant.
Dragonfire

21 2 1,398 0.0 Python

the open-source virtual assistant for Ubuntu based Linux distributions
SpeechT5

22 4 1,398 7.1 Python

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
SALMONN

23 2 1,373 7.3 Python

SALMONN family: A suite of advanced multi-modal LLMs
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python speech-recognition discussion

Python speech-recognition related posts

A beginner's guide to the Whisperx-A40-Large model by Victor-Upmeet on Replicate

1 project | dev.to | 4 Jan 2026
CosyVoice 2025 Complete Guide: The Ultimate Multi-lingual Text-to-Speech Solution

5 projects | dev.to | 15 Dec 2025
Video to Text AI: The [2025 Guide] to Unlocking Revenue from Content

1 project | dev.to | 10 Dec 2025
Making AI Models Faster, Cheaper, and Greener — Here’s How

5 projects | dev.to | 3 Nov 2025
5 must know open-source repositories to build cool AI apps

6 projects | dev.to | 29 Oct 2025
Kitten TTS: 25MB CPU-Only, Open-Source Voice Model

19 projects | news.ycombinator.com | 5 Aug 2025
Ask HN: What Speaker Diarization tools should I look into?

1 project | news.ycombinator.com | 23 Jul 2025
A note from our sponsor - Stream
getstream.io | 5 Jan 2026

Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure. Learn more →

Index

What are some of the best open-source speech-recognition projects in Python? This list will help you:

#	Project	Stars
1	transformers	154,507
2	faster-whisper	19,699
3	whisperX	19,392
4	FunASR	14,261
5	PaddleSpeech	12,477
6	speechbrain	11,003
7	espnet	9,666
8	SpeechRecognition	8,919
9	SenseVoice	7,280
10	voice-pro	5,220
11	wenet	4,983
12	Porcupine	4,573
13	ml-road	4,562
14	distil-whisper	4,010
15	whisper-asr-webservice	3,090
16	lingvo	2,855
17	whisper-standalone-win	2,768
18	whisper-timestamped	2,713
19	lip-reading-deeplearning	1,892
20	kalliope	1,753
21	Dragonfire	1,398
22	SpeechT5	1,398
23	SALMONN	1,373

Python speech-recognition

Top 23 Python speech-recognition Projects

Python speech-recognition discussion

Python speech-recognition related posts

A beginner's guide to the Whisperx-A40-Large model by Victor-Upmeet on Replicate

CosyVoice 2025 Complete Guide: The Ultimate Multi-lingual Text-to-Speech Solution

Video to Text AI: The [2025 Guide] to Unlocking Revenue from Content

Making AI Models Faster, Cheaper, and Greener — Here’s How

5 must know open-source repositories to build cool AI apps

Kitten TTS: 25MB CPU-Only, Open-Source Voice Model

Ask HN: What Speaker Diarization tools should I look into?

Index

Did you know that Python is the 2nd most popular programming language based on number of references?

Did you know that Python is
the 2nd most popular programming language
based on number of references?