Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

edtimer/chaplin

Open more actions menu
 
 

Repository files navigation

Chaplin

Chaplin Thumbnail

A visual speech recognition (VSR) tool that reads your lips in real-time and types whatever you silently mouth. Runs fully locally.

Relies on a model trained on the Lip Reading Sentences 3 dataset as part of the Auto-AVSR project.

Watch a demo of Chaplin here.

Setup

  1. Clone the repository, and cd into it:
    git clone https://github.com/amanvirparhar/chaplin
    cd chaplin
  2. Download the required model components: LRS3_V_WER19.1 and lm_en_subword.
  3. Unzip both folders, and place them in their respective directories:
    chaplin/
    ├── benchmarks/
        ├── LRS3/
            ├── language_models/
                ├── lm_en_subword/
            ├── models/
                ├── LRS3_V_WER19.1/
    ├── ...
    
  4. Install and run ollama, and pull the llama3.2 model.
  5. Install uv.

Usage

  1. Run the following command:
    sudo uv run --with-requirements requirements.txt --python 3.12 main.py config_filename=./configs/LRS3_V_WER19.1.ini detector=mediapipe
  2. Once the camera feed is displayed, you can start "recording" by pressing the option key (Mac) or the alt key (Windows/Linux), and start mouthing words.
  3. To stop recording, press the option key (Mac) or the alt key (Windows/Linux) again. You should see some text being typed out wherever your cursor is.
  4. To exit gracefully, focus on the window displaying the camera feed and press q.

About

A real-time silent speech recognition tool. determines what a person says from lips

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%
Morty Proxy This is a proxified and sanitized view of the page, visit original site.