Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Hello, we're Minish!

About us

We're a two-person (pringled and stephantul) open-source lab, with a focus on Natural Language Processing.

We believe that if you make models fast enough, you unlock new possibilities.

Using our models and packages, you can:

  • Embed the entire English Wikipedia in 5 minutes
  • Classify tens of thousands of documents per second on a CPU
  • Approximately deduplicate extremely large datasets in minutes
  • Build the fastest RAG application in the world
  • Easily evaluate which ANN algorithm works best for your data

Our projects:

  • model2vec: tiny static embedding models with state-of-the-art performance.
  • potion: the best small models in the world. 100-500x faster than a sentence-transformer, and almost as good.
  • semble: the fastest and best code search library for your agent.
  • vicinity: consistent interfaces to many approximate nearest neighbor algorithms.
  • semhash: lightning-fast, super accuracte, semantic deduplication and filtering for your text datasets.
  • model2vec-rs: a Rust port of model2vec.

You can also find us on:

Pinned Loading

  1. model2vec model2vec Public

    Fast State-of-the-Art Static Embeddings

    Python 2.1k 121

  2. semble semble Public

    Fast and Accurate Code Search for Agents. Uses ~98% fewer tokens than grep+read

    Python 4.9k 205

  3. semhash semhash Public

    Fast Multimodal Semantic Deduplication & Filtering

    Python 935 57

  4. vicinity vicinity Public

    Lightweight Nearest Neighbors with Flexible Backends

    Python 345 13

  5. tokenlearn tokenlearn Public

    Pre-train Static Word Embeddings

    Python 104 9

  6. model2vec-rs model2vec-rs Public

    Official Rust Implementation of Model2Vec

    Rust 190 23

Repositories

Loading
Type
Select type
Language
Select language
Sort
Select order
Showing 10 of 11 repositories

Top languages

Loading…

Most used topics

Loading…

Morty Proxy This is a proxified and sanitized view of the page, visit original site.