Python Machine Learning

Open-source Python projects categorized as Machine Learning

Top 23 Python Machine Learning Projects

Machine Learning
  1. transformers

    ๐Ÿค— Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

    Project mention: Detecting AI Slop: Techniques & Red Flags | dev.to | 2025-12-28

    HuggingFace Transformers - Library for building custom detectors

  2. Stream

    Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.

    Stream logo
  3. Pytorch

    Tensors and Dynamic neural networks in Python with strong GPU acceleration

    Project mention: Avoid UUIDv4 Primary Keys | news.ycombinator.com | 2025-12-15

    > A running number also carries data. Before you know it, someone's relying on the ordering or counting on there not being gaps - or counting the gaps to figure out something they shouldn't.

    For example, if https://github.com/pytorch/pytorch/issues/111111 can be seen but https://github.com/pytorch/pytorch/issues/111110 can't, someone might infer the existence of a hidden issue relating to a critical security problem.

    Whereas if the URL was instead https://github.com/pytorch/pytorch/issues/761500e0-0070-4c0d... that risk would be avoided.

  4. nn

    ๐Ÿง‘โ€๐Ÿซ 60+ Implementations/tutorials of deep learning papers with side-by-side notes ๐Ÿ“; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), ๐ŸŽฎ reinforcement learning (ppo, dqn), capsnet, distillation, ... ๐Ÿง 

  5. scikit-learn

    scikit-learn: machine learning in Python

    Project mention: Data Analyst Guide: Mastering Random Forest vs XGBoost: Which Wins for Analytics? | dev.to | 2026-01-05

    scikit-learn documentation: https://scikit-learn.org/

  6. Keras

    Deep Learning for humans

    Project mention: PyTorch vs TensorFlow 2025: Which one wins after 72 hours? | dev.to | 2025-08-29

    Keras 3 multi-backend

  7. yolov5

    YOLOv5 ๐Ÿš€ in PyTorch > ONNX > CoreML > TFLite

    Project mention: Teaching AI to Read Emotions: Science, Challenges, and Innovation Behind Facial Emotion Detection with YOLOv11 on Raspberry Pi | dev.to | 2025-11-23

    Ultralytics YOLO Documentation

  8. OpenBB

    Financial data platform for analysts, quants and AI agents.

    Project mention: ๐Ÿ“Š 2026-01-04 - Daily Intelligence Recap - Top 5 Signals | dev.to | 2026-01-04

    OpenBB-finance / OpenBB

  9. InfluxDB

    InfluxDB โ€“ Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  10. Face Recognition

    The world's simplest facial recognition api for Python and the command line

    Project mention: Show HN: Real-time privacy protection for smart glasses | news.ycombinator.com | 2025-08-11

    Did you look at egoblur? its a lot more effective at face detection than https://github.com/ageitgey/face_recognition granted, you'd have to do your own face matching to do exception.

  11. faceswap

    Deepfakes Software For All

  12. ultralytics

    Ultralytics YOLO ๐Ÿš€

    Project mention: Why DETRs are replacing YOLOs for real-time object detection | news.ycombinator.com | 2025-11-22

    > The YOLO series is developed and maintained by Ultralytics. All YOLO code and weights are released under the AGPL-3.0 license.The YOLO series is developed and maintained by Ultralytics. All YOLO code and weights are released under the AGPL-3.0 license.

    The original author of YOLO and the Darknet framework [1] issued the code under pretty much every license you wish to use [2]. My preferred fork by AlexeyAB is under an equally permissive license [3].

    Ultralytics then created their own model under the AGPL-3.0 license [4], which probably would never stand up in a court as they have the model from the likes of YOLOv3 in their source [5].

    This entire article is flawed anyway, because they don't state which YOLOv11 model they are using or compare the accuracy. They appear to have just taken the pre-trained models and assumed it's apples-to-apples. They could have at least compared YOLO11n/s/m/l/x,

    [1] https://pjreddie.com/darknet/yolo/

    [2] https://github.com/pjreddie/darknet

    [3] https://github.com/AlexeyAB/darknet

    [4] https://github.com/ultralytics/ultralytics

    [5] https://github.com/ultralytics/ultralytics/tree/main/ultraly...

  13. Airflow

    Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

    Project mention: Top Open-Source Data Engineering Tools- Unravelling the Best in 2026 | dev.to | 2025-12-10

    Airflow

  14. streamlit

    Streamlit โ€” A faster way to build and share data apps.

    Project mention: Experimenting with Javelit - The Streamlit for Java | dev.to | 2025-12-20

    Javelit brings the power of rapid prototyping and interactive web app development to the Java ecosystem, much like Streamlit does for Python. With its simple, loop-based programming model, developers can quickly build data-driven applications without needing extensive frontend knowledge, leveraging familiar Java syntax and the rich JVM ecosystem. The live-reload feature enables instant experimentation and iteration, making it ideal for prototyping AI agents, data visualizations, and interactive tools. By integrating seamlessly with libraries like LangGraph4j combined with both Spring AI and LangChain4j, Javelit empowers Java developers to create engaging user interfaces effortlessly, bridging the gap between backend logic and user-facing applications. Checkout project, try it and let me know your feedback and ... happy coding! ๐Ÿ‘‹

  15. gradio

    Build and share delightful machine learning apps, all in Python. ๐ŸŒŸ Star to support our work!

    Project mention: The Ultimate Guide to Building Stunning AI Apps For Beginners - Gradio | dev.to | 2025-11-14

    Why Gradio is the New Superpower for Every AI Learner in 2025

  16. DeepSpeed

    DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

    Project mention: All Data and AI Weekly #193 - June 9, 2025 | dev.to | 2025-06-09
  17. Ray

    Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

    Project mention: Top Open-Source Data Engineering Tools- Unravelling the Best in 2026 | dev.to | 2025-12-10

    Ray

  18. MindsDB

    Query Engine for AI - The only MCP Server you'll ever need

    Project mention: MindsDB Supercharges Google's MCP Toolbox with Unstructured Data Support | dev.to | 2025-12-29

    Weโ€™re happy to announce that weโ€™ve integrated MindsDB with Google's open-source project, MCP (Model Context Protocol) Toolbox. This will make your AI applications very, very smart. This enhancement expands the Toolbox's reach, especially for organizations grappling with lots of siloed data.

  19. Open-Assistant

    OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

  20. gym

    A toolkit for developing and comparing reinforcement learning algorithms.

  21. supervision

    We write your reusable computer vision tools. ๐Ÿ’œ

    Project mention: Show HN: Plug-and-play Python utils for any computer-vision pipeline | news.ycombinator.com | 2025-07-21
  22. paperless-ngx

    A community-supported supercharged document management system: scan, index and archive all your documents

    Project mention: Review for Synology DiskStation DS925+: A feature-packed NAS | dev.to | 2025-10-30

    Borg Backup - I use it to automatically back up my main hosted Docker services. I have publicly hosted instances of Immich, and Paperless-NGX using Docker containers. I periodically make a backup of their data folder using Borg and store it in a Borg repo. The advantage of storing the backups in a Borg repo is that it is a deduplicating archival program. So no matter how many backups you make, it will not take any extra space than the first backup, provided nothing has changed. If there is a change, only that changed chunk is backed up, just like git. Also, you can easily encrypt and/or compress while backing up. Restoring a backup is also as easy as running a single Borg command.

  23. qlib

    Qlib is an AI-oriented Quant investment platform that aims to use AI tech to empower Quant Research, from exploring ideas to implementing productions. Qlib supports diverse ML modeling paradigms, including supervised learning, market dynamics modeling, and RL, and is now equipped with https://github.com/microsoft/RD-Agent to automate R&D process.

    Project mention: Choosing the Right AI Model for Stock Prediction | dev.to | 2025-10-04

    After researching different AI models in Qlib (a quantitative finance platform), here's what I learned:

  24. spaCy

    ๐Ÿ’ซ Industrial-strength Natural Language Processing (NLP) in Python

    Project mention: Solved: Is there a better way to test subject lines besides random A/B tools? | dev.to | 2025-12-29

    Open-Source NLP Libraries: Python libraries like spaCy, NLTK, and Hugging Face Transformers for building custom models.

  25. dspy

    DSPy: The framework for programmingโ€”not promptingโ€”language models

    Project mention: Agent Optimization: Why Context Engineering Isnโ€™t Enough | dev.to | 2025-10-02

    These methods improve efficiency, reduce hallucination, and enhance autonomy. Frameworks such as LangChain and DSPy could integrate many of these strategies, proving their practical value.

  26. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Machine Learning discussion

Log in or Post with

Python Machine Learning related posts

  • How to Evaluate Your Text-to-SQL Agent in Cortex Analyst Using TruLens

    1 project | dev.to | 5 Jan 2026
  • Data Analyst Guide: Mastering Random Forest vs XGBoost: Which Wins for Analytics?

    1 project | dev.to | 5 Jan 2026
  • SynthTS โ€“ Open-source CLI for generating privacy-safe synthetic time-series

    1 project | news.ycombinator.com | 5 Jan 2026
  • AWS Sagemaker Notebook Jobs for Accelerating Data Science Experimentation Workflows with Mlflow and Optuna

    3 projects | dev.to | 4 Jan 2026
  • ๐Ÿ“Š 2026-01-04 - Daily Intelligence Recap - Top 5 Signals

    1 project | dev.to | 4 Jan 2026
  • Build a Deep Learning Library

    3 projects | news.ycombinator.com | 1 Jan 2026
  • MindsDB Supercharges Google's MCP Toolbox with Unstructured Data Support

    2 projects | dev.to | 29 Dec 2025
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 5 Jan 2026
    InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more โ†’

Index

What are some of the best open-source Machine Learning projects in Python? This list will help you:

# Project Stars
1 transformers 154,507
2 Pytorch 96,237
3 nn 65,107
4 scikit-learn 64,474
5 Keras 63,678
6 yolov5 56,518
7 OpenBB 56,002
8 Face Recognition 55,756
9 faceswap 54,846
10 ultralytics 50,555
11 Airflow 43,710
12 streamlit 42,959
13 gradio 41,151
14 DeepSpeed 41,145
15 Ray 40,583
16 MindsDB 38,177
17 Open-Assistant 37,492
18 gym 36,649
19 supervision 36,247
20 paperless-ngx 35,263
21 qlib 35,136
22 spaCy 33,030
23 dspy 31,171

Sponsored
Stream - Scalable APIs for Chat, Feeds, Moderation, & Video.
Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
getstream.io

Did you know that Python is
the 2nd most popular programming language
based on number of references?

Morty Proxy This is a proxified and sanitized view of the page, visit original site.