etl-pipelines

Here are 25 public repositories matching this topic...

yobix-ai / extractous

Fast and efficient unstructured data extraction. Written in Rust with bindings for many languages.

nlp rust pdf machine-learning natural-language-processing ocr etl tika extraction docx data-pipelines pdf-parser unstructured unstructured-data rag etl-pipelines llm

Updated Dec 21, 2024
Rust

Burla-Cloud / burla

Star

The simplest way to run Python on lots of computers.

python data-pipelines batch-processing etl-pipelines

Updated Oct 17, 2025
TypeScript

patterns-app / patterns-devkit

Star

Data pipelines from re-usable components

data-science sql etl pipelines immutability data-engineering functional-reactive-programming data-analysis data-pipelines data-pipeline etl-framework etl-pipeline etl-pipelines

Updated Mar 30, 2023
Python

level-vc / useful

Star

The open-source Useful SDK. One python decorator in the Useful library allows for full observability of Python functions within an ETL.

etl telemetry etl-pipelines python-observability

Updated Jan 11, 2024
Python

Chek0rrdn / DataEngineer_ETL

Star

A project structure for doing and sharing data engineer work.

scraper etl cookiecutter python3 data-engineering data-extraction cookiecutter-template etl-pipeline etl-pipe etl-pipelines

Updated Feb 28, 2022
Python

datacompose / datacompose

Star

Clean API primitives for data cleaning in Pyspark. Inspired by PyJanitor, Dataprep.AI and Shadcn.

python data-science sql ai etl pyspark salesforce databricks etl-pipeline etl-pipelines

Updated Sep 15, 2025
Python

abrahamkoloboe27 / Airflow-Pipeline-Dashboard-Compagnie-Aerienne

Star

Lien de l'application

python docker dockerfile airflow mongodb etl docker-compose makefile postgresql orchestration data-engineering atlas extract-transform-load mongodb-atlas etl-pipeline streamlit etl-pipelines streamlit-dashboard duckdb

Updated Dec 18, 2024
Python

angelxd84130 / Airflow-ETL

Star

Build ETL piplines on AirFlow to load data from BigQuery and store it in MySQL

mysql bigquery airflow etl apache-airflow etl-pipeline airflow-dags etl-pipelines

Updated Aug 1, 2022
Python

ragztigadi / BigData-ETL-Pipelines-Ecommerce

Star

Big Data ETL pipeline for Brazilian e-commerce data. Implements data ingestion, transformation, and storage using Apache Spark, Hadoop, and SQL. Designed for scalable data processing and analytics.

mysql sql mongodb python3 powerbi azure-databricks azure-devops etl-pipelines

Updated Apr 1, 2025
HTML

EmmanuelEzenwere / DataSift

Star

DataSift auto applies a data pre-processing pipeline to Data Science Projects.

data-science data-engineering data-preprocessing etl-pipelines

Updated May 28, 2024
Python

ChristianRCanlas / ChristianRCanlas.github.io

Star

e-Portfolio showcasing my personal projects.

Updated Jan 13, 2025
Python

prneidhardt / Apache-Data-Pipeline

Star

Sparkify project

python aws airflow-dags etl-pipelines

Updated Nov 11, 2024
Jupyter Notebook

Guilherme-B / baboon

Star

JSON-driven ETL pipeline framework prototype

json dag bonobo etl-pipelines

Updated Mar 25, 2020
Python

siddarthaThentu / Disaster-Response-Pipeline

Star

A deployed machine learning model that has the capability to automatically classify the incoming disaster messages into related 36 categories. Project developed as a part of Udacity's Data Science Nanodegree program.

bootstrap flask machine-learning plotly python3 data-analytics hyperparameter-optimization feature-engineering ensemble-models ml-pipelines etl-pipelines

Updated Jun 10, 2021
Python

juniors90 / PymaciesArg

Star

An extension that registers all pharmacies in Argentina.

python datascience argentina pharmacy etl-framework etl-pipeline etl-job pharmacies pypi-package etl-automation etl-pipelines

Updated Oct 16, 2022
Python

vossmoos / fleetfluid

Star

FleetFluid is a Python library that simplifies data transformation by letting you use AI-powered functions without writing (and hosting) them from scratch.

etl ai-agents etl-pipelines etl-pipeline-automation

Updated Sep 28, 2025
Python

ccalobeto / analytics_metropolitano

Star

Pipeline de datos que muestra indicadores analíticos del servicio del metropolitano de Lima

gcp observablehq etl-pipelines

Updated Oct 15, 2025
Python

IMAbril / RENIS

Star

project in process

data-validation data-wrangling data-modeling data-cleaning data-profiling portfolio-project data-governance technical-documentation database-normalization etl-pipelines data-quality-assessment relational-database-design data-qa data-remediation data-integrity-report dataset-optimization

Updated Feb 17, 2025
Jupyter Notebook

extralo / loom

Star

Weaving together different threads (services like image/audio converse, ETL services, etc.) to enable the World Wide Flow

etl-framework etl-pipelines flow-architectures

Updated Dec 26, 2023
JavaScript

speedbits / LimitlessETL

Star

A Python and Spark based ETL framework. While it operates within speed limits that is framework and standards, but offers boundless possibilities.

etl etl-framework etl-pipeline etl-job etl-pipelines

Updated Apr 1, 2024
Python

Improve this page

Add a description, image, and links to the etl-pipelines topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the etl-pipelines topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

etl-pipelines

Here are 25 public repositories matching this topic...

yobix-ai / extractous

Burla-Cloud / burla

patterns-app / patterns-devkit

level-vc / useful

Chek0rrdn / DataEngineer_ETL

datacompose / datacompose

abrahamkoloboe27 / Airflow-Pipeline-Dashboard-Compagnie-Aerienne

angelxd84130 / Airflow-ETL

ragztigadi / BigData-ETL-Pipelines-Ecommerce

EmmanuelEzenwere / DataSift

ChristianRCanlas / ChristianRCanlas.github.io

prneidhardt / Apache-Data-Pipeline

Guilherme-B / baboon

siddarthaThentu / Disaster-Response-Pipeline

juniors90 / PymaciesArg

vossmoos / fleetfluid

ccalobeto / analytics_metropolitano

IMAbril / RENIS

extralo / loom

speedbits / LimitlessETL

Improve this page

Add this topic to your repo

Search code, repositories, users, issues, pull requests...

etl-pipelines

Here are 25 public repositories matching this topic...

yobix-ai / extractous

Burla-Cloud / burla

patterns-app / patterns-devkit

level-vc / useful

Chek0rrdn / DataEngineer_ETL

datacompose / datacompose

abrahamkoloboe27 / Airflow-Pipeline-Dashboard-Compagnie-Aerienne

angelxd84130 / Airflow-ETL

ragztigadi / BigData-ETL-Pipelines-Ecommerce

EmmanuelEzenwere / DataSift

ChristianRCanlas / ChristianRCanlas.github.io

prneidhardt / Apache-Data-Pipeline

Guilherme-B / baboon

siddarthaThentu / Disaster-Response-Pipeline

juniors90 / PymaciesArg

vossmoos / fleetfluid

ccalobeto / analytics_metropolitano

IMAbril / RENIS

extralo / loom

speedbits / LimitlessETL

Improve this page

Add this topic to your repo