Financial Data Engine

This project shows how to deploy a distributed web scraper for financial data to enhance efficiency, use a relational database for storage, and implement comprehensive monitoring.

Key Features

Distributed Systems: Develop systems using RabbitMQ and Celery for scalable web scraping.
Docker Deployment: Use Docker for streamlined setup and deployment, monitored with Protainer.
Database: Efficiently store and manage data using MySQL.
Monitoring: Implement Grafana, Prometheus for big data monitoring.
Dashboard: Build Grafana dashboards for data status monitoring and anomaly detection.

Quickstart

Follow these steps to set up and run the distributed web scraper:

1. Initial set-up

Clone the repo:

git clone https://github.com/whchien/financial-data-engine.git

Install the necessary dependencies:

make install-package

Initiate docker swarm

make init-swarm

Create the Docker network for service communication:

make create-network

2. Start Essential Services

Deploy RabbitMQ to handle message queuing:

make deploy-rabbitmq

Deploy the MySQL service for data storage:

make deploy-mysql

Set up the MySQL volume for data persistence:

make create-mysql-volume

3. Start Celery Workers

Deploy the Celery worker for TWSE tasks for example:

make run-worker-twse

4. Fetch Financial Data

Send a task to fetch Taiwan futures daily data:

make send-taiwan-futures-daily-task

By following these steps, you will set up a distributed scraping system capable of efficiently collecting financial data, utilizing RabbitMQ for task queuing, MySQL for data storage, and Celery for task execution.

Credits

This project is inspired by this repo.

Name	Name	Last commit message	Last commit date
Latest commit History 5 Commits 5 Commits
fin_engine	fin_engine
grafana	grafana
Dockerfile	Dockerfile
Makefile	Makefile
Pipfile	Pipfile
Pipfile.lock	Pipfile.lock
README.md	README.md
celery.sql	celery.sql
crawler_scheduler.yml	crawler_scheduler.yml
crawler_worker.yml	crawler_worker.yml
create_partition_table.sql	create_partition_table.sql
genenv.py	genenv.py
grafana.yml	grafana.yml
local.ini	local.ini
monitor.sql	monitor.sql
mysql.yml	mysql.yml
portainer.yml	portainer.yml
pyproject.toml	pyproject.toml
rabbitmq.yml	rabbitmq.yml
requirements.txt	requirements.txt
setup.py	setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Financial Data Engine

Key Features

Quickstart

1. Initial set-up

2. Start Essential Services

3. Start Celery Workers

4. Fetch Financial Data

Credits

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Search code, repositories, users, issues, pull requests...

Folders and files

Latest commit

History

Repository files navigation

Financial Data Engine

Key Features

Quickstart

1. Initial set-up

2. Start Essential Services

3. Start Celery Workers

4. Fetch Financial Data

Credits

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages