🧠 Optimal Execution with Reinforcement Learning

This repository implements a Reinforcement Learning (RL) agent to solve the optimal execution problem in algorithmic trading using Deep Q-Learning (DQN). The paper aims to rigorously explore Value based, Deep Q-Learning (DQN) Reinforcment Learning for the problem of Optimal Trade Execution. The problem of Optimal Trade Execution focuses on to find the the optimal "path" of executing a stock order, minimising the price impact from the market, and maximising consequent revenue from execution of a stock order.

The results and outcomes of the paper reflects that under a simple environment, where the optimal execution strategy is known as (TWAP), the RL agent is able to very closely and accurately approximate the same solution. As a repercussion, this may set as a proof of concept that under a larger environment, the RL has the potential to learn more patterns relevant to the problem, like stock price prediction or external predatory trading, which would allow it to outperform the TWAP in practical scenarios. It replicates and builds upon the paper:

Optimal Execution with Reinforcement Learning
Yadh Hafsi, Edoardo Vittori (2024) — arXiv:2411.06389

📚 Problem Overview

Traders executing large orders must minimize implementation shortfall, i.e., the total cost of executing trades across time, while accounting for market impact.

This RL agent learns an execution strategy that beats fixed strategies like:

TWAP (Time-Weighted Average Price)
Aggressive buying
Passive execution
Random actions

🔧 Setup & Installation

1. Clone the Repository

git clone https://github.com/yourusername/optimal-execution-rl.git
cd optimal-execution-rl

2. Create Virtual Environment & Install Dependencies

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

pip install -r requirements.txt

📈 Key Results

🎯 Training Progress

Training stabilizes after ~100 episodes.
Moving average flattens, suggesting convergence.

🥊 Strategy Comparison

Reward Comparison Over 20 Runs

Strategy	Mean Reward	Std Dev
TWAP	-2400.00	≈ 0.00
Passive	-2403.36	2.74
Aggressive	-2407.20	≈ 0.00
Random	-2402.64	2.68
RL Agent	-2402.42	0.00

✅ RL Agent delivers more consistent and slightly better performance than all baselines.

🧠 Methodology

State Features

% Time left
% Inventory left
Limit order book imbalance (top 5 levels)
Best bid/ask

Action Space

Discrete: {0, Qmin, 2×Qmin, 3×Qmin, 4×Qmin}

Reward Function

Penalizes market impact (execution price - bid)
Penalizes depth (larger orders incur cost)
Final penalty for unexecuted inventory

Agent

Deep Q-Network (DQN)
ε-greedy policy (decaying)
Experience replay + target network
Trained for 200 episodes

🧪 Baseline Strategies

Strategy	Description
TWAP	Fixed trade size at each time interval
Passive	60% chance to do nothing, else random
Aggressive	Always executes 2 × Qmin per timestep
Random	Chooses random action from action space
RL Agent	Learns optimal policy from environment

🗂 File Structure

hft-optimal-execution-dqn
|____notebooks/
|       |__optimal_execution_rl_implementation.ipynb     ← Full implementation in Jupyter Notebook
|       |__optimal_execution_rl_implementation.py
|       |__aggressive.ipynb
|       |__Passive.ipynb
|       |__random.ipynb
|       |__ TWAP.ipynb
|
|____baseline/
|       |__aggressive.py
|       |__Passive.py
|       |__random.py
|       |__TWAP.py
|
|____docs/
|        |__hft_interview_prep.md
|        |__paper_summary.md
|
|____src/
|       |__environment.py                 ← Custom Gym environment for optimal execution
|       |__agents.py                      ← Deep Q-Network model, training loop, replay buffer
|       |__utils.py                       ← Evaluation logic and plotting functions
|
|____images/
|      |__reward_distribution.png         ← Output charts and visualizations
|      |__strategy_comparison.png
|
|___README.md                         ← Detailed project documentation
|___requirements.txt                  ← List of Python dependencies
|___.gitignore                        ← Files and folders to be excluded from Git
|___LICENSE                           ← MIT license file

📑 Citation

If you use this code, please cite:

@misc{hafsi2024optimal,
  title={Optimal Execution with Reinforcement Learning},
  author={Yadh Hafsi and Edoardo Vittori},
  year={2024},
  eprint={2411.06389},
  archivePrefix={arXiv},
  primaryClass={q-fin.TR}
}

🛡 License

This project is open-sourced under the MIT License. See LICENSE for details.

🤝 Contributing

Contributions to this project are whole heartedly welcomed and encouraged. Whether it's proposing new features, improving the code structure, enhancing documentation, or recommending improvements to the methodology, your input is highly valued.

Please ensure that any contributions align with the goals of the project and maintain the clarity, reproducibility, and research integrity that underpin this work. All contributions, regardless of size, help strengthen the project and improve the robustness of the implementation.

If you wish to contribute, you're encouraged for thoughtful collaboration, respectful discussion, and a commitment to shared learning. Thank you for helping make this project better for everyone.

For questions, open an issue or reach out via ✉️alqama043@gmail.com

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧠 Optimal Execution with Reinforcement Learning

📚 Problem Overview

🔧 Setup & Installation

1. Clone the Repository

2. Create Virtual Environment & Install Dependencies

📈 Key Results

🎯 Training Progress

🥊 Strategy Comparison

Reward Comparison Over 20 Runs

🧠 Methodology

State Features

Action Space

Reward Function

Agent

🧪 Baseline Strategies

🗂 File Structure

📑 Citation

🛡 License

🤝 Contributing

About

Uh oh!

Releases

Packages

Languages

Name	Name	Last commit message	Last commit date
Latest commit History 18 Commits
baseline	baseline
docs	docs
images	images
notebooks	notebooks
src	src
.gitignore	.gitignore
LICENSE.txt	LICENSE.txt
readme.md	readme.md
requirements.txt	requirements.txt

Search code, repositories, users, issues, pull requests...

License

Alqama-svg/optimal_execution_with_RL_Agent

Folders and files

Latest commit

History

Repository files navigation

🧠 Optimal Execution with Reinforcement Learning

📚 Problem Overview

🔧 Setup & Installation

1. Clone the Repository

2. Create Virtual Environment & Install Dependencies

📈 Key Results

🎯 Training Progress

🥊 Strategy Comparison

Reward Comparison Over 20 Runs

🧠 Methodology

State Features

Action Space

Reward Function

Agent

🧪 Baseline Strategies

🗂 File Structure

📑 Citation

🛡 License

🤝 Contributing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages