Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Robust Deep Monte Carlo Counterfactual Regret Minimization: Addressing Theoretical Risks in Neural Fictitious Self-Play

License

Notifications You must be signed in to change notification settings

nier2kirito/robust-deep-mccfr

Open more actions menu

Repository files navigation

Robust Deep Monte Carlo Counterfactual Regret Minimization : Addressing Theoretical Risks in Neural Fictitious Self-Play

License: MIT Python 3.8+ PyTorch

A PyTorch implementation of Monte Carlo Counterfactual Regret Minimization (MCCFR) with deep neural networks for learning optimal strategies in imperfect information games.

🎯 Overview

This library implements state-of-the-art algorithms for solving imperfect information games using deep learning. It combines the theoretical foundations of Counterfactual Regret Minimization (CFR) with modern deep learning techniques to learn near-optimal strategies in complex game environments.

👉 This is an implementation of the paper: Robust Deep Monte Carlo Counterfactual Regret Minimization: Addressing Theoretical Risks in Neural Fictitious Self-Play.

Key Features

  • Multiple Neural Network Architectures: From simple feedforward networks to advanced transformer-based architectures
  • Robust Training: Includes importance weight clipping, target networks, and variance reduction techniques
  • Comprehensive Evaluation: Built-in exploitability calculation and strategy analysis tools
  • Modular Design: Easy to extend to new games and network architectures
  • Research-Ready: Includes experimental frameworks and diagnostic tools

Supported Games

  • Kuhn Poker: A simplified poker variant perfect for testing and research -Leduc Poker: A poker variant that is more complex than Kuhn Poker, perfect for testing scalability and robustness.
  • Extensible Framework: Easy to add new imperfect information games

🚀 Quick Start

Installation

From Source (Recommended)

git clone https://github.com/your-username/robust-deep-mccfr.git
cd robust-deep-mccfr
pip install -e .

Basic Usage

from deep_mccfr import DeepMCCFR, KuhnGame

# Initialize the algorithm
mccfr = DeepMCCFR(
    network_type='ultra_deep',
    learning_rate=0.00003,
    batch_size=384
)

# Train on Kuhn Poker
results = mccfr.train(num_iterations=10000)

print(f"Final exploitability: {results['final_exploitability']:.6f}")
print(f"Training time: {results['training_time']:.1f}s")

Advanced Usage with Robust Features

from deep_mccfr import RobustDeepMCCFR, RobustMCCFRConfig

# Configure robust training
config = RobustMCCFRConfig(
    network_type='mega_transformer',
    exploration_epsilon=0.1,
    importance_weight_clip=10.0,
    use_target_networks=True,
    prioritized_replay=True,
    num_iterations=20000
)

# Initialize robust MCCFR
robust_mccfr = RobustDeepMCCFR(config)

# Train with advanced features
results = robust_mccfr.train(config.num_iterations)

🏗️ Architecture

Neural Network Architectures

The library includes several neural network architectures optimized for strategy learning:

  1. BaseNN: Simple feedforward network with dropout
  2. DeepResidualNN: Deep residual network with skip connections
  3. FeatureAttentionNN: Self-attention mechanism for feature interactions
  4. HybridAdvancedNN: Combines attention and residual processing
  5. MegaTransformerNN: Large-scale transformer architecture
  6. UltraDeepNN: Ultra-deep network with bottleneck residual blocks

Key Components

  • Feature Extraction: Sophisticated state representation for game states
  • Experience Replay: Prioritized sampling for stable learning
  • Risk Mitigation: Multiple techniques to ensure robust training
  • Diagnostic Tools: Comprehensive monitoring and analysis

📊 Experimental Results

The library includes extensive experimental frameworks for comparing different approaches:

from deep_mccfr.experiments import ExperimentRunner, get_ablation_configs

# Run systematic ablation study
runner = ExperimentRunner()
configs = get_ablation_configs()

for config in configs:
    results = runner.run_experiment(config)
    
# Analyze results
runner.analyze_results()

Citation

If you use this library in your research, please cite:

@software{eljaafari2024dlmccfr,
  author = {El Jaafari, Zakaria},
  title = {Deep Learning Monte Carlo Counterfactual Regret Minimization},
  url = {https://github.com/nier2kirito/robust-deep-mccfr},
  version = {1.0.0},
  year = {2024}
}

🛠️ Development

Setting up Development Environment

# Clone the repository
git clone https://github.com/your-username/robust-deep-mccfr.git
cd robust-deep-mccfr

# Install development dependencies
pip install -e ".[dev]"

# Run tests
pytest tests/

# Format code
black src/

# Type checking
mypy src/

Project Structure

robust-deep-mccfr/
├── src/robust-deep-mccfr/          # Main package
│   ├── games/             # Game implementations
│   ├── networks.py        # Neural network architectures
│   ├── mccfr.py          # Core MCCFR algorithms
│   ├── features.py       # Feature extraction
│   ├── utils.py          # Utility functions
│   └── __init__.py       # Package initialization
├── examples/             # Example scripts
├── tests/               # Unit tests
├── docs/                # Documentation
├── requirements.txt     # Dependencies
├── setup.py            # Package setup
└── README.md           # This file

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new functionality
  4. Ensure all tests pass
  5. Submit a pull request

🔧 Configuration

Network Types

# Available network architectures
NETWORK_TYPES = [
    'simple',           # Basic feedforward
    'deep_residual',    # Deep residual network
    'feature_attention', # Attention-based
    'hybrid_advanced',  # Hybrid architecture
    'mega_transformer', # Large transformer
    'ultra_deep'        # Ultra-deep network
]

Training Parameters

# Common training configurations
CONFIGS = {
    'fast': {
        'batch_size': 128,
        'learning_rate': 0.001,
        'train_every': 50
    },
    'stable': {
        'batch_size': 384,
        'learning_rate': 0.00003,
        'train_every': 25
    },
    'research': {
        'batch_size': 512,
        'learning_rate': 0.00001,
        'train_every': 10
    }
}

🐛 Troubleshooting

Common Issues

  1. CUDA Out of Memory: Reduce batch size or use a smaller network
  2. Slow Training: Enable GPU acceleration or use simpler architectures
  3. Numerical Instability: Adjust learning rate or enable gradient clipping

Performance Optimization

  • Use GPU acceleration for large networks
  • Adjust batch size based on available memory
  • Use mixed precision training for faster computation
  • Enable experience replay for sample efficiency

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

  • Original MCCFR algorithm by Lanctot et al.
  • Deep CFR extensions by Brown et al.
  • PyTorch team for the excellent deep learning framework
  • Game theory research community

📞 Contact


⭐ If you find this project helpful, please consider giving it a star on GitHub!

Morty Proxy This is a proxified and sanitized view of the page, visit original site.