Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

AISmithLab/CoBRA

Open more actions menu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

18 Commits
18 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

CoBRA Logo

CHI 2026 Best Paper Award arXiv License Under Active Development

Toward Precise and Consistent Agent Behaviors across Models Anchored by Validated Social Science Knowledge

๐ŸŒ Project Page: cobra.clawder.ai ย |ย  ๐Ÿ“„ Paper: arXiv 2509.13588

If you find CoBRA useful, please star โญ this repo to help others discover it!

English ็ฎ€ไฝ“ไธญๆ–‡

Demo_Video.mp4

๐Ÿ’ก What is Cognitive Bias?

Systematic deviations from rational judgment in human cognition and decision-making. For example, Framing Effect: "90% survival rate" vs. "10% mortality rate" โ€” logically identical, yet people make different choices based on how information is framed.


Reproducibility and controllability are fundamental to scientific research. Yet implicit natural language descriptions โ€” the dominant approach for specifying social agent behaviors in nearly all LLM-based social simulations โ€” often fail to yield consistent behavior across models or capture the nuances of the descriptions.

CoBRA (Cognitive Bias Regulator for Social Agents) is a novel toolkit that lets researchers explicitly specify desired nuances in LLM-based agents and obtain consistent behavior across models.

Through CoBRA, we show how to operationalize validated social science knowledge as reusable "gym" environments for AI โ€” an approach that generalizes to richer social and affective simulations.

CoBRA Overview
The problem and our solution: from inconsistent agent behaviors under implicit specifications to explicit, quantitative control.


At the heart of CoBRA is a novel closed-loop system with two core components:

  • Cognitive Bias Index โ€” measures the cognitive bias of a social agent by quantifying its reactions in validated classic social science experiments
  • Behavioral Regulation Engine โ€” aligns the agent's behavior to exhibit controlled cognitive bias, via three control methods:
    • Prompt Engineering (input space control)
    • Representation Engineering (activation space control)
    • Fine-tuning (parameter space control)

CoBRA Workflow
Example: A researcher specifies a target bias level โ†’ CoBRA measures it via classic experiments โ†’ iteratively adjusts the agent until it reliably exhibits the desired bias.

Quick Start (3 Steps)

# 1. Install dependencies
pip install -r requirements.txt

# 2. Navigate to the unified bias control module
cd examples/unified_bias

# 3. Run a bias experiment
python pipelines.py --bias authority --method repe-linear --model Mistral-7B

That's it. The system will measure and control the agent's Authority Effect bias.

Repository Structure

CoBRA/
โ”œโ”€โ”€ control/                    # Core bias control engine
โ”œโ”€โ”€ examples/
โ”‚   โ”œโ”€โ”€ unified_bias/           # Main entry point (START HERE)
โ”‚   โ”‚   โ”œโ”€โ”€ pipelines.py        # Unified experiment runner
โ”‚   โ”‚   โ”œโ”€โ”€ run_pipelines.py    # CLI interface
โ”‚   โ”‚   โ”œโ”€โ”€ ablation/           # Ablation studies
โ”‚   โ”‚   โ””โ”€โ”€ README.md           # Full usage guide
โ”‚   โ”œโ”€โ”€ authority/              # Authority Effect utils
โ”‚   โ”œโ”€โ”€ bandwagon/              # Bandwagon Effect utils
โ”‚   โ”œโ”€โ”€ confirmation/           # Confirmation Bias utils
โ”‚   โ””โ”€โ”€ framing/                # Framing Effect utils
โ”œโ”€โ”€ generator/                  # Data generation utilities
โ”œโ”€โ”€ data_generated/             # Generated experimental data
โ”œโ”€โ”€ webdemo/                    # Web demonstration interface
โ””โ”€โ”€ requirements.txt            # Python dependencies

Key Components

Component Description Documentation
Cognitive Bias Index Measures bias strength via classic experiments data/data_README.md
Behavioral Regulation Engine Three control methods (Prompt/RepE/Finetune) control/control_README.md
Unified Pipeline Run full experiments with one command examples/unified_bias/README.md
Ablation Studies Test model/persona/temperature sensitivity examples/unified_bias/ablation/README.md
Data Generator Create custom bias scenarios and responses generator/README.md

Supported Biases & Experiments

Bias Type Paradigms Data Directory Control Range
Authority Effect Milgram Obedience, Stanford Prison data/authority/ 0-4 scale
Bandwagon Effect Asch's Line, Hotel Towel data/bandwagon/ 0-4 scale
Confirmation Bias Wason Selection, Biased Information data/confirmation/ 0-4 scale
Framing Effect Asian Disease, Investment/Insurance data/framing/ 0-4 scale

Citation

If you use CoBRA in your research, please cite our paper:

@article{liu2025cobra,
  title={CoBRA: Programming Cognitive Bias in Social Agents Using Classic Social Science Experiments},
  author={Liu, Xuan and Shang, Haoyang and Jin, Haojian},
  journal={arXiv preprint arXiv:2509.13588},
  year={2025}
}

Paper Link: https://arxiv.org/abs/2509.13588

License

MIT License - see LICENSE for details

Contact

For questions, please contact the corresponding author Xuan Liu at xul049@ucsd.edu, or file a GitHub Issue to report bugs and request features.


Need help? Check examples/unified_bias/README.md for detailed walkthroughs. The finetuning code is in the finetuning branch.

Releases

No releases published

Packages

 
 
 

Contributors

Morty Proxy This is a proxified and sanitized view of the page, visit original site.