BioForge is a pure-Rust toolkit for automated preparation of biological macromolecules. It reads experimental structures (PDB/mmCIF), reconciles them with high-quality residue templates, repairs missing atoms, assigns hydrogens and termini, builds topologies, and optionally solvates the system with water and ions—all without leaving the Rust type system.
- Template-driven accuracy – curated TOML templates for standard amino acids, nucleotides, and water guarantee reproducible coordinates, charges, and bonding.
- Rich structure model – lightweight
Atom,Residue,Chain, andStructuretypes backed bynalgebramake geometric operations trivial. - Format interoperability – buffered readers/writers for PDB, mmCIF, and MOL2 plus error types that surface precise parsing diagnostics.
- Preparation pipeline – cleaning, repairing, protonating, solvation, coordinate transforms, and topology reconstruction share a common
ops::Errorso workflows compose cleanly. - Rust-first ergonomics – no FFI, no global mutable state beyond the lazily-loaded template store, and edition 2024 guarantees modern language features.
- Load –
io::read_pdb_structureorio::read_mmcif_structureparses coordinates with help fromIoContextalias resolution. - Clean –
ops::clean_structureremoves waters, ions, hetero residues, or arbitrary residue names viaCleanConfig. - Repair –
ops::repair_structurerealigns residues to their templates and rebuilds missing heavy atoms (including OXT on C-termini and OP3 on 5'-phosphorylated nucleic acids). - Hydrogenate –
ops::add_hydrogensinfers protonation states (configurable pH and histidine strategy) and reconstructs hydrogens from template anchors. - Solvate/Ionize –
ops::solvate_structurecreates a periodic box, packs water on a configurable lattice, and swaps molecules for ions to satisfy a target charge. - Topology –
ops::TopologyBuilderreplays template bond definitions, peptide-link detection, nucleic backbone connectivity, and disulfide heuristics to emit aTopologyobject. - Write –
io::write_pdb_structure/io::write_mmcif_structureserialize the processed structure; the correspondingwrite_*_topologyhelpers emit CONECT orstruct_connrecords.
Install the latest BioForge CLI binary from the releases page or via cargo:
cargo install bio-forgeOnce the bioforge binary is installed, you can repair a structure in a single step:
bioforge repair -i input.pdb -o repaired.pdbExplore the complete preparation pipeline in the user manual and browse the examples directory for runnable walkthroughs.
BioForge is also available as a library crate. Add it to your Cargo.toml dependencies:
[dependencies]
bio-forge = "0.3.0"use std::{fs::File, io::{BufReader, BufWriter}};
use bio_forge::{
io::{
read_pdb_structure,
write_pdb_structure,
write_pdb_topology,
IoContext,
},
ops::{
add_hydrogens, clean_structure, repair_structure, solvate_structure,
CleanConfig, HydroConfig, SolvateConfig, TopologyBuilder,
},
};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let ctx = IoContext::new_default();
let input = BufReader::new(File::open("input.pdb")?);
let mut structure = read_pdb_structure(input, &ctx)?;
clean_structure(&mut structure, &CleanConfig::water_only())?;
repair_structure(&mut structure)?;
add_hydrogens(&mut structure, &HydroConfig::default())?;
solvate_structure(&mut structure, &SolvateConfig::default())?;
let topology = TopologyBuilder::new().build(structure.clone())?;
write_pdb_structure(BufWriter::new(File::create("prepared.pdb")?), &structure)?;
write_pdb_topology(BufWriter::new(File::create("prepared-topology.pdb")?), &topology)?;
Ok(())
}Prefer mmCIF? Swap in
read_mmcif_structure/write_mmcif_structure. Need to process ligands? Parse them viaio::read_mol2_templateand feed the resultingTemplateintoTopologyBuilder::add_hetero_template.
- CLI User Manual – detailed explanation of command-line usage and options.
- API Documentation – comprehensive reference for public types and functions.
- Architecture Overview – detailed explanation of the internal design and algorithms used in BioForge.
This project is licensed under the MIT License - see the LICENSE file for details.