Fast SRA downloader and FASTQ converter, written in pure Rust.
- Fast -- 4-11x faster than
fasterq-dumpon typical SRA files - One command -- download, convert to FASTQ, and compress
- Batch input -- accessions, BioProjects (PRJNA), studies (SRP), or a file via
--accession-list - gzip or zstd output -- parallel compression, or plain FASTQ
- FASTA output --
--fastadrops quality scores - SRA and SRA-lite -- full or simplified quality scores
- Split modes -- split-3, split-files, split-spot, interleaved
- Resumable downloads -- picks up where it left off
- Stdout streaming --
-Zpipes FASTQ straight into downstream tools - Integrity checks -- MD5 verification on download and decode
- Platform support -- Illumina, BGISEQ/DNBSEQ, Element, Ultima, PacBio, Nanopore (legacy 454 and Ion Torrent are not supported)
- Single static binary -- no Python, no C dependencies
# Download, convert, and compress
sracha get SRR28588231
# Download all runs from a BioProject
sracha get PRJNA675068
# Batch download from an accession list
sracha get --accession-list SRR_Acc_List.txt
# Just download
sracha fetch SRR28588231
# Convert a local .sra file
sracha fastq SRR28588231.sra
# Show accession info
sracha info SRR28588231
# Validate a downloaded file
sracha validate SRR28588231.sraUncompressed output, measured with hyperfine.
| File | Size | sracha | fasterq-dump | fastq-dump | Speedup vs fasterq-dump |
|---|---|---|---|---|---|
| SRR28588231 | 23 MiB | 0.17 s | 1.86 s | 2.09 s | 10.9x |
| SRR2584863 | 288 MiB | 1.51 s | 5.80 s | 13.30 s | 3.8x |
| ERR1018173 | 1.94 GiB | 9.40 s | 34.35 s | -- | 3.7x |
sracha produces gzipped FASTQ by default (level 1, ~1.4× the
uncompressed time on small files thanks to parallel block compression),
so the integrated pipeline (sracha get) writes ready-to-use .fastq.gz
without a separate gzip step.
Full hyperfine output
SRR28588231 (23 MiB, 66K spots, Illumina paired)
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|---|---|---|---|---|
sracha |
170.9 ± 1.8 | 168.2 | 175.4 | 1.00 |
fasterq-dump |
1856.4 ± 14.2 | 1838.3 | 1871.6 | 10.86 ± 0.14 |
fastq-dump |
2090.5 ± 33.3 | 2052.5 | 2125.0 | 12.23 ± 0.23 |
SRR2584863 (288 MiB, Illumina paired)
| Command | Mean [s] | Min [s] | Max [s] | Relative |
|---|---|---|---|---|
sracha |
1.512 ± 0.018 | 1.496 | 1.532 | 1.00 |
fasterq-dump |
5.799 ± 0.130 | 5.667 | 5.927 | 3.83 ± 0.10 |
fastq-dump |
13.297 ± 0.157 | 13.192 | 13.478 | 8.79 ± 0.15 |
ERR1018173 (1.94 GiB, 15.6M spots, Illumina paired, single run)
| Command | Time [s] |
|---|---|
sracha |
9.40 |
fasterq-dump |
34.35 |
sracha gzip overhead (SRR28588231, default --gzip-level 1)
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|---|---|---|---|---|
sracha (no compression) |
172.1 ± 5.6 | 165.1 | 185.6 | 1.00 |
sracha (gzip) |
239.5 ± 5.9 | 230.9 | 249.4 | 1.39 ± 0.06 |
Benchmarks run with sracha v0.3.5, sra-tools v3.4.1, on Linux
(8 CPUs). Install the reference toolkit with pixi run install-sratools
and reproduce with validation/benchmark.sh.
Install via Bioconda:
pixi add --channel bioconda srachaOr download pre-built binaries from the releases page, or install from source:
cargo install --git https://github.com/rnabioco/sracha-rs srachaFull CLI reference and usage guide: https://rnabioco.github.io/sracha-rs/
sracha builds on the Sequence Read Archive, maintained by the National Center for Biotechnology Information at the National Library of Medicine. The SRA and its toolchain are public-domain software developed by U.S. government employees — our tax dollars at work. Special thanks to Kenneth Durbrow (@durbrow) and the SRA Toolkit team for building and maintaining the infrastructure that makes projects like this possible.
This project wouldn't exist without NCBI's open infrastructure: the VDB/KAR format, the SDL locate API, EUtils, and public S3 hosting of sequencing data. sracha aims to make it easier for the community to build on that foundation.
MIT
