sra-toolkit
- Version:
3.0.0
- Category:
bio
- Cluster:
Loki
Description
The SRA Toolkit is a collection of command-line utilities and libraries for accessing and working with the NCBI Sequence Read Archive (SRA). It enables users to download, convert, and process sequencing data stored in the SRA in formats such as FASTQ, SAM/BAM, and SRA.
Version 3.0.0 introduces improvements in:
Compatibility with modern secure HTTPS endpoints
Memory-efficient streaming of large datasets
Updated accession resolution and configuration handling
Improved support for fasterq-dump and vdb-dump tools
Documentation
Common commands:
----------------
prefetch SRRXXXXXXX
Download SRA data by accession
fasterq-dump SRRXXXXXXX
Convert SRA to FASTQ (faster than fastq-dump)
fastq-dump --split-files SRRXXXXXXX
Legacy FASTQ conversion tool
vdb-dump SRRXXXXXXX
Dump database table contents to stdout
vdb-validate SRRXXXXXXX
Validate local SRA file integrity
Configuration:
--------------
vdb-config --interactive
Launch interactive configuration tool
Help:
$ fasterq-dump --help
$ vdb-config --help
$ prefetch --help
Examples/Usage
Load the module:
$ module load bio/SRA-Toolkit/3.0.0
Download SRA accession:
$ prefetch SRR12345678
Convert to FASTQ:
$ fasterq-dump SRR12345678 -O ./fastq/
View contents of an SRA file:
$ vdb-dump SRR12345678 | less
Validate a file:
$ vdb-validate SRR12345678
Configure settings:
$ vdb-config --interactive
Unload the module:
$ module unload bio/SRA-Toolkit/3.0.0
Installation
Source code is obtained from NCBI SRA Toolkit