Kraken¶
Kraken is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies. Previous attempts by other bioinformatics software to accomplish this task have often used sequence alignment or machine learning techniques that were quite slow, leading to the development of less sensitive but much faster abundance estimation programs.
Kraken aims to achieve high sensitivity and high speed by utilizing exact alignments of k-mers and a novel classification algorithm.
Policy¶
Kraken is available under the MIT license.
Citations
If you use Kraken in your research, please cite the Kraken paper.
If you use KrakenUniq in your research, please cite their publications:
- KrakenUniq: confident and fast metagenomics classification using unique k-mer counts. Breitwieser FP, Baker DN, Salzberg SL. Genome Biology, Dec 2018. https://doi.org/10.1186/s13059-018-1568-0
- Metagenomic classification with KrakenUniq on low-memory computers. Pockrandt C, Zimin AV, Salzberg SL. The Journal of Open Source Software, Dec 2022. https://doi.org/10.21105/joss.04908
Overview¶
Kraken is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies. Previous attempts by other bioinformatics software to accomplish this task have often used sequence alignment or machine learning techniques that were quite slow, leading to the development of less sensitive but much faster abundance estimation programs. Kraken aims to achieve high sensitivity and high speed by utilizing exact alignments of k-mers and a novel classification algorithm.
In its fastest mode of operation, for a simulated metagenome of 100 bp reads, Kraken processed over 4 million reads per minute on a single core, over 900 times faster than Megablast and over 11 times faster than the abundance estimation program MetaPhlAn. Kraken’s accuracy is comparable with Megablast, with slightly lower sensitivity and very high precision.
Kraken is written in C++ and Perl, and is designed for use with the Linux operating system. It has also been successfully compiled and run under the Mac OS.
As of September 29 2022, Kraken 1 is no longer supported.
For guidance on which software version to choose, see Choosing a Metagenomics Classification Tool.
- Kraken 1 remains available via the Kraken 1 Github page.
- KrakenUniq is an improved version of Kraken1, with the same ultra-low false-positive (FP) rate, which adds features described in a newer paper, here, and on the KrakenUniq Github page.
- Kraken 2 is a newer implementation of Kraken that uses much less memory with a higher FP rate than Kraken 1/KrakenUniq. Kraken 2 now also includes the kmer-counting features of KrakenUniq. (see Kraken 2’s Webpage for additional details).
Kraken at HPC2N¶
On HPC2N we have Kraken 1, Kraken 2, and KrakenUniq available as modules on Kebnekaise. To see the available versions, login to Kebnekaise and do ml spider Kraken
.
Usage at HPC2N¶
To use, load the Kraken/Kraken2/KrakenUniq module to add it to your environment. You give this command to see the available versions:
and to see how to load a specific module, including the prerequisites, do:
or
or
For more information on running Kraken, see the Kraken Manual.
For more information about KrakenUniq, do krakenuniq --help:
after loading the KrakenUniq
module.
Additional info¶
- The Kraken homepage
- The Kraken Manual
- The readme for KrakenUniq (scroll down a bit).