DIAMOND¶
DIAMOND is a software for protein sequences alignement and translated DNA searches.
Policy¶
DIAMOND is available under the GNU General Public License.
Citations
If you use DIAMOND in your research, please cite the recommended papers listed in the Github repository.
DIAMOND at HPC2N¶
On HPC2N we have several versions of DIAMOND available as modules.
Usage at HPC2N¶
To use, load the DIAMOND module to add it to your environment. You give this command to see the available versions:
and to see how to load a specific module, including the prerequisites.
Submit file examples¶
The following script (name it job.sh, for instance):
#!/bin/bash
# Change to your actual project id number (of the form: hpc2nXXXX-YYY, SNICXXX-YY-ZZ, or NAISSXXXX-YY-ZZ)
#SBATCH -A hpc2nXXXX-YYY
# Asking for 30 hours walltime
#SBATCH -t 30:00:00
# All cores will be on 1 node
#SBATCH -N 1
# Use 1 core per task
#SBATCH -c 1
# It is always best to do a ml purge before loading modules in a submit file
ml purge > /dev/null 2>&1
# Load prerequisites for DIAMOND and DIAMOND version
ml GCC/13.3.0 DIAMOND/2.1.11
# Get the database, the flag for not checking certificates is needed
wget --no-check-certificate https://scop.berkeley.edu/downloads/scopeseq-2.07/astral-scopedom-seqres-gd-sel-gs-bib-40-2.07.fa
# Setup the database using the number of cores (-c) specified in the batch script through the
# $SLURM_CPUS_PER_TASK in the -p flag for DIAMOND
diamond makedb --in astral-scopedom-seqres-gd-sel-gs-bib-40-2.07.fa -d astral40 -p $SLURM_CPUS_PER_TASK
# Get the database, the flag for not checking certificates is needed
wget --no-check-certificate https://scop.berkeley.edu/downloads/scopeseq-2.07/astral-scopedom-seqres-gd-sel-gs-bib-95-2.07.fa
# Run a clustering analysis using the number of cores (-c) specified in the batch script through the
# $SLURM_CPUS_PER_TASK in the -p flag for DIAMOND
diamond cluster -d astral-scopedom-seqres-gd-sel-gs-bib-95-2.07.fa -p $SLURM_CPUS_PER_TASK -o clusters.tsv --approx-id 40 -M 64G --header
can be used to submit this simulation using 1 core (-c 1) to the queue with sbatch job.sh. The number of cores
can be increased if the simulation is heavier.