Skip to Search Skip to Navigation Skip to Content

Software - Genomics

BEDOPS

BEDOPS is a suite of tools to address common questions raised in genomic studies - mostly with regard to overlap and proximity relationships between data sets. It aims to be scalable and flexible, facilitating the efficient and accurate analysis and management of large-scale genomic data. Click here for additional information.

Software Homepage: https://github.com/bedops/bedops
Available on: Cheetah Cluster


bedtools

bedtools allow one to address common genomics tasks such as finding feature overlaps and computing coverage. The utilities are largely based on four widely-used file formats: BED, GFF/GTF, VCF, and SAM/BAM. Using bedtools, one can develop sophisticated pipelines that answer complicated research questions by "streaming" several bedtools together.

Software Homepage: https://github.com/arq5x/bedtools2
Available on: Cheetah Cluster
Location: /share/apps/BEDTools-Version-2.16.2


Bowtie2

Bowtie2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes. Bowtie2 indexes the genome with an FM Index to keep its memory footprint small: for the human genome, its memory footprint is typically around 3.2 GB. Bowtie2 supports gapped, local, and paired-end alignment modes.

Software Homepage: http://bowtie-bio.sourceforge.net/bowtie2/index.shtml
Available on: Cheetah Cluster
Location: /share/apps/bowtie2-2.0.0-beta5


BWA

BWA (Burrows-Wheeler Aligner) is an efficient program that aligns relatively short nucleotide sequences against a long reference sequence such as the human genome. It implements two algorithms, bwa-short and BWA-SW.

Software Homepage: http://bio-bwa.sourceforge.net/
Available on: Cheetah Cluster
Location: /share/apps/bwa-0.6.1


CLC Genomics Workbench

CLC Genomics Workbench is for analyzing and visualizing Next Generation Sequencing data, it incorporates cutting-edge technology and algorithms, while also supporting and integrating with the rest of your typical NGS workflow.

Software Homepage: http://www.clcbio.com/products/clc-genomics-workbench/
Available on: Windows workstation CBIWS25


Cufflinks

Cufflinks assembles transcripts, estimates their abundances, and tests for differential expression and regulation in RNA-Seq samples. It accepts aligned RNA-Seq reads and assembles the alignments into a parsimonious set of transcripts. It then estimates the relative abundances of these transcripts based on how many reads support each one, taking into account biases in library preparation protocols.

Software Homepage: http://cole-trapnell-lab.github.io/cufflinks/
Available on: Cheetah Cluster
Location: /share/apps/cufflinks-2.1.1


GATK

GATK (Genome Analysis Toolkit) is a structured software library that makes writing efficient analysis tools using next-generation sequencing data very easy. Secondly, it's a suite of tools for working with human medical resequencing projects such as 1000 Genomes and The Cancer Genome Atlas. These tools include things like a depth of coverage analyzers, a quality score recalibrator, a SNP/indel caller, and a local realigner.

Software Homepage: http://www.broadinstitute.org/gsa/wiki/index.php/Home_Page
Available on: Cheetah Cluster
Location: /share/apps/gatk-1.5.32


GeneSpring

Genespring provides powerful, accessible statistical tools for fast visualization and analysis of expression data.

Software Homepage: http://www.agilent.com/search/?Ntt=GeneSpring
Available on: Windows workstation CBIWS10


GenomeStudio

GenomeStudio is a framework for analyzing data gathered from Illumina sequencing and array platforms. The following modules are available:

  • Genotyping
    • Analyze SNP and CNV data across 2.5 million markers and probes
    • Detect sample outliers
  • Gene Expression
    • Analyze differentially expressed genes across different genomes
    • Profile miRNA expression
  • Methylation
    • Detect cytosine methylation at single-base resolution
    • Identify methylation signatures across the entire genome

Note: There is a bug that renders the default application shortcut unusable. To launch the program, navigate to its directory and launch the program there.

Software Homepage: http://www.illumina.com/techniques/microarrays/array-data-analysis-experimental-design/genomestudio.html
Available on: Windows workstation CBIWS16


igvtools

The igvtools utility provides a set of tools for pre-processing data files. To use igvtools, load its module file first: module load igvtools.

Software Homepage: http://www.broadinstitute.org/igv/igvtools
Available on: Cheetah Cluster
Command: igvtools


MACS

MACS (Model-based Analysis of ChIP-Seq) is a novel algorithm for identifying transcript factor binding sites, addressing the lack of powerful ChIP-Seq analysis method. MACS captures the influence of genome complexity to evaluate the significance of enriched ChIP regions, and MACS improves the spatial resolution of binding sites through combining the information of both sequencing tag position and orientation. It can be easily used for ChIP-Seq data alone, or with control sample with the increase of specificity. Click here for additional information. To use MACS, load its module file first: module load macs2.

Software Homepage: https://github.com/taoliu/MACS/
Available on: Cheetah Cluster
Command: macs2


MAQ

MAQ (Mapping and Assembly with Quality) builds assembly by mapping short reads to reference sequences.

Software Homepage: http://maq.sourceforge.net/
Available on: Cheetah Cluster
Location: /share/apps/maq


MEME Systems

Click here for information on running MEME.

Software Homepage: n/a
Available on: Bishop server; Cheetah Cluster


Mugsy

Mugsy is a multiple whole genome aligner. Mugsy uses Nucmer for pairwise alignment, a custom graph based segmentation procedure for identifying collinear regions, and the segment-based progressive multiple alignment strategy from Seqan::TCoffee. Mugsy accepts draft genomes in the form of multi-FASTA files and does not require a reference genome. To use Mugsy, load its module file first: module load mugsy.

Software Homepage: http://mugsy.sourceforge.net/
Available on:
Command: mugsy
Location: module load mugsy


Picard

Picard comprises Java-based command-line utilities that manipulate SAM files, and a Java API (SAM-JDK) for creating new programs that read and write SAM files. Both SAM text format and SAM binary (BAM) format are supported.

Software Homepage: http://broadinstitute.github.io/picard
Available on: Cheetah Cluster
Location: /share/apps/picard-tools-1.67


Prodigal

Prodigal (Prokaryotic Dynamic Programming Genefinding Algorithm) is a microbial (bacterial and archaeal) gene finding program developed at Oak Ridge National Laboratory and the University of Tennessee.

Software Homepage: http://prodigal.ornl.gov/
Available on: Cheetah Cluster
Location: /share/apps/prodigal-2.60


SAM

SAM (Significance Analysis of Microarrays) is a statistical technique for finding significant genes in a set of microarray experiments.

Software Homepage: http://www-stat.stanford.edu/~tibs/SAM/
Available on: Windows workstation CBIWS12


SAMtools

SAMtools provide various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-position format.

Software Homepage: http://samtools.sourceforge.net/
Available on: Cheetah Cluster
Location: /share/apps/samtools-0.1.18


TopHat

TopHat is a fast splice junction mapper for RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized genomes using the ultra high-throughput short read aligner Bowtie, and then analyzes the mapping results to identify splice junctions between exons.

Software Homepage: http://ccb.jhu.edu/software/tophat/index.shtml
Available on: Cheetah Cluster
Command: tophat2
Location: /share/apps/tophat-2.0.5/


USEARCH

USEARCH is a unique high-throughput sequence analysis tool. It is a distributed as single binary program that implements a suite of algorithms comparable to BLASTN, BLASTP, BLASTX, BLASTCLUST, CD-HIT, CD-HIT-EST, CD-HIT-2D, CD-HIT-EST-2D, CD-HIT-OTU, CD-HIT-454, ChimeraSlayer, Perseus, RAPsearch and more. It supports a rich set of sequence matching options, including E-values, identity, coverage (fraction of query or target sequence covered by the alignment) and maximum gap length, and a range of output file formats including FASTA, BLAST-like, user-defined tabbed text and a native format designed for clustering applications. Supported alignment styles include local (gapped and ungapped), like BLAST, and global, which is most often used in clustering applications. User-settable parameters allow tuning of substitution scores, gap penalties and Karlin-Altschul statistics.

Software Homepage: http://www.drive5.com/usearch/
Available on: Cheetah Cluster
Location: /share/apps/usearch-6.0.307


VCFtools

VCFtools is a program package designed for working with VCF files, such as those generated by the 1000 Genomes Project. The aim of VCFtools is to provide methods for working with VCF files: validating, merging, comparing, and calculating some basic population genetic statistics.

Software Homepage: https://vcftools.github.io/index.html
Available on: Cheetah Cluster
Location: /share/apps/vcftools-0.1.8