All Available Modules

Below are all available modules in the current release of Omics Pipe in alphabetical order. When creating a custom pipeline, you can choose from these modules or create your own.

omics_pipe.modules.annotate_peaks.annotate_peaks(step, annotate_peaks_flag)[source]

Runs HOMER to annotate peak track from ChIPseq data.

input:
.tag input file
output:
.txt file
citation:
Heinz S, Benner C, Spann N, Bertolino E et al. Simple Combinations of Lineage-Determining Transcription Factors Prime cis-Regulatory Elements Required for Macrophage and B Cell Identities. Mol Cell 2010 May 28;38(4):576-589. PMID: 20513432
link:
http://homer.salk.edu/homer/
parameters from parameters file:

PAIR_LIST:

HOMER_RESULTS:

HOMER_VERSION:

TEMP_DIR:

HOMER_GENOME:

HOMER_ANNOTATE_OPTIONS:

omics_pipe.modules.annotate_variants.annotate_variants(sample, annotate_variants_flag)[source]

Annotates variants with ANNOVAR variant annotator. Follows VarCall. input:

.vcf
output:
.vcf
citation:
Wang K, Li M, Hakonarson H. ANNOVAR: Functional annotation of genetic variants from next-generation sequencing data Nucleic Acids Research, 38:e164, 2010
link:
http://www.openbioinformatics.org/annovar/
parameters from parameters file:

VARIANT_RESULTS:

ANNOVARDB:

ANNOVAR_OPTIONS:

ANNOVAR_OPTIONS2:

TEMP_DIR:

ANNOVAR_VERSION:

VCFTOOLS_VERSION:

omics_pipe.modules.bowtie.bowtie(sample, bowtie_flag)[source]

Runs Bowtie to align .fastq files.

input:
.fastq file
output:
sample.bam
citation:
Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology 10:R25
link:
http://bowtie-bio.sourceforge.net/index.shtml
parameters from parameters file:

ENDS:

TRIMMED_DATA_PATH:

BOWTIE_OPTIONS:

BOWTIE_INDEX:

BOWTIE_RESULTS:

BOWTIE_VERSION:

SAMTOOLS_VERSION:

BEDTOOLS_VERSION:

TEMP_DIR:

omics_pipe.modules.BreastCancer_RNA_report.BreastCancer_RNA_report(sample, BreastCancer_RNA_report_flag)[source]

Runs R script with knitr to produce report from RNAseq pipeline.

input:
results from other steps in RNAseq pipelines
output:
html report
citation:
  1. Meissner
parameters from parameter file:

WORKING_DIR:

R_VERSION:

REPORT_RESULTS:

PARAMS_FILE:

TABIX_VERSION:

TUMOR_TYPE:

GENELIST:

COSMIC:

CLINVAR:

PHARMGKB_rsID:

PHARMGKB_Allele:

DRUGBANK:

CADD:

omics_pipe.modules.bwa.bwa1(sample, bwa1_flag)[source]

BWA aligner for read1 of paired_end reads.

input:
.fastq
output:
.sam
citation:
Li H. and Durbin R. (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25, 1754-1760. [PMID: 19451168]
link:
http://bio-bwa.sourceforge.net/bwa.shtml
parameters from parameters file:

BWA_RESULTS:

TEMP_DIR:

SAMTOOLS_VERSION:

BWA_VERSION:

BWA_INDEX:

RAW_DATA_DIR:

GATK_READ_GROUP_INFO:

COMPRESSION:

omics_pipe.modules.bwa.bwa2(sample, bwa2_flag)[source]

BWA aligner for read2 of paired_end reads.

input:
.fastq
output:
.sam
citation:
Li H. and Durbin R. (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25, 1754-1760. [PMID: 19451168]
link:
http://bio-bwa.sourceforge.net/bwa.shtml
parameters from parameters file:

BWA_RESULTS:

TEMP_DIR:

SAMTOOLS_VERSION:

BWA_VERSION:

BWA_INDEX:

RAW_DATA_DIR:

GATK_READ_GROUP_INFO:

COMPRESSION:

omics_pipe.modules.bwa.bwa_RNA(sample, bwa_flag)[source]

BWA aligner for single end reads.

input:
.fastq
output:
.sam
citation:
Li H. and Durbin R. (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25, 1754-1760. [PMID: 19451168]
link:
http://bio-bwa.sourceforge.net/bwa.shtml
parameters from parameters file:

BWA_RESULTS:

TEMP_DIR:

SAMTOOLS_VERSION:

BWA_VERSION:

BWA_INDEX:

RAW_DATA_DIR:

GATK_READ_GROUP_INFO:

COMPRESSION:

omics_pipe.modules.bwa.bwa_mem(sample, bwa_mem_flag)[source]

BWA aligner with BWA-MEM algorithm.

input:
.fastq
output:
.sam
citation:
Li H. and Durbin R. (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25, 1754-1760. [PMID: 19451168]
link:
http://bio-bwa.sourceforge.net/bwa.shtml
parameters from parameters file:

BWA_RESULTS:

TEMP_DIR:

SAMTOOLS_VERSION:

BWA_VERSION:

GENOME:

RAW_DATA_DIR:

BWA_OPTIONS:

COMPRESSION:

omics_pipe.modules.bwa.bwa_mem_pipe(sample, bwa_mem_pipe_flag)[source]

BWA aligner with BWA-MEM algorithm.

input:
.fastq
output:
.sam
citation:
Li H. and Durbin R. (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25, 1754-1760. [PMID: 19451168]
link:
http://bio-bwa.sourceforge.net/bwa.shtml
parameters from parameters file:

BWA_RESULTS:

TEMP_DIR:

SAMTOOLS_VERSION:

BWA_VERSION:

GENOME:

RAW_DATA_DIR:

BWA_OPTIONS:

COMPRESSION:

SAMBAMBA_VERSION:

SAMBLASTER_VERSION:

SAMBAMBA_OPTIONS:

omics_pipe.modules.call_variants.call_variants(sample, call_variants_flag)[source]

Calls variants from alignment .bam files using Varcall.

input:
Aligned.out.sort.bam or accepted_hits.bam
output:
.vcf file
citation:
Erik Aronesty (2011). ea-utils : “Command-line tools for processing biological sequencing data”;
link:
https://code.google.com/p/ea-utils/wiki/Varcall
parameters from parameters file:

STAR_RESULTS:

GENOME:

VARSCAN_PATH:

VARSCAN_OPTIONS:

VARIANT_RESULTS:

TEMP_DIR:

SAMTOOLS_VERSION:

ANNOVAR_VERSION:

VCFTOOLS_VERSION:

VARSCAN_VERSION:

SAMTOOLS_OPTIONS:

omics_pipe.modules.ChIP_trim.ChIP_trim(sample, ChIP_trim_flag)[source]

Runs Homer Tools to trim adapters from .fastq files.

input:
.fastq file
output:
.fastq file
citation:
Heinz S, Benner C, Spann N, Bertolino E et al. Simple Combinations of Lineage-Determining Transcription Factors Prime cis-Regulatory Elements Required for Macrophage and B Cell Identities. Mol Cell 2010 May 28;38(4):576-589. PMID: 20513432
link:
http://homer.salk.edu/homer/
parameters from parameters file:

ENDS:

RAW_DATA_DIR:

HOMER_TRIM_OPTIONS:

TRIMMED_DATA_PATH:

HOMER_VERSION:

omics_pipe.modules.cuffdiff.cuffdiff(step, cuffdiff_flag)[source]

Runs Cuffdiff to perform differential expression. Runs after Cufflinks. Part of Tuxedo Suite.

input:
.bam files
output:
differential expression results
citation:
Trapnell C, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation Nature Biotechnology doi:10.1038/nbt.1621
link:
http://cufflinks.cbcb.umd.edu/
parameters from parameters file:

CUFFDIFF_RESULTS:

GENOME:

CUFFDIFF_OPTIONS:

CUFFMERGE_RESULTS:

CUFFDIFF_INPUT_LIST_COND1:

CUFFDIFF_INPUT_LIST_COND2:

CUFFLINKS_VERSION:

omics_pipe.modules.cuffdiff_miRNA.cuffdiff_miRNA(step, cuffdiff_miRNA_flag)[source]

Runs Cuffdiff to perform differential expression. Runs after Cufflinks. Part of Tuxedo Suite.

input:
.bam files
output:
differential expression results
citation:
Trapnell C, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation Nature Biotechnology doi:10.1038/nbt.1621
link:
http://cufflinks.cbcb.umd.edu/
parameters from parameters file:

CUFFDIFF_RESULTS:

GENOME:

CUFFDIFF_OPTIONS:

CUFFMERGE_RESULTS:

CUFFDIFF_INPUT_LIST_COND1:

CUFFDIFF_INPUT_LIST_COND2:

CUFFLINKS_VERSION:

Runs cufflinks to assemble .bam files from TopHat.

input:
accepted_hits.bam
output:
transcripts.gtf
citation:
Trapnell C, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation Nature Biotechnology doi:10.1038/nbt.1621
link:
http://cufflinks.cbcb.umd.edu/
parameters from parameters file:

TOPHAT_RESULTS:

CUFFLINKS_RESULTS:

REF_GENES:

GENOME:

CUFFLINKS_OPTIONS:

CUFFLINKS_VERSION:

Runs cufflinks to assemble .bam files from TopHat. Takes parameter MIRNA_GTF.

input:
accepted_hits.bam
output:
transcripts.gtf
citation:
Trapnell C, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation Nature Biotechnology doi:10.1038/nbt.1621
link:
http://cufflinks.cbcb.umd.edu/
parameters from parameters file:

TOPHAT_RESULTS:

CUFFLINKS_RESULTS:

miRNA_GTF:

GENOME:

CUFFLINKS_OPTIONS:

CUFFLINKS_VERSION:

Runs cufflinks to assemble .bam files from TopHat. Takes parameters LNCRNA_GTF and NONCODE_FASTA.

input:
accepted_hits.bam
output:
transcripts.gtf
citation:
Trapnell C, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation Nature Biotechnology doi:10.1038/nbt.1621
link:
http://cufflinks.cbcb.umd.edu/
parameters from parameters file:

TOPHAT_RESULTS:

CUFFLINKS_RESULTS:

LNCRNA_GTF:

NONCODE_FASTA:

CUFFLINKS_OPTIONS:

CUFFLINKS_VERSION:

omics_pipe.modules.cuffmerge.cuffmerge(step, cuffmerge_flag)[source]

Runs cuffmerge to merge .gtf files from Cufflinks.

input:
assembly_GTF_list.txt
output:
merged.gtf
citation:
Trapnell C, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation Nature Biotechnology doi:10.1038/nbt.1621
link:
http://cufflinks.cbcb.umd.edu/
parameters from parameters file:

CUFFMERGE_RESULTS:

REF_GENES:

GENOME:

CUFFMERGE_OPTIONS:

CUFFLINKS_VERSION:

omics_pipe.modules.cuffmerge_miRNA.cuffmerge_miRNA(step, cuffmerge_miRNA_flag)[source]

Runs cuffmerge to merge .gtf files from Cufflinks.

input:
assembly_GTF_list.txt
output:
merged.gtf
citation:
Trapnell C, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation Nature Biotechnology doi:10.1038/nbt.1621
link:
http://cufflinks.cbcb.umd.edu/
parameters from parameters file:

CUFFMERGE_RESULTS:

miRNA_GTF:

GENOME:

CUFFMERGE_OPTIONS:

CUFFLINKS_VERSION:

omics_pipe.modules.cuffmergetocompare.cuffmergetocompare(step, cuffmergetocompare_flag)[source]

Runs cuffcompare to annotate merged .gtf files from Cufflinks.

input:
assembly_GTF_list.txt
output:
merged.gtf
citation:
Trapnell C, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation Nature Biotechnology doi:10.1038/nbt.1621
link:
http://cufflinks.cbcb.umd.edu/
parameters from parameters file:

CUFFMERGE_RESULTS:

REF_GENES:

GENOME:

CUFFMERGETOCOMPARE_OPTIONS:

CUFFLINKS_VERSION:

omics_pipe.modules.cuffmergetocompare_miRNA.cuffmergetocompare_miRNA(step, cuffmergetocompare_miRNA_flag)[source]

Runs cuffcompare to annotate merged .gtf files from Cufflinks.

input:
assembly_GTF_list.txt
output:
merged.gtf
citation:
Trapnell C, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation Nature Biotechnology doi:10.1038/nbt.1621
link:
http://cufflinks.cbcb.umd.edu/
parameters from parameters file:

CUFFMERGE_RESULTS:

miRNA_GTF:

GENOME:

CUFFMERGETOCOMPARE_OPTIONS:

CUFFLINKS_VERSION:

omics_pipe.modules.custom_R_report.custom_R_report(sample, custom_R_report_flag)[source]

Runs R script with knitr to produce report from omics pipeline.

input:
results from other steps in RNAseq pipelines
output:
html report
citation:
  1. Meissner
parameters from parameter file:

REPORT_SCRIPT:

R_VERSION:

REPORT_RESULTS:

R_MARKUP_FILE:

DPS_VERSION:

PARAMS_FILE:

omics_pipe.modules.cutadapt_miRNA.cutadapt_miRNA(sample, cutadapt_miRNA_flag)[source]

Runs Cutadapt to trim adapters from reads.

input:
.fastq
output:
.fastq
citation:
Martin 2011. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal 17: 10-12.
link:
https://code.google.com/p/cutadapt/
parameters from parameters file:

RAW_DATA_DIR:

ADAPTER:

TRIMMED_DATA_PATH:

PYTHON_VERSION

omics_pipe.modules.fastq_length_filter_miRNA.fastq_length_filter_miRNA(sample, fastq_length_filter_miRNA_flag)[source]

Runs custom Python script to filter miRNA reads by length.

input:
.fastq
output:
.fastq
parameters from parameter file:
TRIMMED_DATA_PATH:
omics_pipe.modules.fastqc.fastqc(sample, fastqc_flag)[source]

QC check of raw .fastq files using FASTQC.

input:
.fastq file
output:
folder and zipped folder containing html, txt and image files
citation:
Babraham Bioinformatics
link:
http://www.bioinformatics.babraham.ac.uk/projects/fastqc/
parameters from parameters file:

RAW_DATA_DIR:

QC_PATH:

FASTQC_VERSION:

COMPRESSION:

omics_pipe.modules.fastqc_miRNA.fastqc_miRNA(sample, fastqc_miRNA_flag)[source]

QC check of raw .fastq files using FASTQC.

input:
.fastq file
output:
folder and zipped folder containing html, txt and image files
citation:
Babraham Bioinformatics
link:
http://www.bioinformatics.babraham.ac.uk/projects/fastqc/
parameters from parameters file:

RAW_DATA_DIR:

QC_PATH:

FASTQC_VERSION:

omics_pipe.modules.filter_variants.filter_variants(sample, filter_variants_flag)[source]

Filters variants to remove common variants.

input:
.bam or .sam file
output:
.vcf file
citation:
Piskol et al. 2013. Reliable identification of genomic variants from RNA-seq data. The American Journal of Human Genetics 93: 641-651.
link:
http://lilab.stanford.edu/SNPiR/
parameters from parameters file:

VARIANT_RESULTS:

TEMP_DIR:

SAMTOOLS_VERSION:

BWA_VERSION:

PICARD_VERSION:

GATK_VERSION:

BEDTOOLS_VERSION:

UCSC_TOOLS_VERSION:

GENOME:

REPEAT_MASKER:

SNPIR_ANNOTATION:

RNA_EDIT:

DBSNP:

MILLS:

G1000:

WORKING_DIR:

BWA_RESULTS:

SNPIR_VERSION:

SNPIR_CONFIG:

SNPIR_DIR:

SNPEFF_VERSION:

dbNSFP:

VCFTOOLS_VERSION:

WORKING_DIR:

SNP_FILTER_OUT_REF:

omics_pipe.modules.find_motifs.find_motifs(step, find_motifs_flag)[source]

Runs HOMER to find motifs from ChIPseq data.

input:
.txt peak file from Homer
output:
.txt file
citation:
Heinz S, Benner C, Spann N, Bertolino E et al. Simple Combinations of Lineage-Determining Transcription Factors Prime cis-Regulatory Elements Required for Macrophage and B Cell Identities. Mol Cell 2010 May 28;38(4):576-589. PMID: 20513432
link:
http://homer.salk.edu/homer/
parameters from parameters file:

PAIR_LIST:

HOMER_RESULTS:

HOMER_VERSION:

TEMP_DIR:

HOMER_GENOME:

HOMER_MOTIFS_OPTIONS:

omics_pipe.modules.fusion_catcher.fusion_catcher(sample, fusion_catcher_flag)[source]

Detects fusion genes in paired-end RNAseq data.

input:
paired end .fastq files
output:
list of candidate fusion genes
citation:
  1. Kangaspeska, S. Hultsch, H. Edgren, D. Nicorici, A. Murumgi, O.P. Kallioniemi, Reanalysis of RNA-sequencing data reveals several additional fusion genes with multiple isoforms, PLOS One, Oct. 2012. http://dx.plos.org/10.1371/journal.pone.0048745
link:
https://code.google.com/p/fusioncatcher
parameters from parameters file:

ENDS:

RAW_DATA_DIR:

FUSION_RESULTS:

FUSIONCATCHERBUILD_DIR:

TEMP_DIR:

SAMTOOLS_VERSION:

FUSIONCATCHER_VERSION:

FUSIONCATCHER_OPTIONS:

TISSUE:

PYTHON_VERSION:

omics_pipe.modules.GATK_preprocessing_WES.GATK_preprocessing_WES(sample, GATK_preprocessing_WES_flag)[source]

GATK preprocessing steps for whole exome sequencing.

input:
sorted.rg.md.bam
output:
.ready.bam
citation:
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA (2010). The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20:1297-303.
link:
http://www.broadinstitute.org/gatk/
parameters from parameters file:

BWA_RESULTS:

TEMP_DIR:

GATK_VERSION:

GENOME:

DBSNP:

MILLS:

G1000:

CAPTURE_KIT_BED:

SAMTOOLS_VERSION:

omics_pipe.modules.GATK_preprocessing_WGS.GATK_preprocessing_WGS(sample, GATK_preprocessing_WGS_flag)[source]

GATK preprocessing steps for whole genome sequencing.

input:
sorted.rg.md.bam
output:
.ready.bam
citation:
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA (2010). The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20:1297-303.
link:
http://www.broadinstitute.org/gatk/
parameters from parameters file:

BWA_RESULTS:

TEMP_DIR:

GATK_VERSION:

GENOME:

DBSNP:

MILLS:

G1000:

SAMTOOLS_VERSION:

omics_pipe.modules.GATK_variant_discovery.GATK_variant_discovery(sample, GATK_variant_discovery_flag)[source]

GATK_variant_discovery.

input:
sorted.rg.md.bam
output:
.ready.bam
citation:
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA (2010). The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20:1297-303.
link: GATK_variant_discovery
http://www.broadinstitute.org/gatk/
parameters from parameters file:

BWA_RESULTS:

TEMP_DIR:

GATK_VERSION:

GENOME:

DBSNP:

VARIANT_RESULTS:

omics_pipe.modules.GATK_variant_filtering.GATK_variant_filtering(sample, GATK_variant_filtering_flag)[source]

GATK_variant_filtering.

input:
sorted.rg.md.bam
output:
.ready.bam
citation:
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA (2010). The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20:1297-303.
link: GATK_variant_filtering
http://www.broadinstitute.org/gatk/
parameters from parameters file:

VARIANT_RESULTS:

TEMP_DIR:

GATK_VERSION:

GENOME:

DBSNP:

MILLS:

OMNI:

HAPMAP:

R_VERSION:

G1000_SNPs:

G1000_Indels:

omics_pipe.modules.GATK_variant_filtering.GATK_variant_filtering_group(sample, GATK_variant_filtering_group_flag)[source]

GATK_variant_filtering.

input:
sorted.rg.md.bam
output:
.ready.bam
citation:
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA (2010). The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20:1297-303.
link: GATK_variant_filtering
http://www.broadinstitute.org/gatk/

parameters from parameters file:

VARIANT_RESULTS:

TEMP_DIR:

GATK_VERSION:

GENOME:

DBSNP:

MILLS_G1000:

OMNI:

HAPMAP:

R_VERSION:

G1000:

omics_pipe.modules.homer_peaks.homer_peaks(step, homer_peaks_flag)[source]

Runs HOMER to call peaks from ChIPseq data.

input:
.tag input file
output:
.txt file
citation:
Heinz S, Benner C, Spann N, Bertolino E et al. Simple Combinations of Lineage-Determining Transcription Factors Prime cis-Regulatory Elements Required for Macrophage and B Cell Identities. Mol Cell 2010 May 28;38(4):576-589. PMID: 20513432
link:
http://homer.salk.edu/homer/
parameters from parameters file:

PAIR_LIST:

HOMER_RESULTS:

HOMER_PEAKS_OPTIONS:

HOMER_VERSION:

TEMP_DIR:

omics_pipe.modules.htseq.htseq(sample, htseq_flag)[source]

Runs htseq-count to get raw count data from alignments.

input:
Aligned.out.sort.bam
output:
counts.txt
citation:
Simon Anders, EMBL
link:
http://www-huber.embl.de/users/anders/HTSeq/doc/overview.html
parameters from parameters file:

STAR_RESULTS:

HTSEQ_OPTIONS:

REF_GENES:

HTSEQ_RESULTS:

TEMP_DIR:

SAMTOOLS_VERSION:

BAM_FILE_NAME:

PYTHON_VERSION:

omics_pipe.modules.htseq_gencode.htseq_gencode(sample, htseq_flag)[source]

Runs htseq-count to get raw count data from alignments.

input:
Aligned.out.sort.bam
output:
counts.txt
citation:
Simon Anders, EMBL
link:
http://www-huber.embl.de/users/anders/HTSeq/doc/overview.html
parameters from parameters file:

STAR_RESULTS:

HTSEQ_OPTIONS:

REF_GENES_GENCODE:

HTSEQ_GENCODE_RESULTS:

TEMP_DIR:

SAMTOOLS_VERSION:

BAM_FILE_NAME:

omics_pipe.modules.htseq_miRNA.htseq_miRNA(sample, htseq_miRNA_flag)[source]

Runs htseq-count to get raw count data from alignments.

input:
Aligned.out.sort.bam
output:
counts.txt
citation:
Simon Anders, EMBL
link:
http://www-huber.embl.de/users/anders/HTSeq/doc/overview.html
parameters from parameters file:

TOPHAT_RESULTS:

HTSEQ_OPTIONS:

miRNA_GFF:

HTSEQ_RESULTS:

TEMP_DIR:

SAMTOOLS_VERSION:

BAM_FILE_NAME:

omics_pipe.modules.intogen.intogen(sample, intogen_flag)[source]

Runs Intogen to rank mutations and implication for cancer phenotype. Follows variant calling.

input:
.vcf
output:
variant list
citation:
Gonzalez-Perez et al. 2013. Intogen mutations identifies cancer drivers across tumor types. Nature Methods 10, 1081-1082.
link:
http://www.intogen.org/
parameters from parameter file:

VCF_FILE:

INTOGEN_OPTIONS:

INTOGEN_RESULTS:

INTOGEN_VERSION:

USERNAME:

WORKING_DIR:

TEMP_DIR:

SCHEDULER:

VARIANT_RESULTS:

omics_pipe.modules.macs.macs(step, macs_flag)[source]

Runs MACS to call peaks from ChIPseq data. input:

.fastq file
output:
peaks and .bed file
citation:
Zhang et al. Model-based Analysis of ChIP-Seq (MACS). Genome Biol (2008) vol. 9 (9) pp. R137
link:
http://liulab.dfci.harvard.edu/MACS/
parameters from parameters file:

PAIR_LIST:

BOWTIE_RESULTS:

CHROM_SIZES:

MACS_RESULTS:

MACS_VERSION:

TEMP_DIR:

BEDTOOLS_VERSION:

PYTHON_VERSION:

omics_pipe.modules.mutect.mutect(sample, mutect_flag)[source]

Runs MuTect on paired tumor/normal samples to detect somatic point mutations in cancer genomes.

input:
.bam
output:
call_stats.txt
citation:
Cibulskis, K. et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnology (2013).doi:10.1038/nbt.2514
link:
http://www.broadinstitute.org/cancer/cga/mutect
parameters from parameters file:

BWA_RESULTS:

TEMP_DIR:

GATK_VERSION:

GENOME:

DBSNP:

MILLS:

G1000:

CAPTURE_KIT_BED:

omics_pipe.modules.peak_track.peak_track(step, peak_track_flag)[source]

Runs HOMER to create peak track from ChIPseq data.

input:
.tag input file
output:
.txt file
citation:
Heinz S, Benner C, Spann N, Bertolino E et al. Simple Combinations of Lineage-Determining Transcription Factors Prime cis-Regulatory Elements Required for Macrophage and B Cell Identities. Mol Cell 2010 May 28;38(4):576-589. PMID: 20513432
link:
http://homer.salk.edu/homer/
parameters from parameters file:

PAIR_LIST:

HOMER_RESULTS:

HOMER_VERSION:

TEMP_DIR:

omics_pipe.modules.picard_mark_duplicates.picard_mark_duplicates(sample, picard_mark_duplicates_flag)[source]

Picard tools Mark Duplicates.

input:
sorted.bam
output:
_sorted.rg.md.bam
citation:
http://picard.sourceforge.net/
link:
http://picard.sourceforge.net/
parameters from parameters file:

BWA_RESULTS:

TEMP_DIR:

PICARD_VERSION:

SAMTOOLS_VERSION:

omics_pipe.modules.read_density.read_density(sample, read_density_flag)[source]

Runs HOMER to visualize read density from ChIPseq data.

input:
.bam file
output:
.txt file
citation:
Heinz S, Benner C, Spann N, Bertolino E et al. Simple Combinations of Lineage-Determining Transcription Factors Prime cis-Regulatory Elements Required for Macrophage and B Cell Identities. Mol Cell 2010 May 28;38(4):576-589. PMID: 20513432
link:
http://homer.salk.edu/homer/
parameters from parameters file:

BOWTIE_RESULTS:

CHROM_SIZES:

HOMER_RESULTS:

HOMER_VERSION:

TEMP_DIR:

omics_pipe.modules.RNAseq_QC.RNAseq_QC(sample, RNAseq_QC_flag)[source]

Runs rseqc to determine insert size as QC for alignment.

input:
.bam
output:
pdf plot
link:
http://rseqc.sourceforge.net/
parameters from parameters file:

STAR_RESULTS:

QC_PATH:

BAM_FILE_NAME:

RSEQC_REF:

TEMP_DIR:

PICARD_VERSION:

R_VERSION:

omics_pipe.modules.RNAseq_report.RNAseq_report(sample, RNAseq_report_flag)[source]

Runs R script with knitr to produce report from RNAseq pipeline.

input:
results from other steps in RNAseq pipelines
output:
html report
citation:
  1. Meissner
parameters from parameter file:

REPORT_SCRIPT:

R_VERSION:

REPORT_RESULTS:

R_MARKUP_FILE:

DPS_VERSION:

PARAMS_FILE:

omics_pipe.modules.RNAseq_report_counts.RNAseq_report_counts(sample, RNAseq_report_counts_flag)[source]

Runs R script with knitr to produce report from RNAseq pipeline.

input:
results from other steps in RNAseq pipelines
output:
html report
citation:
  1. Meissner
parameters from parameter file:

WORKING_DIR:

R_VERSION:

REPORT_RESULTS:

PARAMS_FILE:

omics_pipe.modules.RNAseq_report_tuxedo.RNAseq_report_tuxedo(sample, RNAseq_report_tuxedo_flag)[source]

Runs R script with knitr to produce report from RNAseq pipeline.

input:
results from other steps in RNAseq pipelines
output:
html report
citation:
  1. Meissner
parameters from parameter file:

WORKING_DIR:

R_VERSION:

REPORT_RESULTS:

DPS_VERSION:

PARAMS_FILE:

omics_pipe.modules.rseqc.rseqc(sample, rseqc_flag)[source]

Runs rseqc to determine insert size as QC for alignment.

input:
.bam
output:
pdf plot
link:
http://rseqc.sourceforge.net/
parameters from parameters file:

STAR_RESULTS:

QC_PATH:

BAM_FILE_NAME:

RSEQC_REF:

RSEQC_VERSION:

TEMP_DIR:

omics_pipe.modules.snpir_variants.snpir_variants(sample, snpir_variants_flag)[source]

Calls variants using SNPIR pipeline.

input:
Aligned.out.sort.bam or accepted_hits.bam
output:
final_variants.vcf file
citation:
Piskol, R., et al. (2013). “Reliable Identification of Genomic Variants from RNA-Seq Data.” The American Journal of Human Genetics 93(4): 641-651.
link:
http://lilab.stanford.edu/SNPiR/
parameters from parameters file:

VARIANT_RESULTS:

TEMP_DIR:

SAMTOOLS_VERSION:

BWA_VERSION:

PICARD_VERSION:

GATK_VERSION:

BEDTOOLS_VERSION:

UCSC_TOOLS_VERSION:

GENOME:

REPEAT_MASKER:

SNPIR_ANNOTATION:

RNA_EDIT:

DBSNP:

MILLS:

G1000:

WORKING_DIR:

BWA_RESULTS:

SNPIR_VERSION:

SNPIR_CONFIG:

SNPIR_DIR:

ENCODING:

omics_pipe.modules.star.star(sample, star_flag)[source]

Runs STAR to align .fastq files.

input:
.fastq file
output:
Aligned.out.bam
citation:
  1. Dobin et al, Bioinformatics 2012; doi: 10.1093/bioinformatics/bts635 “STAR: ultrafast universal RNA-seq aligner”
link:
https://code.google.com/p/rna-star/
parameters from parameters file:

ENDS:

RAW_DATA_DIR:

STAR_INDEX:

STAR_OPTIONS:

STAR_RESULTS:

SAMTOOLS_VERSION:

STAR_VERSION:

COMPRESSION:

REF_GENES:

omics_pipe.modules.star_piRNA.star_piRNA(sample, star_flag)[source]

Runs STAR to align .fastq files.

input:
.fastq file
output:
Aligned.out.bam
citation:
  1. Dobin et al, Bioinformatics 2012; doi: 10.1093/bioinformatics/bts635 “STAR: ultrafast universal RNA-seq aligner”
link:
https://code.google.com/p/rna-star/
parameters from parameters file:

ENDS:

RAW_DATA_DIR:

STAR_INDEX:

STAR_OPTIONS:

STAR_RESULTS:

SAMTOOLS_VERSION:

STAR_VERSION:

omics_pipe.modules.TCGA_download.TCGA_download(sample, TCGA_download_flag)[source]

Downloads and unzips TCGA data from Manifest.xml downloaded from CGHub. input:

TGCA XML file
output:
downloaded files from TCGA
citation:
The Cancer Genome Atlas
link:
https://cghub.ucsc.edu/software/downloads.html
parameters from parameters file:

TCGA_XML_FILE:

TCGA_KEY:

TCGA_OUTPUT_PATH:

CGATOOLS_VERSION:

omics_pipe.modules.tophat.tophat(sample, tophat_flag)[source]

Runs TopHat to align .fastq files.

input:
.fastq file
output:
accepted_hits.bam
citation:
Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics doi:10.1093/bioinformatics/btp120
link:
http://tophat.cbcb.umd.edu/
parameters from parameters file:

RAW_DATA_DIR:

REF_GENES:

TOPHAT_RESULTS:

BOWTIE_INDEX:

TOPHAT_VERSION:

TOPHAT_OPTIONS:

BOWTIE_VERSION:

SAMTOOLS_VERSION:

omics_pipe.modules.tophat_miRNA.tophat_miRNA(sample, tophat_miRNA_flag)[source]

Runs TopHat to align .fastq files.

input:
.fastq file
output:
accepted_hits.bam
citation:
Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics doi:10.1093/bioinformatics/btp120
link:
http://tophat.cbcb.umd.edu/
parameters from parameters file:

RAW_DATA_DIR:

miRNA_GTF:

TOPHAT_RESULTS:

miRNA_BOWTIE_INDEX:

TOPHAT_VERSION:

TOPHAT_OPTIONS:

BOWTIE_VERSION:

SAMTOOLS_VERSION:

omics_pipe.modules.tophat_ncRNA.tophat_ncRNA(sample, tophat_ncRNA_flag)[source]

Runs TopHat to align .fastq files.

input:
.fastq file
output:
accepted_hits.bam
citation:
Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics doi:10.1093/bioinformatics/btp120
link:
http://tophat.cbcb.umd.edu/
parameters from parameters file:

RAW_DATA_DIR:

REF_GENES:

TOPHAT_RESULTS:

NONCODE_BOWTIE_INDEX:

TOPHAT_VERSION:

TOPHAT_OPTIONS:

BOWTIE_VERSION:

SAMTOOLS_VERSION: