All Available Modules¶
Below are all available modules in the current release of Omics Pipe in alphabetical order. When creating a custom pipeline, you can choose from these modules or create your own.
- omics_pipe.modules.annotate_peaks.annotate_peaks(step, annotate_peaks_flag)[source]¶
Runs HOMER to annotate peak track from ChIPseq data.
- input:
- .tag input file
- output:
- .txt file
- citation:
- Heinz S, Benner C, Spann N, Bertolino E et al. Simple Combinations of Lineage-Determining Transcription Factors Prime cis-Regulatory Elements Required for Macrophage and B Cell Identities. Mol Cell 2010 May 28;38(4):576-589. PMID: 20513432
- link:
- http://homer.salk.edu/homer/
- parameters from parameters file:
PAIR_LIST:
HOMER_RESULTS:
HOMER_VERSION:
TEMP_DIR:
HOMER_GENOME:
HOMER_ANNOTATE_OPTIONS:
- omics_pipe.modules.annotate_variants.annotate_variants(sample, annotate_variants_flag)[source]¶
Annotates variants with ANNOVAR variant annotator. Follows VarCall. input:
.vcf- output:
- .vcf
- citation:
- Wang K, Li M, Hakonarson H. ANNOVAR: Functional annotation of genetic variants from next-generation sequencing data Nucleic Acids Research, 38:e164, 2010
- link:
- http://www.openbioinformatics.org/annovar/
- parameters from parameters file:
VARIANT_RESULTS:
ANNOVARDB:
ANNOVAR_OPTIONS:
ANNOVAR_OPTIONS2:
TEMP_DIR:
ANNOVAR_VERSION:
VCFTOOLS_VERSION:
- omics_pipe.modules.bowtie.bowtie(sample, bowtie_flag)[source]¶
Runs Bowtie to align .fastq files.
- input:
- .fastq file
- output:
- sample.bam
- citation:
- Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology 10:R25
- link:
- http://bowtie-bio.sourceforge.net/index.shtml
- parameters from parameters file:
ENDS:
TRIMMED_DATA_PATH:
BOWTIE_OPTIONS:
BOWTIE_INDEX:
BOWTIE_RESULTS:
BOWTIE_VERSION:
SAMTOOLS_VERSION:
BEDTOOLS_VERSION:
TEMP_DIR:
- omics_pipe.modules.BreastCancer_RNA_report.BreastCancer_RNA_report(sample, BreastCancer_RNA_report_flag)[source]¶
Runs R script with knitr to produce report from RNAseq pipeline.
- input:
- results from other steps in RNAseq pipelines
- output:
- html report
- citation:
- Meissner
- parameters from parameter file:
WORKING_DIR:
R_VERSION:
REPORT_RESULTS:
PARAMS_FILE:
TABIX_VERSION:
TUMOR_TYPE:
GENELIST:
COSMIC:
CLINVAR:
PHARMGKB_rsID:
PHARMGKB_Allele:
DRUGBANK:
CADD:
- omics_pipe.modules.bwa.bwa1(sample, bwa1_flag)[source]¶
BWA aligner for read1 of paired_end reads.
- input:
- .fastq
- output:
- .sam
- citation:
- Li H. and Durbin R. (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25, 1754-1760. [PMID: 19451168]
- link:
- http://bio-bwa.sourceforge.net/bwa.shtml
- parameters from parameters file:
BWA_RESULTS:
TEMP_DIR:
SAMTOOLS_VERSION:
BWA_VERSION:
BWA_INDEX:
RAW_DATA_DIR:
GATK_READ_GROUP_INFO:
COMPRESSION:
- omics_pipe.modules.bwa.bwa2(sample, bwa2_flag)[source]¶
BWA aligner for read2 of paired_end reads.
- input:
- .fastq
- output:
- .sam
- citation:
- Li H. and Durbin R. (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25, 1754-1760. [PMID: 19451168]
- link:
- http://bio-bwa.sourceforge.net/bwa.shtml
- parameters from parameters file:
BWA_RESULTS:
TEMP_DIR:
SAMTOOLS_VERSION:
BWA_VERSION:
BWA_INDEX:
RAW_DATA_DIR:
GATK_READ_GROUP_INFO:
COMPRESSION:
- omics_pipe.modules.bwa.bwa_RNA(sample, bwa_flag)[source]¶
BWA aligner for single end reads.
- input:
- .fastq
- output:
- .sam
- citation:
- Li H. and Durbin R. (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25, 1754-1760. [PMID: 19451168]
- link:
- http://bio-bwa.sourceforge.net/bwa.shtml
- parameters from parameters file:
BWA_RESULTS:
TEMP_DIR:
SAMTOOLS_VERSION:
BWA_VERSION:
BWA_INDEX:
RAW_DATA_DIR:
GATK_READ_GROUP_INFO:
COMPRESSION:
- omics_pipe.modules.bwa.bwa_mem(sample, bwa_mem_flag)[source]¶
BWA aligner with BWA-MEM algorithm.
- input:
- .fastq
- output:
- .sam
- citation:
- Li H. and Durbin R. (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25, 1754-1760. [PMID: 19451168]
- link:
- http://bio-bwa.sourceforge.net/bwa.shtml
- parameters from parameters file:
BWA_RESULTS:
TEMP_DIR:
SAMTOOLS_VERSION:
BWA_VERSION:
GENOME:
RAW_DATA_DIR:
BWA_OPTIONS:
COMPRESSION:
- omics_pipe.modules.bwa.bwa_mem_pipe(sample, bwa_mem_pipe_flag)[source]¶
BWA aligner with BWA-MEM algorithm.
- input:
- .fastq
- output:
- .sam
- citation:
- Li H. and Durbin R. (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25, 1754-1760. [PMID: 19451168]
- link:
- http://bio-bwa.sourceforge.net/bwa.shtml
- parameters from parameters file:
BWA_RESULTS:
TEMP_DIR:
SAMTOOLS_VERSION:
BWA_VERSION:
GENOME:
RAW_DATA_DIR:
BWA_OPTIONS:
COMPRESSION:
SAMBAMBA_VERSION:
SAMBLASTER_VERSION:
SAMBAMBA_OPTIONS:
- omics_pipe.modules.call_variants.call_variants(sample, call_variants_flag)[source]¶
Calls variants from alignment .bam files using Varcall.
- input:
- Aligned.out.sort.bam or accepted_hits.bam
- output:
- .vcf file
- citation:
- Erik Aronesty (2011). ea-utils : “Command-line tools for processing biological sequencing data”;
- link:
- https://code.google.com/p/ea-utils/wiki/Varcall
- parameters from parameters file:
STAR_RESULTS:
GENOME:
VARSCAN_PATH:
VARSCAN_OPTIONS:
VARIANT_RESULTS:
TEMP_DIR:
SAMTOOLS_VERSION:
ANNOVAR_VERSION:
VCFTOOLS_VERSION:
VARSCAN_VERSION:
SAMTOOLS_OPTIONS:
- omics_pipe.modules.ChIP_trim.ChIP_trim(sample, ChIP_trim_flag)[source]¶
Runs Homer Tools to trim adapters from .fastq files.
- input:
- .fastq file
- output:
- .fastq file
- citation:
- Heinz S, Benner C, Spann N, Bertolino E et al. Simple Combinations of Lineage-Determining Transcription Factors Prime cis-Regulatory Elements Required for Macrophage and B Cell Identities. Mol Cell 2010 May 28;38(4):576-589. PMID: 20513432
- link:
- http://homer.salk.edu/homer/
- parameters from parameters file:
ENDS:
RAW_DATA_DIR:
HOMER_TRIM_OPTIONS:
TRIMMED_DATA_PATH:
HOMER_VERSION:
- omics_pipe.modules.cuffdiff.cuffdiff(step, cuffdiff_flag)[source]¶
Runs Cuffdiff to perform differential expression. Runs after Cufflinks. Part of Tuxedo Suite.
- input:
- .bam files
- output:
- differential expression results
- citation:
- Trapnell C, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation Nature Biotechnology doi:10.1038/nbt.1621
- link:
- http://cufflinks.cbcb.umd.edu/
- parameters from parameters file:
CUFFDIFF_RESULTS:
GENOME:
CUFFDIFF_OPTIONS:
CUFFMERGE_RESULTS:
CUFFDIFF_INPUT_LIST_COND1:
CUFFDIFF_INPUT_LIST_COND2:
CUFFLINKS_VERSION:
- omics_pipe.modules.cuffdiff_miRNA.cuffdiff_miRNA(step, cuffdiff_miRNA_flag)[source]¶
Runs Cuffdiff to perform differential expression. Runs after Cufflinks. Part of Tuxedo Suite.
- input:
- .bam files
- output:
- differential expression results
- citation:
- Trapnell C, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation Nature Biotechnology doi:10.1038/nbt.1621
- link:
- http://cufflinks.cbcb.umd.edu/
- parameters from parameters file:
CUFFDIFF_RESULTS:
GENOME:
CUFFDIFF_OPTIONS:
CUFFMERGE_RESULTS:
CUFFDIFF_INPUT_LIST_COND1:
CUFFDIFF_INPUT_LIST_COND2:
CUFFLINKS_VERSION:
- omics_pipe.modules.cufflinks.cufflinks(sample, cufflinks_flag)[source]¶
Runs cufflinks to assemble .bam files from TopHat.
- input:
- accepted_hits.bam
- output:
- transcripts.gtf
- citation:
- Trapnell C, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation Nature Biotechnology doi:10.1038/nbt.1621
- link:
- http://cufflinks.cbcb.umd.edu/
- parameters from parameters file:
TOPHAT_RESULTS:
CUFFLINKS_RESULTS:
REF_GENES:
GENOME:
CUFFLINKS_OPTIONS:
CUFFLINKS_VERSION:
- omics_pipe.modules.cufflinks_miRNA.cufflinks_miRNA(sample, cufflinks_miRNA_flag)[source]¶
Runs cufflinks to assemble .bam files from TopHat. Takes parameter MIRNA_GTF.
- input:
- accepted_hits.bam
- output:
- transcripts.gtf
- citation:
- Trapnell C, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation Nature Biotechnology doi:10.1038/nbt.1621
- link:
- http://cufflinks.cbcb.umd.edu/
- parameters from parameters file:
TOPHAT_RESULTS:
CUFFLINKS_RESULTS:
miRNA_GTF:
GENOME:
CUFFLINKS_OPTIONS:
CUFFLINKS_VERSION:
- omics_pipe.modules.cufflinks_ncRNA.cufflinks_ncRNA(sample, cufflinks_ncRNA_flag)[source]¶
Runs cufflinks to assemble .bam files from TopHat. Takes parameters LNCRNA_GTF and NONCODE_FASTA.
- input:
- accepted_hits.bam
- output:
- transcripts.gtf
- citation:
- Trapnell C, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation Nature Biotechnology doi:10.1038/nbt.1621
- link:
- http://cufflinks.cbcb.umd.edu/
- parameters from parameters file:
TOPHAT_RESULTS:
CUFFLINKS_RESULTS:
LNCRNA_GTF:
NONCODE_FASTA:
CUFFLINKS_OPTIONS:
CUFFLINKS_VERSION:
- omics_pipe.modules.cuffmerge.cuffmerge(step, cuffmerge_flag)[source]¶
Runs cuffmerge to merge .gtf files from Cufflinks.
- input:
- assembly_GTF_list.txt
- output:
- merged.gtf
- citation:
- Trapnell C, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation Nature Biotechnology doi:10.1038/nbt.1621
- link:
- http://cufflinks.cbcb.umd.edu/
- parameters from parameters file:
CUFFMERGE_RESULTS:
REF_GENES:
GENOME:
CUFFMERGE_OPTIONS:
CUFFLINKS_VERSION:
- omics_pipe.modules.cuffmerge_miRNA.cuffmerge_miRNA(step, cuffmerge_miRNA_flag)[source]¶
Runs cuffmerge to merge .gtf files from Cufflinks.
- input:
- assembly_GTF_list.txt
- output:
- merged.gtf
- citation:
- Trapnell C, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation Nature Biotechnology doi:10.1038/nbt.1621
- link:
- http://cufflinks.cbcb.umd.edu/
- parameters from parameters file:
CUFFMERGE_RESULTS:
miRNA_GTF:
GENOME:
CUFFMERGE_OPTIONS:
CUFFLINKS_VERSION:
- omics_pipe.modules.cuffmergetocompare.cuffmergetocompare(step, cuffmergetocompare_flag)[source]¶
Runs cuffcompare to annotate merged .gtf files from Cufflinks.
- input:
- assembly_GTF_list.txt
- output:
- merged.gtf
- citation:
- Trapnell C, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation Nature Biotechnology doi:10.1038/nbt.1621
- link:
- http://cufflinks.cbcb.umd.edu/
- parameters from parameters file:
CUFFMERGE_RESULTS:
REF_GENES:
GENOME:
CUFFMERGETOCOMPARE_OPTIONS:
CUFFLINKS_VERSION:
- omics_pipe.modules.cuffmergetocompare_miRNA.cuffmergetocompare_miRNA(step, cuffmergetocompare_miRNA_flag)[source]¶
Runs cuffcompare to annotate merged .gtf files from Cufflinks.
- input:
- assembly_GTF_list.txt
- output:
- merged.gtf
- citation:
- Trapnell C, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation Nature Biotechnology doi:10.1038/nbt.1621
- link:
- http://cufflinks.cbcb.umd.edu/
- parameters from parameters file:
CUFFMERGE_RESULTS:
miRNA_GTF:
GENOME:
CUFFMERGETOCOMPARE_OPTIONS:
CUFFLINKS_VERSION:
- omics_pipe.modules.custom_R_report.custom_R_report(sample, custom_R_report_flag)[source]¶
Runs R script with knitr to produce report from omics pipeline.
- input:
- results from other steps in RNAseq pipelines
- output:
- html report
- citation:
- Meissner
- parameters from parameter file:
REPORT_SCRIPT:
R_VERSION:
REPORT_RESULTS:
R_MARKUP_FILE:
DPS_VERSION:
PARAMS_FILE:
- omics_pipe.modules.cutadapt_miRNA.cutadapt_miRNA(sample, cutadapt_miRNA_flag)[source]¶
Runs Cutadapt to trim adapters from reads.
- input:
- .fastq
- output:
- .fastq
- citation:
- Martin 2011. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal 17: 10-12.
- link:
- https://code.google.com/p/cutadapt/
- parameters from parameters file:
RAW_DATA_DIR:
ADAPTER:
TRIMMED_DATA_PATH:
PYTHON_VERSION
- omics_pipe.modules.fastq_length_filter_miRNA.fastq_length_filter_miRNA(sample, fastq_length_filter_miRNA_flag)[source]¶
Runs custom Python script to filter miRNA reads by length.
- input:
- .fastq
- output:
- .fastq
- parameters from parameter file:
- TRIMMED_DATA_PATH:
- omics_pipe.modules.fastqc.fastqc(sample, fastqc_flag)[source]¶
QC check of raw .fastq files using FASTQC.
- input:
- .fastq file
- output:
- folder and zipped folder containing html, txt and image files
- citation:
- Babraham Bioinformatics
- link:
- http://www.bioinformatics.babraham.ac.uk/projects/fastqc/
- parameters from parameters file:
RAW_DATA_DIR:
QC_PATH:
FASTQC_VERSION:
COMPRESSION:
- omics_pipe.modules.fastqc_miRNA.fastqc_miRNA(sample, fastqc_miRNA_flag)[source]¶
QC check of raw .fastq files using FASTQC.
- input:
- .fastq file
- output:
- folder and zipped folder containing html, txt and image files
- citation:
- Babraham Bioinformatics
- link:
- http://www.bioinformatics.babraham.ac.uk/projects/fastqc/
- parameters from parameters file:
RAW_DATA_DIR:
QC_PATH:
FASTQC_VERSION:
- omics_pipe.modules.filter_variants.filter_variants(sample, filter_variants_flag)[source]¶
Filters variants to remove common variants.
- input:
- .bam or .sam file
- output:
- .vcf file
- citation:
- Piskol et al. 2013. Reliable identification of genomic variants from RNA-seq data. The American Journal of Human Genetics 93: 641-651.
- link:
- http://lilab.stanford.edu/SNPiR/
- parameters from parameters file:
VARIANT_RESULTS:
TEMP_DIR:
SAMTOOLS_VERSION:
BWA_VERSION:
PICARD_VERSION:
GATK_VERSION:
BEDTOOLS_VERSION:
UCSC_TOOLS_VERSION:
GENOME:
REPEAT_MASKER:
SNPIR_ANNOTATION:
RNA_EDIT:
DBSNP:
MILLS:
G1000:
WORKING_DIR:
BWA_RESULTS:
SNPIR_VERSION:
SNPIR_CONFIG:
SNPIR_DIR:
SNPEFF_VERSION:
dbNSFP:
VCFTOOLS_VERSION:
WORKING_DIR:
SNP_FILTER_OUT_REF:
- omics_pipe.modules.find_motifs.find_motifs(step, find_motifs_flag)[source]¶
Runs HOMER to find motifs from ChIPseq data.
- input:
- .txt peak file from Homer
- output:
- .txt file
- citation:
- Heinz S, Benner C, Spann N, Bertolino E et al. Simple Combinations of Lineage-Determining Transcription Factors Prime cis-Regulatory Elements Required for Macrophage and B Cell Identities. Mol Cell 2010 May 28;38(4):576-589. PMID: 20513432
- link:
- http://homer.salk.edu/homer/
- parameters from parameters file:
PAIR_LIST:
HOMER_RESULTS:
HOMER_VERSION:
TEMP_DIR:
HOMER_GENOME:
HOMER_MOTIFS_OPTIONS:
- omics_pipe.modules.fusion_catcher.fusion_catcher(sample, fusion_catcher_flag)[source]¶
Detects fusion genes in paired-end RNAseq data.
- input:
- paired end .fastq files
- output:
- list of candidate fusion genes
- citation:
- Kangaspeska, S. Hultsch, H. Edgren, D. Nicorici, A. Murumgi, O.P. Kallioniemi, Reanalysis of RNA-sequencing data reveals several additional fusion genes with multiple isoforms, PLOS One, Oct. 2012. http://dx.plos.org/10.1371/journal.pone.0048745
- link:
- https://code.google.com/p/fusioncatcher
- parameters from parameters file:
ENDS:
RAW_DATA_DIR:
FUSION_RESULTS:
FUSIONCATCHERBUILD_DIR:
TEMP_DIR:
SAMTOOLS_VERSION:
FUSIONCATCHER_VERSION:
FUSIONCATCHER_OPTIONS:
TISSUE:
PYTHON_VERSION:
- omics_pipe.modules.GATK_preprocessing_WES.GATK_preprocessing_WES(sample, GATK_preprocessing_WES_flag)[source]¶
GATK preprocessing steps for whole exome sequencing.
- input:
- sorted.rg.md.bam
- output:
- .ready.bam
- citation:
- McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA (2010). The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20:1297-303.
- link:
- http://www.broadinstitute.org/gatk/
- parameters from parameters file:
BWA_RESULTS:
TEMP_DIR:
GATK_VERSION:
GENOME:
DBSNP:
MILLS:
G1000:
CAPTURE_KIT_BED:
SAMTOOLS_VERSION:
- omics_pipe.modules.GATK_preprocessing_WGS.GATK_preprocessing_WGS(sample, GATK_preprocessing_WGS_flag)[source]¶
GATK preprocessing steps for whole genome sequencing.
- input:
- sorted.rg.md.bam
- output:
- .ready.bam
- citation:
- McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA (2010). The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20:1297-303.
- link:
- http://www.broadinstitute.org/gatk/
- parameters from parameters file:
BWA_RESULTS:
TEMP_DIR:
GATK_VERSION:
GENOME:
DBSNP:
MILLS:
G1000:
SAMTOOLS_VERSION:
- omics_pipe.modules.GATK_variant_discovery.GATK_variant_discovery(sample, GATK_variant_discovery_flag)[source]¶
GATK_variant_discovery.
- input:
- sorted.rg.md.bam
- output:
- .ready.bam
- citation:
- McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA (2010). The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20:1297-303.
- link: GATK_variant_discovery
- http://www.broadinstitute.org/gatk/
- parameters from parameters file:
BWA_RESULTS:
TEMP_DIR:
GATK_VERSION:
GENOME:
DBSNP:
VARIANT_RESULTS:
- omics_pipe.modules.GATK_variant_filtering.GATK_variant_filtering(sample, GATK_variant_filtering_flag)[source]¶
GATK_variant_filtering.
- input:
- sorted.rg.md.bam
- output:
- .ready.bam
- citation:
- McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA (2010). The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20:1297-303.
- link: GATK_variant_filtering
- http://www.broadinstitute.org/gatk/
- parameters from parameters file:
VARIANT_RESULTS:
TEMP_DIR:
GATK_VERSION:
GENOME:
DBSNP:
MILLS:
OMNI:
HAPMAP:
R_VERSION:
G1000_SNPs:
G1000_Indels:
- omics_pipe.modules.GATK_variant_filtering.GATK_variant_filtering_group(sample, GATK_variant_filtering_group_flag)[source]¶
GATK_variant_filtering.
- input:
- sorted.rg.md.bam
- output:
- .ready.bam
- citation:
- McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA (2010). The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20:1297-303.
- link: GATK_variant_filtering
- http://www.broadinstitute.org/gatk/
parameters from parameters file:
VARIANT_RESULTS:
TEMP_DIR:
GATK_VERSION:
GENOME:
DBSNP:
MILLS_G1000:
OMNI:
HAPMAP:
R_VERSION:
G1000:
- omics_pipe.modules.homer_peaks.homer_peaks(step, homer_peaks_flag)[source]¶
Runs HOMER to call peaks from ChIPseq data.
- input:
- .tag input file
- output:
- .txt file
- citation:
- Heinz S, Benner C, Spann N, Bertolino E et al. Simple Combinations of Lineage-Determining Transcription Factors Prime cis-Regulatory Elements Required for Macrophage and B Cell Identities. Mol Cell 2010 May 28;38(4):576-589. PMID: 20513432
- link:
- http://homer.salk.edu/homer/
- parameters from parameters file:
PAIR_LIST:
HOMER_RESULTS:
HOMER_PEAKS_OPTIONS:
HOMER_VERSION:
TEMP_DIR:
- omics_pipe.modules.htseq.htseq(sample, htseq_flag)[source]¶
Runs htseq-count to get raw count data from alignments.
- input:
- Aligned.out.sort.bam
- output:
- counts.txt
- citation:
- Simon Anders, EMBL
- link:
- http://www-huber.embl.de/users/anders/HTSeq/doc/overview.html
- parameters from parameters file:
STAR_RESULTS:
HTSEQ_OPTIONS:
REF_GENES:
HTSEQ_RESULTS:
TEMP_DIR:
SAMTOOLS_VERSION:
BAM_FILE_NAME:
PYTHON_VERSION:
- omics_pipe.modules.htseq_gencode.htseq_gencode(sample, htseq_flag)[source]¶
Runs htseq-count to get raw count data from alignments.
- input:
- Aligned.out.sort.bam
- output:
- counts.txt
- citation:
- Simon Anders, EMBL
- link:
- http://www-huber.embl.de/users/anders/HTSeq/doc/overview.html
- parameters from parameters file:
STAR_RESULTS:
HTSEQ_OPTIONS:
REF_GENES_GENCODE:
HTSEQ_GENCODE_RESULTS:
TEMP_DIR:
SAMTOOLS_VERSION:
BAM_FILE_NAME:
- omics_pipe.modules.htseq_miRNA.htseq_miRNA(sample, htseq_miRNA_flag)[source]¶
Runs htseq-count to get raw count data from alignments.
- input:
- Aligned.out.sort.bam
- output:
- counts.txt
- citation:
- Simon Anders, EMBL
- link:
- http://www-huber.embl.de/users/anders/HTSeq/doc/overview.html
- parameters from parameters file:
TOPHAT_RESULTS:
HTSEQ_OPTIONS:
miRNA_GFF:
HTSEQ_RESULTS:
TEMP_DIR:
SAMTOOLS_VERSION:
BAM_FILE_NAME:
- omics_pipe.modules.intogen.intogen(sample, intogen_flag)[source]¶
Runs Intogen to rank mutations and implication for cancer phenotype. Follows variant calling.
- input:
- .vcf
- output:
- variant list
- citation:
- Gonzalez-Perez et al. 2013. Intogen mutations identifies cancer drivers across tumor types. Nature Methods 10, 1081-1082.
- link:
- http://www.intogen.org/
- parameters from parameter file:
VCF_FILE:
INTOGEN_OPTIONS:
INTOGEN_RESULTS:
INTOGEN_VERSION:
USERNAME:
WORKING_DIR:
TEMP_DIR:
SCHEDULER:
VARIANT_RESULTS:
- omics_pipe.modules.macs.macs(step, macs_flag)[source]¶
Runs MACS to call peaks from ChIPseq data. input:
.fastq file- output:
- peaks and .bed file
- citation:
- Zhang et al. Model-based Analysis of ChIP-Seq (MACS). Genome Biol (2008) vol. 9 (9) pp. R137
- link:
- http://liulab.dfci.harvard.edu/MACS/
- parameters from parameters file:
PAIR_LIST:
BOWTIE_RESULTS:
CHROM_SIZES:
MACS_RESULTS:
MACS_VERSION:
TEMP_DIR:
BEDTOOLS_VERSION:
PYTHON_VERSION:
- omics_pipe.modules.mutect.mutect(sample, mutect_flag)[source]¶
Runs MuTect on paired tumor/normal samples to detect somatic point mutations in cancer genomes.
- input:
- .bam
- output:
- call_stats.txt
- citation:
- Cibulskis, K. et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnology (2013).doi:10.1038/nbt.2514
- link:
- http://www.broadinstitute.org/cancer/cga/mutect
- parameters from parameters file:
BWA_RESULTS:
TEMP_DIR:
GATK_VERSION:
GENOME:
DBSNP:
MILLS:
G1000:
CAPTURE_KIT_BED:
- omics_pipe.modules.peak_track.peak_track(step, peak_track_flag)[source]¶
Runs HOMER to create peak track from ChIPseq data.
- input:
- .tag input file
- output:
- .txt file
- citation:
- Heinz S, Benner C, Spann N, Bertolino E et al. Simple Combinations of Lineage-Determining Transcription Factors Prime cis-Regulatory Elements Required for Macrophage and B Cell Identities. Mol Cell 2010 May 28;38(4):576-589. PMID: 20513432
- link:
- http://homer.salk.edu/homer/
- parameters from parameters file:
PAIR_LIST:
HOMER_RESULTS:
HOMER_VERSION:
TEMP_DIR:
- omics_pipe.modules.picard_mark_duplicates.picard_mark_duplicates(sample, picard_mark_duplicates_flag)[source]¶
Picard tools Mark Duplicates.
- input:
- sorted.bam
- output:
- _sorted.rg.md.bam
- citation:
- http://picard.sourceforge.net/
- link:
- http://picard.sourceforge.net/
- parameters from parameters file:
BWA_RESULTS:
TEMP_DIR:
PICARD_VERSION:
SAMTOOLS_VERSION:
- omics_pipe.modules.read_density.read_density(sample, read_density_flag)[source]¶
Runs HOMER to visualize read density from ChIPseq data.
- input:
- .bam file
- output:
- .txt file
- citation:
- Heinz S, Benner C, Spann N, Bertolino E et al. Simple Combinations of Lineage-Determining Transcription Factors Prime cis-Regulatory Elements Required for Macrophage and B Cell Identities. Mol Cell 2010 May 28;38(4):576-589. PMID: 20513432
- link:
- http://homer.salk.edu/homer/
- parameters from parameters file:
BOWTIE_RESULTS:
CHROM_SIZES:
HOMER_RESULTS:
HOMER_VERSION:
TEMP_DIR:
- omics_pipe.modules.RNAseq_QC.RNAseq_QC(sample, RNAseq_QC_flag)[source]¶
Runs rseqc to determine insert size as QC for alignment.
- input:
- .bam
- output:
- pdf plot
- link:
- http://rseqc.sourceforge.net/
- parameters from parameters file:
STAR_RESULTS:
QC_PATH:
BAM_FILE_NAME:
RSEQC_REF:
TEMP_DIR:
PICARD_VERSION:
R_VERSION:
- omics_pipe.modules.RNAseq_report.RNAseq_report(sample, RNAseq_report_flag)[source]¶
Runs R script with knitr to produce report from RNAseq pipeline.
- input:
- results from other steps in RNAseq pipelines
- output:
- html report
- citation:
- Meissner
- parameters from parameter file:
REPORT_SCRIPT:
R_VERSION:
REPORT_RESULTS:
R_MARKUP_FILE:
DPS_VERSION:
PARAMS_FILE:
- omics_pipe.modules.RNAseq_report_counts.RNAseq_report_counts(sample, RNAseq_report_counts_flag)[source]¶
Runs R script with knitr to produce report from RNAseq pipeline.
- input:
- results from other steps in RNAseq pipelines
- output:
- html report
- citation:
- Meissner
- parameters from parameter file:
WORKING_DIR:
R_VERSION:
REPORT_RESULTS:
PARAMS_FILE:
- omics_pipe.modules.RNAseq_report_tuxedo.RNAseq_report_tuxedo(sample, RNAseq_report_tuxedo_flag)[source]¶
Runs R script with knitr to produce report from RNAseq pipeline.
- input:
- results from other steps in RNAseq pipelines
- output:
- html report
- citation:
- Meissner
- parameters from parameter file:
WORKING_DIR:
R_VERSION:
REPORT_RESULTS:
DPS_VERSION:
PARAMS_FILE:
- omics_pipe.modules.rseqc.rseqc(sample, rseqc_flag)[source]¶
Runs rseqc to determine insert size as QC for alignment.
- input:
- .bam
- output:
- pdf plot
- link:
- http://rseqc.sourceforge.net/
- parameters from parameters file:
STAR_RESULTS:
QC_PATH:
BAM_FILE_NAME:
RSEQC_REF:
RSEQC_VERSION:
TEMP_DIR:
- omics_pipe.modules.snpir_variants.snpir_variants(sample, snpir_variants_flag)[source]¶
Calls variants using SNPIR pipeline.
- input:
- Aligned.out.sort.bam or accepted_hits.bam
- output:
- final_variants.vcf file
- citation:
- Piskol, R., et al. (2013). “Reliable Identification of Genomic Variants from RNA-Seq Data.” The American Journal of Human Genetics 93(4): 641-651.
- link:
- http://lilab.stanford.edu/SNPiR/
- parameters from parameters file:
VARIANT_RESULTS:
TEMP_DIR:
SAMTOOLS_VERSION:
BWA_VERSION:
PICARD_VERSION:
GATK_VERSION:
BEDTOOLS_VERSION:
UCSC_TOOLS_VERSION:
GENOME:
REPEAT_MASKER:
SNPIR_ANNOTATION:
RNA_EDIT:
DBSNP:
MILLS:
G1000:
WORKING_DIR:
BWA_RESULTS:
SNPIR_VERSION:
SNPIR_CONFIG:
SNPIR_DIR:
ENCODING:
- omics_pipe.modules.star.star(sample, star_flag)[source]¶
Runs STAR to align .fastq files.
- input:
- .fastq file
- output:
- Aligned.out.bam
- citation:
- Dobin et al, Bioinformatics 2012; doi: 10.1093/bioinformatics/bts635 “STAR: ultrafast universal RNA-seq aligner”
- link:
- https://code.google.com/p/rna-star/
- parameters from parameters file:
ENDS:
RAW_DATA_DIR:
STAR_INDEX:
STAR_OPTIONS:
STAR_RESULTS:
SAMTOOLS_VERSION:
STAR_VERSION:
COMPRESSION:
REF_GENES:
- omics_pipe.modules.star_piRNA.star_piRNA(sample, star_flag)[source]¶
Runs STAR to align .fastq files.
- input:
- .fastq file
- output:
- Aligned.out.bam
- citation:
- Dobin et al, Bioinformatics 2012; doi: 10.1093/bioinformatics/bts635 “STAR: ultrafast universal RNA-seq aligner”
- link:
- https://code.google.com/p/rna-star/
- parameters from parameters file:
ENDS:
RAW_DATA_DIR:
STAR_INDEX:
STAR_OPTIONS:
STAR_RESULTS:
SAMTOOLS_VERSION:
STAR_VERSION:
- omics_pipe.modules.TCGA_download.TCGA_download(sample, TCGA_download_flag)[source]¶
Downloads and unzips TCGA data from Manifest.xml downloaded from CGHub. input:
TGCA XML file- output:
- downloaded files from TCGA
- citation:
- The Cancer Genome Atlas
- link:
- https://cghub.ucsc.edu/software/downloads.html
- parameters from parameters file:
TCGA_XML_FILE:
TCGA_KEY:
TCGA_OUTPUT_PATH:
CGATOOLS_VERSION:
- omics_pipe.modules.tophat.tophat(sample, tophat_flag)[source]¶
Runs TopHat to align .fastq files.
- input:
- .fastq file
- output:
- accepted_hits.bam
- citation:
- Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics doi:10.1093/bioinformatics/btp120
- link:
- http://tophat.cbcb.umd.edu/
- parameters from parameters file:
RAW_DATA_DIR:
REF_GENES:
TOPHAT_RESULTS:
BOWTIE_INDEX:
TOPHAT_VERSION:
TOPHAT_OPTIONS:
BOWTIE_VERSION:
SAMTOOLS_VERSION:
- omics_pipe.modules.tophat_miRNA.tophat_miRNA(sample, tophat_miRNA_flag)[source]¶
Runs TopHat to align .fastq files.
- input:
- .fastq file
- output:
- accepted_hits.bam
- citation:
- Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics doi:10.1093/bioinformatics/btp120
- link:
- http://tophat.cbcb.umd.edu/
- parameters from parameters file:
RAW_DATA_DIR:
miRNA_GTF:
TOPHAT_RESULTS:
miRNA_BOWTIE_INDEX:
TOPHAT_VERSION:
TOPHAT_OPTIONS:
BOWTIE_VERSION:
SAMTOOLS_VERSION:
- omics_pipe.modules.tophat_ncRNA.tophat_ncRNA(sample, tophat_ncRNA_flag)[source]¶
Runs TopHat to align .fastq files.
- input:
- .fastq file
- output:
- accepted_hits.bam
- citation:
- Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics doi:10.1093/bioinformatics/btp120
- link:
- http://tophat.cbcb.umd.edu/
- parameters from parameters file:
RAW_DATA_DIR:
REF_GENES:
TOPHAT_RESULTS:
NONCODE_BOWTIE_INDEX:
TOPHAT_VERSION:
TOPHAT_OPTIONS:
BOWTIE_VERSION:
SAMTOOLS_VERSION: