Overview

Broad-Enrich tests sets of broad genomic regions (e.g., from ChIP-seq data for histone modifications or copy number variations) for enriched biological pathways, Gene Ontology terms, or other gene sets. The pre-defined gene sets are the same as used in LRpath, and can be browsed here. Using an input .BED file, Broad-Enrich determines the proportion of each gene locus covered by a peak, using a chosen "gene locus definition". The "locus" of a gene is the region from which the gene is predicted to be regulated. Broad-Enrich uses a logistic regression model to test for association between the proportion of each gene locus covered by a peak and gene set membership. It empirically adjusts for the bias due to locus length using a binomial cubic smoothing spline within the logistic model. Detailed methods are provided here. Output includes summary plots, peak to gene assignments, and enrichment (and depletion) results including odds ratio, p-value, and FDR for each gene set.
more Broad-Enrich is also available as part of the Chip-Enrich R package: : Broad-Enrich.zip
Vignette : pdf

Run Analysis

Select input file

The following formats are fully supported via their file extensions: .bed, .broadPeak, .narrowPeak, .gff3, .gff2, .gff, and .bedGraph or .bdg. BED3 through BED6 files are supported under the .bed extension. Files without these extensions are supported under the conditions that the first 3 columns correspond to 'chr', 'start', and 'end' and that there is either no header column, or it is commented out.

Analysis Name
Please provide a meaningful name for this analysis (used to name output files).

Email
Please provide your email address if you wish to be notified when the analysis has been completed.

Supported Genomes



Annotation Databases
Selecting multiple, or a large, annotation database may require several minutes of computation time. For approximate Chip-Enrich running times against different databases view this table.

Locus Definitions
  • < 1kb
    (only use peaks within 1kb of a transcription start site)
  • < 5kb
    (only use peaks within 5kb of a transcription start site)
  • > 5kb upstream
    (only use peaks greater than 5kb upstream of a transcription start site)
  • < 10kb
    (only use peaks within 10kb of a transcription start site)
  • > 10kb upstream
    (only use peaks greater than 10kb upstream of a transcription start site)
  • Exon
    (only use peaks that fall within an annotated exon)
  • Intron
    (only use peaks that fall within an annotated itron)
  • Nearest Gene
    (use all peaks; assign peaks to the nearest gene defined by transcription start and end sites)
  • Nearest TSS
    (use all peaks; assign peaks to the gene with the closest TSS)
  • User Defined
    (user can input their own locus definition)
Filter Only test gene sets with less than the following number of genes:
Filter value should be numeric and greater than 30.It can be used to remove large, vague gene sets such as "binding".

Adjust for the mappability of the gene locus regions
  • True
  • False
 


Please reference the following publication when citing Chip-Enrich:
  1. Please reference the following publication when citing Broad-Enrich: Cavalcante RG, Lee C, Welch RP, Patil S, Weymouth T, Scott LJ and Sartor MA. "Broad-Enrich: Functional interpretation of large sets of broad genomic regions".
Older version of chip-enrich is available at here