Broad-Enrich: gene set enrichment testing for sets of broad genomic regions
Broad-Enrich tests sets of broad genomic regions (e.g., from ChIP-seq data for histone modifications or copy number variations) for enriched biological pathways, Gene Ontology terms, or other gene sets. The pre-defined gene sets are the same as used in LRpath, and can be browsed here. Using an input .bed, .narrowPeak or.broadPeak file, Broad-Enrich determines the proportion of each gene locus covered by a peak, using a chosen "gene locus definition". The "locus" of a gene is the region from which the gene is predicted to be regulated. Broad-Enrich uses a logistic regression model to test for association between the proportion of each gene locus covered by a peak and gene set membership. It empirically adjusts for the bias due to locus length using a binomial cubic smoothing spline within the logistic model. Detailed methods are provided here. Output includes summary plots, peak to gene assignments, and enrichment (and depletion) results including odds ratio, p-value, and FDR for each gene set.

Broad-Enrich is also available as part of the Chip-Enrich R package:
Vignette: pdf

Select input file

Input file should be a standard .bed,.narrowPeak or.broadPeak file containing at least three columns: (1) chromosome (of the form "chr3") (2) start position, and (3) end position. Additional columns will be ignored.

Analysis Name
Please provide a meaningful name for this analysis (used to name output files).

Please provide your email address to be notified when the analysis is complete.

Supported Genomes

Annotation Databases Selecting multiple, or a large, annotation database may require several minutes of computation time. For approximate Broad-Enrich running times against different databases view this table.

Filter Only test gene sets with less than the following number of genes:
Filter value should be numeric and greater than 30.It can be used to remove large, vague gene sets such as "binding".

Locus Definition
  • 1kb
    (only use peaks within 1kb of a transcription start site)
  • 5kb
    (only use peaks within 5kb of a transcription start site)
  • Exon
    (only use peaks that fall within an annotated exon)
  • Nearest Gene
    (use all peaks; assign peaks to the nearest gene defined by transcription start and end sites)
  • Nearest TSS
    (use all peaks; assign peaks to the gene with the closest TSS)
  • User Defined
    (user can input their own locus definition)
Adjust for the mappability of the gene locus regions
  • True
  • False

Please reference the following publication when citing Broad-Enrich:

1 Cavalcante RG, Lee C, Welch RP, Patil S, Weymouth T, Scott LJ and Sartor MA. "Broad-Enrich: Functional interpretation of large sets of broad genomic regions." Bioinformatics. 2014; (accepted).

