TF Binding and Perturbation

Welcome to the TF Binding and Perturbation Explorer

This application provides an interactive interface for exploring datasets of transcription factor (TF) binding and gene expression responses following TF perturbation in yeast.

How to Use This App

Navigate through the tabs above to interact with different visualizations:

  • Binding: View TF binding profiles in multiple datasets and compare the datasets to each other.
  • Perturbation Response: View transcriptional responses to TF perturbations (gene deletion, gene overexpression, and TF degradation) in multiple datasets and compare the datasets to each other.
  • Compare binding datasets to perturbation response datasets: This tab focuses on global statistics for many TFs.
  • Compare binding profiles to perturbation response profiles: This tab focuses on individual TFs.

Getting Started

Begin by selecting a tab to load a dataset and explore visual summaries of binding and expression response relationships.

This page displays the source selection summary and correlation matrix for TF binding datasets. The current binding datasets are:

This page includes binding data from multiple experimental sources. Each technique provides genome-wide measurements of transcription factor (TF) binding events, but differs in resolution, noise profile, and protocol.

  • ChIP-chip: Chromatin immunoprecipitation followed by microarray hybridization. This data is from the Young lab and is publicly available at The Young Lab.
    Harbison CT, Gordon DB, Lee TI, Rinaldi NJ, Macisaac KD, Danford TW, Hannett NM, Tagne JB, Reynolds DB, Yoo J, et al. 2004. Transcriptional regulatory code of a eukaryotic genome. Nature 431: 99–104.doi:10.1038/nature02800
  • ChIP-exo: Chromatin immunoprecipitation followed by exonuclease digestion and sequencing. This protocol yields high-resolution footprints of bound TFs with base-pair precision and reduced background noise compared to ChIP-chip or ChIP-seq. This dataset is produced by the Pugh lab and is publicly available at yeastepigenome.org.
    Rossi, Matthew J et al. 'A high-resolution protein architecture of the budding yeast genome.' Nature vol. 592,7853 (2021): 309–314. doi:10.1038/s41586-021-03314-8
  • Calling Cards: An in vivo transposon-based TF method. A transposase is tagged to a TF of interest while an enabling insertion events of a known transposon sequence near TF binding sites. This data is produced in both the Brent and Mitra labs at Washington University. Most is not publicly available yet.

More information on how this data was parsed and processed for the tfbindingandperturbation database can be found here.

Source Selection Summary
Binding Correlation Matrix

This page displays the source selection summary and correlation matrix for TF perturbation response datasets. The current datasets include data derived from gene deletions and overexpression methods.

Each dataset captures the effect on gene expression of perturbing a regulator These experimental approaches differ in their perturbation strategy and noise profiles.

  • Overexpression: This data is from the McIsaac lab. The TF is overexpressed from a strong promoter via estradiol induction of an artificial TF. Gene expression is measured via microarray at various time points. We are currently displaying results for the 15 minute time point. The data is publicly available from: Calico labs.
    Hackett, Sean R et al. 'Learning causal networks using inducible transcription factors and transcriptome-wide time series.' Molecular systems biology vol. 16,3 (2020): e9174. doi:10.15252/msb.20199174
  • 2014 Transcription Factor Knock Out (TFKO): Deletion of a transcription factor's coding region. Gene expression is measured via microarray. The data is publicly available from the Holstege Lab..
    Kemmeren P, Sameith K, van de Pasch LA, Benschop JJ, Lenstra TL, Margaritis T, O'Duibhir E, Apweiler E, van Wageningen S, Ko CW, et al. 2014. Large-scale genetic perturbations reveal regulatory networks and an abundance of gene-specific repressors. Cell 157: 740–752.doi:10.1016/j.cell.2014.02.054
  • 2007 TFKO: This is also a deletion data set, with gene expression measured via microarray. This is a re-analysis of data produced in the Hu lab. The data is provided in the Supplement of the following paper:
    Reimand, Jüri et al. 'Comprehensive reanalysis of transcription factor knockout expression data in Saccharomyces cerevisiae reveals many new targets.' Nucleic acids research vol. 38,14 (2010): 4768-77. doi:10.1093/nar/gkq232

More information on how this data was parsed and processed for the tfbindingandperturbation database can be found here.

Perturbation Response Source Selection Summary
Perturbation Response Correlation Matrix

This page displays distribution plots for Rank Response, Dual Threshold Optimization (DTO) empirical p-value, and Univariate p-value. Use the sidebar to select binding and perturbation response data sources, and optionally restrict the view to regulators shared across all selected datasets.

  • Rank Response: Target genes are ranked by binding strength, and perturbation response is binarized into response/non-response. The distribution shows the proportion of genes labeled as responsive among the top 25 most strongly bound.
  • DTO empirical p-value: DTO compares two ranked lists—typically binding and response—to find thresholds that minimize the hypergeometric p-value of their overlap. The empirical p-value reflects the rank overlap's extremity relative to a null distribution generated via permutation. See the original method described in Kang et al., 2020.
  • Univariate p-value: The p-value from an ordinary least squares (OLS) regression model that predicts perturbation response based on the binding score of a regulator.

This page shows comparisons between binding locations and perturbation responses for individual TFs. Use the sidebar to type in the name of a TF or select it from a drop-down menu. Results are shown in the rank response plots and in summarized binding-perturbation comparisons, each of which is explained below.

Overview:
Each solid line on a rank response plot shows a comparison of one binding dataset to one perturbation dataset. The genes are first ranked according to the strength of the perturbed TF's binding signal in their regulatory DNA.

Plot Axes:
The vertical axis shows the fraction of most strongly bound genes that are responsive to the perturbation. Responsiveness is determined using a fixed threshold on the differential expression p-value and/or log fold change. The horizontal axis indicates the number of most strongly bound genes considered. For example, 20 on the horizontal axis indicates the 20 most strongly bound genes. There is no fixed threshold on binding strength.

Reference Lines:
The dashed horizontal line shows the random expectation – the fraction of all genes that are responsive. For example, a dashed line at 0.1 means that 10% of all genes are responsive to perturbation of this TF. The gray area shows a 95% confidence interval for the null hypothesis that the bound genes are no more responsive than the random expectation.


How to Use: Clicking on rows in the Replicate Selection Table controls which binding datasets are plotted. Tabs at the top show plots for different perturbation datasets. The sidebar allows control over which columns are displayed in this table.
Rank Response Plots
Replicate Selection Table

Overview:
Each row of this table shows summary statistics for comparisons of one binding dataset (or replicate) to one perturbation-response dataset.

Navigation:
The tabs at the top show tables for different perturbation datasets. The sidebar allows control over which columns are displayed in this table.

Analysis Methods:
The statistics are derived from three methods of comparison:

  1. Fraction responsive among the 25 or 50 most strongly bound genes;
  2. A linear model fit to predict the response strength from the binding strength (details here).
  3. Dual Threshold Optimization (details here).

Summarized Binding-Perturbation Comparison Table