Hello. Sign in to personalize your visit. New user? Register now.  
Journal of Computational Biology
Finding Motifs in Promoter Regions

To cite this article:
Libi Hertzberg, Or Zuk, Gad Getz, Eytan Domany. Journal of Computational Biology. April 2005, 12(3): 314-330. doi:10.1089/cmb.2005.12.314.

Published in Volume: 12 Issue 3: April 21, 2005

Full Text: • PDF for printing (202.2 KB) • PDF w/ links (248.9 KB)


Libi Hertzberg
Department of Physics of Complex Systems, Weizmann Institute of Science, Rehovot 76100, Israel.
Or Zuk
Department of Physics of Complex Systems, Weizmann Institute of Science, Rehovot 76100, Israel.
Gad Getz
Department of Physics of Complex Systems, Weizmann Institute of Science, Rehovot 76100, Israel.
Eytan Domany
Department of Physics of Complex Systems, Weizmann Institute of Science, Rehovot 76100, Israel.

A central issue in molecular biology is understanding the regulatory mechanisms that control gene expression. The availability of whole genome sequences opens the way for computational methods to search for the key elements in transcription regulation. These include methods for discovering the binding sites of DNA-binding proteins, such as transcription factors. A common representation of transcription factor binding sites is a position specific score matrix (PSSM). We developed a probabilistic approach for searching for putative binding sites. Given a promoter sequence and a PSSM, we scan the promoter and find the position with the maximal score. Then we calculate the probability to get such a maximal score or higher on a random promoter. This is the p-value of the putative binding site. In this way, we searched for putative binding sites in the upstream sequences of Saccharomyces cerevisiae, where some binding sites are known (according to the Saccharomyces cerevisiae Promoters Database, SCPD). Our method produces either exact p-values, or a better estimate for them than other methods, and this improves the results of the search. For each gene we found its statistically significant putative binding sites. We measured the rates of true positives, by a comparison to the known binding sites, and also compared our results to these of MatInspector, a commercially available software that looks for putative binding sites in DNA sequences according to PSSMs. Our results were significantly better. In contrast with us, MatInspector doesn't calculate the exact statistical significance of its results.

Free first page

This paper was cited by:

The Condition-Dependent Transcriptional Network in Escherichia coli
Karen Lemmens, Tijl De Bie, Thomas Dhollander, Pieter Monsieurs, Bart De Moor, Julio Collado-Vides, Kristof Engelen, Kathleen Marchal
Annals of the New York Academy of Sciences. Apr 2009, Vol. 1158, No. 1: 29-35
CrossRef
Positional distribution of human transcription factor binding sites
M. Koudritsky, E. Domany
Nucleic Acids Research. Nov 2008, Vol. 36, No. 21: 6795-6805
CrossRef
Compound Poisson Approximation of the Number of Occurrences of a Position Frequency Matrix (PFM) on Both Strands
Utz J. Pape, Sven Rahmann, Fengzhu Sun, Martin Vingron
Journal of Computational Biology. Jul 2008, Vol. 15, No. 6: 547-564
Abstract | Full Text PDF | Reprints & Permissions
All articles
Previous Next