Hello. Sign in to personalize your visit. New user? Register now.  
Journal of Computational Biology
1001 Optimal PDB Structure Alignments: Integer Programming Methods for Finding the Maximum Contact Map Overlap

To cite this article:
Alberto Caprara, Robert Carr, Sorin Istrail, Giuseppe Lancia, Brian Walenz. Journal of Computational Biology. January 2004, 11(1): 27-52. doi:10.1089/106652704773416876.

Published in Volume: 11 Issue 1: July 5, 2004

Full Text: • PDF for printing (581.4 KB) • PDF w/ links (564 KB)


Alberto Caprara
D.E.I.S., Università di Bologna, Viale Risorgimento, 2 40136 Bologna, Italy
Robert Carr
P.O. Box 5800, MS 1110, Sandia National Laboratories, Albuquerque, NM, 87185
Sorin Istrail
Informatics Research, Celera/Applied Biosystems, 45 W. Gude Drive, Rockville, MD, 20850
Giuseppe Lancia
D.I.M.I., Università di Udine, Viale delle Scienze 206, 33100 Udine, Italy
Brian Walenz
Informatics Research, Celera/Applied Biosystems, 45 W. Gude Drive, Rockville, MD, 20850

Protein structure comparison is a fundamental problem for structural genomics, with applications to drug design, fold prediction, protein clustering, and evolutionary studies. Despite its importance, there are very few rigorous methods and widely accepted similarity measures known for this problem. In this paper we describe the last few years of developments on the study of an emerging measure, the contact map overlap (CMO), for protein structure comparison. A contact map is a list of pairs of residues which lie in three-dimensional proximity in the protein's native fold. Although this measure is in principle computationally hard to optimize, we show how it can in fact be computed with great accuracy for related proteins by integer linear programming techniques. These methods have the advantage of providing certificates of near-optimality by means of upper bounds to the optimal alignment value. We also illustrate effective heuristics, such as local search and genetic algorithms. We were able to obtain for the first time optimal alignments for large similar proteins (about 1,000 residues and 2,000 contacts) and used the CMO measure to cluster proteins in families. The clusters obtained were compared to SCOP classification in order to validate the measure. Extensive computational experiments showed that alignments which are off by at most 10% from the optimal value can be computed in a short time. Further experiments showed how this measure reacts to the choice of the threshold defining a contact and how to choose this threshold in a sensible way.

Free first page

This paper was cited by:

A Reduction-Based Exact Algorithm for the Contact Map Overlap Problem
Wei Xie, Nikolaos V. Sahinidis
Journal of Computational Biology. Jun 2007, Vol. 14, No. 5: 637-654
Abstract | Full Text PDF | Reprints & Permissions
A Parameterized Algorithm for Protein Structure Alignment
Jinbo Xu, Feng Jiao, Bonnie Berger
Journal of Computational Biology. Jun 2007, Vol. 14, No. 5: 564-577
Abstract | Full Text PDF | Reprints & Permissions
MUSTANG: A multiple structural alignment algorithm
Arun S. Konagurthu, James C. Whisstock, Peter J. Stuckey, Arthur M. Lesk
Proteins: Structure, Function, and Bioinformatics. Sep 2006, Vol. 64, No. 3: 559-574
CrossRef
Evolution of Structural Shape in Bacterial Globin-Related Proteins
Lorraine Marsh
Journal of Molecular Evolution. Jun 2006, Vol. 62, No. 5: 575-587
CrossRef
Automatic 3D Protein Structure Classification without Structural Alignment
Zeyar Aung, Kian-Lee Tan
Journal of Computational Biology. Nov 2005, Vol. 12, No. 9: 1221-1241
Abstract | Full Text PDF | Reprints & Permissions
SCUD: Fast structure clustering of decoys using reference state to remove overall rotation
Hongzhi Li, Yaoqi Zhou
Journal of Computational Chemistry. Sep 2005, Vol. 26, No. 11: 1189-1192
CrossRef
Optimal Protein Structure Alignment Using Maximum Cliques
Dawn M. Strickland, Earl Barnes, Joel S. Sokol
Operations Research. Jun 2005, Vol. 53, No. 3: 389-402
CrossRef
Comments on ?The Lagrangian Relaxation Method for Solving Integer Programming Problems?
Marshall L. Fisher
Management Science. Jan 2005, Vol. 50, No. 12 Supplement: 1872-1874
CrossRef
All articles
Previous Next