|
Journal of Computational Biology
Towards Optimally Multiplexed Applications of Universal Arrays
To cite this article:
Amir Ben-Dor, Tzvika Hartman, Richard M. Karp, Benno Schwikowski, Roded Sharan, Zohar Yakhini.
Journal of Computational Biology.
March 2004,
11(2-3): 476-492.
doi:10.1089/1066527041410373.
Published in Volume: 11 Issue 2-3: July 28, 2004
Amir Ben-Dor Agilent Laboratories,395 Page Mill Rd., Palo Alto, CA 94303 Tzvika Hartman Department of Computer Science and Applied Mathematics, Weizmann Institute, Rehovot, Israel Richard M. Karp International Computer Science Institute,1947 Center Street, Suite 600, Berkeley, CA 94704 Benno Schwikowski Institute for Systems Biology,1441 N. 34th Street, Seattle, WA 98103 Roded Sharan International Computer Science Institute,1947 Center Street, Suite 600, Berkeley, CA 94704 Zohar Yakhini Agilent Laboratories and Computer Science Dept., Technion, Haifa, Israel We study a design and optimization problem that occurs, for example, when single nucleotide polymorphisms (SNPs) are to be genotyped using a universal DNA tag array. The problem of optimizing the universal array to avoid disruptive cross-hybridization between universal components of the system was addressed in previous work. Cross-hybridization can, however, also occur assay specifically, due to unwanted complementarity involving assay-specific components. Here we examine the problem of identifying the most economic experimental configuration of the assay-specific components that avoids cross-hybridization. Our formalization translates this problem into the problem of covering the vertices of one side of a bipartite graph by a minimum number of balanced subgraphs of maximum degree 1. We show that the general problem is NP-complete. However, in the real biological setting, the vertices that need to be covered have degrees bounded by d. We exploit this restriction and develop an O(d)-approximation algorithm for the problem. We also give an O(d)-approximation for a variant of the problem in which the covering subgraphs are required to be vertex disjoint. In addition, we propose a stochastic model for the input data and use it to prove a lower bound on the cover size. We complement our theoretical analysis by implementing two heuristic approaches and testing their performance on synthetic data as well as on simulated SNP data. 
|
|