|
Journal of Computational Biology
Picking Alignments from (Steiner) Trees
To cite this article:
Fumei Lam, Marina Alexandersson, Lior Pachter.
Journal of Computational Biology.
June 2003,
10(3-4): 509-520.
doi:10.1089/10665270360688156.
Published in Volume: 10 Issue 3-4: July 5, 2004
Fumei Lam Department of Mathematics, M.I.T., Cambridge, MA 02139 Marina Alexandersson FCC, Fraunhofer-Chalmers Research Centre, Göteborg, Sweden Lior Pachter Department of Mathematics, University of California Berkeley, Berkeley, CA 94720 The application of Needleman–Wunsch alignment techniques to biological sequences is complicated by two serious problems when the sequences are long: the running time, which scales as the product of the lengths of sequences, and the difficulty in obtaining suitable parameters that produce meaningful alignments. The running time problem is often corrected by reducing the search space, using techniques such as banding, or chaining of high-scoring pairs. The parameter problem is more difficult to fix, partly because the probabilistic model, which Needleman–Wunsch is equivalent to, does not capture a key feature of biological sequence alignments, namely the alternation of conserved blocks and seemingly unrelated nonconserved segments. We present a solution to the problem of designing efficient search spaces for pair hidden Markov models that align biological sequences by taking advantage of their associated features. Our approach leads to an optimization problem, for which we obtain a 2-approximation algorithm, and that is based on the construction of Manhattan networks, which are close relatives of Steiner trees. We describe the underlying theory and show how our methods can be applied to alignment of DNA sequences in practice, succesfully reducing the Viterbi algorithm search space of alignment PHMMs by three orders of magnitude.  This paper was cited by:A Pairwise Alignment Algorithm Which Favors Clusters of Blocks Elodie Nédélec, Thomas Moncion, Elisabeth Gassiat, Bruno Bossard, Guillemette Duchateau-Nguyen, Alain Denise, Michel Termier Journal of Computational Biology. Feb 2005, Vol. 12, No. 1: 33-47 Abstract | Full Text PDF | Reprints & Permissions
|
|