Dipartimento di Ingegneria dell' Informazione, Università di Padova, Padova, Italy, and Department of Computer Sciences, Purdue University, West Lafayette, IN
Laxmi Parida
IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598
We examine the problem of extracting maximal irredundant motifs from a string. A combinatorial argument poses a linear bound on the total number of such motifs, thereby opening the way to the quest for the fastest and most efficient methods of extraction. The basic paradigm explored here is that of iterated updates of the set of irredundant motifs in a string under consecutive unit symbol extensions of the string itself. This approach exposes novel characterizations for the base set of motifs in a string, hinged on notions of partial order. Such properties support the design of ad hoc data structures and constructs, and lead to develop an O(n3) time incremental discovery algorithm.
This paper was cited by:
ARCS-Motif: discovering correlated motifs from unaligned biological sequences