Speaker:
Sing-Hoi Sze, Computer Science Department, Texas A&M University
Title:
Integrating sample-driven and pattern-driven approaches in motif
finding
Abstract:
The problem of finding conserved motifs given a set of DNA sequences
is among the most fundamental problems in computational biology,
with important applications in locating regulatory sites from co-expressed
genes. Traditionally, two classes of approaches are used to address the
problem: sample-driven approaches focus on finding the locations of the
motif instances directly, while pattern-driven approaches focus on finding
a consensus string or a profile directly to model the motif. We propose an
integrated approach by formulating the motif finding problem as the
problem of finding large cliques in k-partite graphs, with the additional
requirement that there exists a string s (which may not appear in the
given sample) that is close to every motif instance included
in such a clique. In this formulation, each clique represents the locations of
the motif instances, while the existence of string s ensures that
these instances are derived from a common motif pattern. The combined approach
provides a better formulation to model motifs than using cliques alone, and the
use of k-partite graphs allows the development of a fast and exact
divide-and-conquer approach to handle the cases when almost every sequence
contains a motif instance. This is joint work with Songjian Lu and Jianer
Chen.
|