Math 664, Seminar in Applied Mathematics

Math 664, Seminar in Applied Mathematics, will be offered second summer session MTWRF at 10:00-11:35 in BLOC 126. The general theme will be the application of mathematical techniques to analyze the structure of the genome. Students will be introduced to the basic ideas of molecular biology and of information theory. We will then form groups to read and present the results of selected journal articles in this area of mathematical biology.

Grading

Grading will be based on class participation and group projects, some of which may involve computer analysis of genomic data sets available on the Web. The course will be self-contained, and mathematical and biological prerequisites are minimal.

The Problem

Fast accurate methods for determining DNA sequences were invented in the late 1970's, soon after DNA techniques were developed. In principle, there is now no limit to the length of DNA that can be sequenced. The sequences of DNA segments many hundreds - even thousands - of nucleotides long are determined routinely. In the hands of an able person, a DNA sequence several thousand base pairs long can be determined in a week's time and with an accuracy of more than 99.9 percent (one misidentified base in a thousand). Furthermore, automated instruments capable of producing sequence information are now available.

However, even with present methods, about 89,000,000 base pairs of DNA from many different organisms have already been determined (as of May 1992). The data is stored in centralized computer banks in the US and in Europe.

The data are useless, however, unless the biologically significant features of the sequences can be recognized.

As the amount of sequence data expands, the need for increasingly sophisticated computer technology is stimulating collaboration between mathematicians, computer scientists, and biologists.

Paul Berg,
Director, Beckman Center for Molecular and Genetic Medicine,
Stanford University School of Medicine



Molecular Biology Notes
Markov Chains and Stationary Distributions
Bacteriophage T7 Genome

References

Genome Sequencing Center, Washington University, St. Louis
Tom Schneider of the National Cancer Institute
Bioinformatics and Computational Biology Ph. D at George Mason
Pasteur Institute - Information Theory, Mathematics, Statistics
Primer on Molecular Genetics (Department of Energy)
Biological Information Theory and Chowder Society Newsgroup
Human Genome Databases on the Internet
Characterization of Prokaryotic and Eukaryotic Promoters Using Hidden Markov Models
Dealing with Genes, The Language of Heredity
by Paul Berg and Maxine Singer
University Science Books, Mill Valley, CA
1992
ISBN 0-7167-2361-1

a shortened version of
Genes and Genomes
by the same authors
ISBN 0-935702-17-2

Introduction to Genetic Engineering
by William H. Sofer
Butterworth Heineman, Boston
1991
ISBN 0-7506-9114-X

Recombinant DNA, Second Edition
by James D. Watson, Michael Gilman, Jan Witkowski, Mark Zoller
Scientific American Books (W. H. Freeman Co.), New York
1992
ISBN 0-7167-2282-8(pbk.)

Visualizing Biological Information
Clifford A. Pickover, Editor
World Scientific, New Jersey
1995
ISBN 9810214278

Information Theory and Molecular Biology
Hubert P. Yockey
Cambridge University Press, Cambridge
1992
ISBN 0-521-35005-0

An Introduction to Information Theory: Symbols, Signals, and Noise, 2nd Edition
J. R. Pierce
Dover Publications, New York
1980