| |
Abstract/Syllabus:
|
Gifford, David, and Tommi Jaakkola, 7.90J Computational Functional Genomics, Spring 2005. (Massachusetts Institute of Technology: MIT OpenCourseWare), http://ocw.mit.edu (Accessed 08 Jul, 2010). License: Creative Commons BY-NC-SA
Three steps in the transcription of protein-coding genes. (Image by Prof. David Gifford.)
Course Highlights
This course features a complete set of lecture notes.
Course Description
The course focuses on casting contemporary problems in systems biology and functional genomics in computational terms and providing appropriate tools and methods to solve them. Topics include genome structure and function, transcriptional regulation, and stem cell biology in particular; measurement technologies such as microarrays (expression, protein-DNA interactions, chromatin structure); statistical data analysis, predictive and causal inference, and experiment design. The emphasis is on coupling problem structures (biological questions) with appropriate computational approaches.
Technical Requirements
Any number of biological sequence comparison software tools can be used to import the FASTA formatted sequence (.fa) files found on this course site. MATLAB® software is required to view and run the .m and .mat files found on this course site. Postscript viewer software, such as Ghostscript/Ghostview, can be used to view the .ps files found on this course site. File decompression software, such as Winzip® or StuffIt®, is required to open the .zip files found on this course site.
Syllabus
Course Description
This course focuses on casting contemporary problems in systems biology and functional genomics in computational terms and providing appropriate tools and methods to solve them. Topics include genome structure and function, transcriptional regulation, and stem cell biology in particular; measurement technologies such as microarrays (expression, protein-DNA interactions, chromatin structure); statistical data analysis, predictive and causal inference, experiment design. The emphasis is on coupling problem structures (biological questions) with appropriate computational approaches.
During the Spring of 2005, Computational Functional Genomics will be taught around an extended case study showing how we can use new high resolution genomic and proteomic data to discover the underlying biological mechanisms that govern transcriptional regulatory programs in yeast and human.
When possible, our case study will focus on the development of stem cells. Contemporary literature and data as well as new directions for research will be discussed. We will explore the principles of analysis at sufficient depth so that students are able to develop new methodologies that are well founded for new biological problems.
Course Outline
Our case study exploration will be grouped into the following areas:
How can we use DNA sequence to explain mechanism?
In this module we will examine how we can analyze genome sequences to discover properties that are evident in a single genome (CpG islands), properties that are conserved between genomes (genome structure), and how we can discover DNA sequence elements that implement combinatorial control of gene expression (motif discovery). Lectures 1-4.
How can we observe the mechanism of transcriptional regulation?
In this module we will examine the application of DNA microarrays for the analysis of gene expression, protein-DNA binding, chromatin structure, chromatin modifying complexes, and RNA polymerase occupancy. Error models and data normalization techniques for high-resolution array technologies will be presented. Using the processed data we will discuss the basis for clustering genes into sets and discovering gene set features that can be used for diagnostic purposes. We will discuss the importance of chromatin structure in contemporary modeling, and review recent research results on the relationship between chromatin structure and transcriptional regulation. Lectures 5-12.
How can we build predictive network models of transcriptional regulation?
In this module we will build predictive models of transcriptional regulatory networks using probabilistic modeling techniques. We will examine how graphical models can be used to describe key regulatory mechanisms, and use both direct (molecular interaction data) and functional data (expression, phenotype) to constrain the models we learn. We will begin with yeast, and finish this module examining human regulatory networks that are linked to specific diseases. Lectures 12-22.
Team Project
An integral part of the course is a student project component that is based on our case study theme of understanding biological mechanism. We encourage interdisciplinary groups of students to work together to develop novel analysis methodologies to examine recent data. Topics will be chosen by the teams in consultation with us. There will be intermediate (10 minute) and final (20 minute) presentations of each project in class.
Assignments
Four problem sets will be assigned during the term.
Quizzes
There will be one final quiz at the end of the term.
Calendar
Lec # |
TOPICS |
LECTURER |
KEY DATES |
Part 1: Using DNA Sequence to Explain Mechanism |
1 |
Course Introduction |
David Gifford |
|
2 |
Pairwise Alignment |
David Gifford |
|
3 |
Finding Regulatory Sequences in DNA: Motif Discovery |
Tommi Jaakkola |
|
4 |
Finding Regulatory Sequences in DNA: Motif Discovery (cont.) |
Tommi Jaakkola |
Problem set 1 due |
Part 2: Observing the Mechanism of Transcriptional Regulation |
5 |
Microarray Technology |
David Gifford |
|
6 |
Expression Arrays, Normalization, and Error Models |
Tommi Jaakkola |
|
7 |
Expression Profiles, Clustering, and Latent Processes |
Tommi Jaakkola |
Problem set 2 due |
8 |
Computational Functional Genomics |
David Gifford |
|
9 |
Stem Cells and Transcriptional Regulation |
David Gifford |
|
10 |
Part One: An Example of Clustering Expression Data
Part Two: Computational Functional Genomics (cont.) |
David Gifford |
Problem set 3 due |
11 |
Project Group Meetings |
|
|
12 |
Project Group Initial Presentations |
Students |
|
13 |
Computational Discovery of Regulatory Networks |
Georg Gerber (Guest Lecturer) |
|
14 |
RNA Silencing |
David Bartel (Guest Lecturer) |
|
Part 3: Building Predictive Network Models of Transcriptional Regulation |
15 |
Computational Functional Genomics (cont.) |
David Gifford |
|
16 |
Human Regulatory Networks |
David Gifford |
|
17 |
Protein Networks |
David Gifford |
|
18 |
Causal Models |
Tommi Jaakkola |
|
19 |
Causal Bayesian Networks, Active Learning |
Tommi Jaakkola |
|
20 |
From Biological Data to Biological Insight |
Nir Friedman (Guest Lecturer) |
|
21 |
Modeling Transcriptional Regulation |
Tommi Jaakkola |
|
22 |
Dynamics |
David Gifford |
Problem set 4 due |
|
|
|
Further Reading:
|
Readings
Help support MIT OpenCourseWare by shopping at Amazon.com! Partnering with Amazon.com, MIT OCW offers direct links to purchase the books cited in this course. Click on the book titles and purchase the book from Amazon.com, and MIT OCW will receive up to 10% of all purchases you make. Your support will enable MIT to continue offering open access to MIT courses.
Lec # |
TOPICS |
READINGS |
Part 1: Using DNA Sequence to Explain Mechanism |
1 |
Course Introduction |
|
2 |
Pairwise Alignment |
Durbin, Richard, Sean R. Eddy, Anders Krogh, and Graeme Mitchison. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge, UK: Cambridge University Press, 1999. ISBN: 0521629713. |
3 |
Finding Regulatory Sequences in DNA: Motif Discovery |
Hughes, J. D., P. W. Estep, S. Tavazoie, and G. M. Church. "Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae." Journal of Molecular Biology 296, no. 5 (March 10, 2000): 1205-14. |
4 |
Finding Regulatory Sequences in DNA: Motif Discovery (cont.) |
|
Part 2: Observing the Mechanism of Transcriptional Regulation |
5 |
Microarray Technology |
Microarray Core Facility - Laboratory Services
Pan, Qun, et al. "Revealing Global Regulatory Features of Mammalian Alternative Splicing Using a Quantitative Microarray Platform." Molecular Cell 16, no. 6 (December 22, 2004): 929-941.
Zhang, Wen, et al. "The Functional Landscape of Mouse Gene Expression." Journal of Biology 3, no. 5 (2004): 21.
Lee, Tong Ihn, et al. "Transcriptional Regulatory Networks in Saccharomyces cerevisiae." Science 298, no. 5594 (October 25, 2002): 799-804. |
6 |
Expression Arrays, Normalization, and Error Models |
Yang, Yee Hwa, et al. "Normalization for cDNA Microarray Data: A Robust Composite Method for Addressing Single and Multiple Slide Systematic Variation." Nucleic Acid Research 30, no. 4 (2002).
Newton, M. A., et al. "On Differential Variability of Expression Ratios: Improving Statistical Inference about Gene Expression Changes from Microarray Data." Journal of Computational Biology 8, no. 1 (2001): 37-52. |
7 |
Expression Profiles, Clustering, and Latent Processes |
Eisen, M. B., et al. "Cluster Analysis and Display of Genome-wide Expression Patterns." PNAS 95, no. 25 (December 8, 1998): 14863-8.
Liao, J. C., et al. "Network Component Analysis: Reconstruction of Regulatory Signals in Biological Systems." PNAS 100, no. 26 (December 23, 2003): 15522-7.
Dueck, and Frey. "Probabilistic Sparse Matrix Factorization." Technical Report, University of Toronto, 2004. (PDF) |
8 |
Computational Functional Genomics |
Introduction to SVD |
9 |
Stem Cells and Transcriptional Regulation |
Dor, Y., J. Brown, O. Martinez, and D. A. Melton. "Adult pancreatic beta-cells are formed by self-duplication rather than stem-cell differentiation." Nature 429, no. 6987 (2004): 41-6. |
10 |
Part One: An Example of Clustering Expression Data
Part Two: Computational Functional Genomics (cont.) |
Alizadeh, A. A., et al. "Distinct Types of Diffuse Large B-cell Lymphoma Identified by Gene Expression Profiling." Nature 403, no. 6769 (February 3, 2000): 503-11.
Ramalho-Santos, M., et al. ""Stemness": Transcriptional Profiling of Embryonic and Adult Stem Cells." Science 298, no. 5593 (October 18, 2002): 597-600.
Ivanova, N. B., et al. "A Stem Cell Molecular Signature." Science 298, 5593 (October 18, 2002): 601-4.
Fortunel, et al. "Comment on "'Stemness': transcriptional profiling of embryonic and adult stem cells" and "a stem cell molecular signature"." Science 302, no. 5644 (2003): 393.
Evsikov, et al. "Comment on "'Stemness': transcriptional profiling of embryonic and adult stem cells" and "a stem cell molecular signature"." Science 302, no. 5644 (2003): 393.
Ivanova, N. B., et al. "Response to comments on "'Stemness': transcriptional profiling of embryonic and adult stem cells" and "a stem cell molecular signature"." Science 302 no. 5644 (2003): 393.
Vogel, G. "'Stemness' Genes Still Elusive." Science 302, no. 5644 (October 17, 2003): 371. |
11 |
Project Group Meetings |
|
12 |
Project Group Initial Presentations |
|
13 |
Computational Discovery of Regulatory Networks |
|
14 |
RNA Silencing |
|
Part 3: Building Predictive Network Models of Transcriptional Regulation |
15 |
Computational Functional Genomics (cont.) |
Heckerman, David. "A Tutorial on Learning with Bayesian Networks." Microsoft Technical Report MSR-TR-95-06, 1996.
Hartemink, Alex. "Principled Computational Methods for the Validation and Discovery of Genetic Regulatory Networks." MIT Ph.D. Thesis (2001). (PDF - 9.0 MB) |
16 |
Human Regulatory Networks |
Cooper, G. F., and E. Herskovits. "A Bayesian Method for the Induction of Probabilistic Networks from Data." KSL-91-02 (Knowledge Systems Laboratory, Stanford University), November 1993. |
17 |
Protein Networks |
Li, S., et al. "A Map of the Interactome Network of the Metazoan C. elegans." Science 303, no. 5657 (January 23, 2004): 540-3.
Giot, L., et al. "A Protein Interaction Map of Drosophila melanogaster." Science 302, no. 5651 (December 5, 2003): 1727-36.
Gavin, A. C., et al. "Functional organization of the yeast proteome by systematic analysis of protein complexes." Nature 415, no. 6868 (2002): 141-7.
Ho, Y., et al. "Systematic Identification of Protein Complexes in Sacchormyces cerevisiae by Mass Spectrometry." Nature 415, no. 6868 (January 10, 2002): 180-3.
Tong, A. H., et al. "Global mapping of the yeast genetic interaction network." Science 303, no. 5659 (2004): 808-13.
Phizicky, E., P. I. Bastiaens, H. Zhu, M. Snyder, and S. Fields. "Protein Analysis on a Proteomic Scale." Nature 422, no. 6928 (March 13, 2003): 208-15.
von Mering, C., et al. "Comparative Assessment of Large-scale Data Sets of Protein-protein Interactions." Nature 417, no. 6887 (May 23, 2002): 399-403. |
18 |
Causal Models |
Heckerman, David. "A Tutorial on Learning with Bayesian Networks." Microsoft Technical Report MSR-TR-95-06, 1996.
Yeang, C. H., et al. "Physical Network Models." Journal of Computational Biology 11, no. 2-3 (2004): 243-263. |
19 |
Causal Bayesian Networks, Active Learning |
Segal, E., et al. "Module Networks: Identifying Regulatory Modules and their Condition Specific Regulators from Gene Expression Data." Nature Genetics 34, no. 2 (2003): 166-76.
McCuine S., C. Workman, T. Jaakkola, T. Ideker, C-H. Yeang, H. Mak. "Validation and refinement of gene-regulatory pathways on a network of physical interactions." Genome Biology 6, no. 7 (2005): R62.
Oliver, S. G., et al. "Functional genomic hypothesis generation and experimentation by a robot scientist." Nature 427 (2004): 247 - 252. |
20 |
From Biological Data to Biological Insight |
|
21 |
Modeling Transcriptional Regulation |
Segal, E., et al. "Module Networks: Identifying Regulatory Modules and their Condition Specific Regulators from Gene Expression Data." Nature Genetics 34, no. 2 (2003): 166-76.
Said, M., A. Oppenheim, and D. Lauffenburger. "Modeling Cellular Signal Processing Using Interacting Markov Chains." In Proceedings of the International Conference on Acoustics, Speech, Signal Processing, 2003. |
22 |
Dynamics |
Arkin, Ross, and McAdams. "Stochastic Kinetic Analysis of Developmental Pathway Bifurcation in Phage Lambda-Infected Escherichia coli Cells." Genetics 149 (August 1998): 1633-1648.
Gilman, Arkin. "Genetic 'Code': Representations and Dynamic Models of Genetic Components and Networks." Annu. Rev. Genomics Hum. Genet. 3 (2002): 341-69.
McAdams, Arkin. "It's A Noisy Business!: Gene Regulation at the Nanomolar Scale." Trends in Genetics 15 (1999): 65-69.
———. "Simulation of Genetic Circuits." Annu. Rev. Biophys. Biomol. Struct. 27 (1998): 199-224.
Gillespie. "Exact Stochastic Simulation of Coupled Chemical Reactions." Journal of Physical Chemistry 81, no. 25 (1977): 2340-2361. |
Related Resources
Related Links
UCSC Genome Browser
NCBI
Repeat Sequences
NIH Stem Cell Information Home Page
RepeatMasker server
A Practical Introduction to MATLAB®, Mark S. Gockenbach, Michigan Tech
MATLAB® Primer, Kermit Sigmon, UF
MEME/MAST Online Tools
|
|
|
Rating:
0 user(s) have rated this courseware
Views:
27763
|
|
|
|
|