| |
Abstract/Syllabus:
|
Dewey, C. Forbes, Hanry Yu, and Sourav Saha Bhowmick, 20.453J Biomedical Information Technology, Fall 2008. (Massachusetts Institute of Technology: MIT OpenCourseWare), http://ocw.mit.edu (Accessed 09 Jul, 2010). License: Creative Commons BY-NC-SA
Biomedical Information Technology (BE.453J)
Spring 2005
Biomedical technologies like this gene array are fueling rapid growth in sophisticated information systems for medical applications. (Courtesy of the National Institute of Standards and Technology).
Course Highlights
This course features a complete set of lecture slides.
Course Description
The objective of this subject is to teach the design of contemporary information systems for biological and medical data. These data are growing at a prodigious rate, and new information systems are required. This subject will cover examples from biology and medicine to illustrate complete life cycle information systems, beginning with data acquisition, following to data storage and finally to retrieval and analysis. Design of appropriate databases, client-server strategies, data interchange protocols, and computational modeling architectures will be covered. Students are expected to have some familiarity with scientific application software and a basic understanding of at least one contemporary programming language (C, C++, Java®, Lisp, Perl, Python, etc.). A major term project is required of all students. Reading is assigned from the contemporary literature, and there is occasional homework.
Technical Requirements
Microsoft® Powerpoint® software is recommended for viewing the .ppt files found on this course site. Free Microsoft® Powerpoint® viewer software can also be used to view the .ppt files.
Syllabus
Course Overview
The objective of this subject is to teach the design of contemporary information systems for biological and medical data. These data are growing at a prodigious rate, and new information systems are required. This subject will cover examples from biology and medicine to illustrate complete life cycle information systems, beginning with data acquisition, following to data storage and finally to retrieval and analysis. Design of appropriate databases, client-server strategies, data interchange protocols, and computational modeling architectures will be covered. Students are expected to have some familiarity with scientific application software and a basic understanding of at least one contemporary programming language (C, C++, Java®, Lisp, Perl, Python, etc.).
This H-level graduate course is also open to motivated seniors with a strong interest in biomedical engineering and information system design and the ability to carry out a significant independent project.
Prerequisites
1.00, 6.001, or experience with web-based computing.
Readings
There is no course textbook. Readings will be provided from the contemporay literature for each class session. Readings are also incorporated into some of the homework assignments.
Term Project
Each student in the course is required to present a term project that illustrates the use of the course material in a real information technology case in biology or medicine. The actual content of the case can vary depending upon the student's interests and existing skills. Projects can range from general studies of a class of problems and the recommendation of a solution to detailed implementations in running software.
Homework
Prior to the term project, students will complete four homework assignments.
Calendar
LEC # |
TOPICS |
KEY DATES |
1 |
Introduction
Objectives and Methodology
The Information-driven Scientific Method
Ontology and Semantics for Biomedical Information
Term Paper Instructions |
|
2 |
The Life Cycle of Scientific Data
Hypothesis to be Tested
Data Acquisition: Design
Data Acquisition: Measurement and Storage
Analysis and Modeling
Accomodating the Unknown - By Design |
|
Part I: Basic Technologies (3 Weeks) |
3 |
General Principles of Client-Server Architectures
Several contemporary papers on client-server architecture will be assigned for reading and discussed under the following topics:
The Parts: Client, Server, and "Glue"
Dividing the Tasks
Re-usable Code
Projections of Hardware and Software Trends |
Home work 1 due |
4 |
Database Technology I
Object and Relational Database Technology
The Database Schema
Accessing Databases: SQL, ODBC, and JDBC
Size, Performance, and Other Issues |
|
5 |
Database Technology II
Designing Database Schema
Stored Procedures and Similar Mechanisms
Local vs. Global Databases
Database Federation |
Home work 2 due |
6 |
Client Technology
The Classic Tradeoff: Server-side vs. Client-side Functionality
Handling Data Returned From the Server
Choice of Client Software: Java or ??
Graphics Capability |
|
7 |
The Umbilical Cord And Alphabet Soup
Java Contributions
Network Technologies: TCP/IP, Sockets, Threads
Encapsulation Layer: XML®, RDF
Use of CORBA |
Home work 3 due |
8 |
Metadata And The Support of Data Analysis
Getting the Data
Creating Storable Results
Storage Options: Keeping the Object Connection (UIDs)
Querying and Manipulating the Metadata
Implementing Database Federation and Complex Queries |
Home work 4 due |
9 |
Putting it All Together: Complete Architectures
Component Definition and Documentation
Connecting the Components
Robustness and Maintenance
Performance and Security
Interoperability
Standards for Data Interchange |
|
Part II: Selected Examples (4 Weeks) |
10 |
Medical Imaging Information I
The DICOM International Standard
The Patient-study-series-image Hierarchy
Design of an Object-oriented Database for DICOM Images |
|
11 |
Medical Imaging Information II
Integration of Metadata into the Images
Compression
Integration of Images into the Healthcare Environment: IHE
Beyond Databases: Structured Reporting |
|
12 |
No Formal Class Meeting
Use this week to complete three-page proposals for term paper. Schedule individual guidance sessions with the instructor. Critique will be returned by email prior to Lecture 13 |
Term paper proposal due after 5 days |
13 |
Micro Array Information I
Raw Data and Experiment Information
Existing Information Standards: MAGE-OM, MIAME
Existing Database Schema: Array Express, MIAMExpress
Integration of Micro Array and Gel Electrophoresis Schema |
|
14 |
Micro Array Information II
Methods of Analysis
Use Cases for Analysis
Storage and Query of Metadata
An Example: Pacific Northwest National Laboratory |
|
15 |
Gel Electrophoresis
Statement of the Experimental Problem
Defining Experimental Information Objects
The Case for Keeping Raw Image Data
Creating Individual Tables and Functions
Design of a Database Schema for Gel Electrophoresis
Interacting with External Analysis Programs
Storing and Retrieving Metadata
Generalization of Results to Other Experimental Methods |
|
16 |
Swan: Semantic Web Applications in Neuromedicine
Guest lecture by Tim Clark, Massachusetts General Hospital and Harvard Medical School |
|
17 |
Firespout ETL: THE Extract/Transform/Load Engine for Stored Data
Guest Lecture by Ngon Dao, Formerly CEO of Firespout |
|
18 |
Firespout: The Launching of an Information Technology Company
Guest Lecture by Ngon Dao, Formerly CEO of Firespout |
|
19 |
Biological Image Information
SEM, TEM, and Cryo-EM
Fluorescent Images
Analysis Requirements for Different Imaging Modalities
Compression and Other Strategies for Minimizing Storage
Similarities with Other Experimental Data Types
OME - The Open Microscopy Environment |
|
Part III: Data Integration and Analysis (1 Week) |
20 |
Data Integration and Analysis I
Integration in the Hospital Environment
- Imaging and Information Flow
- Use of the DICOM Standard
- The Personal Healthcare Record (EMR)
- HL-7: The Hospital Standard for Information Exchange
- IHE: Integrated Healthcare Environment
The Importance of Use Cases: York Hospital
Adding Metadata to Images and Other Records
- DICOM Structure Reporting (SR)
Diagnostic Coding Systems: SNOMED |
|
21 |
Data Integration and Analysis II
Integration in the Biological Environment
- New Standards are Required
- The Role of the W3C
- XML® and RDF as the "Medium and the Message"
(i) XML®/RDF for Schema Representation
(ii) XML®/RDF for Neutral Transport
(iii) RDF/OWL for Semantic Packages
Database Considerations
- Strengths of Relational Databases
- Weaknesses of the Relational Model
Solutions
- Database Federation
Adding Metadata to Images and Other Records |
|
Part IV: Student Presentations and Summary (2 Weeks) |
22 |
Student Paper Presentations I |
|
23 |
Student Paper Presentations II |
|
24 |
Student Paper Presentations III |
|
25 |
Capstone Roundtable Discussion
Session with leading IT professionals from the pharmaceutical and medical community in Boston. Discussion will include current state of IT for dealing with large medical and biological data sources, future challenges, and future opportunities. |
|
26 |
Last Class (Part 2)
Open Discussion in "Relaxed Atmosphere" |
|
|
|
Further Reading:
|
Readings
Course Readings
|
Lec #
|
Topics
|
Readings
|
1
|
Introduction
Objectives and Methodology
The Information-driven Scientific Method
Ontology and Semantics for Biomedical Information
Term Paper Instructions
|
|
2
|
The Life Cycle of Scientific Data
Hypothesis to be Tested
Data Acquisition: Design
Data Acquisition: Measurement and Storage
Analysis and Modeling
Accomodating the Unknown - by Design
|
Neumann, Eric K., Eric Miller, and John Wilbanks. "What the semantic web could do for the life sciences." Drug Discovery Today 2, no. 6 (2004): 228-236.
Advisory Committee to the Director, NIH Working Group on Biomedical Computing. "The Biomedical Information Science and Technology Initiative." Bethesda, MD: National Institutes of Health, U.S. Department of Health and Human Services, 1999.
Release 1.0 21, no. 1 (2003): 1-36. (Entire periodical.)
|
Part I: Basic Technologies (3 Weeks)
|
3
|
General Principles of Client-Server Architectures
Several contemporary papers on client-server architecture will be assigned for reading and discussed under the following topics:
The Parts: Client, Server, and "Glue"
Dividing the Tasks
Re-usable Code
Projections of Hardware and Software Trends
|
Wong, Stephen T. C., and H. K. Huang. "Design Methods and Architectural Issues of Integrated Medical Image Data Base Systems." Computerized Medical Imaging and Graphics 20, no. 4 (1996): 285-299.
|
4
|
Database Technology I
Object and Relational Database Technology
The Database Schema
Accessing Databases: SQL, ODBC, and JDBC
Size, Performance, and Other Issues
|
Covitz, Peter A., et al. "caCORE: A common infrastructure for cancer informatics." Bioinformatics 19, no. 18 (2003): 2404-2412.
Gupta, Amarnath, Bertram Ludäscher, and Maryann E. Martone. "Knowledge-Based Integration of Neuroscience Data Sources." 12th Intl. Conference on Scientific and Statistical Database Management (SSDBM), Berlin, Germany, IEEE Computer Society, July, 2000.
Hollander, Dave, and C. M. Sperberg-McQueen. "Happy Birthday, XML®!"
The World Wide Web Consortium Issues XML® 1.0 as a W3C Recommendation.
Development History.
|
5
|
Database Technology II
Designing Database Schema
Stored Procedures and Similar Mechanisms
Local vs. Global Databases
Database Federation
|
Kemp, Graham J. L., Nicos Angelopoulos, and Peter M. D. Gray. "A Schema-based Approach to Building a Bioinformatics Database Federation." Presented at the Proc. IEEE Int. Symp. Bio-Informatics Biomed. Eng. (BIBE), Washington, DC, November 2000.
Stonebraker, Michael, and Paul Brown. Object-Relational DBMSs: Tracking the Next Great Wave. 2nd ed. San Francisco, CA: Morgan Kaufmann Publishers, Inc., 1999, pp. 31-34. ISBN: 9781558604520.
|
6
|
Client Technology
The Classic Tradeoff: Server-side vs. Client-side Functionality
Handling Data Returned From the Server
Choice of Client Software: Java or ??
Graphics Capability
|
Seshadri, Govind. "Understanding JavaServer Pages Model 2 Architecture." JavaWorld, December 29, 1999.
Hall, Marty. "Java Integrated Development Environments (IDEs) and Editors."
Dichter, Carl, and Chris Tynes. "Which Java visual development environment is best for you?" JavaWorld, June 1, 1997.
"Oracle JDeveloper 10g Overview." An Oracle White Paper. March 2004.
JCreator - Java IDE.
Borland Software. "Borland JBuilder 2005 Technical Overview." (PDF)
Gibbs, Charles. "Setting Up a Java Development Environment for Linux." Linux Gazette 45 (September 1999).
The Middleware Company Research Team. "Assisted SOA Development: Productivity Comparison of SOA Development Tools." November 2004. (Notice: company discontinued.)
|
7
|
The Umbilical Cord And Alphabet Soup
Java Contributions
Network Technologies: TCP/IP, Sockets, Threads
Encapsulation Layer: XML®, RDF
Use of CORBA
|
Sun Microsystems, Inc. "Java Technology. Java 2 Platform, Enterprise Edition (J2EE) Overview." Sun Developer Network.
|
8
|
Metadata And The Support of Data Analysis
Getting the Data
Creating Storable Results
Storage Options: Keeping the Object Connection (UIDs)
Querying and Manipulating the Metadata
Implementing Database Federation and Complex Queries
|
McGuinness, Deborah L., and Frank van Harmelen, eds. "OWL Web Ontology Language Overview." February 10, 2004.
|
9
|
Putting it All Together: Complete Architectures
Component Definition and Documentation
Connecting the Components
Robustness and Maintenance
Performance and Security
Interoperability
Standards for Data Interchange
|
W3C. "Session V: Life Science Indentifiers - Use Cases, Future Directions." Presented to the Broad Institute, 2004.
Quan, Dennis. "BioHaystack: Gateway to the Biological Semantic Web." IBM, 2004. (PPT)
Liefeld, Ted. "Session V: Life Science Indentifiers – Use Cases, Future Directions." Presented at the W3C 2004 conference. Broad Institute, Cambridge, MA, 2004. (PDF) (Courtesy of Ted Liefeld. Used with permission.)
|
Part II: Selected Examples (4 Weeks)
|
10
|
Medical Imaging Information I
The DICOM International Standard
The Patient-study-series-image Hierarchy
Design of an Object-oriented Database for DICOM Images
|
Bray, Tim, et al, eds. "Namespaces in XML® 1.1." December 18, 2002.
W3C Communications Team. "XML® in 10 points." November 13, 2001.
"Life Science Identifier (LSID) Resolution Protocol Project."
|
11
|
Medical Imaging Information II
Integration of Metadata into the Images
Compression
Integration of Images into the Healthcare Environment: IHE
Beyond Databases: Structured Reporting
|
Skodras, A. N., C. A. Christopoulos, and T. Ebrahimi. "JPEG2000: The Upcoming Still Image Compression Standard." Proceedings of the 11th Portugese Conference on Pattern Recognition (RECPA00D). Porto, Portugal, May 11-12, 2000, pp. 359-366.
Cimino, J. J. "Review Paper: Coding Systems in Health Care." Methods of Information in Medicine 35 (1996): 273-84.
Clunie, David A. "DICOM Structured Reporting: An object model as an implementation boundary." SPIE Medical Imaging (2001). (PDF)
Bidgood Jr., W. D., et al. "The Role of Digital Imaging and Communications in Medicine in an Evolving Healthcare Computing Environment: The Model is The Message." Journal of Digital Imaging 11, no. 1 (1998): 1-9.
Clunie, David A. "Structured Reporting Concepts." Chapter 1 in DICOM Structured Reporting. Bangor, PA: PixelMed Publishing, 2000. ISBN: 9780970136909. (Out of print.)
Maojo, Victor, and Casimir A. Kulikowski. "Bioinformatics and Medical Informatics: Collaborations on the Road to Genomic Medicine?" Journal of the American Medical Informatics Association 10, no. 6 (2003): 515-520.
|
12
|
No Formal Class Meeting
Use this week to complete three-page proposals for term paper. Schedule individual guidance sessions with the instructor. Critique will be returned by email prior to Lecture 13
|
|
13
|
Micro Array Information I
Raw Data and Experiment Information
Existing Information Standards: MAGE-OM, MIAME
Existing Database Schema: Array Express, MIAMExpress
Integration of Micro Array and Gel Electrophoresis Schema
|
Stoeckert Jr., Christian J., Helen C. Causton, and Catherine A. Ball. "Microarray databases: standards and ontologies." Nature Genetics Supplement 32 (2002): 469-473.
Namprempre, Chanathip. "A Proposed DICOM Format for Electrocardiograms." International Consortium for Medical Imaging Technology (ICMIT).
Booch, Grady. "Software Architecture." IBM, 2004. (PPT - 4 MB)
|
14
|
Micro Array Information II
Methods of Analysis
Use Cases for Analysis
Storage and Query of Metadata
An Example: Pacific Northwest National Laboratory
|
Saal, Lao H., Carl Troein, Johan Vallon-Christersson, Sofia Gruvberger, Åke Borg and Carsten Peterson. "BioArray Software Environment: A Platform for Comprehensive Management and Analysis of Microarray Data." Genome Biology 3, no. 8 (2002): software0003.1-0003.6.
|
15
|
Gel Electrophoresis
Statement of the Experimental Problem
Defining Experimental Information Objects
The Case for Keeping Raw Image Data
Creating Individual Tables and Functions
Design of a Database Schema for Gel Electrophoresis
Interacting with External Analysis Programs
Storing and Retrieving Metadata
Generalization of Results to Other Experimental Methods
|
Brazma, Alvis, et al. "Minimum information about a microarray experiment (MIAME)-toward standards for microarray data." Nature Genetics 29 (2001): 365-366.
Mitchell, Peter. "A perspective on protein microarrays." Nature Biotechnology 20 (2002): 225-229.
Taylor, Chris F., et al. "A systematic approach to modeling, capturing, and disseminating proteomics experimental data." Nature Biotechnology 21 (2003): 247-254.
|
16
|
Swan: Semantic Web Applications in Neuromedicine
Guest lecture by Tim Clark, Massachusetts General Hospital and Harvard Medical School
|
Berners-Lee, Tim, James Hendler, and Ora Lassila. "The Semantic Web." Scientific American (May 2001): 36-43.
Hausser, Roland. "The four basic ontologies of semantic interpretation."
Keller, Richard M., et al. "SemanticOrganizer: A Customizable Semantic Repository for Distributed NASA Project Teams." International Semantic Web Conference (ISWC) 2004, LNCS 3298, 2004, pp. 767-781.
Clark, Tim, Sean Martin, and Ted Liefeld. "Globally distributed object identification for biological knowledgebases." Briefings in Bioinformatics 5, no. 1 (2004): 59-70.
Hendler, James. "Science and the Semantic Web." Science 299 (2003): 520-521.
Li, Gangmin, et al. "ClaiMaker: Weaving a Semantic Web of Research Papers." Research Paper at International Semantic Web Conference (ISWC) 2002, June 9-12th, 2002 Sardinia, Italia.
|
17
|
Firespout ETL: THE Extract/Transform/Load Engine for Stored Data
Guest Lecture by Ngon Dao, Formerly CEO of Firespout
|
|
18
|
Firespout: The Launching of an Information Technology Company
Guest Lecture by Ngon Dao, Formerly CEO of Firespout
|
|
19
|
Biological Image Information
SEM, TEM, and Cryo-EM
Fluorescent Images
Analysis Requirements for Different Imaging Modalities
Compression and Other Strategies for Minimizing Storage
Similarities with Other Experimental Data Types
OME - The Open Microscopy Environment
|
Swedlow, Jason R. "Informatics and Quantitative Analysis in Biological Imaging." Science 300 (2003): 100-102.
DICOM Standards Committee, Working Group 13. "Supplement 15: Visible Light Image for Endoscopy, Microscopy, and Photography." Digital Imaging and Communications in Medicine (DICOM) (July 2, 1999).
|
Part III: Data Integration and Analysis (1 Week)
|
20
|
Data Integration and Analysis I
Integration in the Hospital Environment
- Imaging and Information Flow
- Use of the DICOM Standard
- The Personal Healthcare Record (EMR)
- HL-7: The Hospital Standard for Information Exchange
- IHE: Integrated Healthcare Environment
The Importance of Use Cases: York Hospital
Adding Metadata to Images and Other Records
- DICOM Structure Reporting (SR)
Diagnostic Coding Systems: SNOMED
|
|
21
|
Data Integration and Analysis II
Integration in the Biological Environment
- New Standards are Required
- The Role of the W3C
- XML® and RDF as the "Medium and the Message"
(i) XML®/RDF for Schema Representation
(ii) XML®/RDF for Neutral Transport
(iii) RDF/OWL for Semantic Packages
Database Considerations
- Strengths of Relational Databases
- Weaknesses of the Relational Model
Solutions
- Database Federation
Adding Metadata to Images and Other Records
|
"Standards Man." Health IT World.
"Ontology Management System." IBM, October 30, 2003.
|
Part IV: Student Presentations and Summary (2 Weeks)
|
22
|
Student Paper Presentations I
|
|
23
|
Student Paper Presentations II
|
|
24
|
Student Paper Presentations III
|
|
25
|
Capstone Roundtable Discussion
Session with leading IT professionals from the pharmaceutical and medical community in Boston. Discussion will include current state of IT for dealing with large medical and biological data sources, future challenges, and future opportunities
|
|
26
|
Last Class (Part 2)
Open Discussion in "Relaxed Atmosphere"
|
|
|
|
|
Rating:
0 user(s) have rated this courseware
Views:
17091
|
|
|
|
|