Health Sciences > General Public Health > Essentials of Probability and Statistical Inferenc
 Essentials of Probability and Statistical Inferenc  posted by  member7_php   on 3/8/2009 Add To Favorites
Abstract/Syllabus
Courseware/Lectures
Test/Tutorials
Webliography
More Options

Abstract/Syllabus:

## Essentials of Probability and Statistical Inference IV: Algorithmic and NonParametic Approaches

### Spring 2006

Course

Rafael Irizarry

Biostatistics

#### Description

Introduces the theory and application of modern, computationally-based methods for exploring and drawing inferences from data. Covers re-sampling methods, non-parametric regression, prediction, and dimension reduction and clustering. Specific topics include Monte Carlo simulation, bootstrap cross-validation, splines, local weighted regression, CART, random forests, neural networks, support vector machines, and hierarchical clustering. De-emphasizes proofs and replaces them with extended discussion of interpretation of results and simulation and data analysis for illustration.

## Syllabus

#### Course Description

Introduces the theory and application of modern, computationally-based methods for exploring and drawing inferences from data. Covers re-sampling methods, non-parametric regression, prediction, and dimension reduction and clustering. Specific topics include Monte Carlo simulation, bootstrap cross-validation, splines, local weighted regression, CART, random forests, neural networks, support vector machines, and hierarchical clustering. De-emphasizes proofs and replaces them with extended discussion of interpretation of results and simulation and data analysis for illustration.

#### Course Objectives

After completing this course, a student will be able to understand the theoretical basis for the current methods used in statistical analysis.

#### Prerequisites

140.646-648 or 140.611-12 or 140.621-24 or 140.651-54 or 140.671-74; working knowledge of calculus

• T. Hastie, R. Tibshirani, and J. H. Fried. (2001) The Elements of Statistical Learning. Springer-Verlag: New York.
• Venables, W.N. and Ripley, B.D. (2002) Modern Applied Statistics with S-Plus. Springer-Verlag: New York.
• Brian D. Ripley. (1996) Pattern Recognition and Neural Networks. Cambridge University Press.

#### Course Requirements

Method of student evaluation based on homeworks, quizzes, and a final project.

## Schedule

SESSION # TOPIC ACTIVITIES
N/A Review Lecture: Stuff you should know: Basics of probability, the central limit theorem, and inference.
1 Introduction to Regression and Prediction Lecture: We will describe linear regression in the context of a prediction problem.
2 Overview of Supervised Learning Lecture: Regression for predicting bivariate data, K nearest neighbors (KNN), bin smoothers, and an introduction to the bias/variance trade-off.
3 Linear Methods for Regression Lecture: Subset selection and ridge regression. We will use singular value decomposition (SVD) and principal component analysis (PCA) to understand these methods.
4 Linear Methods for Regression Lecture: Subset selection and ridge regression. We will use singular value decomposition (SVD) and principal component analysis (PCA) to understand these methods.
5 Linear Methods for Classification Lecture: Linear Regression, Linear Discriminant Analysis (LDA), and Logisitc Regression
6 Kernel Methods Lecture: Kernel smoothers including loess. We will briefly describe 2 dimensional smoothers. We will also define degrees of freedom in the context of smoothing and learn about density estimators.
7 Model Assessment and Selection Lecture: We revist the bias-variance tradeoff. We describe how monte-carlo simulations can be used to assess bias and variance. We then introduce cross-validation, AIC, and BIC.
8 The Bootstrap Lecture: We give a short introduction to the bootstrap and demonstrate its utility in smoothing problems.
9 Splines, Wavelets, and Friends Lecture: We give intuitive and mathematical description of Splines and Wavelets. We use the SVD to understand these better and see connections with signal processing methods.
10 Splines, Wavelets, and Friends Lecture: We give intuitive and mathematical description of Splines and Wavelets. We use the SVD to understand these better and see connections with signal processing methods.
11 Additive Models, GAM and Neural Networks Lecture: We move back to cases with many covariates. We introduce projection pursuit, additive models as well as generalized additive models. We breifly describe neural networks and explain the connection to projection pursuit.
12 Additive Models, GAM and Neural Networks Lecture: We move back to cases with many covariates. We introduce projection pursuit, additive models as well as generalized additive models. We breifly describe neural networks and explain the connection to projection pursuit.
13 Model Averaging Lecture: Bayesian Statistics, Boosting and Bagging.
14 CART, Boosting and Additive Trees Lecture: We introduce classification algorithms and regression trees (CART) as well as the more modern versions such as random forrests.
15 CART, Boosting and Additive Trees Lecture: We introduce classification algorithms and regression trees (CART) as well as the more modern versions such as random forrests.
16 Clustering Algorithms Lecture

www.sharecourseware.org   Tell A Friend