Instructor
Brian Caffo
Offered By
Biostatistics
Description
Presents fundamental concepts in applied probability, exploratory data analysis, and statistical inference, focusing on probability and analysis of one and two samples. Topics include discrete and continuous probability models; expectation and variance; central limit theorem; inference, including hypothesis testing and confidence for means, proportions, and counts; maximum likelihood estimation; sample size determinations; elementary non-parametric methods; graphical displays; and data transformations.
Learning Objectives
The goal of this course is to equip biostatistics and quantitative scientists with core applied statistical concepts and methods:
1) The course will refresh the mathematical, computational, statistical and probability background that students will need to take the course.
2) The course will introduce students to the display and communication of statistical data. This will include graphical and exploratory data analysis using tools like scatterplots, boxplots and the display of multivariate data. In this objective, students will be required to write extensively.
3) Students will learn the distinctions between the fundamental paradigms underlying statistical methodology.
4) Students will learn the basics of maximum likelihood.
5) Students will learn the basics of frequentist methods: hypothesis testing, confidence intervals.
6) Students will learn basic Bayesian techniques, interpretation and prior specification.
7) Students will learn the creation and interpretation of P values.
8) Students will learn estimation, testing and interpretation for single group summaries such as means, medians, variances, correlations and rates.
9) Students will learn estimation, testing and interpretation for two group comparisons such as odds ratios, relative risks and risk differences.
10) Students will learn the basic concepts of ANOVA.
Syllabus
Course Description
Presents fundamental concepts in applied probability, exploratory data analysis, and statistical inference, focusing on probability and analysis of one and two samples. Topics include discrete and continuous probability models; expectation and variance; central limit theorem; inference, including hypothesis testing and confidence for means, proportions, and counts; maximum likelihood estimation; sample size determinations; elementary non-parametric methods; graphical displays; and data transformations.
Course Objectives
The goal of this course is to equip biostatistics and quantitative scientists with core applied statistical concepts and methods:
1) The course will refresh the mathematical, computational, statistical and probability background that students will need to take the course.
2) The course will introduce students to the display and communication of statistical data. This will include graphical and exploratory data analysis using tools like scatterplots, boxplots and the display of multivariate data. In this objective, students will be required to write extensively.
3) Students will learn the distinctions between the fundamental paradigms underlying statistical methodology.
4) Students will learn the basics of maximum likelihood.
5) Students will learn the basics of frequentist methods: hypothesis testing, confidence intervals.
6) Students will learn basic Bayesian techniques, interpretation and prior specification.
7) Students will learn the creation and interpretation of P values.
8) Students will learn estimation, testing and interpretation for single group summaries such as means, medians, variances, correlations and rates.
9) Students will learn estimation, testing and interpretation for two group comparisons such as odds ratios, relative risks and risk differences.
10) Students will learn the basic concepts of ANOVA.
Prerequisites
Calculus, linear algebra and a moderate level of mathematical literacy are prerequisites for this class. Note that simply having the prerequisites for this class does not necessarily mean that it is the correct class for you. For example, a student with a PhD in theoretical mathematics who would like a broad overview of biostatistics and immediately applicable techniques would be better off in the 620 series.
Readings
Mathematical Statistics and Data Analysis, 2nd Edition by John
A. Rice. Duxbury Press.
Schedule
For s 1-14, please visit Methods in Biostatistics I (140.651)
|
15 |
Hypothesis Testing
1. Introduce hypothesis testing
2. Cover hypothesis testing for a single mean
3. Z and T tests for a single mean
4. Confidence interval equivalences
5. P-values
|
Hypothesis Testing Graphs
Read Rosner Chapt 7.1-7.7
|
16 |
Power and sample size and two group tests
1. Power
2. Power for a one sided normal test
3. Power for t-test
|
Hypothesis Testing Review
Read Rosner 8.1-8.8 and 8.10-8.12
|
17 |
Power and sample size and two group tests
1. Paired difference hypothesis tests
2. Independent group differences hypothesis tests
|
Hypothesis Testing Review
Read Rosner 8.1-8.8 and 8.10-8.12
|
18 |
Tests for binomial proportions
1. Tests for a binomial proportion
2. Score test versus Wald
3. Exact binomial test
4. Tests for differences in binomial proportions
5. Intervals for differences in binomial proportions
|
Read Rosner 7.10
|
19 |
Two sample binomial tests, delta method
1. Define relative risk
2. Odds ratio
3. Confidence intervals
|
Read Rosner 10.1, 10.2, 13.1-13.3
|
20 |
Two sample binomial tests, delta method
1. Review two sample binomial results
2. Delta method
|
Read Rosner 10.1, 10.2, 13.1-13.3
|
21 |
Fisher's exact tests, Chi-squared tests
1. Introduce Fisher's exact test
2. Illustrate Monte Carlo version of test
|
Read Rosner 10.2, 10.3, 10.6-10.9
|
22 |
Fisher's exact tests, Chi-squared tests
1. Chi-squared tests for equivalence of two binomial proportions
2. Chi-squared tests for independence, 2 x 2 tables
3. Chi-squared tests for multiple binomial proportions
4. Chi-squared tests for independence, r x c tables
5. Chi-squared tests for goodness of fit
|
Multinomial Distribution Notes
Read Rosner 10.2, 10.3, 10.6-10.9
|
23 |
Simpson's pardox, confounding
1. Simpson's paradox
2.Weighting
3. CMH estimate
4. CMH test
|
Read Rosner 13.4 & 13.5
|
24 |
Retrospective case-control studies, exact inference for the odds ratio
1. Odds ratios for retrospective studies
2. Odds ratios approximating the prospective RR
3. Exact inference for the odds ratio
|
|
25 |
Methods for matched pairs, McNemar's, conditional versus marginal odds ratios
1. Hypothesis tests of marginal homgeneity
2. Estimating marginal risk differences
3. Estimating marginal odds ratios
4. A brief note on the distinction between conditional
and marginal odds ratios
|
Read Rosner 10.4
|
26 |
Non-parametric tests, permutation tests
1. Distribution-free tests
2. Sign test
3. Sign rank test
4. Rank sum test
5. Discussion of non-parametric tests
|
|
27 |
Inference for Poisson counts
1. Poisson distribution
2. Tests of hypothesis for a single Poisson mean
3. Comparing multiple Poisson means
4. Likelihood equivalence with exponential model
|
|
28 |
Multiplicity
1. Familywise error rates
2. Bonferoni procedure
3. Performance of Bonferoni with multiple independent
tests
4. False discovery rate procedure
|
|
|