For COVID-19 vaccine updates, please review our information guide. For patient eligibility and scheduling availability, please visit

Course Catalog

(number of credits in parenthesis)
Big Data in Medicine (3) HBDS 5020

Course Director: Samprit Banerjee, Ph.D., MStat 

There has been an explosion of big data in medicine and healthcare. There are four main sources of such big data – 1) administrative databases in healthcare such as electronic health records and health insurance claims, 2) biomedical imaging (e.g. MRI, CT-Scan, X-ray etc.) 3) sensors in smartphones, wearable and implantable devices and 4) genetics and genomics. It is difficult to navigate and critically assess the statistical methods and analytic tools that are needed to conduct analytics and research with such big biomedical data. This course will introduce the four above-mentioned important sources of big data in medical studies, discuss the nuances and intricacies of how such data are generated and introduce tools to navigate such databases visualize and describe them.

Biostatistics I with R Lab (4) HBDS 5005

Course Director: Karla Ballman, Ph.D.

This course provides an introduction to important topics in biostatistical concepts and reasoning. Specific topics include tools for describing central tendency and variability in data, probability distributions, sampling distributions, estimation, and hypothesis testing. Assignments will involve computation using the R programming language.

Categorical and Censored Data Analysis (2) HBDS 5016

Course Director: Oleksandr Savenkov, Ph.D.

The course will describe methods related to categorical data analysis and basic concepts for censored data and Kaplan-Meier; and learn how to select appreciate methods and how to interpret the results from categorical data analysis and Kaplan-Meier.

Data Science I (R and Python) (3) HBDS 5018

Course Director: Elizabeth Sweeney, Ph.D.

This course provides an introduction to data science using both the R and python programming languages. In this course students will gain experience working directly with data to pose and answer questions. The course will be divided into two parts; the first part will be taught with the programming language R and the second with python. Topics covered include: reproducible research, exploratory data analysis, data manipulation, data visualization techniques, simulation design, and unsupervised learning methods.

Hierarchical Modeling and Longitudinal Data Analysis (3) HBDS 5010

Course Director: Arindam RoyChoudhury, Ph.D.

An independent biostatistician often encounters data collected on patients over a length of time, or data that are otherwise clustered. This course will give the students necessary tools to analyze such data, while building on the core biostatistics material they have learned from other courses. Specifically, the students will learn to use mixed-effect models, mixed-effect ANOVA, generalized linear mixed models (GLMM), mixed-effect Cox-regression, Bayesian hierarchical models, repeated measure and longitudinal data analysis with appropriate covariance structures.

Modern Methods for Causal Inference (3) HBDS 5017

Course Director: Ivan Diaz, Ph.D.

The goal of this course is to introduce a core set of modern statistical concepts and techniques to the students, and to demonstrate how to use them to answer complex research questions in healthcare. The students will acquire knowledge on causal inference methods using machine learning, including directed acyclic graphs, non-parametric structural equation models, inverse probability weighting, g-computation, survival analysis, marginal structural models, longitudinal data, mediation analyses, effect modification, and precision medicine. This course will use the free software R to perform all statistical analysis.

Pharmaceutical Statistics (3) HBDS 5019

Course Director: Arindam RoyChoudhury, Ph.D.

Pharmaceutical studies use many statistical methods that are not routinely taught as part of conventional biostatistics courses. In this course, the students will learn the statistical methods specifically used in pharmaceutical studies.

The course is divided into three modules.

The first module: “Statistical Aspects of Phase I Clinical Trial” will include 3+3 Design, accelerated titration; up and down designs;  continual reassessment method (CRM), Modified CRM, TITE CRM, Bayesian Logistic Regression Model (BLRM), escalation with overdose control (EWOC), toxicity probability interval (TPI) and modified TPI (mTPI).

The second module: “Statistical Aspects of Phase II Clinical Trial” will include design and analyses for One stage and Simon’s Two Stage Designs, Multi-arm Phase II design.

The third module: “Statistical Aspects of Phase III Clinical Trial” will include randomization, design and analysis for parallel, crossover, factorial, seamless Phase II/III, Adaptive and SMART designs.

Statistical Programming with SAS (3) HBDS 5011

Course Director: Zhengming Chen, Ph.D., MPH, M.S.

This course provides introduction to the statistical software SAS. Students will receive a hands-on exposure to data management and report generation with one of the most popular statistical software packages.

Study Design (2) HBDS 5015

Course Director: Linda Gerber, Ph.D.

The course will describe and apply measures of disease incidence and prevalence, and measures of effect; explain the basic principles underlying different study designs, including descriptive, ecological, cross-sectional, cohort, case-control and intervention studies; assess strengths and limitations of different study designs; identify problems interpreting epidemiological data: chance, bias, confounding and effect modification; address validity, intra-rater reliability and inter-rater reliability.