# Courses for Statistics

**STAT W 3107x and y Undergraduate Research**

This course provides a mechanism for students who undertake research with a
faculty member from the Department of Statistics to receive academic credit.
Students seeking research opportunities should be proactive and
entrepreneurial: identify congenial faculty whose research is appealing, let
them know of your interest and your background and skills. - B. Baydil, R.
Neath

*Prerequisites: the project mentor's permission. BC: Fulfillment of
General Education Requirement: Quantitative and Deductive Reasoning
(QUA)..*

*May be repeated for credit.*

## Introductory Courses

The Department of Statistics offers three first introductory courses, STAT W1001, W1111, and W1211. All three may be taken without preparation in statistics. All three cover roughly the same concepts, but differ substantially in the mathematical maturity that is assumed and in the sophistication of the examples.

STAT W1001 is for students who have no more than the most basic algebra, and may be of interest to students in non-mathematical disciplines seeking to satisfy the Quantitative and Deductive Reasoning requirement. STAT W1111 is for students who have mastered basic algebra; practice is emphasized over mathematical theory. STAT W1211 is for students with competence in differential and integral calculus and emphasizes theory over practice.

STAT W1211 or W1111 may be substituted for ECON BC2411 in satisfaction of the major requirements in Economics. STAT W1211 is required for the major in Mathematics-Statistics, Economics-Statistics, and Statistics, and the for the concentration in Statistics. STAT W1001 and W1111 may be applied to the major requirement in Political Science-Statistics. Students that declared their major in Psychology prior to the 2008-2009 academic year may satisfy their major requirements with STAT W1111 or W1211 in lieu of PSYC BC1101.

STAT W2110 follows on the material of the three introductory courses, and is designed for students interested in developing practical skills. Applications of statistics to current issues in the sciences and social sciences are emphasized.

**STAT W 1001x and y Introduction to Statistical
Reasoning**

A friendly introduction to statistical concepts and reasoning with emphasis
on developing statistical intuition rather than on mathematical rigor. Topics
include design of experiments, descriptive statistics, correlation and
regression, probability, chance variability, sampling, chance models, and
tests of significance.

*BC: Fulfillment of General Education Requirement: Quantitative and
Deductive Reasoning (QUA)..*

*3 points*

**STAT W 1101x and y Introduction to Statistics**

Designed for students in fields that emphasize quantitative methods.
Graphical and numerical summaries, probability, theory of sampling
distributions, linear regression, analysis of variance, confidence intervals
and hypothesis testing. Quantitative reasoning and data analysis. Practical
experience with statistical software. Illustrations are taken from a variety
of fields. Data-collection/analysis project with emphasis on study designs is
part of the coursework requirement.

*Prerequisites: intermediate high school algebra. BC: Fulfillment of
General Education Requirement: Quantitative and Deductive Reasoning
(QUA)..*

*3 points*

**STAT W 1201x and y Calculus-Based Introduction to
Statistics**

Designed for students who desire a strong grounding in statistical concepts
with a greater degree of mathematical rigor than in *STAT W1111*. Random variables, probability
distributions, pdf, cdf, mean, variance, correlation, conditional
distribution, conditional mean and conditional variance, law of iterated
expectations, normal, chi-square, F and t distributions, law of large
numbers, central limit theorem, parameter estimation, unbiasedness,
consistency, efficiency, hypothesis testing, p-value, confidence intervals,
maximum likelihood estimation. Serves as the pre-requisite for *ECON W3412*.

*Prerequisites: one semester of calculus. BC: Fulfillment of General
Education Requirement: Quantitative and Deductive Reasoning
(QUA)..*

*3 points*

**STAT W 1202x Undergraduate Seminar**

Prepared with undergraduates majoring in quantitative disciplines in mind,
the presentations in this colloquium focus on the interface between data
analysis, computation, and theory in interdisciplinary research. Meetings are
open to all undergraduates, whether registered or not. Presenters are drawn
from the faculty of department in Arts and Sciences, Engineering, Public
Health and Medicine. - B. Baydil, R. Neath

*Prerequisites: Previous or concurrent enrollment in a course in
statistics would make the talks more accessible.*

*1 point*

**STAT W 2102y Applied Statistical Computing**

This course is an introduction to R programming. After learning basic
programming component, such as defining variables and vectors, and learning
different data structures in R, students will, via project-based assignments,
study more advanced topics, such as recursion, conditionals, modular
programming, and data visualization. Students will also learn the
fundamental concepts in computational complexity, and will practice writing
reports based on their statistical analyses.

*Corequisites: An introductory course in statistic (STAT UN1101 is recommended).*

**STAT W 2103x Applied Linear Regression Analysis**

Develops critical thinking and data analysis skills for regression analysis
in science and policy settings. Simple and multiple linear regression,
non-linear and logistic models, random-effects models. Implementation in a
statistical package. Emphasis on real-world examples and on planning,
proposing, implementing, and reporting.

*Prerequisites: An introductory course in statistics (STAT UN1101 is recommended). Students without programming
experience in R might find STAT UN2102 very helpful.*

*3 points*

**STAT W 2104y Applied Categorical Data Analysis**

This course covers statistical models amd methods for analyzing and drawing
inferences for problems involving categofical data. The goals are
familiarity and understanding of a substantial and integrated body of
statistical methods that are used for such problems, experience in anlyzing
data using these methods, and profficiency in communicating the results of
such methods, and the ability to critically evaluate the use of such methods.
Topics include binomial proportions, two-way and three-way contingency
tables, logistic regression, log-linear models for large multi-way
contingency tables, graphical methods. The statistical package R will be
used. - J Landwehr

*Prerequisites: STAT UN2103 is strongly recommended. Students without
programming experience in R might find STAT UN2102 very helpful.*

*3 points*

**STAT W 3105x Applied Statistical Methods**

This course is intended to give students practical experience with
statistical methods beyond linear regression and categorical data analysis.
The focus will be on understanding the uses and limitations of models, not
the mathematical foundations for the methods. Topics that may be covered
include random and mixed-effects models, classical non-parametric techniques,
the statistical theory causality, sample survey design, multi-level models,
generalized linear regression, generalized estimating equations and
over-dispersion, survival analysis including the Kaplan-Meier estimator,
log-rank statistics, and the Cox proportional hazards regression model.
Power calculations and proposal and report writing will be discussed.

*Prerequisites: At least one, and preferably both, of STAT UN2103 and UN2104 are strongly recommended. Students without
programming experience in R might find STAT UN2102 very helpful.*

*3 points*

**STAT W 3106y Applied Data Mining**

Data Mining is a dynamic and fast growing field at the interface of
Statistics and Computer Science. The emergence of massive datasets containing
millions or even billions of observations provides the primary impetus for
the field. Such datasets arise, for instance, in large-scale retailing,
telecommunications, astronomy, computational and statistical challenges. This
course will provide an overview of current practice in data mining. Specific
topics covered with include databases and data warehousing, exploratory data
analysis and visualization, descriptive modeling, predictive modeling,
pattern and rule discovery, text mining, Bayesian data mining, and causal
inference. The use of statistical software will be emphasized. - B.
Emir

*Prerequisites: STAT UN2103. Students without programming experience in R
might find STAT UN2102 very helpful.*

*3 points*

**STAT W 4001x and y Introduction to Probability and
Statistics**

A calculus-based tour of the fundamentals of probability theory and
statistical inference. Probability models, random variables, useful
distributions, conditioning, expectations, law of large numbers, central
limit theorem, point and confidence interval estimation, hypothesis tests,
linear regression. This course replaces SIEO 4150. - L. Wright, I.
Hueter

*Prerequisites: Calculus through multiple integration and infinite sums.
BC: Fulfillment of General Education Requirement: Quantitative and Deductive
Reasoning (QUA)..*

*3 points*

## Foundation Courses

The Department offers STAT W3105, W3107, and W4315 as a sequence. W3105 covers probability theory and is a prerequisite for W3107. W3107 covers statistical theory, and is a prerequisite for STAT W4315. STAT W4315 covers linear regression models, and provides an introduction to practical issues in data analysis. Students who have difficulty scheduling STAT W3105 or W3107 may substitute, respectively, STAT W4105 and W4107, or substitute, for the pair, the combined course STAT W4109. The sequences is a pre-requisite for the advanced undergraduate offerings in the Department (except W4604 and W4835, which have only W3105 as a prerequisite, and W4204, which has only STAT W3105 and W3107 as co-requisites). STAT W4150 is an abridged version of W3105 and W3107 designed especially for SEAS students.

**STAT W 4203x Probability Theory**

A calculus-based introduction to probability theory. A quick review of
multivariate calculus is provided. Topics covered include random variables,
conditional probability, expectation, independence, Bayes' rule, important
distributions, joint distributions, moment generating functions, central
limit theorem, laws of large numbers and Markov's inequality.

*Prerequisites: At least one semester, and preferably two, of calculus. An
introductory course (STAT UN2101, preferably) is strongly recommended. BC:
Fulfillment of General Education Requirement: Quantitative and Deductive
Reasoning (QUA)..*

*3 points*

**STAT W 4204y Statistical Inference**

Calculus-based introduction to the theory of statistics. Useful
distributions, law of large numbers and central limit theorem, point
estimation, hypothesis testing, confidence intervals maximum likelihood,
likelihood ratio tests, nonparametric procedures, theory of least squares and
analysis of variance.

*Prerequisites: STAT GU4203. At least one semester of calculus is required;
two or three semesters are strongly recommended. BC: Fulfillment of General
Education Requirement: Quantitative and Deductive Reasoning
(QUA)..*

*3 points*

**STAT W 4205x Linear Regression Models**

Theory and practice of regression analysis. Simple and multiple regression,
testing, estimation, prediction, and confidence procedures, modeling,
regression diagnostics and plots, polynomial regression, colinearity and
confounding, model selection, geometry of least squares. Extensive use of the
computer to analyse data.

*Prerequisites: STAT GU4204 or the equivalent, and a course in linear
algebra.*

*3 points*

**STAT W 4206x Statistical Computing and Introduction to Data
Science**

Introduction to programming in the R statistical package: functions, objects,
data structures, flow control, input and output, debugging, logical design,
and abstraction. Writing code for numerical and graphical statistical
analyses. Writing maintainable code and testing, stochastic simulations,
paralleizing data analyses, and working with large data sets. Examples from
data science will be used for demonstration.

*Prerequisites: STAT GU4204 and GU4205 or the equivalent.*

*3 points*

**STAT W 4207x and y Elementary Stochastic Processes**

Review of elements of probability theory. Poisson processes. Renewal theory.
Wald's equation. Introduction to discrete and continuous time Markov chains.
Applications to queueing theory, inventory models, branching processes. - M.
Brown

*Prerequisites: STAT GU4203 and two, preferably three, semesters of
calculus. BC: Fulfillment of General Education Requirement: Quantitative and
Deductive Reasoning (QUA)..*

*3 points*

## Advanced Courses

**STAT W 3201x or y Math Finance in Continuous Time**

This follows *MATH V3050*. Basic concepts in probability theory, and
then advanced concepts, including Brownian motion, stochastic calculus,
expectation, Radon-Nikodym theorem, Girsanov's theorem, stochastic
differential equations (inlcuding Black-Merton-Scholes), options and hedging,
stochastic interest rates, forwards and futures. Formal proofs will be
eschewed in favor of understanding concepts.

*Prerequisites: MATH V3050. Not offered in 2016-2017.*

*3 points*

**STAT W 4221x and y Time Series Analysis**

Least squares smoothing and prediction, linear systems, Fourier analysis, and
spectral estimation. Impulse response and transfer function. Fourier series,
the fast Fourier transform, autocorrelation function, and spectral density.
Univariate Box-Jenkins modeling and forecasting. Emphasis on applications.
Examples from the physical sciences, social sciences, and business. Computing
is an integral part of the course.

*Prerequisites: STAT GU4205 or the equivalent. BC: Fulfillment of General
Education Requirement: Quantitative and Deductive Reasoning
(QUA)..*

*3 points*

**STAT W 4222y Nonparametric Statistics**

Statistical inference without parametric model assumption. Hypothesis testing
using ranks, permutations, and order statistics. Nonparametric analogs of
analysis of variance. Non-parametric regression, smoothing and model
selection. - B. Sen

*Prerequisites: STAT GU4204 or the equivalent. BC: Fulfillment of General
Education Requirement: Quantitative and Deductive Reasoning
(QUA)..*

*3 points*

**STAT W 4223y Multivariate Statistical Inference**

Multivariate normal distribution, multivariate regression and classification;
canonical correlation; graphical models and Bayesian networks; principal
components and other models for factor analysis; SVD; discriminant analysis;
cluster analysis.

*Prerequisites: STAT GU4205 or the equivalent.*

**STAT W 4224y Bayesian Statistics**

Bayesian vs frequentist, prior and posterior distributions, conjugate priors,
informative and non-informative prior subjective and objective bayes, oneand
two sample problems, models for normal data, models for binary data,
multivariate normal shrinkage, bayesian linear models, bayesian computation
(start early), MCMC algorithms, the Gibbs sampler, hierarchical models,
empirical bayes, hypothesis testing, bayes factors, model selection,
software: R and WinBUGS

*Prerequisites: STAT GU4204 or the equivalent.*

*3 points*

**STAT W 4231y Survival Analysis**

Survival distributions, types of censored data, estimation for various
survival models, nonparametric estimation of survival distributions, the
proportional hazard and accelerated lifetime models for regression analysis
with failure-time data. Extensive use of the computer. - M. Shnaidman

*Prerequisites: STAT GU4205 or the equivalent. BC: Fulfillment of General
Education Requirement: Quantitative and Deductive Reasoning
(QUA)..*

**STAT W 4232y Generalized Linear Models**

Statistical methods for rates and proportions, ordered and nominal
categorical responses, contingency tables, odds-ratios, exact inference,
logistic regression, Poisson regression, generalized linear models. - M.
Sobel

*Prerequisites: STAT GU4205 or the equivalent. BC: Fulfillment of General
Education Requirement: Quantitative and Deductive Reasoning
(QUA)..*

*3 points*

**STAT W 4233x Multilevel Models**

Theory and practice, including model-checking, for random and mixed-effects
models (also called hierarchical, multi-level models). Extensive use of the
computer to analyse data.

*Prerequisites: STAT GU4205 or the equivalent. BC: Fulfillment of General
Education Requirement: Quantitative and Deductive Reasoning (QUA).. Not
offered in 2016-2017.*

**STAT W 4234y Sample Surveys**

Introductory course on the design and analysis of sample surveys. How sample
surveys are conducted, why the designs are used, how to analyze survey
results, and how to derive from first principles the standard results and
their generalizations. Examples from public health, social work, opinion
polling, and other topics of interest. - M. Sobel

*Prerequisites: STAT GU4204 or the equivalent. BC: Fulfillment of General
Education Requirement: Quantitative and Deductive Reasoning
(QUA)..*

*3 points*

**STAT W 4241y Statistical Machine Learning**

The course will provide an introduction to Machine Learning and its core
models and algorithms. The aim of the course is to provide students of
statistics with detailed knowledge of how Machine Learning methods work and
how statistical models can be brought to bear in computer systems - not only
to analyze large data sets, but to let computers perform tasks that
traditional methods of computer science are unable to address. Examples range
from speech recognition and text analysis through bioinformatics and medical
diagnosis. This course provides a first introduction to the statistical
methods and mathematical concepts which make such technologies possible. -
Peter Orbanz

*Prerequisites: STAT GU4206.*

**STAT Q 4242x Advanced Machine Learning**

his course covers some advanced topics in machine learning and has an
emphasis on applications to real world data. A major part of this course is
a course project which consists of an in-class presentation and a written
project report.

*Prerequisites: STAT GU4241 Not offered in 2016-2017.*

*3 points*

**STAT W 4243y Applied Data Science**

This course will incorporate knowledge and skills covered in a statistical
curriculum with topics and projects in data science. Programming will covered
using existing tools in R. Computing best practices will be taught using
test-driven development, version control, and collaboration. Students finish
the class with a portfolio on GitHub, and deeper understanding of several
core statistical/machine-learning algorithms. Bi-weekly project cycles
throughout the semester provide students extensive hands-on experience with
various data-driven applications. - Tian Zheng

*Prerequisites: STAT GU4206 or the equivalent.*

*3 points*

**STAT W 4261x and y Statistical Methods in Finance**

A fast-paced introduction to statistical methods used in quantitative
finance. Financial applications and statistical methodologies are intertwined
in all lectures. Topics include regression analysis and applications to the
Capital Asset Pricing Model and multifactor pricing models, principal
components and multivariate analysis, smoothing techniques and estimation of
yield curves statistical methods for financial time series, value at risk,
term structure models and fixed income research, and estimation and modeling
of volatilities. Hands-on experience with financial data.

*3 points*

**STAT W 4262y Stochastic Processes for Finance**

A careful review of the concept of stochastic process as a model of random
phenomena evolving through time and of conditional expectation, basic Markov
process theory, and the exponential distribution. Marked point processes and
their compensators, beginning with Poisson processes, and proceeding through
general marked point processes. The use of compensators will be justified by
the Doob-Meyer decomposition theorem, and as such it will connect the theory
to martingales. Markov processes will enter to provide a description of
sufficient conditions for the compensators to have absolutely continuous
paths (and as such, have "hazard rates"). Applications to survival analysis
and, especially, to mathematical finance, including default and bankruptcy
models. Cox process construction. This is a core course in the MS program
in mathematical finance.

*Prerequisites: STAT GU4203. STAT GU4207 is recommended.*

*3 points*

**STAT W 4263y Statistical Inference and Time-Series
Modelling**

Modeling and inference for random processes, from natural sciences to finance
and economics. ARMA, ARCH, GARCH and nonlinear models, parameter estimation,
prediction and filtering. This is a core course in the MS program in
mathematical finance.

*Prerequisites: STAT GU4204 or the equivalent. STAT GU4205 is recommended.*

*3 points*

**STAT G 4264x and y Stochastic Processes and
Applications**

Basics of continuous-time stochastic processes. Wiener processes. Stochastic
integrals. Ito's formula, stochastic calculus. Stochastic exponentials and
Girsanov's theorem. Gaussian processes. Stochastic differential equations.
Additional topics as time permits.

*Prerequisites: STAT GU4203. STAT GU4207 is recommended.*

*3 points*

**STAT W 4265x and y Stochastic Methods in Finance**

Mathematical theory and probabilistic tools for modeling and analyzing
security markets are developed. Pricing options in complete and incomplete
markets, equivalent martingale measures, utility maximization, term structure
of interest rates. This is a core course in the MS program in mathematical
finance.

*Prerequisites: STAT GU4264.*

*3 points*

**STAT G 4266 Stochastic Control and Applications in
Finance**

The course provides an introduction ot th eoptimal control of stochastic
systems in continuous time. The topics are centered around controlled
diffusions and otpimal stoppping, and illustrated by applications in Finance
such as Merton's portfolio allocation problem, quadratic hedging, optimal
liquidation, or the pricing of American options. The thoery of dynamic
programming is developled together with the associated partial differnetial
equations (Hamilton-Jacobi-Bellman equations)and boundary value problems, and
complemented by the conved duality method. - M Nutz

*Prerequisites: STAT GU4265 Not offered in 2016-2017.*

**STAT W 4291x and y Advanced Data Analysis**

This is a course on getting the most out of data. The emphasis will be on
hands-on experience, involving case studies with real data and using common
statistical packages. The course covers, at a very high level, exploratory
data analysis, model formulation, goodness of fit testing, and other standard
and non-standard statistical procedures, including linear regression,
analysis of variance, nonlinear regression, generalized linear models,
survival analysis, time series analysis, and modern regression methods.
Students will be expected to propose a data set of their choice for use as
case study material. - Demissie Alemayehu

*Prerequisites: STAT GU4205 and at least one statistics course numbered
between GU4221 and GU4261. BC: Fulfillment of General Education Requirement:
Quantitative and Deductive Reasoning (QUA)..*

*3 points*

**STAT W 4702y Exploratory Data Analysis and
Visualization**

This course is covers the following topics: fundamentals of data
visualization, layered grammer of graphics, perception of discrete and
continuous variables, intreoduction to Mondran, mosaic pots, parallel
coordinate plots, introduction to ggobi, linked pots, brushing, dynamic
graphics, model visualization, clustering and classification.

*Prerequisites: A course in computer programming.*

*3 points*

## Actuarial Science Courses

**STAT W 4281x Theory of Interest**

Introduction to the mathematical theory of interest as well as the elements
of economic and financial theory of interest. Topics include rates of
interest and discount; simple, compound, real, nominal, effective, dollar
(time)-weighted; present, current, future value; discount function;
annuities; stocks and other instruments; definitions of key terms of modern
financial analysis; yield curves; spot (forward) rates; duration;
immunization; and short sales. The course will cover determining equivalent
measures of interest; discounting; accumulating; determining yield rates; and
amortization.

*Prerequisites: At least one semester of calculus.*

*3 points*

**STAT W 4282x Linear Regression and Time Series Methods**

A one semester course covering: simple and multiple regression, including
testing, estimation, and confidence procedures, modeling, regression
diagnostics and plots, polynomial regression, colinearity and confounding,
model selection, geometry of least squares. Linear time series models.
Auto-regressive, moving average and ARIMA models. Estimation and forecasting
with time series models. Confidence intervals and prediction error. Students
may not receive credit for more than two of *STAT W4315*, *W4437*, and *W4440*. Satisfies the SOA VEE requirements in
regression and in time-series.

*Prerequisites: STAT GU4204 or the equivalent.*

*3 points*