Academics / Courses / DescriptionsIEMS 304: Statistical Learning for Data Analysis
Academics
/ Courses
/ Descriptions
VIEW ALL COURSE TIMES AND SESSIONS
Prerequisites
IEMS 303 or equivalent; CS 150 or equivalentDescription
Predictive modeling of data using modern regression and classification methods. Multiple linear regression; logistic regression; pitfalls and diagnostics; nonparametric and nonlinear regression and classification such as trees, nearest neighbors, neural networks, and ensemble methods.
- This course counts as an IE/OR elective for Industrial Engineering.
LEARNING OBJECTIVES
- Understand common data structures in modern predictive and explanatory modeling problems in business, engineering and the sciences and how to formulate the most appropriate solutions
- Learn R statistical software basics and how to use it for regression and classification problems
- Develop ability to fit appropriate linear and logistic regression models, including model selection and diagnostics
- Develop ability to interpret fitted linear and logistic regression models for explanatory and predictive purposes
- Learn fundamental concepts in nonlinear regression and classification, including maximum likelihood estimation, cross-validation, ridge and lasso shrinkage
- Learn how to fit and interpret popular supervised learning models including trees, smoothers, nearest neighbors, random forests, and boosted trees
TOPICS
- Multiple linear regression basics: model fitting, statistical inference, prediction
- Multiple linear regression: influence, residual diagnostics, multicollinearity, interactions, categorical predictors, variable selection, model evaluation criteria, ridge and lasso regression
- Logistic regression: model fitting and interpretation, statistical inference, diagnostics
- Nonlinear regression basics: maximum likelihood estimation, nonlinear least squares, cross-validation, bootstrapping
- Classification and regression trees
- Nearest neighbors for classification and regression
- Boosted trees and random forests
- R statistical software throughout the course
MATERIALS
Required: An Introduction to Statistical Learning with Applications in R, by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani, Springer. ISBN 978-1-4614-7138-7. Electronic version available free.
Recommended: IEMS 304 Reference Guide