Academics
/
Courses
/
Descriptions
IEMS 395-2: Applied Statistical Learning and Decision Making

Prerequisites

A prior course in statistics at the level of IEMS 304; A course in matrix analysis; Proficiency in programming as coding will be a significant part of the class.

Description

This course examines a modern treatment of statistical learning in various decision-making settings. Students will explore both supervised and unsupervised learning, data preprocessing, model selection, and evaluation strategies. Practical case studies demonstrate how a data-driven approach can inform decisions in areas ranging from business strategy to public policy. It can be used as an IEOR elective for IE majors, and as an elective for minors in MLDS/DSE.

LEARNING OBJECTIVES

Achieve proficiency in coding and data analysis, understanding every line of example code from lectures
Develop a deep understanding of statistical methods (e.g., regression and classification) rather than relying on superficial application
Learn to interpret quantitative and visual results effectively, ensuring that methods are applied appropriately in various decision-making contexts
Engage actively with “Discussion Points and Questions” to solidify comprehension of complex topics

TOPICS

Module 1 (Linear Regression): Introduction to scatterplots, correlation, linear regression formulation and estimation, plus confidence intervals, hypothesis testing, and prediction
Module 2: Testing for normality (including Q–Q plots), Poisson regression for count data, and an introduction to variable selection
Module 3 (Penalized Regression): Ridge regression, lasso (with extended discussion), and cross validation for penalty selection
Module 4 (Missing Data): Importance of handling missing data, the EM Algorithm, and time series visualization through line graphs
Module 5 (Unit Root Test): Discussion of drift and trend in time series, unit root testing, and an introduction to classification
Module 6 (Classification): In-depth coverage of classification trees, tree pruning techniques, and random forests
Module 7 (Unsupervised Learning): Nearest neighbor algorithms, further exploration of nearest neighbor methods, unsupervised learning, clustering, and k–means clustering
Module 8 (Deep Learning Introduction): An introduction to PyTorch, back–propagation, and techniques such as dropout, width, and depth adjustments

MATERIALS

Course materials are provided on Canvas (including slides, handouts, homework assignments, datasets, and announcements)
Required Text: “An Introduction to Statistical Learning with Applications in Python” by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani (with a companion version in R)
Required Software: An interactive Python tutorial available on Kaggle, along with companion tutorials covering topics such as programming basics, machine learning, intermediate techniques, pandas, data visualization, and feature engineering
Additional References: “The Elements of Statistical Learning” by Hastie, Tibshirani, and Friedman, and “The Matrix Cookbook” by Petersen and Pedersen provide further depth in statistical methods and matrix analysis

Academics / Courses / DescriptionsIEMS 395-2: Applied Statistical Learning and Decision Making

Prerequisites

Description

Academics
/
Courses
/
Descriptions
IEMS 395-2: Applied Statistical Learning and Decision Making