Academics / Courses / DescriptionsIEMS 395-2: Applied Statistical Learning and Decision Making
Academics
/ Courses
/ Descriptions
VIEW ALL COURSE TIMES AND SESSIONS
Prerequisites
A prior course in statistics at the level of IEMS 304; A course in matrix analysis; Proficiency in programming as coding will be a significant part of the class.Description
This course examines a modern treatment of statistical learning in various decision-making settings. Students will explore both supervised and unsupervised learning, data preprocessing, model selection, and evaluation strategies. Practical case studies demonstrate how a data-driven approach can inform decisions in areas ranging from business strategy to public policy. It can be used as an IEOR elective for IE majors, and as an elective for minors in MLDS/DSE.
LEARNING OBJECTIVES
- Achieve proficiency in coding and data analysis, understanding every line of example code from lectures
- Develop a deep understanding of statistical methods (e.g., regression and classification) rather than relying on superficial application
- Learn to interpret quantitative and visual results effectively, ensuring that methods are applied appropriately in various decision-making contexts
- Engage actively with “Discussion Points and Questions” to solidify comprehension of complex topics
TOPICS
- Module 1 (Linear Regression): Introduction to scatterplots, correlation, linear regression formulation and estimation, plus confidence intervals, hypothesis testing, and prediction
- Module 2: Testing for normality (including Q–Q plots), Poisson regression for count data, and an introduction to variable selection
- Module 3 (Penalized Regression): Ridge regression, lasso (with extended discussion), and cross validation for penalty selection
- Module 4 (Missing Data): Importance of handling missing data, the EM Algorithm, and time series visualization through line graphs
- Module 5 (Unit Root Test): Discussion of drift and trend in time series, unit root testing, and an introduction to classification
- Module 6 (Classification): In-depth coverage of classification trees, tree pruning techniques, and random forests
- Module 7 (Unsupervised Learning): Nearest neighbor algorithms, further exploration of nearest neighbor methods, unsupervised learning, clustering, and k–means clustering
- Module 8 (Deep Learning Introduction): An introduction to PyTorch, back–propagation, and techniques such as dropout, width, and depth adjustments
MATERIALS
- Course materials are provided on Canvas (including slides, handouts, homework assignments, datasets, and announcements)
- Required Text: “An Introduction to Statistical Learning with Applications in Python” by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani (with a companion version in R)
- Required Software: An interactive Python tutorial available on Kaggle, along with companion tutorials covering topics such as programming basics, machine learning, intermediate techniques, pandas, data visualization, and feature engineering
- Additional References: “The Elements of Statistical Learning” by Hastie, Tibshirani, and Friedman, and “The Matrix Cookbook” by Petersen and Pedersen provide further depth in statistical methods and matrix analysis