Curriculum / DescriptionsELEC_ENG 395, 495: Optimization Techniques for Machine Learning and Deep Learning
This course is not currently offered.
Prerequisites
A thorough understanding of Linear Algebra and Vector Calculus, and strong familiarity with the Python programming language (e.g., basic data manipulation libraries, how to construct functions and classes, etc.). Python will be used for all coding assignments. No other language can be used to complete programming assignments.Description
Fundamental machine learning algorithms are now baked into almost every moment of our digital lives. Machine learning is there when we surf the web using Google, when we upload photos to Facebook and our friends are automatically tagged, or when Amazon recommends products to us. And machine learning is the lynchpin of many technologies that lie just over the horizon - from the self driving cars, to chatbots and fully formed digital assistants, to fully automated factories.
This course offers a holistic and hands-on introduction to the fundamentals of mathematical optimization for machine learning and deep learning. Using a range of real datasets and basic Python libraries for data manipulation, vector/matrix algebra, and automatic differentiation students will code up - from scratch - fundamental optimization algorithms for popular machine learning / deep learning models including: least squares linear and nonlinear regression, logistic regression, support vector machines, feedforward neural networks, convolutional networks, gradient boosting, K-Means, Principal Component Analysis, and Recommender Systems.
- This course fulfills the Technical Elective requirement.
INSTRUCTORS: Dr. Reza Borhani and Dr. Jeremy Watt
COURSE OUTLINE:
1. Machine learning / deep learning overview in the context of mathematical optimization
- a. Supervised learning and unconstrained optimization: deep nets, trees, and kernels
- b. The tools in the warchest: first and second order methods
- c. Deep learning and engineering-focused first order methods
- d. Dimension reduction, clustering, and constrained optimization
- e. Dealing with discontinuities via heuristic methods
2. Overview of fundamental mathematical and computational tools
- a. Review of Linear Algebra, probability and statistics
- b. Calculus of multivariate functions, Taylor series and local function approximation
- c. Optimality conditions for unconstrained / constrained optimization
- d. Numerical methods for computing derivatives, automatic differentiation
- e. Numerical computation of eigenvalues / singular values
3. Unconstrained optimization via local search
- a. Local search as a general optimization paradigm
- b. Random local search and the curse of dimensionality
- c. Steplength rules and intuition
- d. Formal principles of random local search
4. First order methods: gradient descent and extensions
- a. Gradient descent - normalized and unnormalized versions
- b. Step size selection and convergence
- c. Conservative theoretical guarantees and linesearch
- d. Formal principles of gradient descent
- e. Steepest descent - norm generalization
- f. Batch / Stochastic gradient descent
- g. Variations on normalized stochastic gradient descent (e.g., RMSprop, Adagrad)
5. Second order methods: Newton and quasi-Newton methods
- a. Newton’s method basics - normalized and unnormalized versions
- b. Newton’s method, stepsize, and backtracking linesearch
- c. Honest adjustments for non-convex functions
- d. Scaling issues with Newton’s method
- e. Newton and secant method as zero-finding algorithms
- f. Quasi-Newton Methods
- g. BFGS and L-BFGS algorithms
6. Methods of constrained optimization
- a. Projected / proximal gradient algorithms
- b. Barrier and penalty methods
- c. Interior point methods
- d. Primal-dual methods
COURSE HAND-OUTS: This course has no required textbook. Handouts authored by the instructors will be made freely available to students for notes. In addition a small number of relevant sections from the author’s textbook Machine Learning Refined (Cambridge University Press) will be made freely available to students.
PROBLEM SETS: 6-7 problem sets will be assigned and graded.
COURSE PROJECT: There will be one course project worth 25% of the final grade.
COURSE GRADE: Final grades for the course are based on homework (75%) and course project (25%)