THE LECTURES
- Taught by Feynman Prize winner Professor Yaser Abu-Mostafa.
- The fundamental concepts and techniques are explained in detail. The focus of the lectures is real understanding, not just "knowing."
- Lectures use incremental viewgraphs (2853 in total) to simulate the pace of blackboard teaching.
- The 18 lectures (below) are available on different platforms in the US and abroad.
Here is the playlist on YouTube
- Lecture 1 (The Learning Problem)
Lecture (some audio drops, sorry!) - Q&A - Slides - Lecture 2 (Is Learning Feasible?)
Review - Lecture - Q&A - Slides - Lecture 3 (The Linear Model I)
Review - Lecture - Q&A - Slides - Lecture 4 (Error and Noise)
Review - Lecture - Q&A - Slides - Lecture 5 (Training versus Testing)
Review - Lecture - Q&A - Slides - Lecture 6 (Theory of Generalization)
Review - Lecture - Q&A - Slides - Lecture 7 (The VC Dimension)
Review - Lecture - Q&A - Slides - Lecture 8 (Bias-Variance Tradeoff)
Review - Lecture - Q&A - Slides - Lecture 9 (The Linear Model II)
Review - Lecture - Q&A - Slides - Lecture 10 (Neural Networks)
Review - Lecture - Q&A - Slides - Lecture 11 (Overfitting)
Review - Lecture - Q&A - Slides - Lecture 12 (Regularization)
Review - Lecture - Q&A - Slides - Lecture 13 (Validation)
Review - Lecture - Q&A - Slides - Lecture 14 (Support Vector Machines)
Review - Lecture - Q&A - Slides - Lecture 15 (Kernel Methods)
Review - Lecture - Q&A - Slides - Lecture 16 (Radial Basis Functions)
Review - Lecture - Q&A - Slides - Lecture 17 (Three Learning Principles)
Review - Lecture - Q&A - Slides - Lecture 18 (Epilogue)
Review - Lecture - Acknowledgment - Slides
The Learning Problem - Introduction; supervised, unsupervised, and reinforcement learning. Components of the learning problem.
Is Learning Feasible? - Can we generalize from a limited sample to the entire space? Relationship between in-sample and out-of-sample.
The Linear Model I - Linear classification and linear regression. Extending linear models through nonlinear transforms.
Error and Noise - The principled choice of error measures. What happens when the target we want to learn is noisy.
Training versus Testing - The difference between training and testing in mathematical terms. What makes a learning model able to generalize?
Theory of Generalization - How an infinite model can learn from a finite sample. The most important theoretical result in machine learning.
The VC Dimension - A measure of what it takes a model to learn. Relationship to the number of parameters and degrees of freedom.
Bias-Variance Tradeoff - Breaking down the learning performance into competing quantities. The learning curves.
The Linear Model II - More about linear models. Logistic regression, maximum likelihood, and gradient descent.
Neural Networks - A biologically inspired model. The efficient backpropagation learning algorithm. Hidden layers.
Overfitting - Fitting the data too well; fitting the noise. Deterministic noise versus stochastic noise.
Regularization - Putting the brakes on fitting the noise. Hard and soft constraints. Augmented error and weight decay.
Validation - Taking a peek out of sample. Model selection and data contamination. Cross validation.
Support Vector Machines - One of the most successful learning algorithms; getting a complex model at the price of a simple one.
Kernel Methods - Extending SVM to infinite-dimensional spaces using the kernel trick, and to non-separable data using soft margins.
Radial Basis Functions - An important learning model that connects several machine learning models and techniques.
Three Learning Principles - Major pitfalls for machine learning practitioners; Occam's razor, sampling bias, and data snooping.
Epilogue - The map of machine learning. Brief views of Bayesian learning and aggregation methods.