THE LECTURES
- Taught by Feynman Prize winner Professor Yaser Abu-Mostafa.
- The fundamental concepts and techniques are explained in detail. The focus of the lectures is real understanding, not just "knowing."
- Lectures use incremental viewgraphs (2853 in total) to simulate the pace of blackboard teaching.
- The 18 lectures (below) are available on different platforms in the US and abroad.
Here is the playlist on YouTube 
-  Lecture 1 (The Learning Problem) 
 Lecture (some audio drops, sorry!) - Q&A - Slides
-  Lecture 2 (Is Learning Feasible?) 
 Review - Lecture - Q&A - Slides
-  Lecture 3 (The Linear Model I) 
 Review - Lecture - Q&A - Slides
-  Lecture 4 (Error and Noise) 
 Review - Lecture - Q&A - Slides
-  Lecture 5 (Training versus Testing) 
 Review - Lecture - Q&A - Slides
-  Lecture 6 (Theory of Generalization) 
 Review - Lecture - Q&A - Slides
-  Lecture 7 (The VC Dimension) 
 Review - Lecture - Q&A - Slides
-  Lecture 8 (Bias-Variance Tradeoff) 
 Review - Lecture - Q&A - Slides
-  Lecture 9 (The Linear Model II) 
 Review - Lecture - Q&A - Slides
-  Lecture 10 (Neural Networks) 
 Review - Lecture - Q&A - Slides
-  Lecture 11 (Overfitting) 
 Review - Lecture - Q&A - Slides
-  Lecture 12 (Regularization) 
 Review - Lecture - Q&A - Slides
-  Lecture 13 (Validation) 
 Review - Lecture - Q&A - Slides
-  Lecture 14 (Support Vector Machines) 
 Review - Lecture - Q&A - Slides
-  Lecture 15 (Kernel Methods) 
 Review - Lecture - Q&A - Slides
-  Lecture 16 (Radial Basis Functions) 
 Review - Lecture - Q&A - Slides
-  Lecture 17 (Three Learning Principles) 
 Review - Lecture - Q&A - Slides
-  Lecture 18 (Epilogue) 
 Review - Lecture - Acknowledgment - Slides
The Learning Problem - Introduction; supervised, unsupervised, and reinforcement learning. Components of the learning problem.
Is Learning Feasible? - Can we generalize from a limited sample to the entire space? Relationship between in-sample and out-of-sample.
The Linear Model I - Linear classification and linear regression. Extending linear models through nonlinear transforms.
Error and Noise - The principled choice of error measures. What happens when the target we want to learn is noisy.
Training versus Testing - The difference between training and testing in mathematical terms. What makes a learning model able to generalize?
Theory of Generalization - How an infinite model can learn from a finite sample. The most important theoretical result in machine learning.
The VC Dimension - A measure of what it takes a model to learn. Relationship to the number of parameters and degrees of freedom.
Bias-Variance Tradeoff - Breaking down the learning performance into competing quantities. The learning curves.
The Linear Model II - More about linear models. Logistic regression, maximum likelihood, and gradient descent.
Neural Networks - A biologically inspired model. The efficient backpropagation learning algorithm. Hidden layers.
Overfitting - Fitting the data too well; fitting the noise. Deterministic noise versus stochastic noise.
Regularization - Putting the brakes on fitting the noise. Hard and soft constraints. Augmented error and weight decay.
Validation - Taking a peek out of sample. Model selection and data contamination. Cross validation.
Support Vector Machines - One of the most successful learning algorithms; getting a complex model at the price of a simple one.
Kernel Methods - Extending SVM to infinite-dimensional spaces using the kernel trick, and to non-separable data using soft margins.
Radial Basis Functions - An important learning model that connects several machine learning models and techniques.
Three Learning Principles - Major pitfalls for machine learning practitioners; Occam's razor, sampling bias, and data snooping.
Epilogue - The map of machine learning. Brief views of Bayesian learning and aggregation methods.





