California Institute of Technology


A real Caltech course, not a watered-down version

8 Million Views

on YouTube & other servers

Article about the course in

  • Free, introductory Machine Learning online course (MOOC)
  • Taught by Caltech Professor Yaser Abu-Mostafa [article]
  • Lectures recorded from a live broadcast, including Q&A
  • Prerequisites: Basic probability, matrices, and calculus
  • 8 homework sets and a final exam
  • Topic-by-topic video library for easy review

Take the course at your own pace

Lectures -- Homework


This is an introductory course in machine learning (ML) that covers the basic theory, algorithms, and applications. ML is a key technology in Big Data, and in many financial, medical, commercial, and scientific applications. It enables computational systems to adaptively improve their performance with experience accumulated from the observed data. ML has become one of the hottest fields of study today, taken up by undergraduate and graduate students from more than 20 different majors at Caltech. This course balances theory and practice, and covers the mathematical as well as the heuristic aspects. The lectures below follow each other in a story-like fashion:

  • What is learning?
  • Can a machine learn?
  • How to do it?
  • How to do it well?
  • Take-home lessons.

The 18 lectures are about 60 minutes each plus Q&A. The content of each lecture is color coded:

theory; mathematical
technique; practical
analysis; conceptual

Place the mouse on a lecture title for a short description

The Learning Problem - Introduction; supervised, unsupervised, and reinforcement learning. Components of the learning problem.
Is Learning Feasible? - Can we generalize from a limited sample to the entire space? Relationship between in-sample and out-of-sample.
The Linear Model I - Linear classification and linear regression. Extending linear models through nonlinear transforms.
Error and Noise - The principled choice of error measures. What happens when the target we want to learn is noisy.
Training versus Testing - The difference between training and testing in mathematical terms. What makes a learning model able to generalize?
Theory of Generalization - How an infinite model can learn from a finite sample. The most important theoretical result in machine learning.
The VC Dimension - A measure of what it takes a model to learn. Relationship to the number of parameters and degrees of freedom.
Bias-Variance Tradeoff - Breaking down the learning performance into competing quantities. The learning curves.
The Linear Model II - More about linear models. Logistic regression, maximum likelihood, and gradient descent.
Neural Networks - A biologically inspired model. The efficient backpropagation learning algorithm. Hidden layers.
Overfitting - Fitting the data too well; fitting the noise. Deterministic noise versus stochastic noise.
Regularization - Putting the brakes on fitting the noise. Hard and soft constraints. Augmented error and weight decay.
Validation - Taking a peek out of sample. Model selection and data contamination. Cross validation.
Support Vector Machines - One of the most successful learning algorithms; getting a complex model at the price of a simple one.
Kernel Methods - Extending SVM to infinite-dimensional spaces using the kernel trick, and to non-separable data using soft margins.
Radial Basis Functions - An important learning model that connects several machine learning models and techniques.
Three Learning Principles - Major pitfalls for machine learning practitioners; Occam's razor, sampling bias, and data snooping.
Epilogue - The map of machine learning. Brief views of Bayesian learning and aggregation methods.

You can also look for a particular topic within the lectures in the Machine Learning Video Library.

Live Lectures

This course was broadcast live from the lecture hall at Caltech in April and May 2012. There was no 'Take 2' for the recorded videos. The lectures included live Q&A sessions with online audience participation.