Repository

A public review for NUSMods.

Taken in AY2023/2024 Sem 1, taught by Prof. Alvin Chua.

An interesting course to take as a supplementary course, for those somewhat familiar with machine learning and want to know more about the theoretical basis behind certain algorithms / optimizations. Course content is similar to treatments in standard texts that cover the same Bayesian Statistics + Machine Learning focus.

Content is split into three parts, with broad topics:

Bayesian statistics introduction (priors, single parameter, multiparameter, hierarchical models, approximations) and model validation (e.g. information criterion, cross validation, Bayes' factor)
Sampling algorithms (Monte-Carlo, Metropolis-Hastings, Gibbs, Hamiltonian) and performance (mixing/PSRF, efficiency/ESS)
Machine learning introduction (classification, regression, kernel methods, neural networks) and Bayesian-motivated objective functions

The course itself is not particularly difficult to go through, but I had to be pretty consistent at attending the 2+1 hours of lectures in order to not fall behind in content. Many important derivations are covered during the lectures. Not too many preliminaries required, but a couple will help: (1) familiarity with probability and the various common distributions, (2) knowledge of Python + jupyter, numpy for array arithmetic, scipy.stats for sampling, sklearn for machine learning. I personally did not use Mathematica, nor scipy's symbolic arithmetic.

Not very math heavy, except perhaps during the section on probability distributions. This coming from someone with an engineering bachelor's :) The moment when things clicked was when I realized that normalization constants are already known for distributions with standard forms, so one just throws $\propto$ everywhere.

The lecture notes are very much self-contained, so no textbooks are required. I feel it would be enlightening to follow the classic "Information Theory, Inference and Learning Algorithms" by MacKay for additional explanation, and the narrative "Theory That Would Not Die" by McGrayne for history/motivation behind Bayesian statistics.

My opinion of the course is pretty positive: the content covered was pretty broad + detailed, yet digestible. Lectures themselves have great pacing and the lecturer is very open to entertaining questions post-lectures and during office hours. Recommended if you'd like to gain a deeper appreciation of the conceptual links between Bayesian statistics and machine learning techniques. Knowledge of Bayesian methods also helped me understand physics papers that use Bayesian likelihoods for parameter estimation / optimization. Not recommended if you're mainly looking to learn bleeding-edge machine learning techniques, which this course is absolutely not about.

Your mileage may vary for the four problem sets during the course, each spaced roughly three weeks apart: the lecturer says at most 10 hours, but I had to dump more time because my fundamentals aren't great. The graduate students handling the tutorials aren't particularly well-spoken (I usually skip them, sorry), but they do put in effort to prepare for them, are open to questions and fairly liberal with hints.