Advanced Topics in Machine Learning

UVa CS 8501 (Fall 2022)

Reading Guideline

The purpose of this guideline is to help students focus on the important content (for this course) on each topic.

[Bishop 2006] Pattern Recognition and Machine Learning.
- Available online
[Mackay 2003] Information Theory, Inference, and Learning Algorithms.
- Available online
[Murphy 2012] Machine Learning: A Probabilistic Perspective.
- Selected chapters available on Canvas
[Murphy 2022] Probabilistic Machine Learning: An Introduction.
- Available online
[Murphy 2023] Probabilistic Machine Learning: Advanced Topics.
- Available online

Lecture 01: Introduction

Reading assignments:

[Murphy 2012] Chap 01

Comments

This reading is to give a quick review about basic machine learning.
While reading this chapter, pay attention to the following terms, if you have not already known what they are
- generalization and generalization error
- MAP estimate
- latent factors
- parametric models vs. non-parametric models
- memory-based/instance-based learning
- curse of dimensionality
- inductive bias
- overfitting
- the no free lunch theorem

Lecture 02: Generative Modeling

Reading Assignments:

[Murphy 2012] Chap 03

Comments

It’s okay to skip some technical details in section 3.3 and 3.4. As the basic idea of these two sections is to provide some concrete examples for the content explained in section 3.2
Although it is definitely beneficial to work out one of the examples (either section 3.3 or 3.4) by hand. It will help you better understanding
- the concepts such as posterior predictive distribution, which is the foundation of Bayesian inference.
- the advantage of using conjugate priors
For section 3.5, focusing on 3.5.1 and 3.5.2 is sufficient for this course

Lecture 03: Bayesian Statistics

Reading assignments:

[Murphy 2012] Chap 05

It’s sufficient to only focus on the following subsections and the contained subsubsections:

Section 5.1
Section 5.2.1, 5.2.3
Section 5.3.1
Section 5.4.1
Section 5.5

Optional:

Grant et al., 2018 on meta learning is a good example of hierarchical Bayes.

Lecture 04: Probabilistic Graphical Models

Reading assignments:

[Murphy 2012] Sec 10.1 – 10.4

It’s okay to skip the following sections

section 10.2.4
section 10.2.5

Additional comments: if you are interested in this topic, there are some additional reading materials

[Bishop 2006] Chapter 08 Graphical Models: it provides a great introduction on probabilistic graphical models
More relevant materials can be found from another class Statistical Learning and Graphical Models
Similarly, the topic of graphical models is taught as an individual class in other universities, for example, CS 228 Probabilistic Graphical Models at Stanford and 10-708 Probabilistic Graphical Models

Lecture 05: Probabilistic Graphical Models II

Reading assignment:

[Murphy 2012] Sec 19.1 – 19.4

Additional comments:

It’s okay to skip section 19.4.4
Section 19.2 may require some understanding about section 10.5, which is not in our reading assignment, so feel free to skim it if you like. We will also talk about the related concepts in our class lecture.

Lecture 06: Information Theory Basics

Reading assignment:

[Murphy 2022] Chapter 06

Additional comments:

It’s okay to skip section 6.1.6, 6.3.9, and 6.3.10
Optional:
- If you are interested in information theory itself (in addition to preparing for this and the following lectures), you may also want to check out chapter 01 and 02 of [Mackay 2003]
- An example course on machine learning and information theory

Lecture 07: Variational Inference

[Murphy 2012] Sec 21.1 – 21.3, 21.5

Comments:

In addition to the assigned readings, [Bishop 2006] Sec 10.1 also gives an excellent explanation of the basic idea on variational inference.
[Murphy 2012] section 21.2 is mostly based on our discussion in lecture 06.
The Ising model used in [Murphy 2012] section 21.3 was discussed in lecture 05, as an example of undirected graphical models.
[Murphy 2012] section 21.5 provides two examples of variational Bayes, feel free to pick one of them for your reading.
In addition, Blei et al., 2016 offers another introduction about variational inference, which are mostly overlapped with our discussion in this lecture.

Lecture 08: Variational Inference II

For the reading assignments, we will use [Murphy 2023] and focus on the following subsections

Sec 10.2.5 – 10.2.6
Sec 10.3.1, 10.3.6

Comments:

Murphy 2023 basically uses the same notations as in Murphy 2012, which save us some time on interpreting the notations.
To better understand sections 10.2.5 and 10.2.6, you may need to quickly review the first four subsections 10.2.1 – 10.2.4
Subsections in sec 9.3 can be read independently. Some of the section may need a good understanding about other topics, such as REINFORCE in section 10.3.2. The subsections not listed above will not be the focus of our discussion, but feel free to read more if you have time.

Lecture 09: Monte Carlo Inference

Reading assignment:

[Murphy 2023] Sec 11.1 – 10.5

Comments

It’s fine to skip the following subsections
- section 11.4.3 – 11.4.4
- section 11.5.2 – 11.5.4
This chapter focuses on some basic idea of Monte Carlo sampling and provides useful algorithms particularly for sampling in low-dimensional spaces. The challenge and algorithms of sampling in high-dimensional space will be discussed in the next lecture.
Section 10.6 talks about some practical issues on Monte Carlo sampling. Feel free to read it if you have some time.

Lecture 10: Monte Carlo Inference II

Reading assignment:

[Murphy 2023] Sec 12.1 – 12.3, 12.6

Comments:

This chapter is intensive in many senses, so let’s focus on the basic parts. It’s okay to skip the following subsections
- 12.2.3
- 12.3.5
- 12.3.6
- 12.6.3
- 12.6.5
As mentioned in the textbook, this website provides some interesting demos of MCMC methods. Check it out, which may give you some intuition about how these methods work.

Lecture 11: Variational Autoencoders

Reading assignment:

[Murphy 2023] Sec 21.1, 21.2, 21.3.6, 21.4

Comments:

VAE is one of the fruitful research topics that have many many model variants. Despite the huge number of model variants, I think the first two papers on thos topic is still worth reading
- Kingma and Welling. Auto-Encoding Variational Bayes. 2014
- Rezende et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models. 2014

Lecture 12: Generative Adversarial Networks

Reading assignment:

[Murphy 2023] Section 26.1 - 26.5

Comments:

It’s okay to skip the following subsections
- Sec 26.2.3 - 26.2.6
Similar to the suggestion for the previous topic, it is always a good idea to check out the original GAN paper
- Goodfellow et al. Generative Adversarial Networks. 2014

Lecture 13: Diffusion Models

Reading assignment:

[Murphy 2023] Chapter 25

With our previous discussion on many relevant components, let’s read the whole chapter this time. Since diffusion models is a rapid evolving research fields, for our purpose, getting the ideas of this method is probably more important than getting some details. Unless, this is your research topic :)

Lecture 14: Beyond the IID Assumption

Reading assignment:

[Murphy 2023] Section 19.1 – 19.4

Comments:

Learning beyond the IID assumption has always been an challenging and interesting topic. For students who expect to learn more about this topic, I recommend the a classical monologue Dataset Shift in Machine Learning published in 2009.

Lecture 15: No Class

Lecture 16: Beyond the IID Assumption II

TODO

Back to the main page