4F10: Deep Learning
2022, Jun 01
A collection of topics and materials for 4F10: Deep Learning and Structured Data in 2022.
Key of what’s expected (hopefully…):
- 🔴 important topic, mathematical detail
- 🟡 understanding of concepts
-
🟢 awareness
- Lecture 1: “Introduction”
- Nothing of use presented in this lecture
- Lecture 2: “Probability of Error & Decision Boundaries”
- 🔴 Much better explained in this very clear tutorial on Linear and Quadratic Disciminant Analysis: arXiv
- 🔴 Bayes decision rule: slides 4-6
- Lecture 3: “Graphical Models and Conditional Independence”, and Undirected GMs (Markov Networks)
- 🔴 CIs: slides 4-6
- 🟡 Markov networks: Murphy ch19, section 19.3.1.
- Lecture 4,5,6: “Latent Variable and Sequence Models”
- 🟡 Factor Analysis sklearn
- 🔴 EM for learning GMM - Murphy 11.4.2
- HMMs:
- 🔴 General: see 3F8 notes
- 🟡 Inference: Discrete Kalman Filter & Viterbi algorithm: Murphy section 17.4, see also 3F8
- 🟡 Learning: EM (not covered really)
- 🟢 Conditional Random Fields - model description, motivation, and learning Murphy section 19.6
- Lecture 7,8: “Deep Learning”
- A very poor intro to DL where everything is just in the wrong order - see instead chapters in the tutorials on d2l.ai:
- 🔴 Initialisation and Xavier initialisation: deeplearning.ai
- Lecture 9,10: “Deep Learning for Sequence Data”
- 🔴 RNNs (slides 4-7, seq to target): Blog and d2l.ai
- 🟢 Elman vs Jordan networks: slide 8
- 🟢 Bi-directional RNNs: d2l.ai
- 🟡 LSTMs and GRUs: Blog
- Seq2seq/encoder-decoder model with attention:
- 🟡 Overview with application to Neural Machine Translation: Jalammar blog
- 🟡 Focus on seq2seq model: d2l.ai
- 🔴 Focus on attention types: d2l.ai
- 🟢 Target to sequence: slides 37-40.
- 🟢 Word2vec Overview blog, More thorough treatment
- 🟢 Transformers: Jalammar blog
- 🟢 Introduction to BERT: Jalammar blog
- Lecture 11: “Ensemble Methods”
- 🔴 Lecture 12: “Support Vector Machines” - ok, this is actually quite good
- 🔴 Lecture 13: “Support Vector Machines: Advanced Topics” (kernel SVM) - good too
- 🟢 Lecture 14: “Kernels for Structured Data” - good enough