4F10: Deep Learning

A collection of topics and materials for 4F10: Deep Learning and Structured Data in 2022.

Key of what’s expected (hopefully…):

🔴 important topic, mathematical detail
🟡 understanding of concepts
🟢 awareness
Lecture 1: “Introduction”
- Nothing of use presented in this lecture
Lecture 2: “Probability of Error & Decision Boundaries”
- 🔴 Much better explained in this very clear tutorial on Linear and Quadratic Disciminant Analysis: arXiv
- 🔴 Bayes decision rule: slides 4-6
Lecture 3: “Graphical Models and Conditional Independence”, and Undirected GMs (Markov Networks)
- 🔴 CIs: slides 4-6
- 🟡 Markov networks: Murphy ch19, section 19.3.1.
Lecture 4,5,6: “Latent Variable and Sequence Models”
- 🟡 Factor Analysis sklearn
- 🔴 EM for learning GMM - Murphy 11.4.2
- HMMs:
  - 🔴 General: see 3F8 notes
  - 🟡 Inference: Discrete Kalman Filter & Viterbi algorithm: Murphy section 17.4, see also 3F8
  - 🟡 Learning: EM (not covered really)
- 🟢 Conditional Random Fields - model description, motivation, and learning Murphy section 19.6
Lecture 7,8: “Deep Learning”
- A very poor intro to DL where everything is just in the wrong order - see instead chapters in the tutorials on d2l.ai:
  - 🔴 Linear network intro: d2l.ai
  - 🔴 MLPs: d2l.ai
    - Network configuration: MLM
  - 🔴 Network training: d2l.ai
    - Batch normalisation d2l.ai
    - Regularisation: TDS
  - 🔴 CNNs: d2l.ai
    - Pooling: d2l.ai
  - 🔴 ResNets: d2l.ai
    - Highway layers: PWC
  - 🟡 Advanced optimisation algorithms: d2l.ai
- 🔴 Initialisation and Xavier initialisation: deeplearning.ai
Lecture 9,10: “Deep Learning for Sequence Data”
- 🔴 RNNs (slides 4-7, seq to target): Blog and d2l.ai
- 🟢 Elman vs Jordan networks: slide 8
- 🟢 Bi-directional RNNs: d2l.ai
- 🟡 LSTMs and GRUs: Blog
- Seq2seq/encoder-decoder model with attention:
  - 🟡 Overview with application to Neural Machine Translation: Jalammar blog
  - 🟡 Focus on seq2seq model: d2l.ai
  - 🔴 Focus on attention types: d2l.ai
- 🟢 Target to sequence: slides 37-40.
- 🟢 Word2vec Overview blog, More thorough treatment
- 🟢 Transformers: Jalammar blog
- 🟢 Introduction to BERT: Jalammar blog
Lecture 11: “Ensemble Methods”
- 🟡 Dropout: Medium
- 🟡 Bagging: TDS
- 🟢 Model compression: Medium
🔴 Lecture 12: “Support Vector Machines” - ok, this is actually quite good
🔴 Lecture 13: “Support Vector Machines: Advanced Topics” (kernel SVM) - good too
🟢 Lecture 14: “Kernels for Structured Data” - good enough