thomas

Reading Group

I help organize a reading group on Deep Learning Theory.

Archived Schedule (we have graduated to a spreadsheet)

Date Paper Resources
06/07/2022 Principles of Deep Learning Theory Chapters 1 & 2 Chapter 1 Chapter 2
20/07/2022 PoDLT Chapter 3 Chapter 3 Chapter 4
27/07/2022 We decided to change approach and read papers  
03/08/2022 Neural Tangent Kernel https://rajatvd.github.io/NTK/ Notes
17/08/2022 Multilayer Feedforward Networks are Universal Approximators Notes
31/08/2022 Explaining Neural Scaling Laws  
22/09/2022 Git Re-Basin  
05/10/2022 Monte Carlo Gradient Estimation in Machine Learning Section 4 Notes
12/10/2022 Monte Carlo Gradient Estimation in Machine Learning Section 5 Notes
19/10/2022 Monte Carlo Gradient Estimation in Machine Learning Section 7… Notes
26/10/2022 Toy Models of Superposition Section 1,2 & 3 Notes
16/11/2022 Toy Models of Superposition Section 4,5 & 6  
23/11/2022 Toy Models of Superposition Section 7,8 & 9,10  
30/11/2022 Gradient Estimation with Discrete Stein Operators  
07/12/2022 Exact learning dynamics of deep linear networks with prior knowledge  
04/01/2023 Neural networks and physical systems with emergent collective computational abilities (Hopfield Networks) Notes
11/01/2023 CSC2541 Winter 2022 Topics in Machine Learning: Neural Net Training Dynamics Lecture 1  
18/01/2023 CSC2541 Winter 2022 Topics in Machine Learning: Neural Net Training Dynamics Lecture 2  
11/01/2023 CSC2541 Winter 2022 Topics in Machine Learning: Neural Net Training Dynamics Lecture 3  
25/01/2023 CSC2541 Winter 2022 Topics in Machine Learning: Neural Net Training Dynamics Lecture 4  
01/02/2023 CSC2541 Winter 2022 Topics in Machine Learning: Neural Net Training Dynamics Lecture 5  
15/03/2023 Understanding the Diffusion Objective as a Weighted Integral of ELBOs  
22/03/2023 Laplace Redux – Effortless Bayesian Deep Learning  
29/03/2023 A Theory on Adam Instability in Large-Scale Machine Learning  
12/04/2023 Sigma-Reparam: Stable Transformer Training with Spectral Reparametrization  
24/05/2023 Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning  
31/05/2023 Loss Landscapes are All You Need: Neural Network Generalization Can Be Explained Without the Implicit Bias of Gradient Descent  
7/06/2023 Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training  
14/06/2023 QLORA: Efficient Finetuning of Quantized LLMs  
09/08/2023 Limitations of the Empirical Fisher Approximation for Natural Gradient Descent  
23/08/2023 The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of Inductive Biases in Machine Learning  
20/09/2023 Transformers as Support Vector Machines  
27/09/2023 Flat Minima  
04/10/2023 Language Modeling Is Compression  
11/10/2023 Efficient Streaming Language Models with Attention Sinks  
25/10/2023 SGPT: GPT Sentence Embeddings for Semantic Search  
01/11/2023 Adam through a Second-Order Lens  
08/11/2023 Maximum a Posteriori Policy Optimisation  
22/11/2023 Simplifying Transformer Blocks  

© 2024 thomas. Last updated: 12 Sep 2024