3. Deep Learning¶

Machine learning using multi-layer neural networks — the substrate of everything modern. Its children mix mechanics (layers, activations, gradients, backprop, loss, optimizers) with architecture families (CNN, RNN, Transformer, diffusion, autoencoders, GANs). The mechanics are universal across architectures; the architectures are design families suited to different data — CNN→spatial, RNN→sequential, Transformer→ relational, diffusion→generative.

Children¶

neural networks — the general function approximator
layers · activations · gradients · backpropagation · loss functions · optimizers — the universal training mechanics
CNN — convolutional, for spatial/grid data
RNN / LSTM / GRU — recurrent, for sequences
Transformer — attention-based; see Transformer Architecture
diffusion models — iterative denoising generators
autoencoders — learned compression / representation
GANs — generator-vs-discriminator adversarial training

Transformer Architecture — the architecture behind modern LLMs
Math & Data Representation — the tensors and embeddings these networks operate on
Model Internals — what a trained network looks like at rest and at run

3. Deep Learning¶

Children¶

Related¶