Skip to content

4. Mathematical and Data Representation Layer

The numeric objects everything runs on — the "physics layer." The children form a dimensional ladder (scalar → vector → matrix → tensor), then the learned representations built on it (embedding, latent space) and the operations that compare them (similarity metrics). Embeddings and high-dimensional geometry are why semantic search and meaning-as-distance work; similarity metrics are the bridge to RAG.

Children

  • scalar → vector → matrix → tensor — the dimensional ladder
  • embedding — a learned vector that encodes meaning
  • latent space — the geometry embeddings live in
  • probability distribution — model outputs are distributions
  • similarity metrics — how vectors are compared:
  • cosine similarity
  • dot product
  • Euclidean distance
  • high-dimensional geometry — why distance encodes semantics
  • RAG — semantic search built on embeddings + similarity
  • Foundation Models — embedding models that produce these vectors
  • Model Internals — logits and probabilities at the output