An embedding converts text into a point on a high-dimensional map. Things with similar meaning land near each other; unrelated things land far apart. That's the whole trick — and it's what makes semantic search possible.
~ related meanings cluster together; unrelated meanings spread apart ~
What this mental model unlocks
Why semantic search works
When you embed "what food is good for puppies?", the query lands near "dog" and "kitten" on the map — not because of keyword overlap ("puppies" isn't in those words), but because the meaning is close. The vector store returns the nearest points. No keywords required. That's the whole magic.
Why hybrid search exists
Pure vector search is amazing for meaning, but blind to specifics. A query for "error ECONNREFUSED" might be matched to vaguely similar "connection problem" documents — and miss the one that literally contains the error code. BM25 (keyword) fills that gap. Pattern 2: Advanced RAG runs both in parallel.
Why you can't mix embedding models
Every embedding model builds its own map. A query embedded by Titan lives in one coordinate system. A chunk embedded by Cohere lives in a different one. Distances between them are meaningless — same numbers, different maps. You must use the same embedding model for queries and chunks. If you change models, you re-embed everything.
Exam angle
Watch for questions about why retrieval quality dropped after a change. If the answer says "we updated the embedding model but didn't re-index the corpus," that's the bug — old chunks live on map A, new queries land on map B. When a stem mentions "embedding mismatch" or "model version drift," that's the concept being tested.
The cosine similarity bit
Distance between points isn't always Euclidean. Most vector databases use cosine similarity — the angle between vectors rather than raw distance. Two points can be far apart in raw coordinates but pointing in the same direction (semantically similar). Most of the time you don't need to care; it's just how the math gets done under the hood.