Pattern 2 of 10 · When basic RAG retrieval falls short
Advanced RAG · Hybrid Search + Reranking
When Pattern 1 isn't retrieving the right chunks, you level up: run keyword and vector search in parallel, fuse their scores, then rerank the top candidates. Each layer fixes a different retrieval failure mode.
Architecture diagram
— Advanced RAG · parallel retrieval + reranking —
How data flows
The query fans out to two parallel retrieval systems. The top path runs BM25 keyword matching — great for exact terms like SKUs, function names, error codes. The bottom path embeds the query and runs k-NN vector search — great for semantic meaning. The two result sets converge in a score-fusion step (typically reciprocal rank fusion), then a reranker model re-scores the top candidates using deeper semantic analysis. Only the top-k reranked chunks go to the FM.
The payoff: you get keyword precision and semantic breadth, and the reranker catches cases where the best chunks weren't ranked first by either method alone.
AWS services used
Amazon OpenSearchHosts both the BM25 keyword index and the k-NN vector index. Supports true hybrid search natively (neural + lexical in one query).
Bedrock RerankerBedrock offers Cohere Rerank and other reranker models. Takes the top N candidates and re-scores them for relevance.
Amazon Titan EmbeddingsEmbeds the query into the same vector space as the indexed chunks.
Bedrock FMGenerates the final answer from the top reranked chunks. Fewer chunks, higher quality = better grounding.
Lambda / Bedrock KBOrchestrates the parallel search, fusion, and reranking. Knowledge Bases can handle this natively with hybrid search + rerank configured.
When to use this pattern
✓ Use Advanced RAG when…
Basic RAG retrieval quality is not good enoughUsers report missing relevant results, or irrelevant results crowding out good ones. Adding hybrid + rerank typically lifts recall and precision together.
Queries mix exact terms and conceptual language"Refund policy for SKU AB-447" — you need to match both "AB-447" (exact) and "refund policy" (semantic). Pure vector search misses the SKU.
Your corpus contains codes, names, IDs, function signatures, or technical jargonTechnical docs, API references, legal documents, medical records — anywhere exact terminology matters and semantics aren't enough.
You need higher precision before sending to the FMFewer, higher-quality chunks mean less context to process, cheaper calls, and better grounded answers.
You're okay with slightly higher latency per queryHybrid + rerank adds ~100-300ms. Acceptable for most chat apps; prohibitive only for very tight SLAs.
✗ Do NOT use Advanced RAG when…
Basic RAG is already producing good resultsDon't over-engineer. If Pattern 1 is working, the added complexity of hybrid + rerank is wasted cost and latency.
Ultra-low latency is a hard requirementReal-time voice assistants, autocomplete scenarios. The extra hops (fusion + reranker call) push you past your budget.
Your corpus is purely conversational / narrative textNatural-language FAQs, customer support transcripts, marketing copy — pure semantic search already covers these well.
You haven't yet tuned chunking or embeddingIf chunks are too large or the embedding model mismatches your domain, hybrid + rerank won't save you. Fix the fundamentals first.
You need transactional operations, not retrieval"Cancel my order" isn't a retrieval problem. Use Pattern 3: Agent.
Exam angle
Pattern-match shortcuts
When a stem mentions "retrieval is returning irrelevant results", "missing exact matches on codes/IDs/names", or "improve retrieval precision", the answer is usually hybrid search + reranking. On AWS, this maps to OpenSearch hybrid search and Bedrock rerank models.
The "throw more chunks at it" trap
A distractor: "increase top-k from 3 to 20." More chunks means more noise, not better answers. The FM gets confused by irrelevant content, context fills up, and grounding degrades. Reranking fixes this by making your small top-k count — quality over quantity.
Keywords that point here
hybrid searchBM25 + vectorrerankerexact term matchingretrieval precisionirrelevant resultsproduct SKUsfunction signatures