Format differences from SAP-C02

This exam introduces two new question types that SAP-C02 doesn't have. You must get the entire answer correct — no partial credit.

Ordering questions

Given 3-5 steps, arrange them in the correct sequence. Example: "Put these RAG pipeline steps in order: embed documents, chunk documents, store in vector database, retrieve relevant chunks, generate response."

You must understand entire workflows end-to-end, not just individual steps.

Matching questions

Given 3-7 pairs to match. Example: "Match each AWS service to its purpose in a GenAI architecture" or "Match each symptom to its root cause."

Use process of elimination. If you're sure of 2 matches, the third may be forced by what's left.


Common exam traps

Trap 1 — Confusing Fine-Tuning with RAG Fine-tuning = modifying the model itself with new training data. Expensive, slow, but good for changing model behavior fundamentally.
RAG = providing context at inference time from external documents. Cheaper, dynamic, good for grounding in current data.
Exam trap: Dynamic/frequently-changing data? RAG. Domain-specific style/tone/format the model can't do well? Fine-tuning. Most of the time RAG is the right answer because it's cheaper and more maintainable.
Trap 2 — Managed vs. Self-Managed AWS exams almost always prefer managed services over self-managed.
• Bedrock Knowledge Bases > building your own RAG pipeline with Lambda + OpenSearch
• Bedrock Agents > custom Step Functions agent orchestration
Unless the question explicitly asks for customization that managed services can't provide.
Trap 3 — Cost Optimization Red Herrings Questions may present an expensive architecture and ask how to reduce cost. Common correct patterns: model cascading, semantic caching, prompt compression, Provisioned Throughput for consistent workloads, on-demand for variable workloads.
Trap: Choosing fine-tuning to "reduce tokens" when prompt engineering achieves the same result cheaper.
Trap 4 — Security Over-Engineering Your CISSP instinct to add every security control may lead you astray. The exam wants the right level of security for the scenario, not maximum security.
• VPC endpoints + IAM + encryption at rest/transit covers most scenarios
• Don't add Lambda@Edge content filtering if Bedrock Guardrails handles it natively
• Don't add a KMS customer-managed key if AWS-managed keys meet the stated requirements
Trap 5 — Confusing Bedrock Services Three different services, often conflated:
Guardrails = content safety filters (input/output)
Agents = autonomous tool-using AI systems
Knowledge Bases = managed RAG (document retrieval)
These combine but serve different purposes. Read the question carefully to identify which one(s) apply.
Trap 6 — Streaming vs. Synchronous Streaming = tokens delivered incrementally. Better UX for chat.
Synchronous = wait for full response. Simpler, good for batch processing.
Trap: Using synchronous when the question describes a chat interface (streaming is the right answer every time).
Trap 7 — Vector Database SelectionOpenSearch = best for pure vector search at scale with advanced filtering
Aurora pgvector = best when you already have relational data and want to add vector search
Bedrock Knowledge Bases = best for managed RAG with minimal setup (uses OpenSearch Serverless under the hood)
DynamoDB = not a vector database; used for metadata alongside a vector store

AWS exam reasoning framework

Apply this to every question, in this order:

Step 1
Eliminate wrong
Out of scope · over-engineered · ignores constraints
Step 2
Prefer managed
AWS answer prefers AWS-managed services
Step 3
Match constraints
Cost? Latency? Compliance?
Step 4
Multi-constraint?
Right answer satisfies ALL stated constraints
Step 5
Read every word
A single "existing" or "real-time" changes the answer

Service decision matrix

When you see a phrase like one of these, the right service jumps out:

"I need to..."AWS service(s)
Add AI to my existing appBedrock API + API Gateway + Lambda
Answer questions from company documentsBedrock Knowledge Bases (managed RAG)
Build an AI that can take actionsBedrock Agents or Strands Agents
Filter harmful contentBedrock Guardrails
Manage prompts across teamsBedrock Prompt Management
Chain multiple AI steps (no-code)Bedrock Prompt Flows
Orchestrate complex logic with non-Bedrock servicesAWS Step Functions
Deploy a custom or fine-tuned modelSageMaker AI endpoints
Enterprise search over internal dataAmazon Q Business or Kendra
Detect PII before sending to FMComprehend PII detection + Guardrails
Track model performance over timeCloudWatch + Bedrock Model Invocation Logs
Ensure model availability across regionsBedrock Cross-Region Inference
Reduce FM costsModel cascading + caching + prompt compression + right-sized models
Build a chat interfaceAPI Gateway WebSocket + InvokeModelWithResponseStream
Persistent-connection MCP toolECS (not Lambda)
Approval before high-value actionStep Functions callback pattern (human-in-the-loop)
Document processing at scaleStep Functions + Textract + Comprehend + Bedrock
Bias/fairness evaluationSageMaker Clarify or LLM-as-a-Judge
Explain how an agent reasonedBedrock Agent Tracing
Keep Bedrock off public internetVPC Endpoints (PrivateLink)

Red-flag words in question stems

These words shift the correct answer. Highlight them mentally when you see them.

"Existing"
They already have something — don't replace it unnecessarily. Choose the option that augments, not rebuilds.
"Minimize operational overhead"
Managed services win. Bedrock > SageMaker > EC2 self-hosted. Bedrock Knowledge Bases > custom RAG pipeline.
"Lowest cost" / "most cost-effective"
Model cascading, semantic/prompt caching, right-sized models, batch inference (~50% discount).
"Real-time" / "low latency"
Streaming APIs, WebSocket, latency-optimized models, edge caching.
"Compliance" / "audit"
CloudTrail + Bedrock Model Invocation Logs + Model Cards + VPC endpoints.
"Current data" / "frequently changing"
RAG, not fine-tuning. Fine-tuning is static; RAG is dynamic.
"Domain-specific language/tone/style"
Fine-tuning or continued pre-training. Where RAG alone falls short.
"Must not leave AWS network"
VPC endpoints (PrivateLink).
"Global users" / "low latency worldwide"
Bedrock Cross-Region Inference + CloudFront edge caching.
"No downtime" / "zero-downtime"
Blue/green deployment, canary, parallel Knowledge Bases.
"Across teams" / "enterprise-wide"
GenAI Gateway pattern. Centralized API Gateway + Cognito + Guardrails + routing.
"Hallucination" / "grounded"
RAG + Guardrails contextual grounding + JSON Schema.

Confusing term pairs (study these carefully)

Fine-Tuning vs. LoRA

Both modify the model. Fine-tuning updates all parameters (expensive). LoRA updates a small subset (cheap, fast). LoRA is a type of parameter-efficient fine-tuning.

Guardrails vs. Safety Classifiers

Guardrails = AWS-managed content filtering (a product). Safety classifiers = models trained to detect harmful content (a technique). Guardrails may use safety classifiers under the hood.

Prompt Flows vs. Step Functions

Both orchestrate multi-step workflows. Prompt Flows = no-code, Bedrock-native. Step Functions = code-based, any AWS service. Use Prompt Flows for simple prompt chains; Step Functions for complex logic with non-Bedrock services.

Bedrock Agents vs. Strands Agents

Bedrock Agents = fully managed, AWS-native. Strands Agents = open-source, more customizable, run on your compute. Exam prefers managed → Bedrock Agents unless customization demanded.

Semantic vs. Hybrid Search

Semantic = vector similarity only. Hybrid = vector + keyword (BM25) combined. Hybrid usually performs better on real data, especially for exact-term queries.

On-Demand vs. Provisioned Throughput

On-demand = pay per token, no commitment, subject to throttling. Provisioned = reserved capacity, consistent performance, minimum commitment. Use provisioned for predictable high-volume workloads.

Context Window Overflow vs. Truncation

Overflow = input exceeds maximum, request fails. Truncation = system silently cuts off content to fit. Both lose information; different failure modes.

Model Card vs. Data Lineage

Model card = docs about the model itself (purpose, limits, perf). Data lineage = tracking where data came from and how it was transformed. Both serve governance at different levels.

Bedrock Agents vs. Bedrock AgentCore

Agents = the agent service itself (action groups, KBs, reasoning). AgentCore = infrastructure layer for deploying and scaling agents at production scale.

Amazon Q Business vs. Q Developer

Q Business = enterprise assistant over internal data (S3, SharePoint, Salesforce). Q Developer = coding assistant (code gen, refactor, debug).

Apply this now Read this page once today, then again one day before the exam. On exam day, the reasoning framework (5 steps) and the red-flag-words table are the highest-leverage concepts to have top-of-mind. Everything else is factual knowledge that either sticks or doesn't.