Domain 1: Foundation Model Integration, Data Management & Compliance

Task 1.1 — Analyze requirements and design GenAI solutions

Architectural design for GenAI

Align business needs with technical constraints. For every design question, ask: which model fits the use case (text / code / multimodal / reasoning), latency tolerance, cost ceiling, context window needs, and compliance/data residency requirements.

Proof-of-concept (PoC) implementations

Build PoCs in Amazon Bedrock before committing to full deployment — it lets you test multiple FMs without infrastructure setup. Validate performance characteristics, business value, and cost projections early.

Well-Architected Framework — Generative AI Lens

AWS WA Tool includes a Generative AI Lens with standardized best practices across the six WAF pillars (operational excellence, security, reliability, performance efficiency, cost optimization, sustainability) specifically for FM-based applications. If a question says "standardized components" or "consistent implementation across deployments," the answer is usually the WA Tool GenAI Lens.

Exam angle When questions describe a new GenAI project, the right answer almost always starts with a Bedrock PoC — not SageMaker training, not a fine-tune, not a self-hosted model. Bedrock first, specialize later.

Task 1.2 — Select and configure FMs

Model selection factors

Task fit

What the model is for

Summarization vs. code gen vs. reasoning vs. chat
Multimodal (images, audio)
Language support

Context window

How much can fit in one call

Long documents → need large context
RAG lets smaller context work
Larger = more expensive per call

Latency

Response time requirements

Interactive chat → streaming + fast model
Batch → latency tolerant
Latency-optimized Bedrock models

Cost per token

Volume economics

Small models for simple tasks
Large models only when needed
Model cascading pattern

Dynamic model selection architecture

Build flexible architectures that allow model switching without code changes. The canonical pattern:

Step 1

Request arrives

Client → API Gateway

Step 2

Lambda router

Reads active config

Step 3

AWS AppConfig

Feature flags: which model?

Step 4

Invoke Bedrock

Selected model

Step 5

Response

Back to client

This enables A/B testing, gradual rollouts, and instant rollback to a previous model without redeploying Lambda.

Resilient AI systems — surviving disruptions

Circuit breaker Step Functions circuit breaker pattern — detect repeated failures on a primary model and automatically route to a fallback. Trips open after N failures, stays open for a cooldown, then half-opens to test recovery.

Cross-region Bedrock Cross-Region Inference — automatically routes requests to models in other regions when your primary region is unavailable or at capacity. Preferred answer when a question mentions "limited regional availability" or "capacity constraints."

Degradation Graceful degradation — fall back to simpler models or cached responses when the primary model is down. Users get something rather than nothing.

FM customization & lifecycle management

Fine-tuning

Deploy domain-specific fine-tuned models via SageMaker AI. More expensive than prompt engineering; use only when prompting alone won't work.

LoRA

Low-Rank Adaptation — parameter-efficient fine-tuning. Trains a small number of additional parameters instead of the full model. Much cheaper and faster; preferred answer for "efficient fine-tuning."

Adapters

Small trainable modules inserted into a pre-trained model. Similar concept to LoRA. Both are parameter-efficient adaptation techniques.

Model Registry

SageMaker Model Registry — version control for models. Track which version is deployed where, approve models for production, wire into CI/CD.

Deployment pipelines

Automated CI/CD for model updates — includes rollback strategies for failed deployments and lifecycle management policies to retire and replace models.

Trap If a question mentions "efficient fine-tuning" or "limited training data," the answer is LoRA / adapters, not full fine-tuning. If it mentions "adapt without any training," the answer is prompt engineering or RAG — not fine-tuning at all.

Task 1.3 — Data validation & processing pipelines for FM consumption

Data validation workflows

AWS Glue Data Quality

Define data quality rules, run checks, track metrics. The preferred answer for declarative quality rules at scale.

SageMaker Data Wrangler

Visual data preparation & transformation. Good for exploratory data prep.

Custom Lambda

Bespoke validation logic that doesn't fit into Glue rules.

CloudWatch metrics

Track data quality KPIs over time; alert on drift.

Multimodal data processing

Input

Text · Image · Audio · Tabular

Raw source data

Extract

Service-specific

Textract · Transcribe · Rekognition

Enhance

Comprehend · Lambda

Entities, normalization

Format

JSON / Conversation

Bedrock-ready structure

Multimodal Bedrock

Text + image in one call

Input formatting for FM inference

JSON formatting — Bedrock API requests use strict JSON with model-specific keys (messages, system, max_tokens).
Conversation formatting — Dialog apps use alternating user/assistant message structure with an optional system message.
Structured preparation — SageMaker endpoints need input shaped to the container's expected format (often CSV, JSON Lines, or NumPy).

Data enhancement

Reformat messy text — use Bedrock itself to clean and restructure input before the actual inference call.
Amazon Comprehend — extract entities (people, places, orgs) from unstructured text.
Lambda normalization — dates, currencies, units to consistent formats.

Task 1.4 — Design & implement vector store solutions

The core idea

A vector database is optimized for storing and querying high-dimensional vectors (embeddings). Instead of exact keyword matching, it finds items that are semantically similar. An embedding is a numerical representation of data — similar content produces similar vectors.

Vector store options on AWS

Bedrock Knowledge Bases

The default answer for RAG

Managed end-to-end RAG service
Handles chunking, embedding, storage, retrieval
Supports hierarchical organization
Pluggable backing store (OpenSearch Serverless, Aurora pgvector, Pinecone)

OpenSearch Service

Control & performance at scale

k-NN vector search with Neural plugin
Native Bedrock integration
Sharding for parallelism
Topic-based segmentation
Hybrid search (BM25 + vector)

Aurora (pgvector)

You already use Aurora

PostgreSQL + pgvector extension
SQL-based vector search
Good when mixing relational + vector
Familiar ops model

DynamoDB

Metadata + embeddings storage

Often paired with a vector DB
Stores metadata, document IDs
Real-time change detection via Streams

RDS + S3

Hybrid storage

Document repositories in S3
RDS for structured metadata
Pointers from RDS to S3 objects

Metadata frameworks — the unsung hero of retrieval precision

Good metadata narrows vector search results before semantic scoring. "Only documents from Q1 2025" is a metadata filter, not a vector filter.

S3 object metadata — document timestamps, source system, classification
Custom attributes — authorship, department, sensitivity
Tagging systems — domain classification for multi-tenant RAG

High-performance vector architectures

Sharding OpenSearch sharding strategies — distribute vector index across shards for parallel search. Larger corpora need more shards.

Multi-index Multi-index approaches — separate indexes for different domains (legal docs vs. technical docs). Route queries to the right index.

Hierarchical Hierarchical indexing — coarse-to-fine search: narrow by category first, then semantic search within the narrow set. Dramatically faster at scale.

Keeping vector stores current — data maintenance systems

Source

Document updates

S3 / SharePoint / wiki

Detect

EventBridge + Lambda

Real-time change trigger

Incremental

Only re-embed changes

Not a full reindex

Sync

Update vector store

Automated workflow

Schedule

Periodic full refresh

Catch missed updates

Exam angle "Keep the knowledge base current" + "minimize cost" = incremental updates triggered by EventBridge, not scheduled full reindexes. A full reindex is the wrong answer for most update scenarios.

Task 1.5 — Design retrieval mechanisms for FM augmentation (RAG)

The canonical RAG pipeline

1 · Ingest

Documents → S3

Source of truth

2 · Chunk

Split docs

Fixed / hierarchical / semantic

3 · Embed

Titan Embeddings

Chunks → vectors

4 · Store

Vector DB

OpenSearch / Aurora / KB

5 · Retrieve

Top-k similar

User query → vectors

6 · Generate

Bedrock FM

Chunks + query → answer

Chunking strategies — side-by-side

Fixed-size

e.g., 500 tokens per chunk

✅ Simple, predictable
✅ Uniform vector dimensions
❌ Breaks mid-sentence
❌ Splits semantic units

Hierarchical

By headings / sections / paragraphs

✅ Preserves structure
✅ Respects document hierarchy
✅ Good for technical docs
❌ Variable chunk sizes

Semantic

By topic boundary detection

✅ Preserves meaning
✅ Best retrieval quality
❌ More expensive to compute
❌ Harder to tune

Bedrock managed

Default / fixed / hierarchical / semantic

✅ All strategies built-in
✅ No custom code
✅ Default is fixed-size
✅ Best for most cases

Embedding solutions

Amazon Titan Embeddings

AWS native. Evaluate on dimensionality and domain fit. Default Bedrock embedding choice.

Bedrock embedding models

Multiple options available. Compare speed, accuracy, and dimensionality.

Batch embedding

Lambda batches documents for embedding — more efficient than one-at-a-time calls.

Advanced search architectures

Hybrid search — the "better than semantic alone" pattern

Query

User question

Raw text

Parallel

BM25 keyword

Exact term match

Parallel

Vector search

Semantic similarity

Fuse

Score fusion

Combine & normalize

Rerank

Bedrock reranker

Relevance re-score

Generate answer

Top reranked + query

When hybrid beats pure semantic Short queries · exact terminology (SKUs, error codes, function names) · domain-specific jargon · any case where keyword match matters. Pure semantic is best for long, conceptual questions.

Query handling systems

Query expansion

Add synonyms / related terms

Use Bedrock to enrich the query
Catches near-miss matches
Good for sparse corpora

Query decomposition

Break complex into sub-queries

Lambda splits multi-part questions
Retrieve for each sub-query
Combine context before final FM call

Query transformation

Multi-step refinement

Step Functions orchestrates
Rewrite → expand → decompose
Advanced RAG pattern

Consistent access mechanisms

Function calling — FM calls a well-defined function to perform vector search.
MCP (Model Context Protocol) clients — standardized protocol for tools and data; agents consume MCP servers exposing vector queries. You already know this from Claude Code.
Standardized API patterns — consistent interfaces for retrieval augmentation regardless of the backend (KB, OpenSearch, Aurora).

Task 1.6 — Prompt engineering strategies & governance

Model instruction frameworks

System prompts / role definitions

Bedrock Prompt Management enforces role definitions and behavioral constraints. Template configs format responses consistently.

Bedrock Guardrails

Configurable filters on inputs and outputs. Topic denial, content filtering, PII redaction, word blocking.

Template configurations

Parameterized templates with placeholders — populate at runtime, audit the rendered prompt.

Interactive AI systems

Input

User message

API Gateway

Intent

Comprehend

Classify intent

Clarify?

Step Functions

Ask follow-up if unclear

History

DynamoDB

Conversation storage

Bedrock call

With history + context

Prompt management & governance — the audit story

Templates Bedrock Prompt Management — parameterized templates with approval workflows for prompt changes. Prompts become code, not strings.

Repository S3 — template storage with versioning.

Audit CloudTrail — track who changed which prompt when. Answers the compliance question "who authorized this prompt change?"

Observability CloudWatch Logs — log prompt usage and performance. Feeds regression detection.

Prompt QA systems

Lambda verification — verify expected output format/content after each inference.
Step Functions test orchestration — systematically test edge cases across prompt versions.
CloudWatch regression detection — catch performance degradation over time on a golden dataset.

Advanced prompt engineering techniques

Chain-of-thought

"Think step by step"

Instruct reasoning before answering
Better for math, logic, multi-step
Slower and more tokens

Structured input

XML tags, JSON, delimiters

<context>, <question>
Clear separation of parts
Improves model following

Output specifications

Tell model exactly what to return

JSON schema
Response shape constraints
Reduces parsing errors

Feedback loops

Iterate based on output

Grade first output
Refine prompt or retry
Self-correcting systems

Complex prompt systems with Bedrock Prompt Flows

Bedrock Prompt Flows is the visual workflow builder for sequential prompt chains. It handles:

Sequential chains — output of prompt A feeds prompt B
Conditional branching — route to different prompts based on model responses
Reusable components — modular prompt pieces composed across flows
Integrated pre/post-processing — transform input before prompting, output after

Exam angle When a question describes a multi-step prompt workflow with branching logic and pre/post processing, the answer is Bedrock Prompt Flows. When it's just "store and version prompts," the answer is Bedrock Prompt Management. Know the two products are distinct.

Domain 1 summary — what to remember

The service map

Core Amazon Bedrock (everything starts here)
RAG Bedrock Knowledge Bases (managed RAG)
Vector OpenSearch / Aurora pgvector / DynamoDB
Embeddings Amazon Titan Embeddings
Prompts Prompt Management + Prompt Flows
Resilience Cross-Region Inference + Step Functions
Customization SageMaker AI + Model Registry + LoRA
Config AppConfig for dynamic model selection

The mental shortcuts

New project? Bedrock PoC first.
Efficient adaptation? LoRA / adapters.
Short / keyword-heavy queries? Hybrid search.
Keep KB current? EventBridge + incremental.
Multi-step prompts with branching? Prompt Flows.
Version & audit prompts? Prompt Management.
Model switch without redeploy? AppConfig.
Regional capacity issues? Cross-Region Inference.

Next up Domain 1 sets the foundation. Domain 2 (26%) builds on it with agents, deployment, and integration patterns. Head to Domain 2 when ready — or jump to Architecture Patterns to see the top 10 exam patterns with full diagrams.