Task 3.1 — Input & output safety controls
The six Bedrock Guardrails filter categories
Memorize these. Every Guardrails question maps to one or more:
1 · Denied topics
Custom categories you forbid
- e.g., "legal advice," "medical diagnosis"
- Natural-language topic definitions
2 · Content filters
Hate, insults, sexual, violence, misconduct
- Threshold per category (low/med/high)
- Applied independently to input & output
3 · Word filters
Profanity / custom blocklists
- Exact-word blocking
- Domain-specific blacklists
4 · Sensitive info
PII detection & redaction
- SSN, credit card, email, phone, etc.
- Block or redact (mask)
5 · Contextual grounding
Catches hallucinations in RAG
- Checks if output is supported by retrieved context
- Flags/blocks ungrounded responses
6 · Prompt attack filter
Jailbreak / injection detection
- Detects override attempts
- Blocks at input layer
Defense-in-depth for GenAI — the seven layers
Applied in order from outside to inside. This is Pattern 10 in the Architecture Patterns reference.
1 · Network
VPC endpoints (PrivateLink) for Bedrock — keep FM traffic off the public internet.
2 · Identity
IAM policies scoped to specific models and actions; Cognito for end-user auth; identity federation to enterprise IdP.
3 · Pre-processing
Amazon Comprehend PII detection + Lambda sanitization before the FM sees input.
4 · Model-level
Bedrock Guardrails — topic denial, content filters, PII, grounding, prompt-attack.
5 · Post-processing
Lambda validates output format, accuracy, safety after FM responds.
6 · API
API Gateway rate limiting; WAF against abuse.
7 · Audit
CloudTrail + Bedrock Model Invocation Logs for forensic traceability.
Hallucination reduction techniques
RAG grounding
Force FM to use sources
- Bedrock Knowledge Bases
- FM cites retrieved chunks
- Primary defense
Confidence scoring
Flag low-confidence
- Uncertainty signals
- Human review for low scores
Semantic similarity
Verify claims vs. sources
- Check FM statements against docs
- Catches fabricated facts
JSON Schema
Structured outputs
- Force exact response shape
- Fields populate from sources
- Less free-form hallucination
Adversarial threat detection
Prompt injection
Attacker embeds instructions that override the system prompt ("Ignore previous instructions. Instead, do X."). Detect with Guardrails' prompt attack filter, input sanitization, safety classifiers.
Jailbreak
Attempt to bypass safety guardrails through clever framing ("You are now DAN," role-play exploits). Same defenses plus adversarial testing.
Input sanitization
Strip or escape potentially malicious content before prompt assembly.
Safety classifiers
Dedicated models that classify input risk before the primary FM runs.
Adversarial testing
Red-team your FM with automated attack workflows; add failed cases to Guardrails.
Trap — security over-engineering
Your CISSP instinct may want to add every possible control. The exam rewards the right level of security for the scenario. VPC endpoints + IAM + encryption handle most scenarios. Don't add Lambda@Edge content filtering when Bedrock Guardrails does it natively.
Task 3.2 — Data security & privacy controls
Protected AI environments
Isolation
VPC endpoints for
bedrock-runtime and bedrock. Invoking compute in a VPC; all traffic stays private.
Access control
IAM policies enforce least-privilege access to models and data.
Fine-grained data
AWS Lake Formation — column/row-level access control on data lakes feeding FMs.
Monitoring
CloudWatch monitors all data access patterns; alarm on anomalies.
Privacy-preserving systems — the PII flow
Discover
Amazon Macie
Scan S3 for sensitive data
Detect
Comprehend PII
Entity recognition in input
Filter
Guardrails sensitive info
Block or redact
Model
Bedrock FM
Processes clean data
Retain
S3 Lifecycle
Auto-delete after retention period
Bedrock data privacy by default
AWS does not use your Bedrock data to train base models. Your data stays in your account. Encrypted at rest (KMS) and in transit (TLS 1.2+). This is a frequent exam fact.
Anonymization strategies
Data masking
Replace with realistic fake data
- "John Smith" → "Alex Johnson"
- Preserves data shape for testing
Comprehend PII detection
Find & tag entities
- SSN, CC, email, phone
- Pre-built PII entity types
Anonymization
Irreversible removal
- Strip identifiers permanently
- For data that doesn't need re-linking
Pseudonymization
Reversible with key
- Replace IDs with tokens
- Mapping table controls re-identification
- GDPR-friendly
Task 3.3 — Governance & compliance mechanisms
Compliance frameworks
SageMaker Model Cards
Programmatic documentation of model purpose, limitations, performance metrics, intended use. Required for governance and audit trails.
Glue Data Lineage
Automatically track where data came from, how it was transformed, where it went. Essential for provenance questions.
Metadata tagging
Systematic source attribution in FM-generated content. Tag outputs with which source documents informed them.
CloudWatch Logs
Comprehensive decision logs for audit. Query with Logs Insights.
Data source tracking for traceability
Register
Glue Data Catalog
Central metadata
Tag
Source attribution
Which doc informed output
Log
CloudTrail
Who · what · when
Invoke logs
Model Invocation Logs
Full request/response
Investigate
Logs Insights
Query the audit trail
Continuous monitoring & advanced governance
Misuse detection
Automated anomaly detection
- Unusual usage patterns
- Policy violations
Drift monitoring
Model behavior changes
- Output distribution shifts
- Quality degradation alerts
Bias drift
Fairness over time
- Track demographic disparities
- Alert on widening gaps
Token redaction
Log-level PII protection
- Redact sensitive fields before logging
- Auditable but privacy-safe
Task 3.4 — Responsible AI principles
Transparency
Reasoning
Reasoning displays — show users how the AI arrived at its answer.
Confidence
CloudWatch confidence metrics — quantify and display uncertainty.
Sources
Evidence presentation — citations linking claims to source documents (built into Knowledge Bases).
Traces
Bedrock Agent Tracing — reasoning traces showing the agent's thought process, tool calls, decision points. Essential for agent debugging and explainability.
Fairness evaluations
CloudWatch fairness metrics
Pre-defined metrics track model performance across demographic groups.
Systematic A/B testing
Bedrock Prompt Management + Prompt Flows for comparing outputs across groups; identify disparate impact.
LLM-as-a-Judge
Use a second FM to evaluate the primary FM's outputs for bias. Bedrock supports this via Model Evaluations automated evaluation jobs.
SageMaker Clarify
Bias detection and model explainability (disparate impact, demographic parity, SHAP).
Policy-compliant AI systems
- Bedrock Guardrails configured to policy — denied topics match policy rules, word filters match forbidden language
- Model cards document limitations — what the FM should and shouldn't be used for
- Lambda compliance checks — automated verification against policy rules; flag or block violations
Exam angle
When a question asks about "explainability" or "showing how the agent reached a conclusion," the answer is Bedrock Agent Tracing. When it's "bias detection" or "demographic fairness," the answer includes SageMaker Clarify or LLM-as-a-Judge via Bedrock Model Evaluations.
Domain 3 summary — what to remember
The service map
- Guardrails Bedrock Guardrails (6 filter types)
- PII Comprehend detection + Macie discovery
- Network VPC endpoints / PrivateLink for Bedrock
- Identity IAM (least privilege) + Cognito
- Data Lake Formation + KMS encryption
- Audit CloudTrail + Model Invocation Logs
- Docs SageMaker Model Cards (programmatic)
- Lineage Glue Data Lineage + Data Catalog
- Bias SageMaker Clarify · LLM-as-a-Judge
- Traces Bedrock Agent Tracing
The mental shortcuts
- PII in user input? Comprehend + Guardrails sensitive info.
- Audit of model invocations? Model Invocation Logs.
- Off public internet? VPC endpoints (PrivateLink).
- Grounded RAG responses? Guardrails contextual grounding.
- Prompt injection defense? Guardrails prompt attack filter.
- Bias detection? SageMaker Clarify.
- Agent explainability? Bedrock Agent Tracing.
- Model documentation? SageMaker Model Cards.
Next up
Continue to Domain 4 — Operational Efficiency & Optimization (12%). Or see the Defense-in-Depth pattern fully diagrammed.