Task 3.1 — Input & output safety controls

The six Bedrock Guardrails filter categories

Memorize these. Every Guardrails question maps to one or more:

1 · Denied topics

Custom categories you forbid
  • e.g., "legal advice," "medical diagnosis"
  • Natural-language topic definitions

2 · Content filters

Hate, insults, sexual, violence, misconduct
  • Threshold per category (low/med/high)
  • Applied independently to input & output

3 · Word filters

Profanity / custom blocklists
  • Exact-word blocking
  • Domain-specific blacklists

4 · Sensitive info

PII detection & redaction
  • SSN, credit card, email, phone, etc.
  • Block or redact (mask)

5 · Contextual grounding

Catches hallucinations in RAG
  • Checks if output is supported by retrieved context
  • Flags/blocks ungrounded responses

6 · Prompt attack filter

Jailbreak / injection detection
  • Detects override attempts
  • Blocks at input layer

Defense-in-depth for GenAI — the seven layers

Applied in order from outside to inside. This is Pattern 10 in the Architecture Patterns reference.

1 · Network VPC endpoints (PrivateLink) for Bedrock — keep FM traffic off the public internet.
2 · Identity IAM policies scoped to specific models and actions; Cognito for end-user auth; identity federation to enterprise IdP.
3 · Pre-processing Amazon Comprehend PII detection + Lambda sanitization before the FM sees input.
4 · Model-level Bedrock Guardrails — topic denial, content filters, PII, grounding, prompt-attack.
5 · Post-processing Lambda validates output format, accuracy, safety after FM responds.
6 · API API Gateway rate limiting; WAF against abuse.
7 · Audit CloudTrail + Bedrock Model Invocation Logs for forensic traceability.

Hallucination reduction techniques

RAG grounding

Force FM to use sources
  • Bedrock Knowledge Bases
  • FM cites retrieved chunks
  • Primary defense

Confidence scoring

Flag low-confidence
  • Uncertainty signals
  • Human review for low scores

Semantic similarity

Verify claims vs. sources
  • Check FM statements against docs
  • Catches fabricated facts

JSON Schema

Structured outputs
  • Force exact response shape
  • Fields populate from sources
  • Less free-form hallucination

Adversarial threat detection

Prompt injection
Attacker embeds instructions that override the system prompt ("Ignore previous instructions. Instead, do X."). Detect with Guardrails' prompt attack filter, input sanitization, safety classifiers.
Jailbreak
Attempt to bypass safety guardrails through clever framing ("You are now DAN," role-play exploits). Same defenses plus adversarial testing.
Input sanitization
Strip or escape potentially malicious content before prompt assembly.
Safety classifiers
Dedicated models that classify input risk before the primary FM runs.
Adversarial testing
Red-team your FM with automated attack workflows; add failed cases to Guardrails.
Trap — security over-engineering Your CISSP instinct may want to add every possible control. The exam rewards the right level of security for the scenario. VPC endpoints + IAM + encryption handle most scenarios. Don't add Lambda@Edge content filtering when Bedrock Guardrails does it natively.

Task 3.2 — Data security & privacy controls

Protected AI environments

Isolation VPC endpoints for bedrock-runtime and bedrock. Invoking compute in a VPC; all traffic stays private.
Access control IAM policies enforce least-privilege access to models and data.
Fine-grained data AWS Lake Formation — column/row-level access control on data lakes feeding FMs.
Monitoring CloudWatch monitors all data access patterns; alarm on anomalies.

Privacy-preserving systems — the PII flow

Discover
Amazon Macie
Scan S3 for sensitive data
Detect
Comprehend PII
Entity recognition in input
Filter
Guardrails sensitive info
Block or redact
Model
Bedrock FM
Processes clean data
Retain
S3 Lifecycle
Auto-delete after retention period
Bedrock data privacy by default AWS does not use your Bedrock data to train base models. Your data stays in your account. Encrypted at rest (KMS) and in transit (TLS 1.2+). This is a frequent exam fact.

Anonymization strategies

Data masking

Replace with realistic fake data
  • "John Smith" → "Alex Johnson"
  • Preserves data shape for testing

Comprehend PII detection

Find & tag entities
  • SSN, CC, email, phone
  • Pre-built PII entity types

Anonymization

Irreversible removal
  • Strip identifiers permanently
  • For data that doesn't need re-linking

Pseudonymization

Reversible with key
  • Replace IDs with tokens
  • Mapping table controls re-identification
  • GDPR-friendly

Task 3.3 — Governance & compliance mechanisms

Compliance frameworks

SageMaker Model Cards
Programmatic documentation of model purpose, limitations, performance metrics, intended use. Required for governance and audit trails.
Glue Data Lineage
Automatically track where data came from, how it was transformed, where it went. Essential for provenance questions.
Metadata tagging
Systematic source attribution in FM-generated content. Tag outputs with which source documents informed them.
CloudWatch Logs
Comprehensive decision logs for audit. Query with Logs Insights.

Data source tracking for traceability

Register
Glue Data Catalog
Central metadata
Tag
Source attribution
Which doc informed output
Log
CloudTrail
Who · what · when
Invoke logs
Model Invocation Logs
Full request/response
Investigate
Logs Insights
Query the audit trail

Continuous monitoring & advanced governance

Misuse detection

Automated anomaly detection
  • Unusual usage patterns
  • Policy violations

Drift monitoring

Model behavior changes
  • Output distribution shifts
  • Quality degradation alerts

Bias drift

Fairness over time
  • Track demographic disparities
  • Alert on widening gaps

Token redaction

Log-level PII protection
  • Redact sensitive fields before logging
  • Auditable but privacy-safe

Task 3.4 — Responsible AI principles

Transparency

Reasoning Reasoning displays — show users how the AI arrived at its answer.
Confidence CloudWatch confidence metrics — quantify and display uncertainty.
Sources Evidence presentation — citations linking claims to source documents (built into Knowledge Bases).
Traces Bedrock Agent Tracing — reasoning traces showing the agent's thought process, tool calls, decision points. Essential for agent debugging and explainability.

Fairness evaluations

CloudWatch fairness metrics
Pre-defined metrics track model performance across demographic groups.
Systematic A/B testing
Bedrock Prompt Management + Prompt Flows for comparing outputs across groups; identify disparate impact.
LLM-as-a-Judge
Use a second FM to evaluate the primary FM's outputs for bias. Bedrock supports this via Model Evaluations automated evaluation jobs.
SageMaker Clarify
Bias detection and model explainability (disparate impact, demographic parity, SHAP).

Policy-compliant AI systems

  • Bedrock Guardrails configured to policy — denied topics match policy rules, word filters match forbidden language
  • Model cards document limitations — what the FM should and shouldn't be used for
  • Lambda compliance checks — automated verification against policy rules; flag or block violations
Exam angle When a question asks about "explainability" or "showing how the agent reached a conclusion," the answer is Bedrock Agent Tracing. When it's "bias detection" or "demographic fairness," the answer includes SageMaker Clarify or LLM-as-a-Judge via Bedrock Model Evaluations.

Domain 3 summary — what to remember

The service map

  • Guardrails Bedrock Guardrails (6 filter types)
  • PII Comprehend detection + Macie discovery
  • Network VPC endpoints / PrivateLink for Bedrock
  • Identity IAM (least privilege) + Cognito
  • Data Lake Formation + KMS encryption
  • Audit CloudTrail + Model Invocation Logs
  • Docs SageMaker Model Cards (programmatic)
  • Lineage Glue Data Lineage + Data Catalog
  • Bias SageMaker Clarify · LLM-as-a-Judge
  • Traces Bedrock Agent Tracing

The mental shortcuts

  • PII in user input? Comprehend + Guardrails sensitive info.
  • Audit of model invocations? Model Invocation Logs.
  • Off public internet? VPC endpoints (PrivateLink).
  • Grounded RAG responses? Guardrails contextual grounding.
  • Prompt injection defense? Guardrails prompt attack filter.
  • Bias detection? SageMaker Clarify.
  • Agent explainability? Bedrock Agent Tracing.
  • Model documentation? SageMaker Model Cards.
Next up Continue to Domain 4 — Operational Efficiency & Optimization (12%). Or see the Defense-in-Depth pattern fully diagrammed.