Architecture diagram

— GenAI Gateway · central control plane for enterprise FM access —
APP · TEAM A Support Chatbot APP · TEAM B Internal Search APP · TEAM C Analytics Q&A ⊕ GenAI Gateway 1 · AUTH API Gateway + Cognito 2 · SAFETY Guardrails + PII filter 3 · THROTTLE Per-team rate limits 4 · ROUTE Lambda · model selector + CloudTrail · CloudWatch · cost tags FM POOL Claude (Haiku/Sonnet) FM POOL Amazon Nova FM POOL Llama · Mistral · Titan ⊕ CROSS-CUTTING · OBSERVABILITY + GOVERNANCE CloudTrail (who called what) · CloudWatch (latency / tokens / errors) Cost Explorer tags (per-team attribution) · Model Invocation Logs (payloads to S3) One central place for audit, cost attribution, and compliance across all teams

How data flows

Every team's application in the enterprise calls the same gateway API instead of calling Bedrock directly. The gateway handles auth, safety filtering, per-team throttling, and model routing. It also logs everything to CloudTrail and CloudWatch, and tags each call by requesting team for cost attribution.

This is the "platform team's" answer to governing AI adoption across many dev teams. Teams don't need to each reimplement guardrails, PII filtering, and rate limiting — the gateway gives them a consistent, compliant path to FMs with one API to learn.

AWS services used

Amazon API GatewayThe public-facing entry point. Provides REST or WebSocket APIs, request validation, throttling, API keys per team.
Amazon CognitoUser identity. Teams authenticate via Cognito; tokens carry team identity into the gateway.
Bedrock GuardrailsContent filtering, denied topics, PII redaction — applied centrally at the gateway so every team inherits the policies.
Amazon ComprehendOptional pre-processing for PII detection beyond what Guardrails catch natively.
AWS Lambda (router)Dynamically selects which FM to invoke based on the request, team policy, cost constraints, or model availability.
AWS AppConfigFeature flags so platform engineers can change model routing without redeploying the gateway.
CloudTrail + CloudWatchAudit and operational observability across all calls through the gateway.
Bedrock Model Invocation LogsFull payload logging to S3 for compliance, debugging, and future fine-tuning datasets.
Cost Allocation TagsTeam tags on every request flow to Cost Explorer, enabling per-team cost attribution and chargeback.

When to use this pattern

Use GenAI Gateway when…

  • Multiple teams / apps need FM accessCentral governance beats every team reimplementing their own guardrails and rate limits.
  • You need consistent policy enforcement"No PII in any LLM call across the company" is enforceable with a gateway, not with individual teams.
  • Cost attribution per team / app mattersChargeback models require per-team tags. The gateway is where that tagging happens cleanly.
  • You need to swap models centrally without code changesPlatform team flips AppConfig → all teams instantly move to a new model (or get cost savings).
  • Compliance requires central audit logRegulated industries (finance, healthcare) need one place that proves who called what, when, and with what data.
  • You have a platform / CCoE teamSomeone has to own and maintain the gateway. Not for 1-2 engineer startups.

Do NOT use Gateway when…

  • Only one team / one appOverkill. Direct Bedrock calls with per-app guardrails are simpler and cheaper.
  • Startup / MVP phaseBuild the product first. Add a gateway when usage patterns and cost centers actually require it.
  • Teams need very different featuresIf one team needs streaming and another needs agents and another needs batch, a single gateway becomes a bottleneck of feature requests.
  • Ultra-low-latency workloadsEvery gateway hop adds ~20-50ms. For voice assistants or real-time autocomplete, direct FM access wins.
  • No platform team exists to own itUnmaintained gateways become brittle bottlenecks. If no one's maintaining it, don't build it.

Exam angle

Pattern-match shortcuts When a stem mentions "multiple teams," "centralized governance," "consistent policy," "per-team cost attribution," or "enterprise-wide AI," GenAI Gateway is the answer. Expect API Gateway + Cognito + Lambda + Bedrock in the correct option.
The "each team builds their own" trap Distractor: "have each team implement Guardrails and PII filtering themselves." Sounds fine on paper but leads to inconsistent policy, duplicated effort, and compliance gaps. The gateway is the right answer when the stem mentions consistency, policy enforcement, or central audit.

Keywords that point here

enterprise-wide multiple teams centralized cost attribution per-team consistent policy governance audit chargeback

Related patterns

The gateway embeds a mini Pattern 10: Defense-in-Depth at its safety layer.
Routing at Layer 4 can implement Pattern 5: Cascading for cost.
For CI/CD of the gateway itself, see Pattern 9: CI/CD for GenAI.