Skip to content

How Authentication & Security Works on UniversalAPI: A Deep Dive

Updated April 2, 2026 · Architecture · 15 min read

Want the quick version?

This is the full technical deep dive. For a concise 5-minute overview of our security architecture, read the Security Architecture Overview.


When you run an MCP server or AI agent on a shared platform, you're trusting that platform with your API keys, your users' data, and the execution of arbitrary code. That's a lot of trust.

We take that seriously. Every request to UniversalAPI passes through a centralized authorizer before it ever reaches an MCP server or agent runtime. Your third-party API keys are stored encrypted and injected server-side — the client never sees them. All user-authored code runs in hardened sandboxes. And for both MCP servers and agents, each user gets their own dedicated Firecracker microVM via AWS Lambda's tenant isolation.

This post is a deep dive into how all of that works. No hand-waving — we'll walk through the actual architecture, from the moment a request hits our API Gateway to the moment your MCP server code reads process.env.SERPAPI_KEY.

The Big Picture

Here's the request flow at a high level:

Client (Browser / API / MCP Client)

  │  Authorization: Bearer uapi_ut_abc123...


API Gateway (api.universalapi.co)


PublicAuthorizer Lambda
  │  ✓ Validate token (hash, expiry, revocation)
  │  ✓ Check credits
  │  ✓ Resolve user's API keys from KeysTable
  │  ✓ Build auth context


Runtime Lambda (MCP or Agent)
  │  ✓ Inject credentials as env vars
  │  ✓ Execute code in sandbox
  │  ✓ Clean up env vars after execution


Response → Client

The key principle: the client only presents a token. Everything else — validation, scope enforcement, credential injection, sandboxing — happens server-side.

Layer 1: The Centralized Authorizer

Every API request — whether it's a browser session, a programmatic API call, or an MCP client connecting via Streamable HTTP — hits the PublicAuthorizer Lambda before reaching any backend handler. This is an API Gateway Lambda authorizer that runs as a gatekeeper.

The authorizer checks tokens in a strict priority order:

Token Types (Priority Order)

1. User Tokens (uapi_ut_*) — Programmatic access as yourself

When you create a user token on the Credentials page, you get a token like uapi_ut_abc123def456.... This token acts as you — it consumes your credits and has access to all your stored API keys.

2. Role Tokens (uapi_rt_*) — Scoped access with specific keys

Role tokens carry their own set of API keys rather than inheriting the user's full key set. This is how MCP server authors can provide a "keys-included" experience — they attach a role token with their own SerpAPI key, OpenAI key, etc., and users of that server don't need to bring their own.

3. Cognito JWTs — Browser sessions

When you log into the UniversalAPI web app, your browser gets a Cognito ID token (a standard JWT). The authorizer validates it server-side using JWKS — checking the RS256 signature, issuer, audience, expiration, and token_use claim. The browser's only job is to present the token.

4. Anonymous — No token at all

Requests without any token are allowed through with IP-based rate limiting. This enables public endpoints like the resource catalog and search.

How Token Validation Works

Let's trace what happens when you send a request with a user token:

bash
curl -H "Authorization: Bearer uapi_ut_abc123def456..." \
     https://mcp.api.universalapi.co/mcp/s/snowtimber/doc-hound
  1. Extract the prefix. The authorizer reads the first 8 characters (uapi_ut_) to determine the token type.

  2. Hash the token. The full token is SHA-256 hashed. We never store tokens in plaintext — only their hashes live in DynamoDB. This means even if someone gained read access to our database, they couldn't extract usable tokens.

  3. Look up by prefix. The first 12 characters of the token are used to query a Global Secondary Index (tokenPrefixIndex) on the AccessTokensTable. This narrows the search efficiently.

  4. Verify the hash. The stored hash is compared against the hash of the presented token. If they don't match, the request is rejected.

  5. Check status. Is the token revoked? Expired? If so, reject.

  6. Check credits. Has the user exceeded their credit limit? If the token has a creditLimit and the user's consumed credits exceed it, reject.

  7. Resolve API keys. For user tokens, the authorizer queries the KeysTable to fetch all of the user's stored third-party API keys (SerpAPI, OpenAI, AWS, GitHub, etc.). For role tokens, only the keys attached to that specific token are resolved.

  8. Refresh expiring OAuth tokens. If any of the user's keys are OAuth tokens (Google, Microsoft, GitHub) that are about to expire, the authorizer refreshes them inline — before the request even reaches the backend.

  9. Build the auth context. The authorizer returns an IAM Allow policy along with a rich context object containing userId, credits, keys (as JSON), authMethod, and verified status. API Gateway passes this context to the backend Lambda via headers.

All of this happens in a single Lambda invocation, typically in under 100ms. The backend handler never validates tokens itself — it just reads the pre-validated context from API Gateway headers.

Layer 2: Credential Injection

This is where things get interesting. UniversalAPI lets you store third-party API keys (SerpAPI, OpenAI, AWS credentials, Google OAuth tokens, etc.) and have them automatically injected into MCP server and agent runtimes. The client never sees or handles these credentials.

How Keys Get Into Your MCP Server

When your MCP server code runs, it can simply read process.env.SERPAPI_KEY — no configuration needed. Here's how that works:

  1. The authorizer resolves keys. As described above, the PublicAuthorizer fetches the user's keys from KeysTable and includes them in the auth context as a JSON blob.

  2. The runtime handler injects them. The mcpRuntimeHandler (for MCP servers) or agentRuntimeHandler (for agents) reads the keys from the auth context and sets them as environment variables:

javascript
// Simplified from mcpRuntimeHandler.js
for (const key of userKeys) {
    process.env[key.keyName] = key.keyValue;
    injectedKeyNames.add(key.keyName);
}
  1. Author keys override user keys. If the MCP server has an authorRoleToken (meaning the author provides their own API keys), those keys are resolved separately and injected after user keys — overriding any keys with the same name. This enables the "keys-included" model where server authors can provide their own SerpAPI key so users don't have to.

  2. Keys are cleaned up after execution. After the MCP tool call completes, all injected keys are removed from process.env:

javascript
// Cleanup after execution
for (const keyName of injectedKeyNames) {
    delete process.env[keyName];
}

This means keys from one user's request never leak into another user's request, even on a warm Lambda instance.

The Key Priority Chain

The credential injection follows a clear priority:

PrioritySourceWhen Used
1 (highest)Author's role token keysMCP server has authorRoleToken set
2User's stored keysUser has keys in their account
3NoneNo keys available — server must work without them

This priority chain is resolved entirely server-side. The client has no influence over which keys get injected.

OAuth Token Management

For OAuth-based services (Google, Microsoft, GitHub), the flow is:

  1. User initiates OAuth via GET /oauth/authorize/{provider} — this redirects to the provider's consent screen.
  2. Callback hits the server. The OAuth handler (oauth_handler.py) exchanges the authorization code for access and refresh tokens.
  3. Tokens are stored encrypted in KeysTable with the provider name, scopes, and expiration.
  4. Automatic refresh. The PublicAuthorizer checks token expiration on every request. If a token is within 5 minutes of expiring, it refreshes it inline using the stored refresh token — via the token_refresh_handler. The user never has to re-authenticate.

The client never touches OAuth tokens directly. It just redirects the user to start the flow — everything else is server-side.

Layer 3: Runtime Sandboxing

MCP servers and agents run user-authored code. That code could be anything — a helpful web search tool, or an attempt to read platform secrets from the environment. Sandboxing ensures that user code can only do what it's supposed to do.

MCP Server Sandbox (Node.js)

MCP servers are written in JavaScript and executed in a sandboxed Node.js environment. The sandbox enforces several restrictions:

Module blocklist. The runtime intercepts require() calls and blocks dangerous modules:

child_process, vm, worker_threads, cluster,
net, dgram, tls, http2,
fs (direct access), os (direct access)

MCP server code cannot spawn processes, open network sockets, or access the filesystem directly. If your tool needs to make HTTP requests, it uses the allowed fetch API or axios — which are permitted and logged.

Environment variable sanitization. Before executing MCP server code, the runtime strips platform-internal environment variables from process.env:

AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN,
TABLE_NAME, KEYS_TABLE_NAME, ACCESS_TOKENS_TABLE_NAME,
AWS_LAMBDA_FUNCTION_NAME, _HANDLER, ...

Your MCP server code sees its injected third-party keys (like SERPAPI_KEY) but not the platform's AWS credentials or internal configuration. After execution, even the injected keys are cleaned up.

No persistent state. Each MCP tool call gets a fresh execution context. There's no shared state between requests, and no way for one user's execution to affect another's.

Agent Sandbox (Python)

Agents are written in Python and run in an even more restrictive sandbox, since they execute with access to Amazon Bedrock (LLM calls) and can make autonomous decisions.

Restricted builtins. The sandbox replaces Python's builtins with a curated whitelist of ~70 safe functions. Notably blocked:

python
# These are NOT available in the agent sandbox
eval()       # No arbitrary code execution
exec()       # No arbitrary code execution
compile()    # No code compilation
breakpoint() # No debugger access
__import__() # Replaced with restricted version

Restricted imports. The import statement is intercepted and checked against an allowlist of ~50 module families. Allowed modules include strands, boto3, httpx, json, datetime, and standard library utilities. Blocked modules include:

python
# These imports will raise ImportError in the sandbox
import subprocess  # No shell access
import socket      # No raw network access
import ctypes      # No C-level access
import importlib   # No dynamic import bypass

Restricted os module. When agent code does import os, it gets a RestrictedOS object instead of the real os module. This object:

  • Blocks os.system(), os.popen(), os.exec*() — no shell execution
  • Restricts os.makedirs() and os.chdir() to /tmp only
  • Blocks access to os.environ directly (replaced with a filtered view)

Restricted file I/O. The open() builtin is wrapped to:

  • Allow writes only to /tmp
  • Block reads of sensitive paths (/etc/shadow, /proc/self/environ, /var/runtime/, etc.)
  • Prevent path traversal attacks

Filtered environment variables. Agent code sees a _FilteredEnviron proxy instead of the real os.environ. This proxy hides 30+ platform environment variables (AWS credentials, table names, handler paths) while exposing the user's injected third-party keys.

Layer 4: Lambda Tenant Isolation (Hardware-Level)

Sandboxing is software-level isolation. But for both MCP servers and agents — which execute untrusted user-authored code — we go a step further with hardware-level isolation.

UniversalAPI uses AWS Lambda's Tenant Isolation Mode (PER_TENANT tenancy) for both the MCP runtime and the agent runtime. Here's what that means:

How It Works

  1. The authorizer sets a tenant ID. When the PublicAuthorizer validates a token, it returns the token value as the tenantId in the authorizer context.

  2. API Gateway maps it to a header. The tenant ID is passed to Lambda via the X-Amz-Tenant-Id header.

  3. Lambda routes to a dedicated microVM. AWS Lambda uses the tenant ID to route the request to a Firecracker microVM dedicated to that tenant. Different users' requests run on physically separate virtual machines.

What This Prevents

  • Side-channel attacks. Even if an attacker found a way to escape the Python sandbox, they'd be in a microVM that only contains their own data. No other user's keys, tokens, or execution state are present.
  • Timing attacks. Each tenant's microVM has its own CPU allocation, preventing timing-based information leakage between tenants.
  • Memory inspection. The Firecracker hypervisor provides memory isolation between microVMs. One tenant cannot read another tenant's memory.

Why Both Runtimes?

MCP servers execute untrusted user-authored JavaScript code. Even though MCP execution is stateless and keys are cleaned up after each request, tenant isolation provides defense-in-depth — preventing cross-tenant data leakage via /tmp, global variables, or in-memory caches that could survive between invocations on a warm Lambda container.

For agents, the case is even stronger: agents have access to Bedrock (LLM calls), can make autonomous multi-step decisions, and run more complex code. But both runtimes benefit from the same hardware-level guarantee: your execution environment is never shared with another tenant.

The tradeoff is additional cold starts (each tenant gets their own Firecracker microVM), but we've found this acceptable given the security benefits.

Layer 5: Bedrock Metering

Agents on UniversalAPI can call Amazon Bedrock models (Claude, Nova, Titan, etc.) as part of their execution. Since these calls cost real money, we need to meter them accurately and bill the right user.

The Metered Bedrock Client

When agent code creates a boto3.client('bedrock-runtime'), it doesn't get the real client. Instead, the agent runtime monkey-patches boto3.client() to return a MeteredBedrockClient — a transparent proxy that wraps every Bedrock API call.

Here's what the proxy does:

  1. Intercepts the call. When agent code calls client.invoke_model(...) or client.converse(...), the proxy intercepts it.

  2. Forwards to Bedrock. The actual API call goes through to Bedrock normally.

  3. Extracts token counts. From the response, the proxy extracts inputTokens and outputTokens.

  4. Calculates cost. Using a built-in pricing table (per-model, per-region), it calculates the exact cost in credits.

  5. Applies infrastructure fee. A 1.20x multiplier is applied to cover platform infrastructure costs.

  6. Records the usage. The metered usage is attached to the request log, which feeds into the billing pipeline.

This is completely transparent to the agent code. The agent author writes normal boto3 calls, and the platform handles metering behind the scenes. The user sees itemized Bedrock costs in their usage dashboard.

Model Allowlist

The metering layer also enforces a model allowlist. Agent code can only call models that are in the platform's MODEL_PRICING table. Attempts to call unsupported models are blocked — preventing unexpected costs and ensuring every call can be accurately billed.

Layer 6: Log Sanitization

Users can view full request logs via the API and the web dashboard for debugging. But logs can inadvertently contain secrets — API keys that appeared in request/response bodies, Bearer tokens in headers, or AWS credentials from error messages. These must never be visible.

UniversalAPI implements two-layer defense-in-depth log sanitization:

Layer 1: Write-Time Redaction (Primary Defense)

Before any log record is written to DynamoDB, the log_request.py helper calls sanitize_log_record() with a list of known secret values — the actual API keys, tokens, and passwords that were injected into the runtime for that request.

python
# Simplified from log_request.py
known_secrets = collect_secret_values(user_keys, author_keys, platform_keys)
sanitized_record = sanitize_log_record(record, known_secret_values=known_secrets)
table.put_item(Item=sanitized_record)

This layer performs exact-match replacement: every occurrence of a known secret value in any field of the log record is replaced with [REDACTED]. Because it has access to the actual secret values, it catches secrets even when they appear in unexpected places — inside JSON payloads, error messages, or concatenated strings.

Layer 2: Read-Time Redaction (Defense-in-Depth)

When logs are read back via the GET /logs/{requestId} API, a second sanitization pass runs using pattern-based detection. This layer doesn't know the actual secret values — instead, it uses regex patterns to identify anything that looks like a secret:

  • API key patterns (sk-..., uapi_..., AKIA...)
  • Bearer tokens in Authorization headers
  • AWS session tokens and secret access keys
  • Common password field patterns

This catches secrets that might have been missed at write time — for example, if a new secret format wasn't in the known-secrets list, or if a secret was logged by a code path that didn't pass through the primary sanitization.

Both Layers Are Automatic

Neither MCP server authors nor users need to configure anything. The platform collects all known secrets before execution, sanitizes at write time, and re-sanitizes at read time. Secrets are stripped before they ever reach the user's screen.

Layer 7: Data Privacy and Account Deletion

UniversalAPI follows data minimization principles and provides full GDPR/CCPA-compliant account deletion.

What We Store

We store only what's needed to operate the platform:

  • User account: userId, email, display name, subscription status, credit balance
  • Credentials: Third-party API keys (encrypted in DynamoDB), OAuth tokens (encrypted with refresh tokens), access tokens (SHA-256 hashed — never stored in plaintext)
  • Resources: MCP server definitions, agent definitions, conversation history
  • Usage: Request logs with requestId tracing, aggregated usage metrics
  • Files: User-uploaded knowledge files in S3

We do not collect analytics beyond what's needed for billing and debugging. We do not sell or share user data.

Full Account Deletion

The DELETE /user/account endpoint (or POST /user/account with {"confirm": true}) performs a hard delete of all user data:

  1. User record — Deleted from UniUserTable
  2. API keys — All entries deleted from KeysTable
  3. Access tokens — All entries deleted from AccessTokensTable
  4. Agents — All agent definitions deleted from AgentTable
  5. MCP servers — All server definitions deleted from McpServerTable
  6. Conversation history — All conversations deleted from AgentConversationTable
  7. Request logs — All logs deleted from RequestTable
  8. Cron jobs — All scheduled jobs deleted, EventBridge schedules removed
  9. Channels — All messaging channel bindings deleted
  10. Knowledge files — All files deleted from S3
  11. Search vectors — All semantic search vectors deleted from S3 Vectors
  12. User alias — Alias mapping deleted from AliasTable

This is a hard delete, not a soft disable. After deletion, no user data remains in any table or storage system. The deletion is immediate and irreversible.

How This Compares to the MCP Auth Spec

The MCP specification is evolving its own authentication story, based on OAuth 2.1. The idea is that each MCP server would implement its own OAuth flow, and clients would authenticate directly with each server.

We take a different approach: a single, centralized authorizer in front of all MCP servers.

Why We Diverge from the Spec

AspectMCP Spec ApproachUniversalAPI Approach
Auth enforcementEach MCP server implements OAuthSingle PublicAuthorizer Lambda
Token validationServer-side, per-serverServer-side, centralized
Credential managementClient manages per-server tokensPlatform manages all credentials
Key injectionNot specifiedAutomatic, server-side
Tenant isolationNot specifiedHardware-level (agents)

The centralized approach has a significant security advantage: there's one place to get auth right, not dozens. If we relied on each MCP server to implement OAuth correctly, a single misconfigured server could compromise user credentials. With our approach, the MCP server code never even sees the authentication layer — it just receives pre-validated context and injected keys.

The tradeoff is interoperability. If you're using a standard MCP client that expects the MCP OAuth 2.1 flow, it won't work with our auth model out of the box. In practice, this hasn't been an issue because most MCP clients (Cline, Claude Desktop, Strands, etc.) support custom headers, which is all you need:

json
{
  "headers": {
    "Authorization": "Bearer uapi_ut_your_token_here"
  }
}

Security Summary

Here's the full security stack, from outermost to innermost:

LayerWhat It DoesApplies To
API GatewayTLS termination, rate limiting, request routingAll requests
PublicAuthorizerToken validation, credit checks, key resolution, OAuth refreshAll authenticated requests
Credential InjectionServer-side key injection with author/user priority, post-execution cleanupMCP servers, Agents
Runtime SandboxModule blocklist (Node.js), restricted builtins/imports (Python), env filteringMCP servers, Agents
Tenant IsolationDedicated Firecracker microVM per user tokenMCP servers, Agents
Bedrock MeteringTransparent usage tracking, model allowlist, cost calculationAgents only
Log SanitizationTwo-layer secret redaction (write-time exact match + read-time pattern match)All logged requests
Data PrivacyFull hard-delete account deletion, encrypted credential storage, data minimizationAll user data

Every layer is server-side. The client's only responsibility is to obtain a token and present it in the Authorization header. Everything else — validation, scope enforcement, credential injection, sandboxing, isolation, and billing — happens on our infrastructure.

What This Means for You

If you're using MCP servers or agents on UniversalAPI:

  • Your API keys are stored encrypted and never exposed to client code
  • Your keys are automatically injected into runtimes and cleaned up after each request
  • OAuth tokens are refreshed automatically — you never have to re-authenticate
  • Every MCP server and agent request runs in an isolated Firecracker microVM

If you're building MCP servers or agents on UniversalAPI:

  • You don't need to implement authentication — the platform handles it
  • Your code reads API keys from environment variables — no key management logic needed
  • Your code runs in a sandbox — you can't accidentally (or intentionally) access other users' data
  • Bedrock usage is metered transparently — just write normal boto3 calls

If you're evaluating UniversalAPI for enterprise use:

  • All auth enforcement is server-side, aligned with zero-trust principles
  • Hardware-level tenant isolation (Firecracker microVMs) for all code execution — MCP servers and agents
  • Tokens are SHA-256 hashed at rest — no plaintext credential storage
  • Centralized auth avoids the pitfalls of the evolving MCP OAuth spec
  • Full audit trail via request logging with requestId tracing
  • Two-layer log sanitization ensures secrets never appear in debugging logs
  • Full GDPR/CCPA-compliant account deletion — hard delete of all user data across all tables and storage

Security is never "done" — it's an ongoing practice. We're continuing to evolve our security posture, including work on scoped IAM roles for even finer-grained permission boundaries within the sandbox. If you have questions about our security architecture, reach out on GitHub.

Want to see it in action? Sign up for UniversalAPI — it's free to start, and your first 100 credits are on us. Browse the MCP server catalog or create your first agent.

Universal API — The agentic entry point to the universe of APIs