SSRF in the Cloud Era: How One Misuse of HttpClient Stole 100M Records
In July 2019, a former Amazon Web Services engineer named Paige Thompson was arrested and later indicted by the U.S. Department of Justice for one of the largest financial-sector breaches on record: roughly 106 million Capital One credit card applicants and customers across the United States and Canada. The technical hook was not a zero-day or a sophisticated kernel exploit. It was a misconfigured ModSecurity-based web application firewall that could be coerced into making outbound HTTP requests on the attacker's behalf. From inside the EC2 instance hosting the WAF, those requests reached http://169.254.169.254/latest/meta-data/iam/security-credentials/, the AWS Instance Metadata Service version 1, which happily returned temporary IAM credentials for the role attached to the instance. That role had S3 list and read permissions far broader than the WAF needed. Thompson used the stolen credentials to enumerate buckets and exfiltrate the customer data to an external server. No CVE was assigned because no single library was at fault. The DOJ press release and the subsequent congressional report attributed the incident to a chain: Server-Side Request Forgery, the legacy IMDSv1 protocol, and an over-privileged IAM role. Each link in that chain was a default that nobody questioned until 100 million records were sitting on someone else's disk.
The Mechanic in One Paragraph
Server-Side Request Forgery happens when an application takes a URL or hostname from an attacker and uses it as the destination of an outbound HTTP, FTP, gopher, or file request. The vulnerable component is the server, and that distinction is the entire point. The server sits inside your network, behind the perimeter firewall, with routable access to internal services the public internet cannot reach: cloud metadata endpoints, admin panels, databases on private subnets, Kubernetes API servers, Redis on localhost, AWS APIs through VPC endpoints. A successful SSRF turns the server into a confused deputy that fetches whatever the attacker wants, then often returns the response body verbatim. The attacker did not bypass the firewall. The application bypassed it for them.
A Python Webhook That Hands Over Your Cloud
The most common shape of SSRF is an endpoint that fetches a URL on behalf of a user. Image proxies, link unfurlers, webhooks, OAuth callbacks, PDF generators, and "import from URL" features all fit the pattern. Here is a Flask handler that takes a webhook target from the request body and forwards a payload to it:
@app.route("/webhook/dispatch", methods=["POST"])
def dispatch_webhook():
target = request.json["url"]
payload = request.json["payload"]
response = requests.post(target, json=payload, timeout=5)
return jsonify({"status": response.status_code, "body": response.text}) Send {"url": "http://169.254.169.254/latest/meta-data/iam/security-credentials/", "payload": {}} and the response body contains the IAM role name. One more request to the same path with the role name appended returns the temporary access key, secret key, and session token. The fix is not "validate the URL." Regex validation is the wrong tool because DNS rebinding, IPv6 representations, decimal-encoded IPs, and URL parser disagreements all defeat string matching. The correct shape is parse, resolve, then check the resolved address:
import ipaddress, socket
from urllib.parse import urlparse
ALLOWED_HOSTS = {"hooks.example-partner.com", "api.trusted-vendor.io"}
def safe_dispatch_url(raw_url: str) -> str:
parsed = urlparse(raw_url)
if parsed.scheme not in {"https"}:
raise ValueError("only https allowed")
if parsed.hostname not in ALLOWED_HOSTS:
raise ValueError("destination not in allowlist")
# Resolve and verify every returned address
for family, _, _, _, sockaddr in socket.getaddrinfo(parsed.hostname, None):
ip = ipaddress.ip_address(sockaddr[0])
if ip.is_private or ip.is_loopback or ip.is_link_local or ip.is_reserved:
raise ValueError(f"resolved to non-routable address {ip}")
return raw_url The allowlist is the primary defense. The IP-range check is the safety net for when allowlists are infeasible (user-controlled link previews, for example). Both layers matter because DNS records are mutable: an attacker who controls evil.example.com can point its A record at 169.254.169.254 between your check and the actual request. The serious mitigation for that race is to resolve once, then make the HTTP request to the resolved IP with the original Host header set explicitly.
A Node.js Link Preview That Reaches the Kube API
Link unfurling is the second most common SSRF vector. A user pastes a URL into a chat message, the server fetches the page, parses the OpenGraph tags, and renders a preview card. The naive implementation uses whatever URL the user typed:
app.post("/preview", async (req, res) => {
const { url } = req.body;
const response = await fetch(url, { redirect: "follow" });
const html = await response.text();
res.json({ title: extractTitle(html), description: extractDescription(html) });
}); On a Kubernetes cluster this endpoint can hit the kube-apiserver at https://kubernetes.default.svc, internal Redis at http://10.0.0.42:6379/, or the EKS metadata service. Following redirects makes it worse: the attacker returns a 302 to http://169.254.169.254/latest/meta-data/ from a domain that did pass the allowlist check. The hardened version resolves first, blocks redirects, and routes through an explicit egress proxy:
import dns from "node:dns/promises";
import net from "node:net";
const PRIVATE_RANGES = [
"10.0.0.0/8", "172.16.0.0/12", "192.168.0.0/16",
"127.0.0.0/8", "169.254.0.0/16", "::1/128", "fc00::/7", "fe80::/10"
];
async function resolveSafe(hostname) {
const records = await dns.lookup(hostname, { all: true });
for (const { address } of records) {
if (PRIVATE_RANGES.some(cidr => cidrMatch(address, cidr))) {
throw new Error(`blocked private address ${address} for ${hostname}`);
}
}
return records[0].address;
}
app.post("/preview", async (req, res) => {
const parsed = new URL(req.body.url);
if (parsed.protocol !== "https:") return res.status(400).end();
const ip = await resolveSafe(parsed.hostname);
const response = await fetch(`https://${ip}${parsed.pathname}${parsed.search}`, {
redirect: "manual",
headers: { Host: parsed.hostname },
});
const html = await response.text();
res.json({ title: extractTitle(html), description: extractDescription(html) });
});The pattern is intentional. Resolve the hostname yourself, reject anything that points inward, then talk directly to the IP with the Host header restored so TLS and virtual hosting still work. Refuse to follow redirects automatically; if you need them, repeat the whole resolve-and-check cycle on each hop.
Why Cloud Made SSRF a 10/10
Before cloud, SSRF was usually rated medium. The worst case was reading an internal status page or pivoting to an unauthenticated admin panel; serious damage required chaining with another bug. The metadata service changed that math overnight. On AWS, every EC2 instance can ask http://169.254.169.254/latest/meta-data/iam/security-credentials/<role> over plain HTTP, with no authentication, and receive a JSON document containing valid temporary STS credentials for the role attached to the instance. The role typically has the permissions the workload needs, which means SSRF plus an over-privileged role equals direct access to whatever S3 buckets, DynamoDB tables, RDS snapshots, or KMS keys the role can touch. Capital One demonstrated the entire chain in production.
The same address answers on Azure, where IMDS lives at http://169.254.169.254/metadata/instance behind only a Metadata: true header. GCP serves http://metadata.google.internal/computeMetadata/v1/ behind Metadata-Flavor: Google. Both header requirements are trivially defeated when the SSRF primitive controls headers. Inside Kubernetes the problem multiplies: the kube-apiserver sits at a well-known service IP, the metadata service is reachable from pods unless network policy blocks it, and any service in the mesh is addressable from any pod that can be coerced into a request. The blast radius for one SSRF in a cloud-native deployment is the union of every credential and every service the workload can reach.
IMDSv2 Is the Right Floor, Not a Ceiling
AWS introduced IMDSv2 in November 2019, four months after the Capital One disclosure. The protocol requires the caller to first issue a PUT request to /latest/api/token with an X-aws-ec2-metadata-token-ttl-seconds header, receive a session token, and then send that token as a header on subsequent GET requests. Most SSRF primitives can issue a GET with an attacker-controlled URL but cannot issue a PUT with custom headers, so IMDSv2 breaks the simple version of the attack. IMDSv2 also rejects requests with hop count above one and rejects requests sourced from outside the instance, which closes a few clever variants. None of this is bulletproof. Older AMIs default to IMDSv1 enabled for backward compatibility, applications that proxy arbitrary HTTP verbs (a sufficiently flexible SSRF) can still issue the PUT, and many organizations have not run aws ec2 modify-instance-metadata-options --http-tokens required across their fleet. IMDSv2 is the floor; revoking metadata access entirely from workloads that do not need credentials, and pairing every workload with a least-privilege role, is what closes the actual gap.
Detection That Works
SSRF is fundamentally a data flow problem: a value that originated in an HTTP request body, query string, or header reaches an HTTP client sink as the destination URL, possibly after passing through string concatenation, JSON parsing, or a redirect handler. Static analysis tools that model taint propagation can flag this pattern at commit time. The source set is the same as for SQL injection (request inputs); the sink set is different (requests.get, fetch, HttpClient.send, URL.openStream, libcurl); the missing element is a sanitizer that proves the input was either allowlisted or resolved-and-validated. Data flow analysis catches the entire class because the sanitizer is part of the model, not an afterthought.
Runtime detection complements the source-level work. DAST scanners detect blind SSRF by injecting URLs that point at an out-of-band collaborator — Burp Collaborator, the open-source interactsh, or a self-hosted DNS canary — and waiting for the callback. The application makes a request to abc123.collaborator.example, the canary records the hit, and the scanner correlates the inbound DNS or HTTP event with the request that triggered it. The defensive runtime layer is an explicit egress firewall that denies all outbound traffic by default and allows only the destinations each workload actually needs, plus per-workload IAM roles scoped to the specific resources the workload touches. Both layers assume SSRF will eventually slip through and limit what an attacker can do once it has.
Prevention Checklist
The defenses below are ordered from cheapest to most operationally involved. Apply as many as the use case allows; each closes a category of bypass that the others do not.
- Parse with a real URL parser. Use
urllib.parse,new URL(), orjava.net.URI; never substring-match. Reject anything that fails to parse cleanly. - Resolve the hostname yourself, then check the IP. Reject the request if any A or AAAA record falls in
10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,127.0.0.0/8,169.254.0.0/16,::1/128,fc00::/7, orfe80::/10. - Allowlist destination domains where the use case allows. Webhooks, OAuth callbacks, and integrations almost always have a finite set of valid destinations.
- Disable automatic redirects, or revalidate every hop. A 302 from an allowlisted domain to
169.254.169.254is the standard bypass. - Route outbound HTTP through a dedicated egress proxy with policy. Centralizes the allowlist, makes auditing tractable, and removes the per-service correctness burden.
- Enforce IMDSv2 fleet-wide and remove metadata access from workloads that do not need credentials. Run
aws ec2 modify-instance-metadata-options --http-tokens required --http-put-response-hop-limit 1as the default for new instances, and audit existing ones. - Scope IAM roles per workload to the smallest action and resource set that the workload actually uses. The Capital One blast radius came from a role that had read access to thousands of buckets it never legitimately touched.
Where GraphNode SAST Fits
GraphNode SAST traces user-controlled URLs from HTTP sources to outbound client sinks across 13+ languages — Python requests, Node.js fetch, Java and .NET HttpClient, Go net/http. The data flow engine treats allowlist checks and IP-range validation as sanitizers. Rules map to OWASP A10:2021 and CWE-918; details in the A10 SSRF guide. Findings surface in the pull request with the full source-to-sink path.
What the Capital One Story Should Teach
The breach was not bad code. It was three reasonable-looking defaults — a permissive WAF, a metadata service that returned credentials over unauthenticated HTTP, and an IAM role broader than the workload required — chained into something none of the individual reviewers had a reason to flag. Defending against the next version means assuming each link fails. Validate URLs at the source, lock down the metadata service at the host, scope IAM at the provider, route egress through a policy you can read in one screen. SSRF is the default behavior of every HTTP client in every language, applied to an attacker-controlled URL, with a metadata endpoint one routable hop away. Take that hop away and the class collapses.
GraphNode SAST traces user-controlled URLs from HTTP handler to outbound client across 13+ languages — request a demo.
Request Demo