This document is a STRIDE-based threat model for the aiohttp library. It is a living document intended to (a) make explicit the implicit security assumptions baked into the codebase, (b) catalogue known classes of threat against each subsystem, and (c) record the existing and recommended mitigations.
aiohttp is an asyncio-based HTTP client/server framework for Python. It
provides:
- An HTTP/1.1 server (
aiohttp.web) including routing, middleware, WebSocket support, static-file serving, and a Gunicorn worker. - An HTTP/1.1 client (
aiohttp.ClientSession) including connection pooling, TLS, proxy support, redirects, cookie handling, and WebSockets. - Shared wire-protocol code: HTTP/1 parser (vendored llhttp wrapped in Cython, with a pure Python fallback), HTTP writer, WebSocket framing, multipart, and compression.
Key public APIs (non-exhaustive):
| Surface | Entry points |
|---|---|
| Server | aiohttp.web.Application, web.RouteTableDef, web.run_app, web.AppRunner, web.WebSocketResponse, web.FileResponse |
| Client | aiohttp.ClientSession, aiohttp.TCPConnector, aiohttp.ClientResponse, aiohttp.WSMessage, aiohttp.BasicAuth |
| Shared | aiohttp.MultipartReader/MultipartWriter, aiohttp.CookieJar, aiohttp.TraceConfig, aiohttp.resolver.AsyncResolver |
We use STRIDE:
- Spoofing — impersonating identity (host, user, peer, dependency).
- Tampering — modifying data or code in flight or at rest.
- Repudiation — denying that an action occurred.
- Information Disclosure — leaking confidential data.
- Denial of Service — exhausting CPU, memory, sockets, file descriptors.
- Elevation of Privilege — gaining unintended access.
Risk is ranked High / Medium / Low based on a rough product of likelihood and impact, as judged by maintainers. Mitigations are split into existing (already implemented in the codebase) and recommended (not yet implemented or only partially implemented).
These cross-cutting assets apply across most sections; individual sections only list assets unique to that section.
- Integrity of public-API behavior — functions return what callers expect and don't introduce protocol corruption (request smuggling, response splitting, framing desync).
- Confidentiality of data in transit — TLS handling, header values, cookies, request/response bodies are not leaked between connections, sessions, or to log sinks.
- Availability of host application — aiohttp does not crash, deadlock, or exhaust CPU/memory/FDs in the host process under hostile or malformed input.
- Security of host application — aiohttp does not become a vector for attacks on the embedding application (SSRF, file disclosure, code execution, privilege escalation through deserialisation, etc.).
- Reputation & supply-chain integrity — the released artifacts on PyPI are what maintainers built and signed; the source on GitHub matches the artifacts; the vendored llhttp matches upstream; CI/CD secrets are not exposed.
flowchart LR
Untrusted([Untrusted Internet])
Caller([Caller / host application])
Upstream([External HTTP servers])
subgraph Server[Server side]
direction TB
SP[web_protocol<br/>connection lifecycle]
PARS[HTTP parser<br/>_http_parser.pyx + llhttp]
REQ[web_request.Request]
DISP[web_urldispatcher + middleware]
HND{user handler}
RESP[web_response.Response<br/>FileResponse / WebSocketResponse]
WR[http_writer]
SP --> PARS --> REQ --> DISP --> HND --> RESP --> WR
end
subgraph Client[Client side]
direction TB
CS[ClientSession]
CONN[TCPConnector<br/>+ TLS, proxy, pooling]
RES[resolver]
CP[client_proto]
CR[client_reqrep.ClientResponse]
CS --> CONN --> RES
CONN --> CP --> CR --> CS
end
subgraph Shared[Shared wire-protocol code]
direction TB
PARS
WR
WS[http_websocket + _websocket/]
MP[multipart]
COMP[compression_utils]
PARS -.-> WS
WR -.-> WS
end
Untrusted -- HTTP/1, WS --> SP
WR -- HTTP/1, WS --> Untrusted
Caller --> CS
CS --> Caller
CONN -- HTTP/1, WS --> Upstream
Upstream --> CP
CJ[(CookieJar)] -. client only .-> CS
TR[TraceConfig] -. signals .-> CS
The threat surface is broken down into 19 sections. Each is modeled in its own subsection below.
- HTTP/1 parser
- HTTP/1 writer
- WebSocket framing & per-message deflate
- Multipart parsing & encoding
- Compression codecs
- Streams & payloads
- Server connection lifecycle
- Server routing & middleware
- Server request/response objects
- Server static file serving
- Server-side WebSocket handler
- Client API & request lifecycle
- Connector / TLS / proxy / pooling
- Client-side WebSocket
- Client auth middlewares
- Cookie handling
- DNS resolution
- Tracing & URL/header helpers
- Build & release supply chain
Scope. Parsing of HTTP/1.0 and HTTP/1.1 request and response messages — request/status line, header block, chunked transfer-encoding, content-length framing, trailers — and the surface where parsed values flow into the rest of the library. Out of scope here: WebSocket framing (§5.3), multipart bodies (§5.4), compression (§5.5), HTTP-writer-side framing (§5.2).
Components covered.
aiohttp/_http_parser.pyx— Cython wrapper over vendored llhttp, default in CPython builds.aiohttp/_cparser.pxd— Cython declarations for llhttp.aiohttp/http_parser.py— pure-PythonHttpRequestParser/HttpResponseParserused as a fallback (and as the canonical implementation whenAIOHTTP_NO_EXTENSIONS=1).aiohttp/_find_header.pxd/aiohttp/_find_header.h— header-name interning.aiohttp/http_exceptions.py—BadHttpMessage,BadHttpMethod,BadStatusLine,LineTooLong,InvalidHeader,TransferEncodingError,ContentLengthError.vendor/llhttp/— vendored upstream parser, version9.3.1(seevendor/llhttp/package.json). Generated viamake generate-llhttp.
Selection. A conditional re-import at the bottom of
aiohttp/http_parser.py re-binds the public names to the Cython parser when
_http_parser imports successfully and AIOHTTP_NO_EXTENSIONS is unset. There is no hybrid mode — both request and
response parsers come from the same backend, so an inconsistent
request-Cython/response-pure-Python configuration cannot occur in supported
builds.
Trust boundaries & data flow.
flowchart LR
Wire([Untrusted bytes]) --> Feed[parser.feed_data]
Feed --> Llhttp[llhttp / Python state machine]
Llhttp -->|RawRequestMessage<br/>RawResponseMessage| Caller[web_protocol / client_proto]
Llhttp -->|StreamReader feed| Body[(Request/response body)]
Caller --> ReqResp[Request / ClientResponse]
ReqResp --> User([User handler / caller])
The parser is invoked on every byte that arrives from a socket, before any
authentication. Everything fed into feed_data is attacker-controlled on
the server side and upstream-controlled on the client side (proxies,
upstream services, malicious origins reached via client). The output
(RawRequestMessage / RawResponseMessage, raw header tuples, body chunks
into StreamReader) is then handed to web_protocol.RequestHandler and
client_proto.ResponseHandler respectively.
Trust assumptions about parser output:
- Header names are validated against a token regex; values are not normalised
beyond
lstrip/rstripand CR/LF/NUL rejection. - Header values are decoded
utf-8withsurrogateescape, so non-UTF-8 bytes are preserved and can round-trip back to the wire if downstream code re-emits them. Any sanitisation downstream of the parser is the responsibility of consumers (logging, header reflection, proxying). - Methods are accepted as any RFC 7230 token; the parser does not canonicalise case.
- Versions are accepted by the regex
HTTP/(\d)\.(\d)— i.e.HTTP/0.9,HTTP/2.0, etc. all parse without rejection, even though they cannot be served correctly.
Assets at risk.
- Framing integrity — that one wire message corresponds to one parsed message; nothing the parser accepts can cause a desync between aiohttp and an upstream/downstream peer (request smuggling).
- Allocator safety — that a malicious peer cannot drive memory or CPU usage to denial of service through parser-controlled allocations.
- Bytewise transparency — that bytes accepted by the parser cannot inject new framing or new header semantics downstream (CRLF injection, NUL smuggling).
Threats (STRIDE).
| # | Component / Vector | STRIDE | Threat | Risk |
|---|---|---|---|---|
| 1.1 | Request line / status line | T | Smuggling via duplicate / conflicting framing headers (Content-Length × N, Content-Length + Transfer-Encoding, obfuscated Transfer-Encoding). |
High |
| 1.2 | Header block, line endings | T | Smuggling via bare-LF, obs-fold, optional CR-before-LF on the request parser. Request parser is strict; lenient flags apply only to the response parser. | Medium |
| 1.3 | Header values, CR/LF/NUL | T / I | CRLF injection enabling response splitting / header injection if downstream re-emits values verbatim. Historically CVE-2023-37276. | High |
| 1.4 | Header values, surrogateescape decode | I / T | Non-UTF-8 bytes round-trip through Headers and may be reflected by user code / proxies / logs into untrusted contexts. |
Medium |
| 1.5 | HTTP version regex | T | HTTP/0.9 and HTTP/2.0 accepted on the wire, opening a small surface for protocol-confusion against intermediaries that handle these specially. |
Low |
| 1.6 | Method token | I / T | Methods are not case-canonicalised; arbitrary tokens up to max_line_size accepted. May confuse downstream method-based authorisation if user code compares case-sensitively. |
Low |
| 1.7 | Content-Length parsing |
T | Negative or non-decimal CL handling, multiple comma-separated CLs, CL with leading +/whitespace. |
Medium |
| 1.8 | Transfer-Encoding: chunked parsing |
T | Lenient acceptance (xchunked, chunked, identity, doubled chunked) leading to smuggling against a non-aiohttp peer that interprets differently. |
Medium |
| 1.9 | Chunk size parsing | D | No upper bound on chunk-size value (Python unbounded int); huge chunk size could drive allocator before client_max_size rejects body. Mitigated by §5.7 / client_max_size. |
Low–Med |
| 1.10 | Chunk extensions | D / T | Unbounded chunk-extension consumption per chunk; weak validation of extension syntax. | Low |
| 1.11 | Parser error reporting | I | Exception messages may include up to ~100 bytes of malformed input, which can be surfaced in 4xx error bodies, logs, or DEBUG=True traces. |
Low |
| 1.12 | Cython ⇄ pure-Python divergence | T / S | Behaviour differences between llhttp and the Python fallback may produce parser-confusion if a deployment unintentionally switches backends (e.g. a user installs without compiled extensions). | Med |
| 1.13 | Vendored llhttp version drift | S / T | An upstream llhttp CVE not picked up by aiohttp's vendoring cadence remains exploitable until make generate-llhttp is re-run and released. |
Medium |
| 1.14 | Build/regen of llhttp (make generate-llhttp) |
S / T | Local tampering or supply-chain compromise of the npm llhttp package gets baked into the vendored C. Covered in §5.19 but originates here. |
Medium |
Mitigations.
| # | Threat | Existing | Recommended |
|---|---|---|---|
| 1.1 | Smuggling via duplicate framing headers | llhttp rejects conflicting Content-Length. http_parser.py:HttpRequestParserPy.parse_headers rejects coexistence of CL + Transfer-Encoding: chunked. The full SINGLETON_HEADERS set (CL, CT, Host, TE, ETag, etc.) is duplicate-rejected by the request parser (strict mode); #12302 disabled this check on the response parser (lax mode), since real-world servers commonly send duplicate Content-Type / Server. |
If new singleton-sensitive headers emerge in HTTP/1.1 RFC errata, add to SINGLETON_HEADERS. |
| 1.2 | Lenient response parsing | Lenient flags (llhttp_set_lenient_headers, llhttp_set_lenient_optional_cr_before_lf, llhttp_set_lenient_spaces_after_chunk_size) are only enabled on the response parser and only when DEBUG is False (set in HttpResponseParser.__init__). The request parser is strict. |
Documented design decision: keep lenient response parsing for real-world server interop |
| 1.3 | CRLF / NUL in header values | Bytes \r, \n, \x00 rejected in header values (_http_parser.pyx callbacks; http_parser.py:HeadersParser.parse_headers). |
Keep regression tests in tests/test_http_parser.py covering each forbidden byte both in name and value, and across both Cython and pure-Python parsers. |
| 1.4 | Non-UTF-8 round-trip | None at parser layer (intentional — preserving original bytes is required for some use cases). | Document in user-facing docs that header values are bytes-preserving; warn against reflecting headers verbatim into responses, logs, or sub-requests without re-validation. |
| 1.5 | HTTP version regex accepts 0.9 / 2.0 | None (regex is permissive). | Tighten VERSRE (and llhttp configuration if possible) to reject anything outside HTTP/1.0 and HTTP/1.1. |
| 1.6 | Method-case round-trip | Method token validated by regex; not canonicalised. | Document that user route handlers / authorization checks should compare methods case-sensitively to the canonical RFC tokens, or use the framework's web.RouteTableDef decorators which already match canonical methods. |
| 1.7 | Content-Length parsing |
llhttp validates CL is decimal and non-negative; pure-Python parser validates via DIGITS.fullmatch(r"\d+") before int(...), rejecting +/-/non-ASCII-digit forms (test_bad_headers, test_headers_content_length_err_* cover these). |
None. Cross-backend parity is covered by the shared parser tests. |
| 1.8 | Transfer-Encoding lenience |
_is_chunked_te requires chunked to be the last value; duplicate chunked rejected (#10611). Request parser strict. |
None. |
| 1.9 | Chunk-size DoS | The parser doesn't cap chunk size, but server-side body length is bounded by client_max_size (default 1 MiB) in web_request.py:BaseRequest.read. Client-side responses are bounded by user-supplied max_body_size / streaming reads. |
None. If a cap is ever needed at the parser level, plumb it through HttpPayloadParser. |
| 1.10 | Chunk-extension DoS | Chunk-extension content is bounded by the same wire-level size constraints (it shares the chunk-size line with max_line_size). |
Add an explicit test that chunk-extension flooding cannot blow past max_line_size. |
| 1.11 | Parser error reflection | http_parser.py truncates to [:100] for line errors. Server-side error path renders 4xx with the exception message; tracebacks only when DEBUG=True. |
Audit any path where BadHttpMessage content is reflected to the client unsanitised (especially in custom web_log configurations). |
| 1.12 | Cython ⇄ pure-Python divergence | tests/test_http_parser.py parameterises tests over REQUEST_PARSERS / RESPONSE_PARSERS (pure-Python always; Cython when the extension imports). The high-leverage attack vectors are already covered under both backends: CL+TE (test_content_length_transfer_encoding), CL×N (test_duplicate_singleton_header_rejected), obs-fold (test_reject_obsolete_line_folding, test_http_response_parser_obs_line_folding*), CR/LF/NUL (test_bad_headers, test_http_response_parser_null_byte_in_header_value, test_http_response_parser_bad_crlf), version regex (test_http_request_parser_bad_version*, test_http_response_parser_bad_version*). |
None. When new attack vectors emerge, add them to the parameterised tests. |
| 1.13 | llhttp version drift | Manual upgrade via make generate-llhttp; vendor pinned in vendor/llhttp/package.json. |
Track upstream releases (e.g. via Dependabot rule for vendor/llhttp/package.json), bump on every llhttp release, regenerate in CI. |
| 1.14 | npm-side compromise of llhttp |
The vendored output is checked into git, so a compromise during a future regen would be detectable in PR review. See §5.19. | Make the llhttp build reproducible: pin Node.js version, commit the npm lockfile, and on every bump verify the regenerated C against upstream's release tarballs before committing. |
Past advisories / hardening (recap).
- GHSA-xx9p-xxvh-7g8j (CVE-2023-47641) (3.8.0) — CL-vs-TE divergence between the Cython and pure-Python parsers, allowing request smuggling against deployments that switched backends.
- CVE-2023-37276 / GHSA-45c4-8wx5-qw6w (3.8.5) — HTTP request smuggling via CR/LF/NUL in header values. Both parsers reject these bytes at the byte level.
- GHSA-pjjw-qhg8-p2p9 (3.8.6) — smuggling pair in vendored llhttp 8.1.1; fixed by bumping llhttp to 9.
- GHSA-gfw2-4jvh-wgfg / GHSA-8qpw-xqxj-h4r2 (3.8.6 / 3.9.2) — pure-Python parser accepted lenient separators / weak RFC validation that llhttp rejected.
- GHSA-8495-4g3g-x7pr (CVE-2024-52304) (3.10.11) — chunk-extension newline smuggling in the pure-Python parser.
- GHSA-9548-qrrj-x5pj (CVE-2025-53643) (3.12.14) — request smuggling via the chunked-trailer section in the pure-Python parser.
- GHSA-69f9-5gxw-wvc2 (CVE-2025-69224) (3.13.3) — Unicode codepoints
matched by
\din the pure-Python parser's regexes were treated as digits. - GHSA-g84x-mcqj-x9qq (3.13.3) — CPU-DoS on
request.read()when the body arrives as a very large number of small chunks. - PR #12137 (3.13.4) — precautionary hardening: pure-Python parser
now explicitly rejects duplicate
Transfer-Encoding: chunkedon the request parser. - GHSA-c427-h43c-vf67 (3.13.4) — duplicate
Hostheader accepted in request parser, bypassingApplication.add_domain()host-based routing / authorisation. Fixed by addingHostto the strict request-parser singleton rejection set. - GHSA-63hf-3vf5-4wqf (CVE-2026-34520) (3.13.4) — llhttp accepted NUL / control bytes in response header values, leaving the response parser weaker than the request parser. Fixed by tightening the response-side byte check.
- GHSA-w2fm-2cpv-w7v5 (CVE-2026-22815) (3.13.4) — uncapped memory
growth on long header / trailer blocks. Fixed by enforcing
max_field_size/max_headerson the trailer block too. - PR #12302 (3.13.5) — duplicate-singleton-header rejection
was breaking real-world response parsing (servers like Google APIs /
Werkzeug emit duplicate
Content-Type/Server); fix disables the check on the response parser (lax mode) while keeping it on the request parser (strict).
These are all currently in place; this section assumes no regression.