feat: experimental traffic analysis#2848
Conversation
🦋 Changeset detectedLatest commit: 0e8c763 The changes in this PR will be included in the next version bump. This PR includes changesets to release 3 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Coverage Report
File Coverage
|
||||||||||||||||||||||||||||||||||||||||||||
Performance Benchmark (Lower is Faster)
|
|
@cursor review |
|
@cursor review |
|
@cursor review |
|
@cursor review |
|
@cursor review |
|
@cursor review |
|
@cursor review |
|
|
||
| - Starts a reverse proxy that forwards every request to an upstream `--target` | ||
| and records each request/response exchange into a HAR file. | ||
| - The HAR file is written incrementally (after each exchange) and flushed on |
There was a problem hiding this comment.
As mentioned during demo - do we want to keep this behavior for the first release? (this means proxy will be usable only for short lived processes like e2e tests etc.). In future we can introduce chunking (and give drift command option to read whole folder of traffic files). Eventually we can switch to ndjson format and append line by line.
| - `--api <path>`: OpenAPI file or folder to validate against. Omit to generate. | ||
| - `--traffic-format <auto|har|kong|nginx-json|apache-json|ndjson>` (default: `auto`) | ||
| - `--format <pretty|json|csv|sarif>` (default: `pretty`) | ||
| - `--match-mode <strict-host|basepath>` (default: `strict-host`): how requests are located |
There was a problem hiding this comment.
Since we add server option which seems more flexible maybe we should get rid of match-mode param at all? That way we will always use strict-host so we validate against OAD, but user may also override server.
| Mutually exclusive with `--match-mode`: use `--match-mode` when the traffic URLs align | ||
| with the description `servers`, use `--server` to declare the actual server when they | ||
| do not. | ||
|
|
There was a problem hiding this comment.
As presented during demo - shall we include ignore option for the first release? To avoid messages about unknown headers like x-caddy-authtoken etc.
| 'packages/**/__tests__/**/*', | ||
| 'packages/cli/src/index.ts', | ||
| 'packages/cli/src/utils/assert-node-version.ts', | ||
| 'packages/cli/src/commands/drift/**', |
There was a problem hiding this comment.
Is it fine if we ignore coverage for experimental commands? I would expect that source code there may change significantly over time after getting feedback on usage....
| - missing required parameters/body, | ||
| - request/response schema mismatches, | ||
| - baseline security issues (opt-in OWASP API risk heuristics). | ||
| - When no spec is provided, infers an OpenAPI 3.1 description from the traffic. |
There was a problem hiding this comment.
Do we need to improve OAD generation option for the first release? Maybe we should explicitly require path to folder where OAD will be saved and implement more sophisticated schemas generation?
On the other hand - we may wait for feedback and do this in next iteration.
There was a problem hiding this comment.
As discussed in Slack - we will remove OAD generation in this PR.
| maxFindings?: number; | ||
| } | ||
|
|
||
| const ANSI = { |
There was a problem hiding this comment.
we have colorette as a dep in this repo, let's use it
| import { handleDrift, type DriftArgv } from './commands/drift/index.js'; | ||
| import type { FindingSeverity, MatchMode, TrafficFormat } from './commands/drift/types/index.js'; |
| yargs | ||
| .env('REDOCLY_CLI_DRIFT') | ||
| .positional('traffic', { | ||
| describe: 'Path to a traffic log file or folder (HAR, Kong, Nginx/Apache JSON, NDJSON).', |
There was a problem hiding this comment.
does it support those other formats?
|
@cursor review |
There was a problem hiding this comment.
✅ Bugbot reviewed your changes and found no new issues!
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit 0e8c763. Configure here.
|
|
||
| > Experimental: the command, flags, and output are subject to change. | ||
|
|
||
| ## What it does |
There was a problem hiding this comment.
| ## What it does | |
| The `drift` command: |
| - Streams traffic logs (HAR, Kong, Nginx/Apache JSON, NDJSON). | ||
| - Matches each request/response exchange to a documented operation. | ||
| - Reports discrepancies: | ||
| - undocumented endpoints, | ||
| - undocumented request params/headers, | ||
| - missing required parameters/body, | ||
| - request/response schema mismatches, | ||
| - baseline security issues (opt-in OWASP API risk heuristics). |
There was a problem hiding this comment.
| - Streams traffic logs (HAR, Kong, Nginx/Apache JSON, NDJSON). | |
| - Matches each request/response exchange to a documented operation. | |
| - Reports discrepancies: | |
| - undocumented endpoints, | |
| - undocumented request params/headers, | |
| - missing required parameters/body, | |
| - request/response schema mismatches, | |
| - baseline security issues (opt-in OWASP API risk heuristics). | |
| - Streams traffic logs (HAR, Kong, Nginx/Apache JSON, NDJSON). | |
| - Matches each request/response exchange to a documented operation. | |
| - Reports discrepancies: | |
| - undocumented endpoints | |
| - undocumented request params/headers | |
| - missing required parameters/body | |
| - request/response schema mismatches | |
| - baseline security issues (opt-in OWASP API risk heuristics) |
| It has **no extra runtime dependencies** beyond what `@redocly/cli` already ships: | ||
| spec loading reuses `@redocly/openapi-core` and schema validation reuses the bundled | ||
| `@redocly/ajv`. |
There was a problem hiding this comment.
| It has **no extra runtime dependencies** beyond what `@redocly/cli` already ships: | |
| spec loading reuses `@redocly/openapi-core` and schema validation reuses the bundled | |
| `@redocly/ajv`. | |
| The `drift` command has **no extra runtime dependencies** beyond what `@redocly/cli` already ships: spec loading reuses `@redocly/openapi-core` and schema validation reuses the bundled `@redocly/ajv`. |
| - `--match-mode <strict-host|basepath>` (default: `strict-host`): how requests are located | ||
| via the description `servers` (`strict-host` also requires the host to match, `basepath` | ||
| only the base path). Mutually exclusive with `--server`. |
There was a problem hiding this comment.
| - `--match-mode <strict-host|basepath>` (default: `strict-host`): how requests are located | |
| via the description `servers` (`strict-host` also requires the host to match, `basepath` | |
| only the base path). Mutually exclusive with `--server`. | |
| - `--match-mode <strict-host|basepath>` (default: `strict-host`): how requests are located using the description's `servers` (`strict-host` also requires the host to match, `basepath` | |
| only the base path). Mutually exclusive with `--server`. |
| - `--output, -o <path>`: write the drift report (in the format selected with `--format`) | ||
| to a file instead of stdout | ||
| - `--server <url>`: server URL the traffic was captured against (host, host + base path, | ||
| or a path-only prefix like `/api`). Only requests under it are considered, and the rest | ||
| of their URL is treated as the API path. It replaces the description `servers` and the | ||
| remainder is matched against the description paths directly - useful when the captured | ||
| traffic does not carry the documented host or base path (e.g. `--server localhost:9000` | ||
| for traffic captured behind a gateway that adds `/api`). Mutually exclusive with | ||
| `--match-mode`: use `--match-mode` when the traffic URLs align with the description | ||
| `servers`, use `--server` to declare the actual server when they do not. |
There was a problem hiding this comment.
| - `--output, -o <path>`: write the drift report (in the format selected with `--format`) | |
| to a file instead of stdout | |
| - `--server <url>`: server URL the traffic was captured against (host, host + base path, | |
| or a path-only prefix like `/api`). Only requests under it are considered, and the rest | |
| of their URL is treated as the API path. It replaces the description `servers` and the | |
| remainder is matched against the description paths directly - useful when the captured | |
| traffic does not carry the documented host or base path (e.g. `--server localhost:9000` | |
| for traffic captured behind a gateway that adds `/api`). Mutually exclusive with | |
| `--match-mode`: use `--match-mode` when the traffic URLs align with the description | |
| `servers`, use `--server` to declare the actual server when they do not. | |
| - `--output, -o <path>`: write the drift report (in the format selected with `--format`) to a file instead of stdout | |
| - `--server <url>`: server URL the traffic was captured against (host, host + base path, or a path-only prefix like `/api`). | |
| Only requests under it are considered, and the rest of their URL is treated as the API path. | |
| `--server` replaces the description's `servers` and the remainder is matched against the description paths directly. | |
| Useful when the captured traffic does not carry the documented host or base path (e.g. `--server localhost:9000` for traffic captured behind a gateway that adds `/api`). | |
| Mutually exclusive with `--match-mode`. | |
| Use `--match-mode` when the traffic URLs align with the description | |
| `servers`. | |
| Use to declare the actual server when they do not. |
|
|
||
| > Experimental: the command, flags, and output are subject to change. | ||
|
|
||
| ## What it does |
There was a problem hiding this comment.
| ## What it does | |
| The `proxy` command: |
| It reuses `@redocly/openapi-core` for spec loading and the bundled `@redocly/ajv` | ||
| for schema validation, plus `undici` (already shipped) for the upstream client — | ||
| no extra runtime dependencies. | ||
|
|
There was a problem hiding this comment.
| It reuses `@redocly/openapi-core` for spec loading and the bundled `@redocly/ajv` | |
| for schema validation, plus `undici` (already shipped) for the upstream client — | |
| no extra runtime dependencies. | |
| The `proxy` command reuses `@redocly/openapi-core` for spec loading and the bundled `@redocly/ajv` for schema validation, plus `undici` (already shipped) for the upstream client. | |
| There are no additional runtime dependencies. | |
| - Reverse proxy only: clients must target the proxy directly; there is no | ||
| forward/`CONNECT` mode and no inbound TLS termination. | ||
| - `accept-encoding` is stripped from forwarded requests so captured bodies are | ||
| stored decoded; binary response bodies are stored base64-encoded in the HAR. | ||
| - Captured exchanges are held in memory and the HAR is rewritten in full on each | ||
| exchange, which suits development-rate traffic rather than high-volume capture. |
There was a problem hiding this comment.
| - Reverse proxy only: clients must target the proxy directly; there is no | |
| forward/`CONNECT` mode and no inbound TLS termination. | |
| - `accept-encoding` is stripped from forwarded requests so captured bodies are | |
| stored decoded; binary response bodies are stored base64-encoded in the HAR. | |
| - Captured exchanges are held in memory and the HAR is rewritten in full on each | |
| exchange, which suits development-rate traffic rather than high-volume capture. | |
| - Reverse proxy only: clients must target the proxy directly. | |
| There is no forward/`CONNECT` mode and no inbound TLS termination. | |
| - `accept-encoding` is stripped from forwarded requests so captured bodies are stored decoded. | |
| Binary response bodies are stored base64-encoded in the HAR. | |
| - Captured exchanges are held in memory and the HAR is rewritten in full on each | |
| exchange. | |
| This strategy suits development-rate traffic rather than high-volume capture. |
| - Starts a reverse proxy that forwards every request to an upstream `--target` | ||
| and records each request/response exchange into a HAR file. | ||
| - The HAR file is written incrementally (after each exchange) and flushed on | ||
| shutdown, so it stays durable if the process is interrupted. | ||
| - When `--api` is provided, each captured exchange is validated live against the | ||
| spec using the same engine as [`drift`](../drift/README.md); findings are | ||
| printed as they happen and a full report is rendered on shutdown. | ||
| - The resulting HAR file can be replayed through `drift` later: | ||
| `redocly drift ./capture.har --api ./openapi.yaml`. |
There was a problem hiding this comment.
| - Starts a reverse proxy that forwards every request to an upstream `--target` | |
| and records each request/response exchange into a HAR file. | |
| - The HAR file is written incrementally (after each exchange) and flushed on | |
| shutdown, so it stays durable if the process is interrupted. | |
| - When `--api` is provided, each captured exchange is validated live against the | |
| spec using the same engine as [`drift`](../drift/README.md); findings are | |
| printed as they happen and a full report is rendered on shutdown. | |
| - The resulting HAR file can be replayed through `drift` later: | |
| `redocly drift ./capture.har --api ./openapi.yaml`. | |
| - Starts a reverse proxy that forwards every request to an upstream `--target` and records each request/response exchange into a HAR file. | |
| - The HAR file is written incrementally (after each exchange) and flushed on shutdown. | |
| The file stays durable if the process is interrupted. | |
| - When `--api` is provided, each captured exchange is validated live against the spec using the same engine as [`drift`](../drift/README.md). | |
| Findings are printed as they happen and a full report is rendered on shutdown. | |
| - The resulting HAR file can be replayed through `drift` later: `redocly drift ./capture.har --api ./openapi.yaml`. |
What/Why/How?
Added two new (experimental) commands -
driftandproxyfor collecting traffic data and validating it against provided OpenAPI definition.Reference
Testing
Screenshots (optional)
Check yourself
Security
Note
Medium Risk
Large new surface area including security heuristics, dynamic plugin imports, and a forwarding proxy; mitigated as experimental with local defaults and no changes to existing product auth flows.
Overview
Adds two experimental CLI commands for comparing real HTTP traffic to OpenAPI:
drift(offline logs) andproxy(live capture).driftingests traffic from HAR, Kong, Nginx/Apache JSON, or NDJSON (auto-detect or explicit format), normalizes each exchange, matches it to bundled OpenAPI operations (strict-host/basepathor--serveroverride), and runs pluggable rules. Default rules cover undocumented routes, schema/parameter consistency (AJV with readOnly/writeOnly handling), and security baselines; optionalowasp-api-top10adds heuristic checks. Reports ship as pretty, JSON, CSV, or SARIF, with grouped problems,--min-severity, and exit code 1 on error-level findings. External--pluginand--traffic-pluginmodules can extend rules and parsers.proxystarts a local reverse proxy (undiciupstream), writes a growing HAR file, and optionally reuses the sameValidationSessionasdriftfor per-request findings plus a shutdown report. Captured HARs are intended for replay viadrift.Wiring includes lazy-loaded yargs commands, argv types, changesets/docs, a small
linttweak to skip config-lint when--formatiscsv/sarif, and broad e2e coverage for formats, rules, and exit behavior.Reviewed by Cursor Bugbot for commit 0e8c763. Bugbot is set up for automated code reviews on this repo. Configure here.