Skip to content

feat: experimental traffic analysis#2848

Open
sobanieca-redocly wants to merge 20 commits into
mainfrom
feat/traffic-analysis
Open

feat: experimental traffic analysis#2848
sobanieca-redocly wants to merge 20 commits into
mainfrom
feat/traffic-analysis

Conversation

@sobanieca-redocly

@sobanieca-redocly sobanieca-redocly commented Jun 3, 2026

Copy link
Copy Markdown

What/Why/How?

Added two new (experimental) commands - drift and proxy for collecting traffic data and validating it against provided OpenAPI definition.

Reference

Testing

Screenshots (optional)

Check yourself

  • This PR follows the contributing guide
  • All new/updated code is covered by tests
  • Core code changed? - Tested with other Redocly products (internal contributions only)
  • New package installed? - Tested in different environments (browser/node)
  • Documentation update has been considered

Security

  • The security impact of the change has been considered
  • Code follows company security practices and guidelines

Note

Medium Risk
Large new surface area including security heuristics, dynamic plugin imports, and a forwarding proxy; mitigated as experimental with local defaults and no changes to existing product auth flows.

Overview
Adds two experimental CLI commands for comparing real HTTP traffic to OpenAPI: drift (offline logs) and proxy (live capture).

drift ingests traffic from HAR, Kong, Nginx/Apache JSON, or NDJSON (auto-detect or explicit format), normalizes each exchange, matches it to bundled OpenAPI operations (strict-host / basepath or --server override), and runs pluggable rules. Default rules cover undocumented routes, schema/parameter consistency (AJV with readOnly/writeOnly handling), and security baselines; optional owasp-api-top10 adds heuristic checks. Reports ship as pretty, JSON, CSV, or SARIF, with grouped problems, --min-severity, and exit code 1 on error-level findings. External --plugin and --traffic-plugin modules can extend rules and parsers.

proxy starts a local reverse proxy (undici upstream), writes a growing HAR file, and optionally reuses the same ValidationSession as drift for per-request findings plus a shutdown report. Captured HARs are intended for replay via drift.

Wiring includes lazy-loaded yargs commands, argv types, changesets/docs, a small lint tweak to skip config-lint when --format is csv/sarif, and broad e2e coverage for formats, rules, and exit behavior.

Reviewed by Cursor Bugbot for commit 0e8c763. Bugbot is set up for automated code reviews on this repo. Configure here.

@changeset-bot

changeset-bot Bot commented Jun 3, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: 0e8c763

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 3 packages
Name Type
@redocly/cli Minor
@redocly/openapi-core Minor
@redocly/respect-core Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@sobanieca-redocly sobanieca-redocly changed the title feat: traffic analysis poc feat: experimental traffic analysis Jun 8, 2026
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Coverage Report

Status Category Percentage Covered / Total
🔵 Lines 81.39% (🎯 81%) 7406 / 9099
🔵 Statements 80.75% (🎯 80%) 7699 / 9534
🔵 Functions 84.58% (🎯 84%) 1476 / 1745
🔵 Branches 73.1% (🎯 73%) 5013 / 6857
File Coverage
File Stmts Branches Functions Lines Uncovered Lines
Changed Files
packages/cli/src/types.ts 100% 100% 100% 100%
packages/cli/src/commands/lint.ts 94.54% 86.04% 100% 94.54% 91-93, 158, 192
Generated in workflow #10356 for commit 0e8c763 by the Vitest Coverage Report Action

@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Performance Benchmark (Lower is Faster)

CLI Version Bundle Lint Check Config
cli-latest ▓ 1.00x (Fastest) ▓ 1.00x (Fastest) ▓ 1.00x (Fastest)
cli-next ▓ 1.01x ± 0.01 ▓ 1.00x ± 0.01 ▓ 1.01x ± 0.01

@sobanieca-redocly

Copy link
Copy Markdown
Author

@cursor review

Comment thread packages/cli/src/commands/proxy/server.ts Outdated
Comment thread packages/cli/src/commands/drift/utils/http.ts
Comment thread packages/cli/src/commands/proxy/server.ts
Comment thread packages/cli/src/commands/drift/engine/runner.ts
Comment thread packages/cli/src/commands/drift/rules/builtins/security.ts
@sobanieca-redocly

Copy link
Copy Markdown
Author

@cursor review

Comment thread packages/cli/src/commands/proxy/server.ts
Comment thread packages/cli/src/commands/proxy/server.ts Outdated
Comment thread packages/cli/src/commands/proxy/index.ts Outdated
Comment thread packages/cli/src/commands/drift/openapi/matcher.ts Outdated
@sobanieca-redocly

Copy link
Copy Markdown
Author

@cursor review

Comment thread packages/cli/src/commands/proxy/server.ts Outdated
Comment thread packages/cli/src/commands/drift/rules/builtins/schema.ts
Comment thread packages/cli/src/commands/drift/rules/builtins/schema.ts
Comment thread packages/cli/src/commands/drift/openapi/loader.ts
@sobanieca-redocly

Copy link
Copy Markdown
Author

@cursor review

Comment thread packages/cli/src/commands/drift/index.ts
Comment thread packages/cli/src/commands/drift/log-formats/ndjson.ts Outdated
Comment thread packages/cli/src/commands/drift/engine/schema-validator.ts
Comment thread packages/cli/src/commands/drift/openapi/generator.ts Outdated
@sobanieca-redocly

Copy link
Copy Markdown
Author

@cursor review

Comment thread packages/cli/src/commands/drift/log-formats/helpers.ts Outdated
@sobanieca-redocly

Copy link
Copy Markdown
Author

@cursor review

Comment thread packages/cli/src/commands/proxy/index.ts
Comment thread packages/cli/src/commands/drift/log-formats/har.ts
Comment thread packages/cli/src/commands/drift/rules/builtins/schema.ts
@sobanieca-redocly

Copy link
Copy Markdown
Author

@cursor review

Comment thread packages/cli/src/commands/drift/openapi/matcher.ts

- Starts a reverse proxy that forwards every request to an upstream `--target`
and records each request/response exchange into a HAR file.
- The HAR file is written incrementally (after each exchange) and flushed on

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned during demo - do we want to keep this behavior for the first release? (this means proxy will be usable only for short lived processes like e2e tests etc.). In future we can introduce chunking (and give drift command option to read whole folder of traffic files). Eventually we can switch to ndjson format and append line by line.

- `--api <path>`: OpenAPI file or folder to validate against. Omit to generate.
- `--traffic-format <auto|har|kong|nginx-json|apache-json|ndjson>` (default: `auto`)
- `--format <pretty|json|csv|sarif>` (default: `pretty`)
- `--match-mode <strict-host|basepath>` (default: `strict-host`): how requests are located

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we add server option which seems more flexible maybe we should get rid of match-mode param at all? That way we will always use strict-host so we validate against OAD, but user may also override server.

Mutually exclusive with `--match-mode`: use `--match-mode` when the traffic URLs align
with the description `servers`, use `--server` to declare the actual server when they
do not.

@sobanieca-redocly sobanieca-redocly Jun 12, 2026

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As presented during demo - shall we include ignore option for the first release? To avoid messages about unknown headers like x-caddy-authtoken etc.

Comment thread vitest.config.ts
'packages/**/__tests__/**/*',
'packages/cli/src/index.ts',
'packages/cli/src/utils/assert-node-version.ts',
'packages/cli/src/commands/drift/**',

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it fine if we ignore coverage for experimental commands? I would expect that source code there may change significantly over time after getting feedback on usage....

- missing required parameters/body,
- request/response schema mismatches,
- baseline security issues (opt-in OWASP API risk heuristics).
- When no spec is provided, infers an OpenAPI 3.1 description from the traffic.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to improve OAD generation option for the first release? Maybe we should explicitly require path to folder where OAD will be saved and implement more sophisticated schemas generation?

On the other hand - we may wait for feedback and do this in next iteration.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in Slack - we will remove OAD generation in this PR.

maxFindings?: number;
}

const ANSI = {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have colorette as a dep in this repo, let's use it

Comment thread packages/cli/src/index.ts Outdated
Comment on lines +15 to +16
import { handleDrift, type DriftArgv } from './commands/drift/index.js';
import type { FindingSeverity, MatchMode, TrafficFormat } from './commands/drift/types/index.js';

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's use dynamic import

Comment thread packages/cli/src/index.ts
yargs
.env('REDOCLY_CLI_DRIFT')
.positional('traffic', {
describe: 'Path to a traffic log file or folder (HAR, Kong, Nginx/Apache JSON, NDJSON).',

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it support those other formats?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it does 👍

@sobanieca-redocly

Copy link
Copy Markdown
Author

@cursor review

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 0e8c763. Configure here.

@sobanieca-redocly sobanieca-redocly marked this pull request as ready for review June 17, 2026 12:58
@sobanieca-redocly sobanieca-redocly requested review from a team as code owners June 17, 2026 12:58

> Experimental: the command, flags, and output are subject to change.

## What it does

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## What it does
The `drift` command:

Comment on lines +9 to +16
- Streams traffic logs (HAR, Kong, Nginx/Apache JSON, NDJSON).
- Matches each request/response exchange to a documented operation.
- Reports discrepancies:
- undocumented endpoints,
- undocumented request params/headers,
- missing required parameters/body,
- request/response schema mismatches,
- baseline security issues (opt-in OWASP API risk heuristics).

@JLekawa JLekawa Jun 17, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Streams traffic logs (HAR, Kong, Nginx/Apache JSON, NDJSON).
- Matches each request/response exchange to a documented operation.
- Reports discrepancies:
- undocumented endpoints,
- undocumented request params/headers,
- missing required parameters/body,
- request/response schema mismatches,
- baseline security issues (opt-in OWASP API risk heuristics).
- Streams traffic logs (HAR, Kong, Nginx/Apache JSON, NDJSON).
- Matches each request/response exchange to a documented operation.
- Reports discrepancies:
- undocumented endpoints
- undocumented request params/headers
- missing required parameters/body
- request/response schema mismatches
- baseline security issues (opt-in OWASP API risk heuristics)

Comment on lines +18 to +20
It has **no extra runtime dependencies** beyond what `@redocly/cli` already ships:
spec loading reuses `@redocly/openapi-core` and schema validation reuses the bundled
`@redocly/ajv`.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It has **no extra runtime dependencies** beyond what `@redocly/cli` already ships:
spec loading reuses `@redocly/openapi-core` and schema validation reuses the bundled
`@redocly/ajv`.
The `drift` command has **no extra runtime dependencies** beyond what `@redocly/cli` already ships: spec loading reuses `@redocly/openapi-core` and schema validation reuses the bundled `@redocly/ajv`.

Comment on lines +38 to +40
- `--match-mode <strict-host|basepath>` (default: `strict-host`): how requests are located
via the description `servers` (`strict-host` also requires the host to match, `basepath`
only the base path). Mutually exclusive with `--server`.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `--match-mode <strict-host|basepath>` (default: `strict-host`): how requests are located
via the description `servers` (`strict-host` also requires the host to match, `basepath`
only the base path). Mutually exclusive with `--server`.
- `--match-mode <strict-host|basepath>` (default: `strict-host`): how requests are located using the description's `servers` (`strict-host` also requires the host to match, `basepath`
only the base path). Mutually exclusive with `--server`.

Comment on lines +49 to +58
- `--output, -o <path>`: write the drift report (in the format selected with `--format`)
to a file instead of stdout
- `--server <url>`: server URL the traffic was captured against (host, host + base path,
or a path-only prefix like `/api`). Only requests under it are considered, and the rest
of their URL is treated as the API path. It replaces the description `servers` and the
remainder is matched against the description paths directly - useful when the captured
traffic does not carry the documented host or base path (e.g. `--server localhost:9000`
for traffic captured behind a gateway that adds `/api`). Mutually exclusive with
`--match-mode`: use `--match-mode` when the traffic URLs align with the description
`servers`, use `--server` to declare the actual server when they do not.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `--output, -o <path>`: write the drift report (in the format selected with `--format`)
to a file instead of stdout
- `--server <url>`: server URL the traffic was captured against (host, host + base path,
or a path-only prefix like `/api`). Only requests under it are considered, and the rest
of their URL is treated as the API path. It replaces the description `servers` and the
remainder is matched against the description paths directly - useful when the captured
traffic does not carry the documented host or base path (e.g. `--server localhost:9000`
for traffic captured behind a gateway that adds `/api`). Mutually exclusive with
`--match-mode`: use `--match-mode` when the traffic URLs align with the description
`servers`, use `--server` to declare the actual server when they do not.
- `--output, -o <path>`: write the drift report (in the format selected with `--format`) to a file instead of stdout
- `--server <url>`: server URL the traffic was captured against (host, host + base path, or a path-only prefix like `/api`).
Only requests under it are considered, and the rest of their URL is treated as the API path.
`--server` replaces the description's `servers` and the remainder is matched against the description paths directly.
Useful when the captured traffic does not carry the documented host or base path (e.g. `--server localhost:9000` for traffic captured behind a gateway that adds `/api`).
Mutually exclusive with `--match-mode`.
Use `--match-mode` when the traffic URLs align with the description
`servers`.
Use to declare the actual server when they do not.


> Experimental: the command, flags, and output are subject to change.

## What it does

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## What it does
The `proxy` command:

Comment on lines +20 to +23
It reuses `@redocly/openapi-core` for spec loading and the bundled `@redocly/ajv`
for schema validation, plus `undici` (already shipped) for the upstream client —
no extra runtime dependencies.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It reuses `@redocly/openapi-core` for spec loading and the bundled `@redocly/ajv`
for schema validation, plus `undici` (already shipped) for the upstream client —
no extra runtime dependencies.
The `proxy` command reuses `@redocly/openapi-core` for spec loading and the bundled `@redocly/ajv` for schema validation, plus `undici` (already shipped) for the upstream client.
There are no additional runtime dependencies.

Comment on lines +67 to +72
- Reverse proxy only: clients must target the proxy directly; there is no
forward/`CONNECT` mode and no inbound TLS termination.
- `accept-encoding` is stripped from forwarded requests so captured bodies are
stored decoded; binary response bodies are stored base64-encoded in the HAR.
- Captured exchanges are held in memory and the HAR is rewritten in full on each
exchange, which suits development-rate traffic rather than high-volume capture.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Reverse proxy only: clients must target the proxy directly; there is no
forward/`CONNECT` mode and no inbound TLS termination.
- `accept-encoding` is stripped from forwarded requests so captured bodies are
stored decoded; binary response bodies are stored base64-encoded in the HAR.
- Captured exchanges are held in memory and the HAR is rewritten in full on each
exchange, which suits development-rate traffic rather than high-volume capture.
- Reverse proxy only: clients must target the proxy directly.
There is no forward/`CONNECT` mode and no inbound TLS termination.
- `accept-encoding` is stripped from forwarded requests so captured bodies are stored decoded.
Binary response bodies are stored base64-encoded in the HAR.
- Captured exchanges are held in memory and the HAR is rewritten in full on each
exchange.
This strategy suits development-rate traffic rather than high-volume capture.

Comment on lines +10 to +18
- Starts a reverse proxy that forwards every request to an upstream `--target`
and records each request/response exchange into a HAR file.
- The HAR file is written incrementally (after each exchange) and flushed on
shutdown, so it stays durable if the process is interrupted.
- When `--api` is provided, each captured exchange is validated live against the
spec using the same engine as [`drift`](../drift/README.md); findings are
printed as they happen and a full report is rendered on shutdown.
- The resulting HAR file can be replayed through `drift` later:
`redocly drift ./capture.har --api ./openapi.yaml`.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Starts a reverse proxy that forwards every request to an upstream `--target`
and records each request/response exchange into a HAR file.
- The HAR file is written incrementally (after each exchange) and flushed on
shutdown, so it stays durable if the process is interrupted.
- When `--api` is provided, each captured exchange is validated live against the
spec using the same engine as [`drift`](../drift/README.md); findings are
printed as they happen and a full report is rendered on shutdown.
- The resulting HAR file can be replayed through `drift` later:
`redocly drift ./capture.har --api ./openapi.yaml`.
- Starts a reverse proxy that forwards every request to an upstream `--target` and records each request/response exchange into a HAR file.
- The HAR file is written incrementally (after each exchange) and flushed on shutdown.
The file stays durable if the process is interrupted.
- When `--api` is provided, each captured exchange is validated live against the spec using the same engine as [`drift`](../drift/README.md).
Findings are printed as they happen and a full report is rendered on shutdown.
- The resulting HAR file can be replayed through `drift` later: `redocly drift ./capture.har --api ./openapi.yaml`.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants