Skip to content

Add a README for every example and standardize on the Estuary name#39

Open
danthelion wants to merge 2 commits into
mainfrom
add-example-readmes
Open

Add a README for every example and standardize on the Estuary name#39
danthelion wants to merge 2 commits into
mainfrom
add-example-readmes

Conversation

@danthelion

@danthelion danthelion commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

Summary

Adds or rewrites a technical README.md for every example in the repo and rebuilds the root README.md as a categorized index. Also standardizes the product name as Estuary throughout (no "Estuary Flow" / bare "Flow").

Two commits:

  1. docs: add and refresh README for every example — documentation only.
  2. fix: correct latent bugs in example projects — small code fixes surfaced while writing the docs, plus README sync.

READMEs

  • Covered all 29 top-level examples + the 4 Python-derivation subprojects, and rebuilt the root index (Database CDC Captures / Materializations & Destinations / Derivations & Transformations / Real-Time RAG & AI / Streaming, Lakehouse & Stream Processing / Demos, Workshops & Webinars). All index links verified to resolve.
  • 8 brand-new READMEs: dekaf-python, estuary-demo-movies, mongodb-tinybird-clickstream, pyiceberg-aws-glue, python-derivations, and shipments-{ai,joins,stateless}.
  • Replaced placeholder stubs (the "??? / Profit!" and bare blog-link READMEs); preserved existing strong content (e.g. sqlserver-cdc-capture, oracle-capture, kafka-capture, shipments-datagen, the shipments-stateful Mermaid diagram, the hands-on-lab steps).
  • Each README documents the data flow in Estuary terms (capture → collection → materialization/derivation), what's included, prerequisites, copy-pasteable setup, and how to configure the capture/materialization via the dashboard or flowctl, with links to the relevant connector docs. Values (ports, users, tables, connector names) were verified against the actual docker-compose, init.sql, flow.yaml, and datagen files.

Naming

All platform references now say Estuary. flowctl, flow.yaml / .flow.py, and flow_capture / flow_publication / flow_watermarks identifiers are left untouched.

Fixes (commit 2)

Example Fix
dekaf-kcat consume.sh bootstrap host dekaf.fly.dev:9092 (retired) → dekaf.estuary-data.com:9092
postgres-cloudsql-simple-capture datagen/requirements.txt missing SQLAlchemy + python-dotenv (imported by datagen.py)
streaming-lakehouse-iceberg-duckdb requirements.txt missing python-dotenv (imported by main.py)
python-derivations/shipments-stateful derivation used 'Out for Delivery'; generator emits 'Out For Delivery' (status never matched)
shipments_eta docker-compose.yml hardcoded dead localhost/mongo values → pass MongoDB env vars through
shipments_eta eta.sql arrayJoin(s.delays_reason, …)arrayStringConcat(s.delays__reason, …) (matches the Data Source column)

Left as-is (flag for review)

shipments_eta/datagen/datagen.py and tinybird/shipments.datasource still contain personal-looking values (dani2, a specific Atlas cluster host, KAFKA_CONNECTION_NAME 'Estuary Flow', the Dani/shipments-demo prefix). Left untouched since they're functional config; happy to scrub them in a follow-up if desired.

danthelion and others added 2 commits June 26, 2026 10:36
Write or rewrite a technical, SEO-oriented README.md for all 29 example
projects plus the four Python derivation subprojects, and rebuild the root
README as a categorized index linking every example. Standardize the
product name as "Estuary" throughout (no "Estuary Flow"/"Flow").

Each README documents the architecture in Estuary terms (capture ->
collection -> materialization/derivation), what's included, prerequisites,
copy-pasteable setup, and how to configure the capture/materialization via
the dashboard or flowctl, with links to the relevant connector docs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- dekaf-kcat: point consume.sh at the live Dekaf host (dekaf.fly.dev was retired)
- postgres-cloudsql: add sqlalchemy and python-dotenv (imported by datagen.py)
- streaming-lakehouse: add python-dotenv (imported by main.py)
- shipments-stateful derivation: match the generator's "Out For Delivery" status casing
- shipments_eta: pass MongoDB env vars through docker-compose instead of dead localhost/mongo defaults
- shipments_eta: eta.sql uses arrayStringConcat on the flattened delays__reason column

Sync the affected READMEs to the corrected behavior.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@danthelion danthelion requested a review from aeluce June 26, 2026 14:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant