Skip to content

Latest commit

 

History

History
225 lines (190 loc) · 14.4 KB

File metadata and controls

225 lines (190 loc) · 14.4 KB

Spec: bytesized.co - OpenAI Image Generation using SwiftWASM + Hummingbird

1. Objective

Implement a web app where:

  • The page loads in the browser as a SwiftWASM app.
  • The app automatically requests an image on the home page, article pages, and paginated archive pages.
  • A same-session revisit of the same page reuses that page's last returned image from client-side session storage when available.
  • The backend persists a stable per-page image key so repeat requests for the same page and request country reuse the existing image instead of generating a new one.
  • When the daily generation budget is exhausted, the backend returns a random previously generated image instead of requesting a new one.
  • The backend waits for image generation to finish before replying.
  • The final image is rendered from a public S3 HTTPS URL.
  • The image is generated by the OpenAI image generation API using the fixed model gpt-image-1.5.
  • The generation prompt should use the request's origin country when the server can resolve it from the client IP through country.is, and otherwise fall back to a generic worldwide prompt.
  • The backend performs country.is lookups and OpenAI image-generation requests with AsyncHTTPClient.
  • The backend uses a long-lived Hummingbird server.

2. System Overview

2.1 Components

  • Frontend: SwiftWASM app using Parcel for typed HTTP JSON requests in the browser.
  • Backend: Hummingbird server exposing a single synchronous POST endpoint.
  • Storage/delivery: A dedicated public S3 bucket exposing generated PNG objects under a dedicated prefix.

2.2 High-Level Flow

  1. The browser loads the SwiftWASM bundle and page HTML.
  2. The app reads the page context from the mount element.
  3. If session storage already contains an image URL for the same page path and page type, the app reuses that URL and skips the API call.
  4. Otherwise, the app calls POST <API_URL> with page context.
  5. The server derives the client IP from proxy forwarding headers, preferring X-Real-IP when present and otherwise falling back to X-Forwarded-For, then looks up the origin country with country.is.
  6. The server validates input and checks for a stable page-cache key derived from page context and resolved country before considering a new generation.
  7. If the page-cache key already exists, the server returns that image immediately.
  8. Otherwise, the server counts generated PNG objects already present under the current UTC day prefix in S3 to decide whether its soft daily generation budget has remaining capacity.
  9. If budget remains, the server creates a fresh unique dated image key, builds a country-aware prompt when country lookup succeeded, calls OpenAI, uploads the PNG to S3, writes the same image to the stable page-cache key, and returns 200 OK.
  10. If the daily budget is exhausted, the server selects a random existing generated PNG from S3, copies it to the stable page-cache key, and returns 200 OK with that page-cache image instead.
  11. The app swaps the placeholder image source to the returned or cached URL.
  12. On successful API responses, the app stores the returned image URL in session storage for future visits to the same page in the current browser session.

2.3 Published SwiftWASM Assets

  • The BytesizedCafe SwiftWASM package is built into the repo-root bytesized-cafe-app/ directory.
  • The site generator publishes that directory at /bytesized-cafe-app/.
  • Published asset paths must preserve the generated nested package layout, including /bytesized-cafe-app/platforms/browser.js.

3. S3 Design

3.1 Canonical Public Origin

The backend derives the public image origin from GENERATED_IMAGES_BUCKET and AWS_REGION as: https://<generated-images-bucket>.s3.<region>.amazonaws.com

3.2 Bucket Access Model

  • Keep S3 Object Ownership set to Bucket owner enforced.
  • Keep object ACLs disabled.
  • Public read comes from bucket policy, not object ACLs.
  • Store generated images in a dedicated public S3 bucket separate from the static site bucket.
  • Keep generated images under IMAGE_GEN_PREFIX.
  • Grant anonymous s3:GetObject on arn:aws:s3:::<generated-images-bucket>/<IMAGE_GEN_PREFIX>/*.
  • Do not upload with public-read ACLs.

3.3 Object Key Format

  • Freshly generated image:
    • {IMAGE_GEN_PREFIX}/{YYYY}/{MM}/{DD}/{UUID}-{country-slug}.png when the request country is known
    • {IMAGE_GEN_PREFIX}/{YYYY}/{MM}/{DD}/{UUID}.png when the request country is not known
  • Stable page-cache image:
    • {IMAGE_GEN_PREFIX}/page-cache/{pageType}/{normalized-page-path}-{country-slug}.png when the request country is known
    • {IMAGE_GEN_PREFIX}/page-cache/{pageType}/{normalized-page-path}-anywhere.png when the request country is not known
  • Random fallback image:
    • Prefer an existing PNG under IMAGE_GEN_PREFIX/ whose key ends in the current request's -{country-slug}.png
    • Fall back to any existing PNG under IMAGE_GEN_PREFIX/ when no country-matching image is available

Rules:

  • Fresh generation keys must not be derived from page context.
  • Stable page-cache keys must be derived from page context and resolved country.
  • API responses should prefer the stable page-cache key whenever one exists or is created during the request.

3.4 Object Metadata

When uploading a freshly generated image:

  • Content-Type: image/png
  • Cache-Control: public, max-age=31536000, immutable

3.5 Lifecycle Policy

Configure S3 Lifecycle expiration to prevent unbounded storage growth:

  • Expire IMAGE_GEN_PREFIX/ after 30 days if regeneration on cache miss is acceptable.

4. API Design

4.1 Routing

This API uses a single action endpoint:

  • POST /api/cafe/generate triggers generation or fallback selection.
  • OPTIONS /api/cafe/generate is handled by Hummingbird CORS middleware.

4.2 CORS

Enable CORS on the server endpoint for browser access:

  • Allowed methods: POST, OPTIONS
  • Allowed headers: Content-Type
  • Allowed origins: site origin(s) used to host the SwiftWASM app

5. API Contract

5.1 POST <API_URL>

Request JSON:

{
  "context": {
    "pagePath": "/posts/example-article",
    "pageType": "article"
  }
}

pageType must be one of:

  • index
  • article
  • archive

Response:

  • Status: 200 OK
  • Body:
{
  "url": "https://<public-base-domain>/generated/v2/page-cache/article/posts/example-article-france.png"
}

Rules:

  • url is the final public image URL and must use the generated-images bucket public origin.
  • The response may return a stable per-page cache key when the page already has an assigned image.
  • Return 200 only after the image has been uploaded successfully or a random fallback image has been selected successfully.
  • Invalid input returns 4xx.
  • If the daily budget is exhausted and no fallback image exists, return 503.
  • Terminal upstream failures return 5xx.

6. Server Behavior

6.1 Hummingbird Server Responsibilities

  • Parse and validate input JSON.
  • Encapsulate S3 operations behind one S3ImageStore client object that owns the bucket configuration and AWS client lifecycle for image upload and lookup operations.
  • Resolve the client IP address by preferring X-Real-IP when present and otherwise falling back to X-Forwarded-For.
  • Look up the request origin country with https://api.country.is/{ip} and convert the returned region code into an English country name when available.
  • Derive a stable page-cache key from page context and resolved country, and return it immediately when that object already exists in S3.
  • Check the soft daily generation budget by counting PNG objects already present under the current UTC date prefix in S3.
  • Build the public url.
  • When budget remains:
    • Generate a fresh unique image key.
    • Build the prompt as a single random dish popular in the request country.
    • Include a normalized -{country-slug} suffix in the generated key when country lookup succeeds.
    • Instruct the model to prefer specific, visually distinct local dishes over generic national defaults, and to avoid repeatedly defaulting to globally common fast food unless it is genuinely the random choice.
    • Fall back to the same prompt structure scoped to somewhere in the world when the client IP or country cannot be resolved.
    • Call the OpenAI image generation API with model gpt-image-1.5.
    • Upload the PNG to the generated image key used for the dated generation pool.
    • Upload the same PNG to the stable page-cache key.
    • Return the page-cache url.
  • When budget is exhausted:
    • Prefer a random existing generated PNG key from S3 whose key suffix matches the current request country.
    • Fall back to a random existing generated PNG key from S3 when no country-matching key is available.
    • Copy the selected fallback image to the stable page-cache key without calling OpenAI.
    • Return the page-cache url.

7. Frontend Behavior

  • Show a loading placeholder immediately.
  • Read page context from the mount element.
  • If session storage contains a URL for the same page path and page type, reuse that URL and skip the API call.
  • Otherwise, start a single POST request to the configured API URL.
  • When the request succeeds, swap the placeholder image source to the returned url.
  • Persist the returned image URL in session storage keyed to the current page so the next same-session visit of that page can reuse it.

8. Environment Variables

8.1 Hummingbird Server

  • GENERATED_IMAGES_BUCKET
  • OPENAI_API_KEY
  • OPENAI_IMAGE_MODEL
  • IMAGE_GEN_PREFIX
  • AWS_REGION
  • AWS_ACCESS_KEY_ID
  • AWS_SECRET_ACCESS_KEY
  • HOST
  • PORT

Local repo tooling may provide BACKEND_HOST and BACKEND_PORT as aliases for the backend runtime HOST and PORT values.

8.2 Site Build

  • BYTESIZED_CAFE_API_URL

8.3 Site Deploy

  • AWS_S3_BUCKET
  • CLOUDFRONT_DISTRIBUTION_ID

9. Validation

The implementation is considered complete when:

  • A same-session revisit of the same page reuses the last returned image URL from session storage without making a new backend request.
  • A backend request for a page that already has a stable page-cache object returns that existing image URL without making a new OpenAI request.
  • The backend returns 200 only after a fresh image upload succeeds or a random fallback image has been selected.
  • When the daily budget is exhausted, the backend returns a random existing generated image instead of making a new OpenAI request.
  • Fresh generations use the request origin country in the prompt when the server can resolve it from the client IP, and otherwise fall back to the generic worldwide prompt.
  • Fresh generations include a country slug suffix in the image key when the request country is known.
  • When the daily budget is exhausted, fallback selection prefers existing images whose keys match the current request country and otherwise falls back to any existing image.
  • The backend persists deterministic per-page cache keys separately from the dated generation pool.

10. Deployment

10.1 Container Build

  • The backend container image is built from Backend/ using the checked-in Backend/Dockerfile.
  • The backend container build and runtime stages pin the official swift:6.3.0-bookworm and swift:6.3.0-bookworm-slim images.
  • The checked-in Backend/railway.toml codifies the Railway deploy settings that should live in source control, currently the Dockerfile builder and /health healthcheck.
  • The deployable product is the Server executable.
  • Railway builds and runs the production image from GitHub pushes, targeting the backend service with railway up Backend --ci --path-as-root.
  • Deployment config changes are validated with just validate-deployment, which delegates to ./Scripts/validate-deployment-config.sh to build the Docker image and validate workflow YAML parsing.

10.2 Railway Infrastructure

  • Railway hosts the public backend service, injects runtime environment variables, and exposes a healthchecked HTTPS endpoint for the Server container.
  • The Railway service should define a stable custom domain so the static site can build against a fixed BYTESIZED_CAFE_API_URL.
  • The GitHub Actions workflow under .github/workflows/deploy.yml is the production deployment path and is intended to run on pushes to the primary deployment branch.
  • The backend deploy job authenticates with a Railway project token, synchronizes the backend runtime variables into Railway, and deploys the Backend/ directory directly to the configured Railway project, environment, and service.
  • Railway service-level GitHub autodeploy should be disabled when the GitHub Actions workflow is the active deployment path, to avoid duplicate backend deployments from the same push.
  • GitHub Actions repository variables and secrets are the source of truth for the backend runtime variables GENERATED_IMAGES_BUCKET, OPENAI_API_KEY, OPENAI_IMAGE_MODEL, IMAGE_GEN_PREFIX, AWS_REGION, AWS_ACCESS_KEY_ID, and AWS_SECRET_ACCESS_KEY.
  • The backend deploy workflow sets HOST=0.0.0.0 and PORT=8080 in Railway by default, unless the deploy job overrides RAILWAY_RUNTIME_HOST or RAILWAY_RUNTIME_PORT.
  • The site deploy job continues to sync Output/ to S3 using a fixed BYTESIZED_CAFE_API_URL.
  • Paginated archive links use the literal deployed object paths under /page/<n>/index.html because the production S3 and CloudFront setup does not rewrite clean directory URLs to nested index.html objects.
  • After the S3 sync completes, the site deploy job invalidates the production CloudFront distribution with CLOUDFRONT_DISTRIBUTION_ID for /, /index.html, /page/*, /posts/*, /feed.rss, /bytesized-cafe-app/*, /css/*, /images/*, and /fonts/*.

11. Local Development

  • A repo-root justfile provides the primary entry point for common local tasks such as just wasm, just site, just site-local, just backend, and just local, along with deployment-oriented recipes like just site-release, just site-deploy, and just validate-deployment.
  • The repo's Swift package manifests target Swift tools version 6.3, the macOS GitHub Actions job installs Swift 6.3.0, and the SwiftWasm site build uses the compatible swift-6.3-RELEASE SDK tag.
  • Scripts/run-local.sh provides a one-command local stack for development and opens the local site in the default browser after the backend and static site server are ready.
  • The script rebuilds the BytesizedCafe SwiftWASM bundle, regenerates the site with BYTESIZED_CAFE_API_URL pointed at a localhost backend, prebuilds the backend to avoid counting SwiftPM compilation against the startup timeout, starts the Hummingbird server, and serves Output/ over a local static HTTP server.