WIP: US Python backend (latency spike — do not merge)#54
Open
SakshiKekre wants to merge 2 commits into
Open
Conversation
- USPolicyEnginePythonBackend in model_backends.py: mirrors the UK Python backend, swaps to policyengine_us, capabilities() lists US variables and parameter roots, prompt notes the state_code requirement. - Add policyengine_us to backend/requirements.txt. - Frontend label map: UK (Compiled) / UK (Python) / US (Python). Known gaps deferred: - reference.md is still UK-compiled-only — Claude sees UK API docs on US backend. Acceptable for the initial latency smoke test. - System prompt still says "British English" and the title-generation route still calls itself "a UK tax and benefit policy assistant". - Modal region is "eu" — US response latency will reflect transatlantic hop, not a US-optimised deploy.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Beta preview is ready.
|
Collaborator
Author
Smoke test findingsTested on preview: https://policyengine-uk-chat-git-feat-us-backend-policy-engine.vercel.app What works ✅
Verification — tight prompt:
Returned $1,616.00, which matches hand calc exactly (standard deduction $14,600 → taxable $15,400 → 10% × $11,600 + 12% × $3,800). What doesn't work yet
|
Open
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Spike to measure how long US simulations take in the chat interface. Not for merge — exists to deploy a preview where we can run timed prompts against the US backend.
Branches off PR #51 (
feat/model-backend-selector) so it includes the backend-selector + scenario_context plumbing already.What's in the diff
USPolicyEnginePythonBackendinbackend/model_backends.py, mirrors the UK Python backendpolicyengine_usadded tobackend/requirements.txtKnown gaps deferred for the latency test
reference.mdis UK-compiled-only. Claude sees UK API docs when US backend is selected. Will write some wrong code on the first attempt — that's part of what we want to measure (recovery latency).eu— US response latency will reflect the transatlantic hop, not a US-optimised deploy.Latency numbers to capture
uk_pythonWhat this PR does NOT do