Skip to content

v0.6.36: new chunkers, sockets state machine, google sheets/drive/calendar triggers, docs updates, integrations/models pages improvements#4106

Merged
waleedlatif1 merged 11 commits intomainfrom
staging
Apr 11, 2026
Merged

v0.6.36: new chunkers, sockets state machine, google sheets/drive/calendar triggers, docs updates, integrations/models pages improvements#4106
waleedlatif1 merged 11 commits intomainfrom
staging

Conversation

@waleedlatif1
Copy link
Copy Markdown
Collaborator

waleedlatif1 and others added 11 commits April 9, 2026 23:43
#4081)

* feat(trigger): add Google Sheets, Drive, and Calendar polling triggers

Add polling triggers for Google Sheets (new rows), Google Drive (file
changes via changes.list API), and Google Calendar (event updates via
updatedMin). Each includes OAuth credential support, configurable
filters (event type, MIME type, folder, search term, render options),
idempotency, and first-poll seeding. Wire triggers into block configs
and regenerate integrations.json. Update add-trigger skill with polling
instructions and versioned block wiring guidance.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(polling): address PR review feedback for Google polling triggers

- Fix Drive cursor stall: use nextPageToken as resume point when
  breaking early from pagination instead of re-using the original token
- Eliminate redundant Drive API call in Sheets poller by returning
  modifiedTime from the pre-check function
- Add 403/429 rate-limit handling to Sheets API calls matching the
  Calendar handler pattern
- Remove unused changeType field from DriveChangeEntry interface
- Rename triggers/google_drive to triggers/google-drive for consistency

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(polling): fix Drive pre-check never activating in Sheets poller

isDriveFileUnchanged short-circuited when lastModifiedTime was
undefined, never calling the Drive API — so currentModifiedTime
was never populated, creating a permanent chicken-and-egg loop.
Now always calls the Drive API and returns the modifiedTime
regardless of whether there's a previous value to compare against.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore(lint): fix import ordering in triggers registry

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(polling): address PR review feedback for Google polling handlers

- Fix fetchHeaderRow to throw on 403/429 rate limits instead of silently
  returning empty headers (prevents rows from being processed without
  headers and lastKnownRowCount from advancing past them permanently)
- Fix Drive pagination to avoid advancing resume cursor past sliced
  changes (prevents permanent change loss when allChanges > maxFiles)
- Remove unused logger import from Google Drive trigger config

* fix(polling): prevent data loss on partial row failures and harden idempotency key

- Sheets: only advance lastKnownRowCount by processedCount when there
  are failures, so failed rows are retried on the next poll cycle
  (idempotency deduplicates already-processed rows on re-fetch)
- Drive: add fallback for change.time in idempotency key to prevent
  key collisions if the field is ever absent from the API response

* fix(polling): remove unused variable and preserve lastModifiedTime on Drive API failure

- Remove unused `now` variable from Google Drive polling handler
- Preserve stored lastModifiedTime when Drive API pre-check fails
  (previously wrote undefined, disabling the optimization until the
  next successful Drive API call)

* fix(polling): don't advance state when all events fail across sheets, calendar, drive handlers

* fix(polling): retry failed idempotency keys, fix drive cursor overshoot, fix calendar inclusive updatedMin

* fix(polling): revert calendar timestamp on any failure, not just all-fail

* fix(polling): revert drive cursor on any failure, not just all-fail

* feat(triggers): add canonical selector toggle to google polling triggers

- Add 'trigger-advanced' mode to SubBlockConfig so canonical pairs work in trigger mode
- Fix buildCanonicalIndex: trigger-mode subblocks don't overwrite non-trigger basicId, deduplicate advancedIds from block spreads
- Update editor, subblock layout, and trigger config aggregation to include trigger-advanced subblocks
- Replace dropdown+fetchOptions in Calendar/Sheets/Drive pollers with file-selector (basic) + short-input (advanced) canonical pairs
- Add canonicalParamId: 'oauthCredential' to triggerCredentials for selector context resolution
- Update polling handlers to read canonical fallbacks (calendarId||manualCalendarId, etc.)

* test(blocks): handle trigger-advanced mode in canonical validation tests

* fix(triggers): handle trigger-advanced mode in deploy, preview, params, and copilot

* fix(polling): use position-only idempotency key for sheets rows

* fix(polling): don't advance calendar timestamp to client clock on empty poll

* fix(polling): remove extraneous comment from calendar poller

* fix(polling): drive cursor stall on full page, calendar latestUpdated past filtered events

* fix(polling): advance calendar cursor past fully-filtered event batches

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat(ui): allow multiselect in resource tabs

* Fix bugs with deselection

* Try catch resource tab deletion independently

* Fix chat switch selection

* Default to null active id

---------

Co-authored-by: Theodore Li <theo@sim.ai>
…sheet selectors (#4097)

* fix(trigger): show selector display names on canvas for trigger file/sheet selectors

* fix(trigger): use isNonEmptyValue in canonical member scan to match visibility contract
The Forms API has a different base URL for OAuth vs Basic Auth.
Per Atlassian support, OAuth requires the /ex/jira/{cloudId}/forms
pattern, not /jira/forms/cloud/{cloudId} which only works with
Basic Auth. This was causing 401 Unauthorized errors.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…e dropdowns (#4096)

* fix(ui): support Tab key to select items in tag, env-var, and resource dropdowns

* fix(ui): support Tab key to select items in tag, env-var, and resource dropdowns

* fix(ui): guard Tab selection against Shift+Tab and undefined index
* fix(doc): Update byok docs section

* Update cost page with new byok providers

* Add translated sections

---------

Co-authored-by: Theodore Li <theo@sim.ai>
…kew, and stale config clearing (#4101)

* fix(trigger): fix polling trigger config defaults, row count, clock-skew, and stale config clearing

* fix(deploy): track first-pass fills to prevent stale baseConfig bypassing required-field validation

Use a dedicated `filledSubBlockIds` Set populated during the first pass so the second-pass skip guard is based solely on live `getConfigValue` results, not on stale entries spread from `baseConfig` (`triggerConfig`).

* fix(trigger): prevent calendar cursor regression when all events are filtered client-side
* improvement(sockets): workflow switching state machine

* address comments
* improvement(integrations, models): ui/ux

* fix(models, integrations): dedup ChevronArrow/provider colors, fix UTC date rendering

- Extract PROVIDER_COLORS and getProviderColor to model-colors.ts to eliminate
  identical definitions in model-comparison-charts and model-timeline-chart
- Remove duplicate private ChevronArrow from integration-card; import the
  exported one from model-primitives instead
- Add timeZone: 'UTC' to formatShortDate so ISO date-only strings (parsed as
  UTC midnight) render the correct calendar day in all timezones

* refactor(models): rename model-colors.ts to consts.ts

* improvement(models): derive provider colors/resellers from definitions, reorient FAQs to agent builder

Dynamic data:
- Add `color` and `isReseller` fields to ProviderDefinition interface
- Move brand colors for all 10 providers into their definitions
- Mark 6 reseller providers (Azure, Bedrock, Vertex, OpenRouter, Fireworks)
- consts.ts now derives color map from MODEL_CATALOG_PROVIDERS
- model-comparison-charts derives RESELLER_PROVIDERS from catalog
- Fix deepseek name: Deepseek → DeepSeek; remove now-redundant
  PROVIDER_NAME_OVERRIDES and getProviderDisplayName from utils
- Add color/isReseller fields to CatalogProvider; clean up duplicate
  providerDisplayName in searchText array

FAQs:
- Replace all 4 main-page FAQs with 5 agent-builder-oriented ones
  covering model selection, context windows, pricing, tool use, and
  how to use models in a Sim agent workflow
- buildProviderFaqs: add conditional tool use FAQ per provider
- buildModelFaqs: add bestFor FAQ (conditional on field presence);
  improve context window answer to explain agent implications;
  tighten capabilities answer wording

* chore(models): remove model-colors.ts (superseded by consts.ts)

* update footer

---------

Co-authored-by: waleed <walif6@gmail.com>
…4102)

* feat(knowledge): add token, sentence, recursive, and regex chunkers

* fix(chunkers): standardize token estimation and use emcn dropdown

- Refactor all existing chunkers (Text, JsonYaml, StructuredData, Docs) to use shared utils
- Fix inconsistent token estimation (JsonYaml used tiktoken, StructuredData used /3 ratio)
- Fix DocsChunker operator precedence bug and hard-coded 300-token limit
- Fix JsonYamlChunker isStructuredData false positive on plain strings
- Add MAX_DEPTH recursion guard to JsonYamlChunker
- Replace @/components/ui/select with emcn DropdownMenu in strategy selector

* fix(chunkers): address research audit findings

- Expand RecursiveChunker recipes: markdown adds horizontal rules, code
  fences, blockquotes; code adds const/let/var/if/for/while/switch/return
- RecursiveChunker fallback uses splitAtWordBoundaries instead of char slicing
- RegexChunker ReDoS test uses adversarial strings (repeated chars, spaces)
- SentenceChunker abbreviation list adds St/Rev/Gen/No/Fig/Vol/months
  and single-capital-letter lookbehind
- Add overlap < maxSize validation in Zod schema and UI form
- Add pattern max length (500) validation in Zod schema
- Fix StructuredDataChunker footer grammar

* fix(chunkers): fix remaining audit issues across all chunkers

- DocsChunker: extract headers from cleaned content (not raw markdown)
  to fix position mismatch between header positions and chunk positions
- DocsChunker: strip export statements and JSX expressions in cleanContent
- DocsChunker: fix table merge dedup using equality instead of includes
- JsonYamlChunker: preserve path breadcrumbs when nested value fits in
  one chunk, matching LangChain RecursiveJsonSplitter behavior
- StructuredDataChunker: detect 2-column CSV (lowered threshold from >2
  to >=1) and use 20% relative tolerance instead of absolute +/-2
- TokenChunker: use sliding window overlap (matching LangChain/Chonkie)
  where chunks stay within chunkSize instead of exceeding it
- utils: splitAtWordBoundaries accepts optional stepChars for sliding
  window overlap; addOverlap uses newline join instead of space

* chore(chunkers): lint formatting

* updated styling

* fix(chunkers): audit fixes and comprehensive tests

- Fix SentenceChunker regex: lookbehinds now include the period to correctly handle abbreviations (Mr., Dr., etc.), initials (J.K.), and decimals
- Fix RegexChunker ReDoS: reset lastIndex between adversarial test iterations, add poisoned-suffix test strings
- Fix DocsChunker: skip code blocks during table boundary detection to prevent false positives from pipe characters
- Fix JsonYamlChunker: oversized primitive leaf values now fall back to text chunking instead of emitting a single chunk
- Fix TokenChunker: pass 0 to buildChunks for overlap metadata since sliding window handles overlap inherently
- Add defensive guard in splitAtWordBoundaries to prevent infinite loops if step is 0
- Add tests for utils, TokenChunker, SentenceChunker, RecursiveChunker, RegexChunker (236 total tests, 0 failures)
- Fix existing test expectations for updated footer format and isStructuredData behavior

* chore(chunkers): remove unnecessary comments and dead code

Strip 445 lines of redundant TSDoc, math calculation comments,
implementation rationale notes, and assertion-restating comments
across all chunker source and test files.

* fix(chunkers): address PR review comments

- Fix regex fallback path: use sliding window for overlap instead of
  passing chunkOverlap to buildChunks without prepended overlap text
- Fix misleading strategy label: "Text (hierarchical splitting)" →
  "Text (word boundary splitting)"

* fix(chunkers): use consistent overlap pattern in regex fallback

Use addOverlap + buildChunks(chunks, overlap) in the regex fallback
path to match the main path and all other chunkers (TextChunker,
RecursiveChunker). The sliding window approach was inconsistent.

* fix(chunkers): prevent content loss in word boundary splitting

When splitAtWordBoundaries snaps end back to a word boundary, advance
pos from end (not pos + step) in non-overlapping mode. The step-based
advancement is preserved for the sliding window case (TokenChunker).

* fix(chunkers): restore structured data token ratio and overlap joiner

- Restore /3 token estimation for StructuredDataChunker (structured data
  is denser than prose, ~3 chars/token vs ~4)
- Change addOverlap joiner from \n to space to match original TextChunker
  behavior

* lint

* fix(chunkers): fall back to character-level overlap in sentence chunker

When no complete sentence fits within the overlap budget,
fall back to character-level word-boundary overlap from the
previous group's text. This ensures buildChunks metadata is
always correct.

* fix(chunkers): fix log message and add missing month abbreviations

- Fix regex fallback log: "character splitting" → "word-boundary splitting"
- Add Jun and Jul to sentence chunker abbreviation list

* lint

* fix(chunkers): restore structured data detection threshold to > 2

avgCount >= 1 was too permissive — prose with consistent comma usage
would be misclassified as CSV. Restore original > 2 threshold while
keeping the improved proportional tolerance.

* fix(chunkers): pass chunkOverlap to buildChunks in TokenChunker

* fix(chunkers): restore separator-as-joiner pattern in splitRecursively

Separator was unconditionally prepended to parts after the first,
leaving leading punctuation on chunks after a boundary reset.

* feat(knowledge): add JSONL file support for knowledge base uploads

Parses JSON Lines files by splitting on newlines and converting to a
JSON array, which then flows through the existing JsonYamlChunker.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 11, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped Apr 11, 2026 4:34am

Request Review

@cursor
Copy link
Copy Markdown

cursor bot commented Apr 11, 2026

PR Summary

Medium Risk
Mostly documentation and landing-page UI refactors, but the integrations detail/list pages were substantially restructured (layout + filtering by integrationTypes), which could introduce navigation/SEO or rendering regressions.

Overview
Adds explicit polling trigger guidance to the internal add-trigger playbooks (Claude/Cursor), including required handler/trigger config patterns and registration steps.

Updates docs to expand BYOK provider support and add per-operation hosted tool pricing tables across locales, plus minor copy/punctuation tweaks.

Refreshes landing UI: adjusts blog typography, updates footer links, adds an AnimatePresence-animated FAQ accordion, introduces a new navbar ProductDropdown, and significantly redesigns integrations pages (new row/card variants, new templates/triggers/tools section layouts, and category filtering now based on integrationTypes arrays).

Reviewed by Cursor Bugbot for commit 1acafe8. Configure here.

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 1acafe8. Configure here.

@waleedlatif1 waleedlatif1 merged commit cbfab1c into main Apr 11, 2026
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants