Skip to content

[Web] Add support for OPFS synchronous access handles and committed records#19673

Open
akaashrp wants to merge 3 commits into
apache:mainfrom
akaashrp:opfssync
Open

[Web] Add support for OPFS synchronous access handles and committed records#19673
akaashrp wants to merge 3 commits into
apache:mainfrom
akaashrp:opfssync

Conversation

@akaashrp
Copy link
Copy Markdown
Contributor

@akaashrp akaashrp commented Jun 4, 2026

Add support for synchronous access handles in OPFS (https://developer.mozilla.org/en-US/docs/Web/API/FileSystemFileHandle/createSyncAccessHandle). Sync mode uses FileSystemSyncAccessHandle where available. Replace optional metadata with committed OPFS records written after payload. Records store the URL, payload byte count, and content type. This allows interrupted or partial writes to be treated as cache misses. Stale OPFS directory handles are cleared on InvalidStateError.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces synchronous OPFS access mode support in OPFSStore and ArtifactOPFSCache for dedicated worker contexts, allowing for faster read and write operations. The feedback identifies a critical memory regression in async mode where files are loaded entirely into memory as ArrayBuffers instead of using lazy Blobs, which could cause OOM crashes. Additionally, the reviewer suggests adding a try-catch fallback for concurrent createSyncAccessHandle calls to prevent lock failures, and wrapping ArrayBuffers in Uint8Arrays to maintain compatibility with older browser implementations of the OPFS API.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread web/src/opfs_store.ts
Comment thread web/src/opfs_store.ts
Comment thread web/src/opfs_store.ts
Comment thread web/src/opfs_store.ts
@akaashrp akaashrp changed the title [Web] Add support for OPFS synchronous access handles [Web] Add support for OPFS synchronous access handles and committed records Jun 8, 2026
@akaashrp
Copy link
Copy Markdown
Contributor Author

akaashrp commented Jun 8, 2026

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces synchronous OPFS (Origin Private File System) access capabilities to OPFSStore and ArtifactOPFSCache via a new OPFSAccessMode option, allowing for synchronous reads and writes in supported dedicated worker contexts. It also adds support for reading cached data directly as an ArrayBuffer. The review feedback highlights two robustness issues in OPFSStore: first, a potential cache corruption scenario where a failed payload overwrite leaves an outdated metadata file intact, which can be mitigated by deleting the metadata file before writing the payload; second, a potential crash during cache reads if the metadata file is corrupted, which should be handled gracefully with a try-catch block to treat it as a cache miss.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread web/src/opfs_store.ts
Comment thread web/src/opfs_store.ts
@akaashrp
Copy link
Copy Markdown
Contributor Author

akaashrp commented Jun 8, 2026

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces synchronous access mode support (OPFSAccessMode) to the Origin Private File System (OPFS) artifact cache, allowing the use of synchronous access handles in dedicated worker contexts for improved performance. The review feedback highlights several critical improvement opportunities in opfs_store.ts. Specifically, the reviewer recommends aborting writable streams on error in writePayloadWithWritable and writeRecord to prevent persistent file locks, using a sync access handle's getSize() in hasExpectedPayloadSize to avoid the overhead of creating large Blob objects, and returning a lazy Blob instead of an eager ArrayBuffer in read to mitigate memory pressure.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread web/src/opfs_store.ts
Comment on lines +363 to +390
private async writePayloadWithWritable(
handle: OPFSFileHandle,
response: Response,
): Promise<number> {
const writable = await handle.createWritable();
if (response.body !== null) {
let nbytes = 0;
const reader = response.body.getReader();
try {
while (true) {
const { done, value } = await reader.read();
if (done) {
break;
}
await writable.write(value);
nbytes += value.byteLength;
}
} finally {
reader.releaseLock();
}
await writable.close();
return nbytes;
}
const payload = await response.arrayBuffer();
await writable.write(payload);
await writable.close();
return payload.byteLength;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If an error occurs during streaming or writing (for example, if the network connection is interrupted or the reader throws an error), the writable stream is never closed or aborted. In the Origin Private File System (OPFS), an unclosed FileSystemWritableFileStream retains an exclusive lock on the file. This can cause subsequent read or write operations on the same file to fail with a NoModificationAllowedError until the page is reloaded or garbage collection runs.

We should wrap the write operations in a try...catch block and ensure that writable.abort() is called if an error occurs, which safely releases the lock and discards any partial temporary file.

  private async writePayloadWithWritable(
    handle: OPFSFileHandle,
    response: Response,
  ): Promise<number> {
    const writable = await handle.createWritable();
    try {
      if (response.body !== null) {
        let nbytes = 0;
        const reader = response.body.getReader();
        try {
          while (true) {
            const { done, value } = await reader.read();
            if (done) {
              break;
            }
            await writable.write(value);
            nbytes += value.byteLength;
          }
        } finally {
          reader.releaseLock();
        }
        await writable.close();
        return nbytes;
      }
      const payload = await response.arrayBuffer();
      await writable.write(payload);
      await writable.close();
      return payload.byteLength;
    } catch (err) {
      try {
        await writable.abort();
      } catch {
        // Ignore secondary errors during abort cleanup
      }
      throw err;
    }
  }

Comment thread web/src/opfs_store.ts
Comment on lines +329 to +336
private async writeRecord(
handle: OPFSFileHandle,
data: Blob | BufferSource | string,
record: OPFSStoreRecord,
): Promise<void> {
const writable = await handle.createWritable();
await writable.write(data);
await writable.write(new TextEncoder().encode(JSON.stringify(record)));
await writable.close();
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to writePayloadWithWritable, if writable.write throws an error, the writable stream is not closed or aborted, leaving the record file locked. We should wrap this in a try...catch block to abort the stream on failure.

  private async writeRecord(
    handle: OPFSFileHandle,
    record: OPFSStoreRecord,
  ): Promise<void> {
    const writable = await handle.createWritable();
    try {
      await writable.write(new TextEncoder().encode(JSON.stringify(record)));
      await writable.close();
    } catch (err) {
      try {
        await writable.abort();
      } catch {
        // Ignore secondary errors during abort cleanup
      }
      throw err;
    }
  }

Comment thread web/src/opfs_store.ts
Comment on lines +345 to +350
private async hasExpectedPayloadSize(
entry: OPFSStoredEntry,
): Promise<boolean> {
const blob = await entry.payloadHandle.getFile();
return blob.size === entry.record.nbytes;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In sync mode, calling entry.payloadHandle.getFile() is an asynchronous operation that creates a Blob object. For large model weights (often hundreds of megabytes or gigabytes), creating a Blob just to check its size can introduce non-trivial overhead and disk/memory mapping latency.

Since we are in sync mode, we can open a read-only sync access handle and call getSize() to retrieve the file size. This is a metadata-only operation that is extremely fast and avoids the overhead of creating a Blob.

  private async hasExpectedPayloadSize(
    entry: OPFSStoredEntry,
  ): Promise<boolean> {
    if (this.accessMode === "sync") {
      const syncHandle = await this.openSyncAccessHandle(entry.payloadHandle, "read-only");
      if (syncHandle !== undefined) {
        try {
          return syncHandle.getSize() === entry.record.nbytes;
        } finally {
          syncHandle.close();
        }
      }
    }
    const blob = await entry.payloadHandle.getFile();
    return blob.size === entry.record.nbytes;
  }

Comment thread web/src/opfs_store.ts
Comment on lines 129 to +153
async read(url: string): Promise<Response | undefined> {
const directory = await this.getScopedDirectory();
const baseName = await this.hashUrl(url);
const dataHandle = await this.getFileHandleIfExists(
directory,
`${baseName}.bin`,
false,
);
if (dataHandle === undefined) {
return undefined;
try {
const entry = await this.getStoredEntry(url);
if (entry === undefined) {
return undefined;
}
if (this.accessMode === "async") {
const blob = await entry.payloadHandle.getFile();
if (blob.size !== entry.record.nbytes) {
return undefined;
}
return new Response(blob, this.getResponseInit(entry.record));
}
const payload = await this.readPayload(entry.payloadHandle);
if (payload.byteLength !== entry.record.nbytes) {
return undefined;
}
return new Response(payload, this.getResponseInit(entry.record));
} catch (err) {
if (this.handleCacheMissStateError(err)) {
return undefined;
}
throw err;
}
const dataBlob = await dataHandle.getFile();
const metadataHandle = await this.getFileHandleIfExists(
directory,
`${baseName}.meta.json`,
false,
);
let metadata: OPFSStoreMetadata | undefined = undefined;
if (metadataHandle !== undefined) {
metadata = await this.readMetadata(metadataHandle);
if (metadata?.url !== undefined && metadata.url !== url) {
throw new Error("OPFSStore: metadata URL does not match key URL.");
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In sync mode, read(url) eagerly reads the entire payload into an ArrayBuffer in JS memory via readPayload before wrapping it in a Response. Eagerly loading large files (like model weights) into JS memory can cause high memory pressure or out-of-memory (OOM) errors in the worker.

Since read(url) returns a Response object, we should always wrap a lazy Blob (obtained via getFile()) instead of an eager ArrayBuffer. This allows the browser to handle the file content lazily or stream it, which is much more memory-efficient. Callers who specifically need an ArrayBuffer should use readArrayBuffer(url) instead, which is already optimized to use the sync access handle directly.

This also simplifies the read method by removing the need to branch on this.accessMode.

  async read(url: string): Promise<Response | undefined> {
    try {
      const entry = await this.getStoredEntry(url);
      if (entry === undefined) {
        return undefined;
      }
      const blob = await entry.payloadHandle.getFile();
      if (blob.size !== entry.record.nbytes) {
        return undefined;
      }
      return new Response(blob, this.getResponseInit(entry.record));
    } catch (err) { 
      if (this.handleCacheMissStateError(err)) {
        return undefined;
      }
      throw err;
    }
  }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant