Add direct sockets api support for isolated web apps#26344
Add direct sockets api support for isolated web apps#26344maceip wants to merge 1 commit intoemscripten-core:mainfrom
Conversation
|
Wow, very impressive. Thanks for working on this. I guess one tricky part is doing to be testing, but it would be great to get at least one end-to-end test add to the browser tests. |
sbc100
left a comment
There was a problem hiding this comment.
Could you add a link to spec for the direct sockets API to the PR description (at to libdirectsockets.js)?
There was a problem hiding this comment.
Pull request overview
This PR adds an alternative POSIX-socket backend that targets Chrome’s Direct Sockets API for Isolated Web Apps, intended to replace the existing WebSocket/proxy-based networking path and enable real TCP/UDP from Wasm (with JSPI for async bridging).
Changes:
- Introduces a new JS library (
libdirectsockets.js) implementing socket-related syscalls viaTCPSocket,TCPServerSocket, andUDPSocket. - Adds new linker settings (
DIRECT_SOCKETS,DIRECT_SOCKETS_WEBTRANSPORT) and wires library inclusion viamodules.mjs. - Adjusts existing syscall/WASI glue to avoid conflicting socket implementations and to close Direct Socket fds.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| system/lib/libc/emscripten_syscall_stubs.c | Minor change to the stub setsockopt return line. |
| src/settings.js | Adds new DIRECT_SOCKETS and DIRECT_SOCKETS_WEBTRANSPORT link-time settings. |
| src/modules.mjs | Links libdirectsockets.js when DIRECT_SOCKETS is enabled. |
| src/lib/libsyscall.js | Prevents default JS socket syscall implementations when Direct Sockets are enabled. |
| src/lib/libwasi.js | Adds a fd_close branch to close Direct Sockets-backed fds. |
| src/lib/libdirectsockets.js | New Direct Sockets syscall backend implementation. |
Comments suppressed due to low confidence (5)
src/lib/libdirectsockets.js:211
- This new backend introduces substantial new socket behavior but doesn’t appear to add coverage in the existing sockets test suite (e.g. under
test/sockets/andtest/test_sockets.py). Adding targeted tests forDIRECT_SOCKETS(basic TCP connect/accept, UDP sendto/recvfrom, setsockopt/get(sock)opt error paths) would help prevent regressions and validate errno/behavior differences vs the existing SOCKFS/proxy backends.
__syscall_socket__deps: ['$DIRECT_SOCKETS'],
__syscall_socket: (domain, type, protocol) => {
// Strip flags that don't apply in single-process context
type &= ~({{{ cDefs.SOCK_CLOEXEC | cDefs.SOCK_NONBLOCK }}});
// Validate family
if (domain !== {{{ cDefs.AF_INET }}} && domain !== {{{ cDefs.AF_INET6 }}}) {
return -{{{ cDefs.EAFNOSUPPORT }}};
}
// Validate type
if (type !== {{{ cDefs.SOCK_STREAM }}} && type !== {{{ cDefs.SOCK_DGRAM }}}) {
return -{{{ cDefs.EINVAL }}};
}
// Validate protocol vs type
if (type === {{{ cDefs.SOCK_STREAM }}} && protocol !== 0 && protocol !== {{{ cDefs.IPPROTO_TCP }}}) {
return -{{{ cDefs.EPROTONOSUPPORT }}};
}
if (type === {{{ cDefs.SOCK_DGRAM }}} && protocol !== 0 && protocol !== {{{ cDefs.IPPROTO_UDP }}}) {
return -{{{ cDefs.EPROTONOSUPPORT }}};
}
var sock = DIRECT_SOCKETS.createSocketState(domain, type, protocol);
#if SOCKET_DEBUG
dbg(`direct_sockets: socket(${domain}, ${type}, ${protocol}) -> fd ${sock.fd}`);
#endif
return sock.fd;
},
src/lib/libdirectsockets.js:100
parseSockaddr()drops the underlyingreadSockaddr()error by returningnullwheninfo.errnois set, causing callers likeconnect()/bind()to return-EINVALeven for cases like-EAFNOSUPPORT. Preserve and propagate the specific errno (e.g. return{ errno }or throwFS.ErrnoError) so syscall error codes match POSIX expectations.
parseSockaddr(addrPtr, addrLen) {
var info = readSockaddr(addrPtr, addrLen);
if (info.errno) return null;
// readSockaddr returns addr as a string like "1.2.3.4" and port as a number.
// DNS.lookup_addr resolves emscripten fake IPs back to hostnames.
var resolvedAddr = DNS.lookup_addr(info.addr) || info.addr;
return { family: info.family, addr: resolvedAddr, port: info.port };
},
src/lib/libdirectsockets.js:235
- These constructors (
TCPSocket) are used unguarded; if the Direct Sockets API is unavailable in the current runtime, this will throw aReferenceErrorand crash rather than returning a clean errno. Add feature-detection (e.g. checkglobalThis.TCPSocket/TCPServerSocket/UDPSocket) and return-ENOSYS/-EOPNOTSUPP(or abort with a clear message under ASSERTIONS) when unavailable.
// TCP connect
var opts = DIRECT_SOCKETS.buildTCPOptions(sock);
var tcpSocket = new TCPSocket(dest.addr, dest.port, opts);
var openInfo = await tcpSocket.opened;
src/lib/libdirectsockets.js:621
__syscall_setsockoptrelies on hard-coded numeric values forSOL_SOCKET,IPPROTO_TCP, and option names (with a comment that these are musl-specific). This makes the implementation harder to audit and more fragile if constants differ across libc/configurations. Prefer usingcDefs.*constants where available (and validateoptlen/range for integer options) so unsupported/invalid inputs return-EINVALinstead of being silently accepted.
// Direct Sockets only supports a few options, and they must be set at
// construction time. We defer them and apply when connect/bind is called.
// SOL_SOCKET = 1, musl values for socket options:
// SO_REUSEADDR=2, SO_TYPE=3, SO_ERROR=4, SO_SNDBUF=7, SO_RCVBUF=8,
// SO_KEEPALIVE=9, SO_REUSEPORT=15
if (level === 1 /*SOL_SOCKET*/) {
switch (optname) {
case 2: // SO_REUSEADDR
case 15: // SO_REUSEPORT
// Silently accept - no equivalent, but harmless
return 0;
case 7: // SO_SNDBUF
sock.options.sendBufferSize = {{{ makeGetValue('optval', 0, 'i32') }}};
return 0;
case 8: // SO_RCVBUF
sock.options.receiveBufferSize = {{{ makeGetValue('optval', 0, 'i32') }}};
src/lib/libwasi.js:334
fd_closecallsDIRECT_SOCKETS._closeSocket(sock)but does not await/return the resulting Promise (andfd_closeitself is synchronous). That means the actual underlyingtcpSocket.close()/udpSocket.close()work may still be pending when the fd is considered closed, and it may never complete if the runtime exits immediately after. Consider making the close path synchronous (only synchronous cleanup here) or, when building with JSPI, returning the Promise so the wasm caller can suspend until the close completes.
#elif DIRECT_SOCKETS
var sock = DIRECT_SOCKETS.getSocket(fd);
if (sock) {
DIRECT_SOCKETS._closeSocket(sock);
delete DIRECT_SOCKETS.sockets[fd];
return 0;
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| nextFd: 100, // Start high to avoid conflicts with stdio/FS fds | ||
|
|
||
| allocateFd() { | ||
| return DIRECT_SOCKETS.nextFd++; | ||
| }, |
There was a problem hiding this comment.
DIRECT_SOCKETS.allocateFd() uses a private monotonically-increasing fd space starting at 100. Since file/socket descriptors share the same numeric namespace, this can collide with real FS-managed fds once enough files are opened (e.g. FS.nextfd() can hand out 100+). Consider allocating fds through the same mechanism as FS streams (or otherwise guaranteeing uniqueness against existing FS.streams) so direct-socket fds remain interoperable with generic fd-based APIs.
| dbg(`direct_sockets: read error: ${e}`); | ||
| #endif | ||
| sock.error = {{{ cDefs.EIO }}}; | ||
| return null; |
There was a problem hiding this comment.
readFromSocket() returns null both for EOF and for read errors (it sets sock.error but still returns null). Callers like __syscall_recvfrom then treat null as EOF and return 0, which incorrectly hides I/O errors. Return a distinct error signal (or throw) so recv/recvmsg can return -EIO (or the specific errno) instead of 0 on failures.
| return null; | |
| return -{{{ cDefs.EIO }}}; |
| } else { | ||
| // UDP "connect" - creates a connected-mode UDPSocket | ||
| var opts = DIRECT_SOCKETS.buildUDPOptions(sock); | ||
| opts.remoteAddress = dest.addr; | ||
| opts.remotePort = dest.port; | ||
| var udpSocket = new UDPSocket(opts); | ||
| var openInfo = await udpSocket.opened; | ||
|
|
||
| sock.udpSocket = udpSocket; | ||
| sock.reader = openInfo.readable.getReader(); | ||
| sock.writer = openInfo.writable.getWriter(); | ||
| sock.remoteAddress = openInfo.remoteAddress || dest.addr; | ||
| sock.remotePort = openInfo.remotePort || dest.port; | ||
| sock.localAddress = openInfo.localAddress || '0.0.0.0'; | ||
| sock.localPort = openInfo.localPort || 0; | ||
| sock.state = 'connected'; |
There was a problem hiding this comment.
For UDP sockets, bind() eagerly constructs a bound UDPSocket, but a subsequent connect() constructs a new UDPSocket without closing the existing one, leaking the original socket and discarding the bound local endpoint. Handle the state === 'bound' case by reusing the existing socket (if possible) or explicitly closing it before replacing it, while preserving POSIX semantics on the same fd.
| var data = await DIRECT_SOCKETS.readFromSocket(sock, len); | ||
| if (!data) return 0; // Connection closed (EOF) | ||
|
|
There was a problem hiding this comment.
In the TCP recv path, __syscall_recvfrom treats falsy data as EOF and returns 0. Because DIRECT_SOCKETS.readFromSocket() can also return null on errors, this will misreport read failures as a clean connection close. Once readFromSocket() distinguishes EOF vs error, update this branch to return a negative errno for errors.
| var data = await DIRECT_SOCKETS.readFromSocket(sock, len); | |
| if (!data) return 0; // Connection closed (EOF) | |
| var result = await DIRECT_SOCKETS.readFromSocket(sock, len); | |
| // Distinguish EOF from errors: | |
| // - EOF: result === null -> return 0 | |
| // - Error: result is an object with an `errno` field -> return -errno | |
| // - Data: result is a Uint8Array (legacy behavior) | |
| if (result === null) { | |
| // Connection closed (EOF) | |
| return 0; | |
| } | |
| if (result && typeof result === 'object' && 'errno' in result) { | |
| // readFromSocket reported an error via errno | |
| return -result.errno; | |
| } | |
| var data = result; |
| try { | ||
| if (sock.type === {{{ cDefs.SOCK_STREAM }}}) { | ||
| // TCP connect | ||
| var opts = DIRECT_SOCKETS.buildTCPOptions(sock); | ||
| var tcpSocket = new TCPSocket(dest.addr, dest.port, opts); | ||
| var openInfo = await tcpSocket.opened; | ||
|
|
||
| sock.tcpSocket = tcpSocket; | ||
| sock.reader = openInfo.readable.getReader(); | ||
| sock.writer = openInfo.writable.getWriter(); | ||
| sock.remoteAddress = openInfo.remoteAddress || dest.addr; | ||
| sock.remotePort = openInfo.remotePort || dest.port; | ||
| sock.localAddress = openInfo.localAddress || '0.0.0.0'; | ||
| sock.localPort = openInfo.localPort || 0; | ||
| sock.state = 'connected'; |
There was a problem hiding this comment.
bind() stores sock.localAddress/sock.localPort, but connect() does not use those values when constructing TCPSocket/UDPSocket. This means bind()-then-connect() won’t honor the requested local endpoint (common for choosing a local port/address). Either plumb the local bind parameters into the Direct Sockets constructor options when sock.state === 'bound', or reject the sequence with an appropriate errno.
| if (sock) { | ||
| DIRECT_SOCKETS._closeSocket(sock); | ||
| delete DIRECT_SOCKETS.sockets[fd]; | ||
| return 0; |
There was a problem hiding this comment.
No need for this return statement since we have return 0 below.
|
|
||
| // Use Chrome's Direct Sockets API (TCPSocket, TCPServerSocket, UDPSocket) | ||
| // for real TCP/UDP networking in Isolated Web Apps, replacing the | ||
| // WebSocket-to-POSIX-socket proxy. Requires -sJSPI for async bridging. |
There was a problem hiding this comment.
Does it really require JSPI or would ASYNCIFY work too?
|
|
||
| weak int __syscall_setsockopt(int sockfd, int level, int optname, intptr_t optval, size_t optlen, int dummy) { | ||
| REPORT(setsockopt); | ||
| return -ENOPROTOOPT; // The option is unknown at the level indicated. |
There was a problem hiding this comment.
Maybe revert this file here to keep the PR focused?
| // error: number, | ||
| // } | ||
| sockets: {}, | ||
| nextFd: 100, // Start high to avoid conflicts with stdio/FS fds |
There was a problem hiding this comment.
Maybe we can do better than this by sharing the FD allocator with the common FS code?
| } | ||
| }, | ||
|
|
||
| __syscall_shutdown__deps: ['$DIRECT_SOCKETS'], |
There was a problem hiding this comment.
If you have a dependecy like this that all your functions depend on in a given file you can use autoAddDeps instead of repeating it everywhere.
| var sock = DIRECT_SOCKETS.getSocket(fd); | ||
| if (!sock) return -{{{ cDefs.EBADF }}}; | ||
|
|
||
| if (level === 1 /*SOL_SOCKET*/) { |
There was a problem hiding this comment.
You can use {{{ cDefs.SOL_SOCKET }}} here instead of the hardcoded 1
Adds a new DIRECT_SOCKETS linker setting that replaces the websocket-to-posix-socket proxy with chrome direct sockets api for real tcp udp networking from wasm in isolated web apps
Uses JSPI to bridge async direct sockets promises to synchronous posix socket calls without asyncify overhead
new files:
src/lib/libdirectsockets.js implements all socket syscalls against TCPSocket TCPServerSocket UDPSocket
modified files:
src/settings.js: adds DIRECT_SOCKETS and DIRECT_SOCKETS_WEBTRANSPORT flags
src/modules.mjs: registers libdirectsockets.js when flag is enabled
src/lib/libsyscall.js: guards default socket impls when direct sockets active
src/lib/libwasi.js: adds fd_close path for direct socket fds
Usage:
emcc -sDIRECT_SOCKETS -sJSPI -sPROXY_TO_PTHREAD -pthread server.c -o server.jstested with a full quic stack ngtcp2 wolfssl nghttp3 compiled to wasm running as a chrome isolated web app achieving 90 percent of native linux throughput on UDP packet handling