Run the encoders in-process#29
Merged
Merged
Conversation
Drops the separate multiprocessing.Process encoder-server and the
AF_UNIX wire protocol. The VczReader and per-fh BedEncoder /
BgenEncoder instances now live in the FUSE handler process.
encoder.read, encoder.close, and reader teardown run on worker threads
via trio.to_thread.run_sync so the pyfuse3 trio task stays responsive.
The 30s per-read timeout and 2s aclose timeout are preserved — a slow
read still surfaces EIO to the kernel rather than blocking the
consumer indefinitely. On read timeout the worker thread is abandoned
(abandon_on_cancel=True) and the handle is marked dead; aclose drains
the abandoned thread via a threading.Event before closing the encoder,
or logs a warning and leaks if the encoder is permanently wedged.
The pyfuse3 mount Operations layer is largely untouched — it depends
on a Protocol (renamed EncoderClientProto -> EncoderHostProto) that
the new EncoderHost / StreamHandle satisfy. The accompanying tests
(test_encoder_ops, test_plink_apps, test_bgen_apps) track the rename.
Net diff: -2335 LOC across the deleted encoder_{client,server,protocol}
modules and their dedicated test files.
The glibc-arena fragmentation tuning called out in
notes/memory_rss_investigation.md (MALLOC_ARENA_MAX, malloc_trim) is
deferred to a separate change.
For both plink and bgen, mount the fixture VCZ via biofuse and verify the first 100 MB of the streaming file matches the bytes produced by BedEncoder / BgenEncoder run directly in-process.
2a264ec to
540f11f
Compare
Member
Author
|
Here's the fs tests report for this change: biofuse fs_tests report
Overall: PASS91 / 91 checks passed across 8 runners. Per-runner summary
Runner:
|
| Check | Status | Duration | Detail |
|---|---|---|---|
| open: O_RDONLY succeeds | PASS | 0.002s | |
| open: O_RDONLY | O_NONBLOCK | O_CLOEXEC accepted | PASS |
| open: O_WRONLY rejected with EROFS or EACCES | PASS | 0.000s | |
| open: O_RDWR rejected with EROFS or EACCES | PASS | 0.000s | |
| open: O_APPEND rejected with EROFS or EACCES | PASS | 0.000s | |
| open: O_CREAT for new file rejected | PASS | 0.000s | |
| open: O_DIRECTORY on regular file -> ENOTDIR | PASS | 0.000s | |
| open: O_DIRECTORY on mountpoint -> ok | PASS | 0.000s | |
| open: nonexistent path -> ENOENT | PASS | 0.000s | |
| read: full file via os.read matches backing | PASS | 0.018s | |
| pread: random offsets match backing | PASS | 4.102s | |
| pread: at EOF returns empty | PASS | 0.000s | |
| pread: spanning EOF returns trailing bytes only | PASS | 0.000s | |
| readv / preadv: bytes match backing | PASS | 0.000s | |
| lseek: SEEK_SET / SEEK_CUR / SEEK_END | PASS | 0.001s | |
| lseek: negative offset -> EINVAL | PASS | 0.000s | |
| lseek: past EOF + read -> 0 bytes | PASS | 0.000s | |
| stat == lstat == fstat for regular files | PASS | 0.000s | |
| stat: st_mode is S_IFREG with no write bits | PASS | 0.000s | |
| stat: st_size matches reads | PASS | 0.000s | |
| stat: st_dev consistent across files in mount | PASS | 0.000s | |
| stat: st_ino unique per file | PASS | 0.000s | |
| statvfs: ST_RDONLY flag set on mount | PASS | 0.001s | |
| access: F_OK true for existing files | PASS | 0.000s | |
| access: R_OK true for existing files | PASS | 0.000s | |
| access: W_OK false on read-only mount | PASS | 0.000s | |
| access: F_OK false for missing file | PASS | 0.000s | |
| readdir: listdir matches backing names | PASS | 0.000s | |
| scandir: entries match listdir | PASS | 0.000s | |
| scandir: each entry is_file() and not is_dir() | PASS | 0.000s | |
| openat / fstatat: relative resolution from dirfd | PASS | 0.000s | |
| dup / dup2: independent offsets | PASS | 0.000s | |
| fcntl: F_GETFL reports O_RDONLY | PASS | 0.000s | |
| mmap: PROT_READ MAP_PRIVATE returns matching bytes | PASS | 0.080s | |
| mmap: MAP_SHARED PROT_WRITE rejected | PASS | 0.000s | |
| path: trailing slash on regular file -> ENOTDIR | PASS | 0.000s | |
| path: redundant ./// segments resolve | PASS | 0.000s | |
| chdir + relative open works | PASS | 0.000s | |
| mutate: write rejected | PASS | 0.000s | |
| mutate: unlink rejected | PASS | 0.000s | |
| mutate: rename rejected | PASS | 0.000s | |
| mutate: mkdir rejected | PASS | 0.000s | |
| mutate: symlink rejected | PASS | 0.000s | |
| mutate: link rejected | PASS | 0.000s | |
| mutate: chmod rejected | PASS | 0.000s | |
| mutate: chown rejected (only if non-root) | PASS | 0.000s | |
| mutate: utime rejected | PASS | 0.000s | |
| mutate: truncate rejected | PASS | 0.000s | |
| xattr: getxattr returns ENOTSUP/ENODATA | PASS | 0.000s | |
| xattr: setxattr rejected | PASS | 0.000s | |
| fd churn: 1000 open/close cycles, no fd leak | PASS | 0.088s |
Runner: bulk-data
| Check | Status | Duration | Detail |
|---|---|---|---|
| bulk-data:plink | PASS | 5.451s | compared 8952503 bytes (encoder total_size=8952503) |
| bulk-data:bgen | PASS | 9.006s | compared 104857600 bytes (encoder total_size=117100422) |
Runner: pjdfstest
| Check | Status | Duration | Detail |
|---|---|---|---|
| pjdfstest:open | PASS | 5.183s | ok=11 not_ok=312 timeouts=0 (read-only FS — high not_ok is expected; see log for samples) |
| pjdfstest:granular | PASS | 0.055s | ok=7 not_ok=0 timeouts=0 (read-only FS — high not_ok is expected; see log for samples) |
| pjdfstest:chflags | PASS | 0.097s | ok=14 not_ok=0 timeouts=0 (read-only FS — high not_ok is expected; see log for samples) |
| pjdfstest:chmod | PASS | 14.093s | ok=2 not_ok=305 timeouts=0 (read-only FS — high not_ok is expected; see log for samples) |
| pjdfstest:chown | PASS | 82.211s | ok=2 not_ok=1495 timeouts=0 (read-only FS — high not_ok is expected; see log for samples) |
| pjdfstest:ftruncate | PASS | 3.371s | ok=3 not_ok=86 timeouts=0 (read-only FS — high not_ok is expected; see log for samples) |
| pjdfstest:link | PASS | 12.235s | ok=16 not_ok=343 timeouts=0 (read-only FS — high not_ok is expected; see log for samples) |
| pjdfstest:mkdir | PASS | 2.615s | ok=3 not_ok=115 timeouts=0 (read-only FS — high not_ok is expected; see log for samples) |
| pjdfstest:mkfifo | PASS | 2.457s | ok=3 not_ok=117 timeouts=0 (read-only FS — high not_ok is expected; see log for samples) |
| pjdfstest:mknod | PASS | 5.025s | ok=1 not_ok=185 timeouts=0 (read-only FS — high not_ok is expected; see log for samples) |
| pjdfstest:rename | PASS | 26.748s | ok=6 not_ok=4851 timeouts=0 (read-only FS — high not_ok is expected; see log for samples) |
| pjdfstest:rmdir | PASS | 2.797s | ok=4 not_ok=141 timeouts=0 (read-only FS — high not_ok is expected; see log for samples) |
| pjdfstest:symlink | PASS | 2.568s | ok=3 not_ok=92 timeouts=0 (read-only FS — high not_ok is expected; see log for samples) |
| pjdfstest:truncate | PASS | 3.593s | ok=3 not_ok=81 timeouts=0 (read-only FS — high not_ok is expected; see log for samples) |
| pjdfstest:unlink | PASS | 19.556s | ok=3 not_ok=437 timeouts=0 (read-only FS — high not_ok is expected; see log for samples) |
| pjdfstest:utimensat | PASS | 1.573s | ok=2 not_ok=120 timeouts=0 (read-only FS — high not_ok is expected; see log for samples) |
Runner: fio
| Check | Status | Duration | Detail |
|---|---|---|---|
| fio:seq-read | PASS | 30.340s | errors=0 io=396.4 MB runtime=30017ms throughput=13.2 MB/s |
| fio:rand-read | PASS | 60.544s | errors=0 io=28.7 MB runtime=30063ms throughput=1.0 MB/s |
| fio:mmap-read | PASS | 30.941s | errors=11 io=136.0 MB runtime=30705ms throughput=4.4 MB/s (informational) |
| fio:mmap-read:concurrent | PASS | 0.000s | records=100 fhs=17 max_overlap=12 |
| fio:parallel-seq-read | PASS | 30.534s | errors=0 io=70.2 MB runtime=30300ms throughput=2.3 MB/s |
| fio:parallel-seq-read:concurrent | PASS | 0.000s | records=556 fhs=21 max_overlap=18 |
| fio:multithread | PASS | 30.305s | errors=11 io=28.2 MB runtime=30070ms throughput=0.9 MB/s (informational) |
| fio:multithread:concurrent | PASS | 0.000s | records=530 fhs=31 max_overlap=16 |
| fio:static-stress-bim | PASS | 10.253s | errors=0 io=224.3 MB runtime=10010ms throughput=22.4 MB/s |
| fio:static-stress-fam | PASS | 10.246s | errors=0 io=361.9 MB runtime=10008ms throughput=36.2 MB/s |
Runner: fsx
| Check | Status | Duration | Detail |
|---|---|---|---|
| fsx:seed-7 | PASS | 8.796s | completed=50000/50000 mismatches=0 short_reads=0 |
| fsx:seed-23 | PASS | 5.869s | completed=50000/50000 mismatches=0 short_reads=0 |
| fsx:seed-101 | PASS | 5.852s | completed=50000/50000 mismatches=0 short_reads=0 |
Runner: stress-ng
| Check | Status | Duration | Detail |
|---|---|---|---|
| open-loop:4p:30s | PASS | 30.044s | workers=4 ops=183 errors=0 |
| open-loop:16p:30s | PASS | 30.123s | workers=16 ops=19856 errors=0 |
| stress-ng:background-load | PASS | 0.000s | rc=0 failed=None completed=None |
Runner: lifecycle
| Check | Status | Duration | Detail |
|---|---|---|---|
| lifecycle:cycles_complete | PASS | 257.420s | completed 50/50; mean=5.15s p99=5.59s max=5.59s |
| lifecycle:no_orphan_mounts | PASS | 0.000s | orphan fuse.biofuse mounts at /home/ubuntu/agents-work/sgkit-dev/biofuse/fs_tests/results/20260518T080912Z/lifecycle/mnt: 0 |
| lifecycle:max_cycle_within_budget | PASS | 0.000s | max cycle 5.59s vs budget 30.0s |
Runner: active-under-stress
| Check | Status | Duration | Detail |
|---|---|---|---|
| liveness:readdir | PASS | 0.000s | attempts=59 ok=59 timeouts=0 errors=0 max_latency=41.3ms |
| liveness:static-read:.bim | PASS | 0.000s | attempts=59 ok=59 timeouts=0 errors=0 max_latency=15.0ms |
| liveness:static-read:.fam | PASS | 0.000s | attempts=59 ok=59 timeouts=0 errors=0 max_latency=20.1ms |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Drops the separate multiprocessing.Process encoder-server and the AF_UNIX wire protocol. The VczReader and per-fh BedEncoder / BgenEncoder instances now live in the FUSE handler process. encoder.read, encoder.close, and reader teardown run on worker threads via trio.to_thread.run_sync so the pyfuse3 trio task stays responsive.
The 30s per-read timeout and 2s aclose timeout are preserved — a slow read still surfaces EIO to the kernel rather than blocking the consumer indefinitely. On read timeout the worker thread is abandoned (abandon_on_cancel=True) and the handle is marked dead; aclose drains the abandoned thread via a threading.Event before closing the encoder, or logs a warning and leaks if the encoder is permanently wedged.
The pyfuse3 mount Operations layer is largely untouched — it depends on a Protocol (renamed EncoderClientProto -> EncoderHostProto) that the new EncoderHost / StreamHandle satisfy. The accompanying tests (test_encoder_ops, test_plink_apps, test_bgen_apps) track the rename.
Net diff: -2335 LOC across the deleted encoder_{client,server,protocol} modules and their dedicated test files.
The glibc-arena fragmentation tuning called out in notes/memory_rss_investigation.md (MALLOC_ARENA_MAX, malloc_trim) is deferred to a separate change.