BUG: check direct syscall before multiplexed pseudo-syscall#488
BUG: check direct syscall before multiplexed pseudo-syscall#488nikita-dubrovskii wants to merge 2 commits into
Conversation
Docker 29.4.2 removed socketcall(2) from the default seccomp profile. On s390x, this broke socket operations: # strace curl icanhazip.com socket(AF_UNIX, SOCK_STREAM, 0) = -1 ENOSYS (Function not implemented) # scmp_sys_resolver -a s390x socket -101 # ausyscall s390x socket socket 359 The abi_syscall_resolve_name_munge() function was returning __PNR_socket (-101) instead of checking if arch implements socket(2) directly (359). Fix by checking arch->syscall_resolve_name_raw() first, only falling back to multiplexed pseudo-syscalls if the direct implementation doesn't exist. Affects socket and IPC syscalls on architectures with direct implementations (s390x, aarch64, etc). Signed-off-by: Nikita Dubrovskii <nikita@linux.ibm.com>
|
This won't fix Docker itself, but at least the tools will return the correct values: |
|
We need to be careful with changes like this as it is changing the behavior of the library in a way that might not be friendly to all of the users. |
|
Marking this as a BUG for now, but until we understand the impact of a change like this we need to take the classification with a grain of salt ... |
After switching to direct syscall resolution, SCMP_SYS(accept) resolves to the direct syscall number instead of the pseudo-syscall. This creates rules for both socketcall and the direct syscall without argument restrictions. Change expected result from KILL to ALLOW for accept/accept4 with arbitrary arguments, since they now match the direct syscall rule. Signed-off-by: Nikita Dubrovskii <nikita@linux.ibm.com>
|
I think having libseccomp use the "standard" syscall before the multiplexed pseudo-syscall is a good discussion to have, but as @pcmoore pointed out, the ramifications could be large. We would definitely have to proceed with caution if we were to make such a change. With that said, is this still urgent? It looks like Docker v29.4.3 [1] fixed this. Correct? [1] moby/moby#52555 |
Yes, it should be working now.
After checking git history, i have an impression, that current behaviour was implemented as a workaround many years ago. Than several updates like this with some The code DID work:
This fix doesn't break rule generation, as This is my understanding. |
Thanks. Your research is thorough and spot on. The worry I have is that this will change the default multiplexing behavior for all users on all architectures. I don't know what the fallout would be in that case. |
Docker 29.4.2 removed socketcall(2) from the default seccomp profile. On s390x, this broke socket operations:
The abi_syscall_resolve_name_munge() function was returning __PNR_socket (-101) instead of checking if arch implements socket(2) directly (359).
Fix by checking arch->syscall_resolve_name_raw() first, only falling back to multiplexed pseudo-syscalls if the direct implementation doesn't exist.
Affects socket and IPC syscalls on architectures with direct implementations (s390x, aarch64, etc).