Skip to content

BUG: check direct syscall before multiplexed pseudo-syscall#488

Open
nikita-dubrovskii wants to merge 2 commits into
seccomp:mainfrom
nikita-dubrovskii:fix_syscall_name_munging
Open

BUG: check direct syscall before multiplexed pseudo-syscall#488
nikita-dubrovskii wants to merge 2 commits into
seccomp:mainfrom
nikita-dubrovskii:fix_syscall_name_munging

Conversation

@nikita-dubrovskii
Copy link
Copy Markdown

Docker 29.4.2 removed socketcall(2) from the default seccomp profile. On s390x, this broke socket operations:

  # strace curl icanhazip.com
  socket(AF_UNIX, SOCK_STREAM, 0) = -1 ENOSYS (Function not implemented)

  # scmp_sys_resolver -a s390x socket
  -101

  # ausyscall s390x socket
  socket             359

The abi_syscall_resolve_name_munge() function was returning __PNR_socket (-101) instead of checking if arch implements socket(2) directly (359).

Fix by checking arch->syscall_resolve_name_raw() first, only falling back to multiplexed pseudo-syscalls if the direct implementation doesn't exist.

Affects socket and IPC syscalls on architectures with direct implementations (s390x, aarch64, etc).

Docker 29.4.2 removed socketcall(2) from the default seccomp profile.
On s390x, this broke socket operations:

  # strace curl icanhazip.com
  socket(AF_UNIX, SOCK_STREAM, 0) = -1 ENOSYS (Function not implemented)

  # scmp_sys_resolver -a s390x socket
  -101

  # ausyscall s390x socket
  socket             359

The abi_syscall_resolve_name_munge() function was returning __PNR_socket
(-101) instead of checking if arch implements socket(2) directly (359).

Fix by checking arch->syscall_resolve_name_raw() first, only falling back
to multiplexed pseudo-syscalls if the direct implementation doesn't exist.

Affects socket and IPC syscalls on architectures with direct implementations
(s390x, aarch64, etc).

Signed-off-by: Nikita Dubrovskii <nikita@linux.ibm.com>
@nikita-dubrovskii
Copy link
Copy Markdown
Author

This won't fix Docker itself, but at least the tools will return the correct values:

[zukku@a3elp43 libseccomp]$ scmp_sys_resolver socket
-101

$ LD_PRELOAD=./src/.libs/libseccomp.so scmp_sys_resolver socket
359

@pcmoore
Copy link
Copy Markdown
Member

pcmoore commented May 18, 2026

We need to be careful with changes like this as it is changing the behavior of the library in a way that might not be friendly to all of the users.

@pcmoore
Copy link
Copy Markdown
Member

pcmoore commented May 18, 2026

Marking this as a BUG for now, but until we understand the impact of a change like this we need to take the classification with a grain of salt ...

@pcmoore pcmoore changed the title syscalls: check direct syscall before multiplexed pseudo-syscall BUG: check direct syscall before multiplexed pseudo-syscall May 18, 2026
After switching to direct syscall resolution, SCMP_SYS(accept) resolves to
the direct syscall number instead of the pseudo-syscall. This creates rules
for both socketcall and the direct syscall without argument restrictions.

Change expected result from KILL to ALLOW for accept/accept4 with arbitrary
arguments, since they now match the direct syscall rule.

Signed-off-by: Nikita Dubrovskii <nikita@linux.ibm.com>
@drakenclimber
Copy link
Copy Markdown
Member

drakenclimber commented May 19, 2026

I think having libseccomp use the "standard" syscall before the multiplexed pseudo-syscall is a good discussion to have, but as @pcmoore pointed out, the ramifications could be large. We would definitely have to proceed with caution if we were to make such a change.

With that said, is this still urgent? It looks like Docker v29.4.3 [1] fixed this. Correct?

[1] moby/moby#52555

@nikita-dubrovskii
Copy link
Copy Markdown
Author

With that said, is this still urgent? It looks like Docker v29.4.3 [1] fixed this. Correct?

Yes, it should be working now.

I think having libseccomp use the "standard" syscall before the multiplexed pseudo-syscall is a good discussion to have, but as @pcmoore pointed out, the ramifications could be large.

After checking git history, i have an impression, that current behaviour was implemented as a workaround many years ago. Than several updates like this with some mux/demux, and finally consolidation moved all logic to centralized functions. But! Bug was still there - instead of returning direct syscall, code always returned multiplexed.

The code DID work:

  1. Name resolution (buggy): abi_syscall_resolve_name_munge("socket") returned __PNR_socket (-101) instead of 359
  2. Dual-rule generation in abi_rule_add() logic:

This fix doesn't break rule generation, as abi_rule_add() works correctly regardless of whether you start with the direct syscall (359) or pseudo-syscall (-101).

This is my understanding.

@drakenclimber
Copy link
Copy Markdown
Member

The code DID work:

  1. Name resolution (buggy): abi_syscall_resolve_name_munge("socket") returned __PNR_socket (-101) instead of 359

  2. Dual-rule generation in abi_rule_add() logic:

This fix doesn't break rule generation, as abi_rule_add() works correctly regardless of whether you start with the direct syscall (359) or pseudo-syscall (-101).

This is my understanding.

Thanks. Your research is thorough and spot on.

The worry I have is that this will change the default multiplexing behavior for all users on all architectures. I don't know what the fallout would be in that case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants