bubblewrap: Add --not-a-security-boundary flag to enable fail-open behavior#751
bubblewrap: Add --not-a-security-boundary flag to enable fail-open behavior#751ao2 wants to merge 2 commits into
--not-a-security-boundary flag to enable fail-open behavior#751Conversation
--not-a-security-boundary flag to enable fail-open …--not-a-security-boundary flag to enable fail-open behavior
|
cc @smcv even though it's still a draft |
e64137e to
418a4f1
Compare
smcv
left a comment
There was a problem hiding this comment.
I'd like some smoke-test coverage for this in the test suite. Setting up a failing automounter (so that there will genuinely be an error to ignore) is probably too difficult to do in bubblewrap's test suite, but we can at least exercise the happy path where the new option makes no practical difference.
| (op->type == SETUP_DEV_BIND_MOUNT ? BIND_DEVICES : 0) | | ||
| ((op->type == SETUP_BIND_MOUNT && | ||
| opt_not_a_security_boundary) ? BIND_FAIL_OPEN : 0), | ||
| source, dest); |
There was a problem hiding this comment.
Why does only --bind fail open? If bubblewrap is not a security boundary, shouldn't --dev-bind and --ro-bind be equally willing to fail open?
Separately, I think we're getting enough flags that the repeated ?: operators make it harder to read, and it would be better more like:
bind_option_t bind_flags = 0;
if (opt_not_a_security_boundary)
bind_flags |= BIND_FAIL_OPEN;
if (op->type == SETUP_RO_BIND_MOUNT)
bind_flags |= BIND_READONLY;
if (op->type == SETUP_DEV_BIND_MOUNT)
bind_flags |= BIND_DEVICES;
/* etc. */
setup_op_bind_mount (bind_flags, source, dest);There was a problem hiding this comment.
As mentioned in the commit message -dev-bind and --ro-bind seemed more critical on a first look.
For --dev-bind I thought that, since the argument is probably a device node needed for the intended operations, if remounting failed it might affect functionality and we might want to signal it.
For --ro-bind I thought that remounting ro was asked explicitly and so we might want to fail hard if we cannot satisfy an explicit request.
But maybe I am being overly cautious here, because of my limited familiarity with bwrap use cases, and the warning is good enough, since --not-a-security-boundary was also an explicit directive.
There was a problem hiding this comment.
Done in the latest version.
|
The commit message should have a
Optionally also a mention of |
|
For manual testing, I see you've described how to reproduce the problem in ValveSoftware/steam-for-linux#10571 (comment) (it might be useful to summarize that in the commit message). It would be useful if you could create other mount points - perhaps What I expect will happen is that Similarly, if you request mounting them read-only, something like But it would be good to confirm this. |
Rework how flags are passed to setup_op_bind_mount() when calling it in setup_newroot(); no functional changes are made, this is just in preparation for adding add more flags in future commits and still keep the code readable. Signed-off-by: Antonio Ospite <antonio.ospite@collabora.com>
|
Some steps to manually test the change.
|
|
I'll try to condense the comment above about the manual tests into something suitable for the commit message. |
418a4f1 to
c93289e
Compare
…behavior Some callers of bwrap (e.g. xdg-dbus-proxy, Steam Runtime) use it purely to adjust filesystem layout, without any expectation of a security boundary between the sandbox and the host. For these callers, hard failures during sandbox setup (such as an automount timeout on a bind source) are unnecessarily fatal. So add a new `--not-a-security-boundary` option that can be used to relax the bubblewrap behavior in these specific cases, and allow it to "fail-open". The behavior can be tested by setting up an unavailable automount and check that: 1. bubblewrap succeeds even when accessing the unavailable automount fails; 2. remount flags like `nosuid,nodev` are not applied to the unavailable automount. ``` $ sudo sh -c 'echo "UUID=00000000-0000-0000-0000-000000000000 /mnt/bad ext4 defaults,noatime,nofail,inode_readahead_blks=64,nobarrier,x-systemd.automount 0 1" >> /etc/fstab' $ sudo systemctl daemon-reload $ sudo systemctl restart mnt-bad.automount $ mount | grep /mnt/bad systemd-1 on /mnt/bad type autofs (rw,relatime,fd=76,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=15854) $ ls /mnt/bad &>/dev/null & $ ./_build/bwrap --not-a-security-boundary --bind / / grep /mnt /proc/self/mountinfo bwrap: Can't remount /newroot/mnt/bad submount (No such device), ignoring error 860 769 0:41 / /mnt/bad rw,relatime master:45 - autofs systemd-1 rw,fd=107,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=144522 ``` steamrt/tasks#535 Resolves: containers#653 Helps: flatpak/flatpak#5112 Helps: ValveSoftware/steam-for-linux#10571 Helps: ValveSoftware/steam-runtime#766 Signed-off-by: Antonio Ospite <antonio.ospite@collabora.com>
c93289e to
d983f77
Compare
| { | ||
| /* And if we don't need a security boundary, we can also | ||
| * ignore other remount errors for submounts. */ | ||
| if (options & BIND_FAIL_OPEN) |
There was a problem hiding this comment.
I verified that having the check under the for loop that start with i = 1 seem to be enough:
if (recursive)
{
for (i = 1; mount_tab[i].mountpoint != NULL; i++)
{
...
This seems to cover also host mount points like /hdd directly under /, because even when --bind / / is used mount_tab[0].mountpoint will be /newroot, and so the code will handle /newroot/hdd
And we can assume that /newroot is not on an automount?
So we shouldn't nee to add the check also for mount_tab[0].mountpoint.
There was a problem hiding this comment.
And we can assume that
/newrootis not on an automount?
Normally, yes: it's on a tmpfs, look for comment /* Create a tmpfs which we will use as / in the namespace */.
If we're mounting an automount onto the root (like --bind /mnt/am / where /mnt/am is an automount) then I think mount_tab[0].mountpoint might be the automounter? But in that case, I think mount(2) would trigger the automount, mount the underlying filesystem, and either succeed or fail as appropriate. Please could you confirm that?
There was a problem hiding this comment.
I think it's reasonable to say that if /mnt/bad is a broken automount, then bwrap --not-a-security-boundary ... --bind /mnt/bad $DEST ... should still fail: even though we're not imposing a security boundary, we still asked to mount /mnt/bad onto $DEST, and if we're unable to do so, we should fail.
Conversely, if /mnt/bad is a broken automount and we mount /mnt onto $DEST successfully, it's a lot more reasonable to succeed even though were unable to remount /mnt/bad as nodev.
| (op->type == SETUP_DEV_BIND_MOUNT ? BIND_DEVICES : 0), | ||
| (op->type == SETUP_DEV_BIND_MOUNT ? BIND_DEVICES : 0) | | ||
| ((op->type == SETUP_BIND_MOUNT && | ||
| opt_not_a_security_boundary) ? BIND_FAIL_OPEN : 0), |
There was a problem hiding this comment.
Does it make sense to still fail closed for --ro-bind and --dev-bind ?
There was a problem hiding this comment.
Addressed this in the latest version.
| (op->type == SETUP_DEV_BIND_MOUNT ? BIND_DEVICES : 0) | | ||
| ((op->type == SETUP_BIND_MOUNT && | ||
| opt_not_a_security_boundary) ? BIND_FAIL_OPEN : 0), | ||
| source, dest); |
There was a problem hiding this comment.
As mentioned in the commit message -dev-bind and --ro-bind seemed more critical on a first look.
For --dev-bind I thought that, since the argument is probably a device node needed for the intended operations, if remounting failed it might affect functionality and we might want to signal it.
For --ro-bind I thought that remounting ro was asked explicitly and so we might want to fail hard if we cannot satisfy an explicit request.
But maybe I am being overly cautious here, because of my limited familiarity with bwrap use cases, and the warning is good enough, since --not-a-security-boundary was also an explicit directive.
| (op->type == SETUP_DEV_BIND_MOUNT ? BIND_DEVICES : 0) | | ||
| ((op->type == SETUP_BIND_MOUNT && | ||
| opt_not_a_security_boundary) ? BIND_FAIL_OPEN : 0), | ||
| source, dest); |
There was a problem hiding this comment.
Done in the latest version.
| { | ||
| /* And if we don't need a security boundary, we can also | ||
| * ignore other remount errors for submounts. */ | ||
| if (options & BIND_FAIL_OPEN) |
There was a problem hiding this comment.
And we can assume that
/newrootis not on an automount?
Normally, yes: it's on a tmpfs, look for comment /* Create a tmpfs which we will use as / in the namespace */.
If we're mounting an automount onto the root (like --bind /mnt/am / where /mnt/am is an automount) then I think mount_tab[0].mountpoint might be the automounter? But in that case, I think mount(2) would trigger the automount, mount the underlying filesystem, and either succeed or fail as appropriate. Please could you confirm that?
| { | ||
| /* And if we don't need a security boundary, we can also | ||
| * ignore other remount errors for submounts. */ | ||
| if (options & BIND_FAIL_OPEN) |
There was a problem hiding this comment.
I think it's reasonable to say that if /mnt/bad is a broken automount, then bwrap --not-a-security-boundary ... --bind /mnt/bad $DEST ... should still fail: even though we're not imposing a security boundary, we still asked to mount /mnt/bad onto $DEST, and if we're unable to do so, we should fail.
Conversely, if /mnt/bad is a broken automount and we mount /mnt onto $DEST successfully, it's a lot more reasonable to succeed even though were unable to remount /mnt/bad as nodev.
| intended to create a security boundary between the sandbox and the | ||
| host system. When this option is given, certain non-fatal sandbox | ||
| setup failures (such as a bind mount failing because an automounter | ||
| did not respond in time) will produce a warning and will be skipped, |
There was a problem hiding this comment.
Hmm, technically it's not the bind mount that is failing in the scenario that motivated this option. That scenario is something like this:
- user wants to
bwrap --ro-bind(orbwrap --bind), let's say/mntonto/mnt bwrap --ro-bindimplies that they want the whole/mnthierarchy to be mountedro, and similarly eitherbwrap --ro-bindorbwrap --bindimplies that they want the whole/mnthierarchy to be mountednodev- but
mount(2)can't do that atomically, so we can only achieve that[1] with the C equivalent of:mount --bind /oldroot/mnt /newroot/mntmount -o remount,ro,nodev,nosuid /newroot/mntmount -o remount,ro,nodev,nosuid /newroot/mnt/badmount -o remount,ro,nodev,nosuid /newroot/mnt/good- ...
"A bind mount failing" would be the mount --bind-equivalent, but that's not what is failing here: what is failing here is the remount of /newroot/mnt/bad, specifically.
So I think it would be more accurate to say something like
setup failures (such as a subdirectory remount failing because an automounter
did not respond in time) will produce a warning and will be skipped,
If you're touching this text anyway, consider also doing semantic line-breaks after each sentence, after each comma, before and after the part in parentheses and so on, even if the line-length limit has not yet been reached, to allow later edits with minimal diff. Ordinary flowing text in Docbook gets line-wrapped automatically, so that won't affect the rendered output.
[1] assuming we don't want to increase our minimum kernel version, which I think we don't. In newer kernels we'd be able to use the "new mount APIs", like open_tree(2), mount_setattr(2) and move_mount(2), but using those APIs exclusively would mean losing the ability to run on old host OSs like Ubuntu 20.04.
Some callers of bwrap (e.g. xdg-dbus-proxy, Steam Runtime) use it purely to adjust filesystem layout, without any expectation of a security boundary between the sandbox and the host.
For these callers, hard failures during sandbox setup (such as an automount timeout on a bind source) are unnecessarily fatal.
So add a new
--not-a-security-boundaryoption that can be used to relax the bubblewrap behavior in these specific cases, and allow it to "fail-open".