Summary
sort --files0-from=F reads NUL-separated filenames from F and decodes each with std::str::from_utf8(&line).expect(...). Linux filenames are arbitrary byte strings, so any non-UTF-8 byte in the list aborts the program (SIGABRT, exit 134) instead of being treated as a (missing) filename. GNU sort reports that it cannot read the file and exits 2.
Steps to reproduce
$ printf '\xff' > list # a single non-UTF-8 byte as the "filename"
$ sort --files0-from=list
thread 'main' panicked at src/uu/sort/src/sort.rs:2046:18:
Could not parse string from zero terminated input.: Utf8Error { valid_up_to: 0, error_len: Some(1) }
$ echo $?
134
Expected behavior
Match GNU: treat the bytes as a filename, fail to open it, and exit 2.
$ /usr/bin/sort --files0-from=list
sort: cannot read: ''$'\377': No such file or directory
$ echo $?
2
Actual behavior
uutils aborts with a Rust Utf8Error .expect() panic (exit 134, core dump under the release panic="abort" profile).
Root cause
The --files0-from reader decodes each NUL-terminated entry as UTF-8 and unwraps:
// src/uu/sort/src/sort.rs:2045-2046
let f = std::str::from_utf8(&line)
.expect("Could not parse string from zero terminated input.");
// … and the twin at :2062
files.push(OsString::from(
std::str::from_utf8(&line)
.expect("Could not parse string from zero terminated input."),
));
The first site (:2046) aborts before the second is reached.
Found by our static analysis tooling.
Summary
sort --files0-from=Freads NUL-separated filenames fromFand decodes each withstd::str::from_utf8(&line).expect(...). Linux filenames are arbitrary byte strings, so any non-UTF-8 byte in the list aborts the program (SIGABRT, exit 134) instead of being treated as a (missing) filename. GNUsortreports that it cannot read the file and exits 2.Steps to reproduce
Expected behavior
Match GNU: treat the bytes as a filename, fail to open it, and exit 2.
Actual behavior
uutils aborts with a Rust
Utf8Error.expect()panic (exit 134, core dump under the releasepanic="abort"profile).Root cause
The
--files0-fromreader decodes each NUL-terminated entry as UTF-8 and unwraps:The first site (
:2046) aborts before the second is reached.Found by our static analysis tooling.