fix: sanitize non-UTF-8 label values in filesystem collector#3675
Open
toller892 wants to merge 1 commit into
Open
fix: sanitize non-UTF-8 label values in filesystem collector#3675toller892 wants to merge 1 commit into
toller892 wants to merge 1 commit into
Conversation
Linux paths are byte strings and can contain non-UTF-8 sequences. When a filesystem is mounted on a non-UTF-8 path, the filesystem collector panics because prometheus.MustNewConstMetric requires valid UTF-8 label values. Sanitize device, mountPoint, and fsType label values using strings.ToValidUTF8() in parseFilesystemLabels() to replace invalid UTF-8 sequences with the Unicode replacement character. Fixes prometheus#3662 Signed-off-by: toller892 <toller892@users.noreply.github.com>
| "/run/user/1000": "", | ||
| "/run/user/1000/gvfs": "", | ||
| "/var/lib/kubelet/plugins/kubernetes.io/vsphere-volume/mounts/[vsanDatastore] bafb9e5a-8856-7e6c-699c-801844e77a4a/kubernetes-dynamic-pvc-3eba5bba-48a3-11e8-89ab-005056b92113.vmdk": "", | ||
| "/var/lib/kubelet/plugins/kubernetes.io/vsphere-volume/mounts/[vsanDatastore] bafb9e5a-8856-7e6c-699c-801844e77a4a/kubernetes-dynamic-pvc-3eba5bba-48a3-11e8-89ab-005056b92113.vmdk": "", |
There was a problem hiding this comment.
Is this required by the Non-UTF8 bug fix?
| } | ||
| } | ||
|
|
||
| func Test_parseFilesystemLabelsNonUTF8DoesNotPanic(t *testing.T) { |
There was a problem hiding this comment.
This test largely overlaps the table test, both guard the same revert. Its only unique coverage is an invalid fsType; consider folding that into the table test and dropping this one. If you'd rather keep a named regression test for #3662, make it exercise MustNewConstMetric so it actually reproduces the panic, right now neither test hits that path.
|
@toller892 could you also address the lint issues? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The filesystem collector panics when a filesystem is mounted on a non-UTF-8 path. Linux paths are byte strings and can contain arbitrary non-UTF-8 sequences. When
prometheus.MustNewConstMetricreceives a non-UTF-8 label value, it panics:Fixes #3662
Fix
Sanitize the
device,mountPoint, andfsTypelabel values usingstrings.ToValidUTF8()inparseFilesystemLabels(). Invalid UTF-8 sequences are replaced with the Unicode replacement character (U+FFFD), which is valid UTF-8 and won't cause a panic.This is consistent with how other Prometheus exporters handle non-UTF-8 data.
Changes
collector/filesystem_linux.go: Addstrings.ToValidUTF8()sanitization for all string label values inparseFilesystemLabels()collector/filesystem_linux_test.go: Add tests for non-UTF-8 mount points, non-UTF-8 device names, and valid UTF-8 preservationTesting
GOOS=linux go build ./collector/)GOOS=linux go test -c ./collector/)