Skip to content

fix: sanitize non-UTF-8 label values in filesystem collector#3675

Open
toller892 wants to merge 1 commit into
prometheus:masterfrom
toller892:fix/filesystem-non-utf8-mountpoint
Open

fix: sanitize non-UTF-8 label values in filesystem collector#3675
toller892 wants to merge 1 commit into
prometheus:masterfrom
toller892:fix/filesystem-non-utf8-mountpoint

Conversation

@toller892
Copy link
Copy Markdown

Problem

The filesystem collector panics when a filesystem is mounted on a non-UTF-8 path. Linux paths are byte strings and can contain arbitrary non-UTF-8 sequences. When prometheus.MustNewConstMetric receives a non-UTF-8 label value, it panics:

panic: label value "/mnt/cass\xe9" is not valid UTF-8

goroutine 35 [running]:
github.com/prometheus/client_golang/prometheus.MustNewConstMetric(...)
github.com/prometheus/node_exporter/collector.(*filesystemCollector).Update(0x400019e9a0, 0x40002463f0)
        /app/collector/filesystem_common.go:198 +0x704

Fixes #3662

Fix

Sanitize the device, mountPoint, and fsType label values using strings.ToValidUTF8() in parseFilesystemLabels(). Invalid UTF-8 sequences are replaced with the Unicode replacement character (U+FFFD), which is valid UTF-8 and won't cause a panic.

This is consistent with how other Prometheus exporters handle non-UTF-8 data.

Changes

  • collector/filesystem_linux.go: Add strings.ToValidUTF8() sanitization for all string label values in parseFilesystemLabels()
  • collector/filesystem_linux_test.go: Add tests for non-UTF-8 mount points, non-UTF-8 device names, and valid UTF-8 preservation

Testing

  • Verified the fix compiles for Linux (GOOS=linux go build ./collector/)
  • Verified test file compiles for Linux (GOOS=linux go test -c ./collector/)
  • Added test cases covering:
    • Non-UTF-8 mount point is sanitized (replacement character)
    • Non-UTF-8 device name is sanitized
    • Valid UTF-8 is preserved (no false positives)

Linux paths are byte strings and can contain non-UTF-8 sequences.
When a filesystem is mounted on a non-UTF-8 path, the filesystem
collector panics because prometheus.MustNewConstMetric requires
valid UTF-8 label values.

Sanitize device, mountPoint, and fsType label values using
strings.ToValidUTF8() in parseFilesystemLabels() to replace
invalid UTF-8 sequences with the Unicode replacement character.

Fixes prometheus#3662

Signed-off-by: toller892 <toller892@users.noreply.github.com>
"/run/user/1000": "",
"/run/user/1000/gvfs": "",
"/var/lib/kubelet/plugins/kubernetes.io/vsphere-volume/mounts/[vsanDatastore] bafb9e5a-8856-7e6c-699c-801844e77a4a/kubernetes-dynamic-pvc-3eba5bba-48a3-11e8-89ab-005056b92113.vmdk": "",
"/var/lib/kubelet/plugins/kubernetes.io/vsphere-volume/mounts/[vsanDatastore] bafb9e5a-8856-7e6c-699c-801844e77a4a/kubernetes-dynamic-pvc-3eba5bba-48a3-11e8-89ab-005056b92113.vmdk": "",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this required by the Non-UTF8 bug fix?

}
}

func Test_parseFilesystemLabelsNonUTF8DoesNotPanic(t *testing.T) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test largely overlaps the table test, both guard the same revert. Its only unique coverage is an invalid fsType; consider folding that into the table test and dropping this one. If you'd rather keep a named regression test for #3662, make it exercise MustNewConstMetric so it actually reproduces the panic, right now neither test hits that path.

@nicolastakashi
Copy link
Copy Markdown

@toller892 could you also address the lint issues?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

filesystemCollector crash with non-utf8 mount point

2 participants