Skip to content

Extend DRM labels#3145

Open
Deezzir wants to merge 1 commit intoprometheus:masterfrom
Deezzir:add-drm-chip
Open

Extend DRM labels#3145
Deezzir wants to merge 1 commit intoprometheus:masterfrom
Deezzir:add-drm-chip

Conversation

@Deezzir
Copy link

@Deezzir Deezzir commented Oct 3, 2024

Issue

Add chip label to the node_drm_card_info metric.

Solution

Extend the DRM collector to have the chip label from the ClassDRMCardAMDGPUStats.
The change was also proposed in the procfs repo, and the PR was merged.

Context

The DRM metrics are impossible to relate to the hwmon metrics, which export other helpful information about AMD GPUs. The extension will allow us to relate metrics from both hwmon and 'drm` and, in turn, create a better GPU Dashboard for AMD with proper filtering and labelling.

Example metrics:

# DRM
node_drm_card_info{card="card0",chip="0000:9d:00_0_0000:9e:00_0",memory_vendor="unknown",power_performance_level="auto",unique_id="",vendor="amd"} 1

# HWMON
node_hwmon_chip_names{chip="0000:9d:00_0_0000:9e:00_0",chip_name="amdgpu"} 1
node_hwmon_fan_enable{chip="0000:9d:00_0_0000:9e:00_0",sensor="fan1"} 0
node_hwmon_fan_max_rpm{chip="0000:9d:00_0_0000:9e:00_0",sensor="fan1"} 6900
node_hwmon_fan_min_rpm{chip="0000:9d:00_0_0000:9e:00_0",sensor="fan1"} 1800
node_hwmon_fan_rpm{chip="0000:9d:00_0_0000:9e:00_0",sensor="fan1"} 2035
node_hwmon_fan_target_rpm{chip="0000:9d:00_0_0000:9e:00_0",sensor="fan1"} 2035
node_hwmon_freq_freq_mhz{chip="0000:9d:00_0_0000:9e:00_0",sensor="mclk"} 1500
node_hwmon_freq_freq_mhz{chip="0000:9d:00_0_0000:9e:00_0",sensor="sclk"} 214
node_hwmon_in_volts{chip="0000:9d:00_0_0000:9e:00_0",sensor="in0"} 0.85
node_hwmon_power_average_watt{chip="0000:9d:00_0_0000:9e:00_0",sensor="power1"} 6.075
node_hwmon_power_cap_default_watt{chip="0000:9d:00_0_0000:9e:00_0",sensor="power1"} 35
node_hwmon_power_cap_max_watt{chip="0000:9d:00_0_0000:9e:00_0",sensor="power1"} 35
node_hwmon_power_cap_min_watt{chip="0000:9d:00_0_0000:9e:00_0",sensor="power1"} 0
node_hwmon_power_cap_watt{chip="0000:9d:00_0_0000:9e:00_0",sensor="power1"} 35
node_hwmon_pwm{chip="0000:9d:00_0_0000:9e:00_0",sensor="pwm1"} 49
node_hwmon_pwm_enable{chip="0000:9d:00_0_0000:9e:00_0",sensor="pwm1"} 2
node_hwmon_pwm_max{chip="0000:9d:00_0_0000:9e:00_0",sensor="pwm1"} 255
node_hwmon_pwm_min{chip="0000:9d:00_0_0000:9e:00_0",sensor="pwm1"} 0
node_hwmon_sensor_label{chip="0000:9d:00_0_0000:9e:00_0",label="edge",sensor="temp1"} 1
node_hwmon_sensor_label{chip="0000:9d:00_0_0000:9e:00_0",label="mclk",sensor="freq2"} 1
node_hwmon_sensor_label{chip="0000:9d:00_0_0000:9e:00_0",label="sclk",sensor="freq1"} 1
node_hwmon_sensor_label{chip="0000:9d:00_0_0000:9e:00_0",label="slowPPT",sensor="power1"} 1
node_hwmon_sensor_label{chip="0000:9d:00_0_0000:9e:00_0",label="vddgfx",sensor="in0"} 1
node_hwmon_temp_celsius{chip="0000:9d:00_0_0000:9e:00_0",sensor="temp1"} 42
node_hwmon_temp_crit_celsius{chip="0000:9d:00_0_0000:9e:00_0",sensor="temp1"} 97
node_hwmon_temp_crit_hyst_celsius{chip="0000:9d:00_0_0000:9e:00_0",sensor="temp1"} -273.15000000000003

Signed-off-by: Deezzir <deezzir@gmail.com>
@Deezzir
Copy link
Author

Deezzir commented Feb 25, 2026

@discordianfish and @SuperQ PTAL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant