I migrated from running host drivers to managing drivers running GPU operator on our clusterw. In the past we were able to have node problem detector run and check various information about the device and drivers and set conditions. However, many of those checks relied on nvidia-smi.
Given, that nvidia-smi is not available on the host when using gpu operator, what is the best approach for integrating with Node Problem Detector? Specifically, we want to be able to have a condition that tells us if drivers are installed which we plan to use with https://kubernetes.io/blog/2026/02/03/introducing-node-readiness-controller/ to taint nodes until they are ready after spinning up a new node,