Improve nvme_device_critical_warnings_state alert when drive's warranty from the manufacturer is over #17311
Replies: 2 comments 1 reply
-
@andy108369 hey. Can you share sudo nvme smart-log /dev/nvme1 And I see you have
which is likely the reason for the alarm and not the percentage used. This is the nvme.device_critical_warnings_state chart description (formatted for better readability):
You can change "group by" to "dimension" + "label:device" to find out which critical warning status is active for your device. |
Beta Was this translation helpful? Give feedback.
-
Same issue, available spare is fine, seems related to that kernel bug report filling up the smart error log, which is not related to NvmSubsystemReliability: Is there any way to change that behaviour, as the NVMe seems fine and it will mask really upcoming issues?
root@h1pve:/etc/netdata# nvme smart-log /dev/nvme1
|
Beta Was this translation helpful? Give feedback.
-
Hi,
I've noticed this alert:
On the server:
However, Hetzner (DC owner) said it's false alarm:
My question is how do I address the alert, considering Hetzner's response that it's a false alarm due to the "Percentage Used" exceeding 100% indicating the drive's warranty period has ended?
Can I suppress this error either in SMART (smartctl) or the Netdata Agent for my case?
Also, I don't want to disable it entirely since there could be a different critical alert/warning come related to the disk malfunction or similar... I'd like to only disable it for this particular warranty-related case or improve Netdata Agent's judgment in its regards.
Beta Was this translation helpful? Give feedback.
All reactions