accel/amdxdna: Treat power-off failure as unrecoverable error

Failing to set power off indicates an unrecoverable hardware or firmware
error. Update the driver to treat such a failure as a fatal condition
and stop further operations that depend on successful power state
transition.

This prevents undefined behavior when the hardware remains in an
unexpected state after a failed power-off attempt.

Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20251106180521.1095218-1-lizhi.hou@amd.com
This commit is contained in:
Lizhi Hou
2025-11-06 10:05:21 -08:00
parent 9942d36a73
commit 1750e652a9

View File

@@ -147,6 +147,16 @@ int aie2_smu_init(struct amdxdna_dev_hdl *ndev)
{
int ret;
/*
* Failing to set power off indicates an unrecoverable hardware or
* firmware error.
*/
ret = aie2_smu_exec(ndev, AIE2_SMU_POWER_OFF, 0, NULL);
if (ret) {
XDNA_ERR(ndev->xdna, "Access power failed, ret %d", ret);
return ret;
}
ret = aie2_smu_exec(ndev, AIE2_SMU_POWER_ON, 0, NULL);
if (ret) {
XDNA_ERR(ndev->xdna, "Power on failed, ret %d", ret);