linux

mirror of https://github.com/torvalds/linux.git synced 2025-12-07 20:06:24 +00:00

Author	SHA1	Message	Date
Rafael J. Wysocki	f58f86df6a	Merge branches 'pm-core', 'pm-runtime' and 'pm-sleep' Merge changes related to system sleep and runtime PM framework for 6.18-rc1: - Annotate loops walking device links in the power management core code as _srcu and add macros for walking device links to reduce the likelihood of coding mistakes related to them (Rafael Wysocki) - Document time units for _time functions in the runtime PM API (Brian Norris) - Clear power.must_resume in noirq suspend error path to avoid resuming a dependant device under a suspended parent or supplier (Rafael Wysocki) - Fix GFP mask handling during hybrid suspend and make the amdgpu driver handle hybrid suspend correctly (Mario Limonciello, Rafael Wysocki) - Fix GFP mask handling after aborted hibernation in platform mode and combine exit paths in power_down() to avoid code duplication (Rafael Wysocki) - Use vmalloc_array() and vcalloc() in the hibernation core to avoid open-coded size computations (Qianfeng Rong) - Fix typo in hibernation core code comment (Li Jun) - Call pm_wakeup_clear() in the same place where other functions that do bookkeeping prior to suspend_prepare() are called (Samuel Wu) pm-core: PM: core: Add two macros for walking device links PM: core: Annotate loops walking device links as _srcu * pm-runtime: PM: runtime: Documentation: ABI: Document time units for _time pm-sleep: PM: hibernate: Combine return paths in power_down() PM: hibernate: Restrict GFP mask in power_down() PM: hibernate: Fix pm_hibernation_mode_is_suspend() build breakage drm/amd: Fix hybrid sleep PM: hibernate: Add pm_hibernation_mode_is_suspend() PM: hibernate: Fix hybrid-sleep PM: sleep: core: Clear power.must_resume in noirq suspend error path PM: sleep: Make pm_wakeup_clear() call more clear PM: hibernate: Fix typo in memory bitmaps description comment PM: hibernate: Use vmalloc_array() and vcalloc() to improve code	2025-09-29 12:54:01 +02:00
Mario Limonciello (AMD)	0a6e9e098f	drm/amd: Fix hybrid sleep [Why] commit `530694f54d` ("drm/amdgpu: do not resume device in thaw for normal hibernation") optimized the flow for systems that are going into S4 where the power would be turned off. Basically the thaw() callback wouldn't resume the device if the hibernation image was successfully created since the system would be powered off. This however isn't the correct flow for a system entering into s0i3 after the hibernation image is created. Some of the amdgpu callbacks have different behavior depending upon the intended state of the suspend. [How] Use pm_hibernation_mode_is_suspend() as an input to decide whether to run resume during thaw() callback. Reported-by: Ionut Nechita <ionut_n2001@yahoo.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4573 Tested-by: Ionut Nechita <ionut_n2001@yahoo.com> Fixes: `530694f54d` ("drm/amdgpu: do not resume device in thaw for normal hibernation") Acked-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Kenneth Crudup <kenny@panix.com> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Cc: 6.17+ <stable@vger.kernel.org> # 6.17+: `495c8d3503`: PM: hibernate: Add pm_hibernation_mode_is_suspend() Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2025-09-25 21:37:38 +02:00
Dave Airlie	feb96ccb33	Merge tag 'amd-drm-fixes-6.17-2025-09-18' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.17-2025-09-18: amdgpu: - GC 11.0.1/4 cleaner shader support - DC irq fix - OD fix amdkfd: - S0ix fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250918191428.2553105-1-alexander.deucher@amd.com	2025-09-19 12:35:09 +10:00
Dave Airlie	b55caa69c5	Merge tag 'drm-xe-fixes-2025-09-18' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes - Release kobject for the failure path (Shuicheng) - SRIOV PF: Drop rounddown_pow_of_two fair (Michal) - Remove type casting on hwmon (Mallesh) - Defer free of NVM auxiliary container to device release (Nitin) - Fix a NULL vs IS_ERR (Dan) - Add cleanup action in xe_device_sysfs_init (Zongyao) - Fix error handling if PXP fails to start (Daniele) - Set GuC RCS/CCS yield policy (Daniele) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/aMwL7vxFP1L94IML@intel.com	2025-09-19 11:24:11 +10:00
Dave Airlie	f5a9c2b49f	Merge tag 'drm-misc-fixes-2025-09-18' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes One fix for a documentation warning, a null pointer dereference fix for anx7625, and a mutex unlock fix for cdns-mhdp8546 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://lore.kernel.org/r/20250918-orthodox-pretty-puma-1ddeea@houat	2025-09-19 10:28:52 +10:00
Alex Deucher	9272bb34b0	drm/amdgpu: suspend KFD and KGD user queues for S0ix We need to make sure the user queues are preempted so GFX can enter gfxoff. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Tested-by: David Perry <david.perry@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `f8b367e6fa`) Cc: stable@vger.kernel.org	2025-09-18 14:59:41 -04:00
Alex Deucher	2ade36eaa9	drm/amdkfd: add proper handling for S0ix When in S0i3, the GFX state is retained, so all we need to do is stop the runlist so GFX can enter gfxoff. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Tested-by: David Perry <david.perry@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `4bfa860993`) Cc: stable@vger.kernel.org	2025-09-18 14:59:24 -04:00
Daniele Ceraolo Spurio	26caeae9fb	drm/xe/guc: Set RCS/CCS yield policy All recent platforms (including all the ones officially supported by the Xe driver) do not allow concurrent execution of RCS and CCS workloads from different address spaces, with the HW blocking the context switch when it detects such a scenario. The DUAL_QUEUE flag helps with this, by causing the GuC to not submit a context it knows will not be able to execute. This, however, causes a new problem: if RCS and CCS queues have pending workloads from different address spaces, the GuC needs to choose from which of the 2 queues to pick the next workload to execute. By default, the GuC prioritizes RCS submissions over CCS ones, which can lead to CCS workloads being significantly (or completely) starved of execution time. The driver can tune this by setting a dedicated scheduling policy KLV; this KLV allows the driver to specify a quantum (in ms) and a ratio (percentage value between 0 and 100), and the GuC will prioritize the CCS for that percentage of each quantum. Given that we want to guarantee enough RCS throughput to avoid missing frames, we set the yield policy to 20% of each 80ms interval. v2: updated quantum and ratio, improved comment, use xe_guc_submit_disable in gt_sanitize Fixes: `d9a1ae0d17` ("drm/xe/guc: Enable WA_DUAL_QUEUE for newer platforms") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Tested-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://lore.kernel.org/r/20250905235632.3333247-2-daniele.ceraolospurio@intel.com (cherry picked from commit `8843444843`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Rodrigo added #include "xe_guc_submit.h" while backporting]	2025-09-17 20:23:47 -04:00
Daniele Ceraolo Spurio	ae5fbbda34	drm/xe: Fix error handling if PXP fails to start Since the PXP start comes after __xe_exec_queue_init() has completed, we need to cleanup what was done in that function in case of a PXP start error. __xe_exec_queue_init calls the submission backend init() function, so we need to introduce an opposite for that. Unfortunately, while we already have a fini() function pointer, it performs other operations in addition to cleaning up what was done by the init(). Therefore, for clarity, the existing fini() has been renamed to destroy(), while a new fini() has been added to only clean up what was done by the init(), with the latter being called by the former (via xe_exec_queue_fini). Fixes: `72d479601d` ("drm/xe/pxp/uapi: Add userspace and LRC support for PXP-using queues") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250909221240.3711023-3-daniele.ceraolospurio@intel.com (cherry picked from commit `626667321d`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-09-17 12:28:55 -04:00
Zongyao Bai	ff89a4d285	drm/xe/sysfs: Add cleanup action in xe_device_sysfs_init On partial failure, some sysfs files created before the failure might not be removed. Add common cleanup step to remove them all immediately, as is should be harmless to attempt to remove non-existing files. Fixes: `0e414bf7ad` ("drm/xe: Expose PCIe link downgrade attributes") Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Stuart Summers <stuart.summers@intel.com> Cc: Shuicheng Lin <shuicheng.lin@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Zongyao Bai <zongyao.bai@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250915214716.1327379-2-zongyao.bai@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `1a869168d9`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-09-17 12:28:55 -04:00
Mario Limonciello	f9b80514a7	drm/amd: Only restore cached manual clock settings in restore if OD enabled If OD is not enabled then restoring cached clock settings doesn't make sense and actually leads to errors in resume. Check if enabled before restoring settings. Fixes: `4e9526924d` ("drm/amd: Restore cached manual clock settings during resume") Reported-by: Jérôme Lécuyer <jerome.4a4c@gmail.com> Closes: https://lore.kernel.org/amd-gfx/0ffe2692-7bfa-4821-856e-dd0f18e2c32b@amd.com/T/#me6db8ddb192626360c462b7570ed7eba0c6c9733 Suggested-by: Jérôme Lécuyer <jerome.4a4c@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `1a4dd33cc6`) Cc: stable@vger.kernel.org	2025-09-16 17:58:34 -04:00
Dan Carpenter	cbc7f3b4f6	drm/xe: Fix a NULL vs IS_ERR() in xe_vm_add_compute_exec_queue() The xe_preempt_fence_create() function returns error pointers. It never returns NULL. Update the error checking to match. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/aJTMBdX97cof_009@stanley.mountain Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit `75cc23ffe5`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-09-16 11:49:16 -04:00
Qi Xi	288dac9fb6	drm: bridge: cdns-mhdp8546: Fix missing mutex unlock on error path Add missing mutex unlock before returning from the error path in cdns_mhdp_atomic_enable(). Fixes: `935a92a1c4` ("drm: bridge: cdns-mhdp8546: Fix possible null pointer dereference") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Qi Xi <xiqi2@huawei.com> Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20250904034447.665427-1-xiqi2@huawei.com Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>	2025-09-16 15:42:35 +02:00
Aaron Ma	35e526398b	drm/i915/backlight: Honor VESA eDP backlight luminance control capability The VESA AUX backlight fails to be enable luminance based backlight mainpulation becaused luminance_set is false by default. Fix it by using luminance support control capabitliy. Fixes: `e13af5166a` ("drm/i915/backlight: Use drm helper to initialize edp backlight") Signed-off-by: Aaron Ma <aaron.ma@canonical.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://lore.kernel.org/r/20250823121647.275834-1-aaron.ma@canonical.com (cherry picked from commit `72136efb87`) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>	2025-09-16 09:12:24 +01:00
Ivan Lipski	29a2f43047	drm/amd/display: Allow RX6xxx & RX7700 to invoke amdgpu_irq_get/put [Why&How] As reported on https://gitlab.freedesktop.org/drm/amd/-/issues/3936, SMU hang can occur if the interrupts are not enabled appropriately, causing a vblank timeout. This patch reverts commit `5009628d85` ("drm/amd/display: Remove unnecessary amdgpu_irq_get/put"), but only for RX6xxx & RX7700 GPUs, on which the issue was observed. This will re-enable interrupts regardless of whether the user space needed it or not. Fixes: `5009628d85` ("drm/amd/display: Remove unnecessary amdgpu_irq_get/put") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3936 Suggested-by: Sun peng Li <sunpeng.li@amd.com> Reviewed-by: Sun peng Li <sunpeng.li@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `95d168b367`) Cc: stable@vger.kernel.org	2025-09-15 17:24:42 -04:00
Srinivasan Shanmugam	c1b6b8c770	drm/amdgpu/gfx11: Add Cleaner Shader Support for GFX11.0.1/11.0.4 GPUs Enable the cleaner shader for additional GFX11.0.1/11.0.4 series GPUs to ensure data isolation among GPU tasks. The cleaner shader is tasked with clearing the Local Data Store (LDS), Vector General Purpose Registers (VGPRs), and Scalar General Purpose Registers (SGPRs), which helps avoid data leakage and guarantees the accuracy of computational results. This update extends cleaner shader support to GFX11.0.1/11.0.4 GPUs, previously available for GFX11.0.3. It enhances security by clearing GPU memory between processes and maintains a consistent GPU state across KGD and KFD workloads. Cc: Wasee Alam <wasee.alam@amd.com> Cc: Mario Sopena-Novales <mario.novales@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `0a71ceb27f`)	2025-09-15 17:23:42 -04:00
Loic Poulain	a10f910c77	drm: bridge: anx7625: Fix NULL pointer dereference with early IRQ If the interrupt occurs before resource initialization is complete, the interrupt handler/worker may access uninitialized data such as the I2C tcpc_client device, potentially leading to NULL pointer dereference. Signed-off-by: Loic Poulain <loic.poulain@oss.qualcomm.com> Fixes: `8bdfc5dae4` ("drm/bridge: anx7625: Add anx7625 MIPI DSI/DPI to DP") Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250709085438.56188-1-loic.poulain@oss.qualcomm.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>	2025-09-15 21:44:42 +03:00
Nitin Gote	7926ba2143	drm/xe: defer free of NVM auxiliary container to device release callback Do not kfree the intel_dg_nvm_dev in xe_nvm_fini() right after auxiliary_device_delete/uninit. The auxiliary_device embeds the device/kobject (and its name); freeing it too early can race with asynchronous device_del/udev processing and cause a use-after-free. Signed-off-by: Nitin Gote <nitin.r.gote@intel.com> Fixes: `c28bfb107d` ("drm/xe/nvm: add on-die non-volatile memory device") Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250911052823.226696-1-nitin.r.gote@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `d4c3ed963e`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-09-15 13:12:19 -04:00
Mallesh Koujalagi	dcfd151d32	drm/xe/hwmon: Remove type casting Refactor: eliminate type casts by using proper u32 declarations. v2: - Address review comments. (Karthik) v3: - Use the proper u32 type and drop cast. (Lucas De Marchi) - Modify variable when actually using u64 value. - Change r value to reg_value with u32 type. v4: - Remove newline between trailer and Signed-off-by. (Lucas De Marchi) - Change reg_val to val for more user-friendly logging. - Use mul_u32_u32 function since both values are u32. v5: - mul_u32_u32 function with shift. (Lucas De Marchi) Fixes: `7596d839f6` ("drm/xe/hwmon: Add support to manage power limits though mailbox") Signed-off-by: Mallesh Koujalagi <mallesh.koujalagi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250912113458.2815172-1-mallesh.koujalagi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `4e1d3b5e64`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-09-15 13:11:40 -04:00
Michal Wajdeczko	fef8b64e48	drm/xe/pf: Drop rounddown_pow_of_two fair LMEM limitation This effectively reverts commit `4c3fe5eae4` ("drm/xe/pf: Limit fair VF LMEM provisioning") since we don't need it any more after non-contig VRAM allocations were fixed. This allows larger LMEM auto-provisioning for VFs, so instead: [ ] GT0: PF: LMEM available(14096M) fair(1 x 8192M) [ ] GT0: PF: VF1 provisioned with 8589934592 (8.00 GiB) LMEM or [ ] GT0: PF: LMEM available(14096M) fair(2 x 4096M) [ ] GT0: PF: VF1..VF2 provisioned with 4294967296 (4.00 GiB) LMEM we may get: [ ] GT0: PF: LMEM available(14096M) fair(1 x 14096M) [ ] GT0: PF: VF1 provisioned with 14780727296 (13.8 GiB) LMEM and [ ] GT0: PF: LMEM available(14096M) fair(2 x 7048M) [ ] GT0: PF: VF1..VF2 provisioned with 7390363648 (6.88 GiB) LMEM Fixes: `1e32ffbc9d` ("drm/xe/sriov: support non-contig VRAM provisioning") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://lore.kernel.org/r/20250910222439.32869-1-michal.wajdeczko@intel.com (cherry picked from commit `95c1cfa306`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-09-15 08:26:26 -04:00
Shuicheng Lin	013e484dbd	drm/xe/tile: Release kobject for the failure path Call kobject_put() for the failure path to release the kobject v2: remove extra newline. (Matt) Fixes: `e3d0839aa5` ("drm/xe/tile: Abort driver load for sysfs creation failure") Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Link: https://lore.kernel.org/r/20250819153950.2973344-2-shuicheng.lin@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `b98775bca9`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-09-15 08:26:19 -04:00
Bagas Sanjaya	07c24945ca	Revert "drm: Add directive to format code in comment" Commit `6cc44e9618` ("drm: Add directive to format code in comment") fixes original Sphinx indentation warning as introduced in `471920ce25` ("drm/gpuvm: Add locking helpers"), by means of using code-block:: directive. It semantically conflicts with earlier `bb324f85f7` ("drm/gpuvm: Wrap drm_gpuvm_sm_map_exec_lock() expected usage in literal code block") that did the same using double colon syntax instead. These duplicated literal code block directives causes the original warnings not being fixed. Revert `6cc44e9618` to keep things rolling without these warnings. Fixes: `6cc44e9618` ("drm: Add directive to format code in comment") Fixes: `471920ce25` ("drm/gpuvm: Add locking helpers") Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Danilo Krummrich <dakr@kernel.org>	2025-09-13 11:17:34 +02:00
Dave Airlie	9a3f210737	Merge tag 'drm-xe-fixes-2025-09-11' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes - Don't touch survivability_mode on fini (Michal) - Fixes around eviction and suspend (Thomas) - Extend Wa_13011645652 to PTL-H, WCL (Julia) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/aMLq7QlaEPHGKXKX@intel.com	2025-09-12 09:44:07 +10:00
Dave Airlie	dab1f85526	Merge tag 'drm-misc-fixes-2025-09-11' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes A maintainer update, an out-of-bound check for panthor and a revert for nouveau to fix a race. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://lore.kernel.org/r/20250911-glistening-uakari-of-serendipity-06ceb1@houat	2025-09-12 09:34:48 +10:00
Dave Airlie	f2c8bbb6e9	Merge tag 'mediatek-drm-fixes-20250910' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes Mediatek DRM Fixes - 20250910 1. fix potential OF node use-after-free Signed-off-by: Dave Airlie <airlied@redhat.com> From: Chun-Kuang Hu <chunkuang.hu@kernel.org> Link: https://lore.kernel.org/r/20250910231813.3526-1-chunkuang.hu@kernel.org	2025-09-12 09:31:27 +10:00
Dave Airlie	1d00adb873	Merge tag 'amd-drm-fixes-6.17-2025-09-10' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.17-2025-09-10: amdgpu: - PSP 11.x fix - DPCD quirk handing fix - DCN 3.5 PG fix - Audio suspend fix - OEM i2c clean up fix - Module unload memory leak fix - DC delay fix - ISP firmware fix - VCN fixes amdkfd: - P2P topology fix - APU mem limit calculation fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250910162855.2507853-1-alexander.deucher@amd.com	2025-09-12 09:24:57 +10:00
Dave Airlie	467360e295	Merge tag 'drm-intel-fixes-2025-09-10' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes - Fix size for for_each_set_bit() in abox iteration [display] (Jani Nikula) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Tvrtko Ursulin <tursulin@igalia.com> Link: https://lore.kernel.org/r/aMFUtRdJ46qK-EXl@linux	2025-09-12 09:21:20 +10:00
Dave Airlie	2c38074c36	Merge tag 'drm-rust-fixes-2025-09-05' of https://gitlab.freedesktop.org/drm/rust/kernel into drm-fixes - Add drm-rust tree to MAINTAINERS - Require CONFIG_64BIT for Nova Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/aLquN1YvdyI_6PJS@google.com	2025-09-12 09:05:42 +10:00
Johan Hovold	9ba2556cef	drm/mediatek: clean up driver data initialisation The platform and drm devices are only used to look up the drm device and its driver data respectively when initialising the driver data during bind(). Drop the reference counts as soon as they have been used to make the code more readable. Note that the crtc count is never incremented on lookup failures. Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: CK Hu <ck.hu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20250829090345.21075-3-johan@kernel.org/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2025-09-10 12:52:59 +00:00
Johan Hovold	4de37a48b6	drm/mediatek: fix potential OF node use-after-free The for_each_child_of_node() helper drops the reference it takes to each node as it iterates over children and an explicit of_node_put() is only needed when exiting the loop early. Drop the recently introduced bogus additional reference count decrement at each iteration that could potentially lead to a use-after-free. Fixes: `1f403699c4` ("drm/mediatek: Fix device/node reference count leaks in mtk_drm_get_all_drm_priv") Cc: Ma Ke <make24@iscas.ac.cn> Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: CK Hu <ck.hu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20250829090345.21075-2-johan@kernel.org/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2025-09-10 12:49:37 +00:00
David Rosca	3318f2d20c	drm/amdgpu/vcn: Allow limiting ctx to instance 0 for AV1 at any time There is no reason to require this to happen on first submitted IB only. We need to wait for the queue to be idle, but it can be done at any time (including when there are multiple video sessions active). Signed-off-by: David Rosca <david.rosca@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `8908fdce06`) Cc: stable@vger.kernel.org	2025-09-09 16:42:26 -04:00
David Rosca	2b10cb58d7	drm/amdgpu/vcn4: Fix IB parsing with multiple engine info packages There can be multiple engine info packages in one IB and the first one may be common engine, not decode/encode. We need to parse the entire IB instead of stopping after finding first engine info. Signed-off-by: David Rosca <david.rosca@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `dc8f9f0f45`) Cc: stable@vger.kernel.org	2025-09-09 16:41:49 -04:00
Pratap Nirujogi	857ccfc19f	drm/amd/amdgpu: Declare isp firmware binary file Declare isp firmware file isp_4_1_1.bin required by isp4.1.1 device. Suggested-by: Alexey Zagorodnikov <xglooom@gmail.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `d97b74a833`) Cc: stable@vger.kernel.org	2025-09-09 16:41:15 -04:00
Alex Deucher	1d66c3f2b8	drm/amd/display: use udelay rather than fsleep This function can be called from an atomic context so we can't use fsleep(). Fixes: `01f60348d8` ("drm/amd/display: Fix 'failed to blank crtc!'") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4549 Cc: Wen Chen <Wen.Chen3@amd.com> Cc: Fangzhi Zuo <jerry.zuo@amd.com> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `27e4dc2c05`)	2025-09-09 16:39:16 -04:00
Alex Deucher	7838fb5f11	drm/amdgpu: fix a memory leak in fence cleanup when unloading Commit `b61badd20b` ("drm/amdgpu: fix usage slab after free") reordered when amdgpu_fence_driver_sw_fini() was called after that patch, amdgpu_fence_driver_sw_fini() effectively became a no-op as the sched entities we never freed because the ring pointers were already set to NULL. Remove the NULL setting. Reported-by: Lin.Cao <lincao12@amd.com> Cc: Vitaly Prosyak <vitaly.prosyak@amd.com> Cc: Christian König <christian.koenig@amd.com> Fixes: `b61badd20b` ("drm/amdgpu: fix usage slab after free") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `a525fa37aa`) Cc: stable@vger.kernel.org	2025-09-09 16:38:26 -04:00
Julia Filipchuk	fd99415ec8	drm/xe: Extend Wa_13011645652 to PTL-H, WCL Expand workaround to additional graphics architectures. Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: Stuart Summers <stuart.summers@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: intel-xe@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v6.17+ Signed-off-by: Julia Filipchuk <julia.filipchuk@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250903190122.1028373-2-julia.filipchuk@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `6fc957185e`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-09-09 13:20:36 -04:00
Thomas Hellström	eb5723a751	drm/xe: Block exec and rebind worker while evicting for suspend / hibernate When the xe pm_notifier evicts for suspend / hibernate, there might be racing tasks trying to re-validate again. This can lead to suspend taking excessive time or get stuck in a live-lock. This behaviour becomes much worse with the fix that actually makes re-validation bring back bos to VRAM rather than letting them remain in TT. Prevent that by having exec and the rebind worker waiting for a completion that is set to block by the pm_notifier before suspend and is signaled by the pm_notifier after resume / wakeup. It's probably still possible to craft malicious applications that block suspending. More work is pending to fix that. v3: - Avoid wait_for_completion() in the kernel worker since it could potentially cause work item flushes from freezable processes to wait forever. Instead terminate the rebind workers if needed and re-launch at resume. (Matt Auld) v4: - Fix some bad naming and leftover debug printouts. - Fix kerneldoc. - Use drmm_mutex_init() for the xe->rebind_resume_lock (Matt Auld). - Rework the interface of xe_vm_rebind_resume_worker (Matt Auld). Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4288 Fixes: `c6a4d46ec1` ("drm/xe: evict user memory in PM notifier") Cc: Matthew Auld <matthew.auld@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: <stable@vger.kernel.org> # v6.16+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250904160715.2613-4-thomas.hellstrom@linux.intel.com (cherry picked from commit `599334572a`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-09-09 13:20:31 -04:00
Thomas Hellström	d84820309e	drm/xe: Allow the pm notifier to continue on failure Its actions are opportunistic anyway and will be completed on device suspend. Marking as a fix to simplify backporting of the fix that follows in the series. v2: - Keep the runtime pm reference over suspend / hibernate and document why. (Matt Auld, Rodrigo Vivi): Fixes: `c6a4d46ec1` ("drm/xe: evict user memory in PM notifier") Cc: Matthew Auld <matthew.auld@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: <stable@vger.kernel.org> # v6.16+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250904160715.2613-3-thomas.hellstrom@linux.intel.com (cherry picked from commit `ebd546fdff`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-09-09 13:20:26 -04:00
Thomas Hellström	5c87fee3c9	drm/xe: Attempt to bring bos back to VRAM after eviction VRAM+TT bos that are evicted from VRAM to TT may remain in TT also after a revalidation following eviction or suspend. This manifests itself as applications becoming sluggish after buffer objects get evicted or after a resume from suspend or hibernation. If the bo supports placement in both VRAM and TT, and we are on DGFX, mark the TT placement as fallback. This means that it is tried only after VRAM + eviction. This flaw has probably been present since the xe module was upstreamed but use a Fixes: commit below where backporting is likely to be simple. For earlier versions we need to open- code the fallback algorithm in the driver. v2: - Remove check for dgfx. (Matthew Auld) - Update the xe_dma_buf kunit test for the new strategy (CI) - Allow dma-buf to pin in current placement (CI) - Make xe_bo_validate() for pinned bos a NOP. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5995 Fixes: `a78a8da51b` ("drm/ttm: replace busy placement with flags v6") Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: <stable@vger.kernel.org> # v6.9+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250904160715.2613-2-thomas.hellstrom@linux.intel.com (cherry picked from commit `cb3d7b3b46`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-09-09 13:20:22 -04:00
Michal Wajdeczko	7934fdc25a	drm/xe/configfs: Don't touch survivability_mode on fini This is a user controlled configfs attribute, we should not modify that outside the configfs attr.store() implementation. Fixes: `bc417e54e2` ("drm/xe: Enable configfs support for survivability mode") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250904103521.7130-1-michal.wajdeczko@intel.com (cherry picked from commit `079a5c83db`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-09-09 13:20:17 -04:00
Yifan Zhang	5350355627	amd/amdkfd: correct mem limit calculation for small APUs Current mem limit check leaks some GTT memory (reserved_for_pt reserved_for_ras + adev->vram_pin_size) for small APUs. Since carveout VRAM is tunable on APUs, there are three case regarding the carveout VRAM size relative to GTT: 1. 0 < carveout < gtt apu_prefer_gtt = true, is_app_apu = false 2. carveout > gtt / 2 apu_prefer_gtt = false, is_app_apu = false 3. 0 = carveout apu_prefer_gtt = true, is_app_apu = true It doesn't make sense to check below limitation in case 1 (default case, small carveout) because the values in the below expression are mixed with carveout and gtt. adev->kfd.vram_used[xcp_id] + vram_needed > vram_size - reserved_for_pt - reserved_for_ras - atomic64_read(&adev->vram_pin_size) gtt: kfd.vram_used, vram_needed, vram_size carveout: reserved_for_pt, reserved_for_ras, adev->vram_pin_size In case 1, vram allocation will go to gtt domain, skip vram check since ttm_mem_limit check already cover this allocation. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `fa7c99f04f`)	2025-09-09 12:28:28 -04:00
Eric Huang	ce42a3b581	drm/amdkfd: fix p2p links bug in topology When creating p2p links, KFD needs to check XGMI link with two conditions, hive_id and is_sharing_enabled, but it is missing to check is_sharing_enabled, so add it to fix the error. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `36cc7d1317`)	2025-09-09 12:28:18 -04:00
Geoffrey McRae	1dfd2864a1	drm/amd/display: remove oem i2c adapter on finish Fixes a bug where unbinding of the GPU would leave the oem i2c adapter registered resulting in a null pointer dereference when applications try to access the invalid device. Fixes: `3d5470c973` ("drm/amd/display/dm: add support for OEM i2c bus") Cc: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Geoffrey McRae <geoffrey.mcrae@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `89923fb7ea`) Cc: stable@vger.kernel.org	2025-09-09 12:27:52 -04:00
Mario Limonciello (AMD)	60f71f0db7	drm/amd/display: Drop dm_prepare_suspend() and dm_complete() [Why] dm_prepare_suspend() was added in commit `50e0bae34f` ("drm/amd/display: Add and use new dm_prepare_suspend() callback") to allow display to turn off earlier in the suspend sequence. This caused a regression that HDMI audio sometimes didn't work properly after resume unless audio was playing during suspend. [How] Drop dm_prepare_suspend() callback. All code in it will still run during dm_suspend(). Also drop unnecessary dm_complete() callback. dm_complete() was used for failed prepare and also for any case of successful resume. The code in it already runs in dm_resume(). This change will introduce more time that the display is turned on during suspend sequence. The compositor can turn it off sooner if desired. Cc: Harry Wentland <harry.wentland@amd.com> Reported-by: Przemysław Kopa <prz.kopa@gmail.com> Closes: https://lore.kernel.org/amd-gfx/1cea0d56-7739-4ad9-bf8e-c9330faea2bb@kernel.org/T/#m383d9c08397043a271b36c32b64bb80e524e4b0f Reported-by: Kalvin <hikaph+oss@gmail.com> Closes: https://github.com/alsa-project/alsa-lib/issues/465 Closes: https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/4809 Fixes: `50e0bae34f` ("drm/amd/display: Add and use new dm_prepare_suspend() callback") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `2fd653b9bb`) Cc: stable@vger.kernel.org	2025-09-09 12:26:28 -04:00
Ovidiu Bunea	70f0b051f8	drm/amd/display: Correct sequences and delays for DCN35 PG & RCG [why] The current PG & RCG programming in driver has some gaps and incorrect sequences. [how] Added delays after ungating clocks to allow ramp up, increased polling to allow more time for power up, and removed the incorrect sequences. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Ovidiu Bunea <ovidiu.bunea@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `1bde5584e2`) Cc: stable@vger.kernel.org	2025-09-09 12:25:22 -04:00
Fangzhi Zuo	f5c32370db	drm/amd/display: Disable DPCD Probe Quirk Disable dpcd probe quirk to native aux. Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4500 Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20250904191351.746707-1-Jerry.Zuo@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `c5f4fb4058`) Cc: stable@vger.kernel.org # 6.16.y: `5281cbe0b5` Cc: stable@vger.kernel.org # 6.16.y: `0b4aa85e89` Cc: stable@vger.kernel.org # 6.16.y: `b87ed522b3` Cc: stable@vger.kernel.org # 6.16.y	2025-09-09 12:23:05 -04:00
Jani Nikula	cfa7b76597	drm/i915/power: fix size for for_each_set_bit() in abox iteration for_each_set_bit() expects size to be in bits, not bytes. The abox mask iteration uses bytes, but it works by coincidence, because the local variable holding the mask is unsigned long, and the mask only ever has bit 2 as the highest bit. Using a smaller type could lead to subtle and very hard to track bugs. Fixes: `62afef2811` ("drm/i915/rkl: RKL uses ABOX0 for pixel transfers") Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: stable@vger.kernel.org # v5.9+ Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20250905104149.1144751-1-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit `7ea3baa6ef`) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>	2025-09-09 09:08:37 +01:00
Lijo Lazar	440cec4ca1	drm/amdgpu: Wait for bootloader after PSPv11 reset Some PSPv11 SOCs take a longer time for PSP based mode-1 reset. Instead of checking for C2PMSG_33 status, add the callback wait_for_bootloader. Wait for bootloader to be back to steady state is already part of the generic mode-1 reset flow. Increase the retry count for bootloader wait and also fix the mask to prevent fake pass. Fixes: `8345a71fc5` ("drm/amdgpu: Add more checks to PSP mailbox") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4531 Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `32f73741d6`)	2025-09-08 11:05:53 -04:00
Dave Airlie	8b556ddeee	Merge tag 'amd-drm-fixes-6.17-2025-09-03' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.17-2025-09-03: amdgpu: - UserQ fixes - MES 11 fix - eDP/LVDS fix - Fix non-DC audio clean up - Fix duplicate cursor issue - Fix error path in PSP init Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250903221656.251254-1-alexander.deucher@amd.com	2025-09-05 08:06:34 +10:00
Chia-I Wu	a00f2015ac	drm/panthor: validate group queue count A panthor group can have at most MAX_CS_PER_CSG panthor queues. Fixes: `4bdca11507` ("drm/panthor: Add the driver frontend block") Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> # v1 Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20250903192133.288477-1-olvaffe@gmail.com	2025-09-04 15:59:23 +01:00

1 2 3 4 5 ...

116992 Commits