Commit Graph

1325483 Commits

Author SHA1 Message Date
Kuniyuki Iwashima
7fb1073300 net: Hold rtnl_net_lock() in (un)?register_netdevice_notifier_dev_net().
(un)?register_netdevice_notifier_dev_net() hold RTNL before triggering
the notifier for all netdev in the netns.

Let's convert the RTNL to rtnl_net_lock().

Note that move_netdevice_notifiers_dev_net() is assumed to be (but not
yet) protected by per-netns RTNL of both src and dst netns; we need to
convert wireless and hyperv drivers that call dev_change_net_namespace().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250106070751.63146-4-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07 17:49:20 -08:00
Kuniyuki Iwashima
ca779f4065 net: Hold rtnl_net_lock() in (un)?register_netdevice_notifier_net().
(un)?register_netdevice_notifier_net() hold RTNL before triggering the
notifier for all netdev in the netns.

Let's convert the RTNL to rtnl_net_lock().

Note that the per-netns netdev notifier is protected by per-netns RTNL.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250106070751.63146-3-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07 17:49:20 -08:00
Kuniyuki Iwashima
a239e06250 net: Hold __rtnl_net_lock() in (un)?register_netdevice_notifier().
(un)?register_netdevice_notifier() hold pernet_ops_rwsem and RTNL,
iterate all netns, and trigger the notifier for all netdev.

Let's hold __rtnl_net_lock() before triggering the notifier.

Note that we will need protection for netdev_chain when RTNL is
removed.  (e.g. blocking_notifier conversion [0] with a lockdep
annotation [1])

Link: https://lore.kernel.org/netdev/20250104063735.36945-2-kuniyu@amazon.com/ [0]
Link: https://lore.kernel.org/netdev/20250105075957.67334-1-kuniyu@amazon.com/ [1]
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250106070751.63146-2-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07 17:49:19 -08:00
Dr. David Alan Gilbert
4ce1aeece9 ixgbevf: Remove unused ixgbevf_hv_mbx_ops
The const struct ixgbevf_hv_mbx_ops was added in 2016 as part of
commit c6d45171d7 ("ixgbevf: Support Windows hosts (Hyper-V)")

but has remained unused.

The functions it references are still referenced elsewhere.

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Link: https://patch.msgid.link/20250105122847.27341-1-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07 17:43:47 -08:00
Eric Dumazet
1b960cd193 net: watchdog: rename __dev_watchdog_up() and dev_watchdog_down()
In commit d7811e623d ("[NET]: Drop tx lock in dev_watchdog_up")
dev_watchdog_up() became a simple wrapper for __netdev_watchdog_up()

Herbert also said : "In 2.6.19 we can eliminate the unnecessary
__dev_watchdog_up and replace it with dev_watchdog_up."

This patch consolidates things to have only two functions, with
a common prefix.

- netdev_watchdog_up(), exported for the sake of one freescale driver.
  This replaces __netdev_watchdog_up() and dev_watchdog_up().

- netdev_watchdog_down(), static to net/sched/sch_generic.c
  This replaces dev_watchdog_down().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://patch.msgid.link/20250105090924.1661822-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07 17:43:01 -08:00
Jakub Kicinski
a8a6531164 Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2025-01-07

We've added 7 non-merge commits during the last 32 day(s) which contain
a total of 11 files changed, 190 insertions(+), 103 deletions(-).

The main changes are:

1) Migrate the test_xdp_meta.sh BPF selftest into test_progs
   framework, from Bastien Curutchet.

2) Add ability to configure head/tailroom for netkit devices,
   from Daniel Borkmann.

3) Fixes and improvements to the xdp_hw_metadata selftest,
   from Song Yoong Siang.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
  selftests/bpf: Extend netkit tests to validate set {head,tail}room
  netkit: Add add netkit {head,tail}room to rt_link.yaml
  netkit: Allow for configuring needed_{head,tail}room
  selftests/bpf: Migrate test_xdp_meta.sh into xdp_context_test_run.c
  selftests/bpf: test_xdp_meta: Rename BPF sections
  selftests/bpf: Enable Tx hwtstamp in xdp_hw_metadata
  selftests/bpf: Actuate tx_metadata_len in xdp_hw_metadata
====================

Link: https://patch.msgid.link/20250107130908.143644-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07 15:39:09 -08:00
Ted Chen
a1942da8a3 bridge: Make br_is_nd_neigh_msg() accept pointer to "const struct sk_buff"
The skb_buff struct in br_is_nd_neigh_msg() is never modified. Mark it as
const.

Signed-off-by: Ted Chen <znscnchen@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20250104083846.71612-1-znscnchen@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-07 15:13:10 +01:00
Paolo Abeni
04ced323ef Merge branch 'dev-hold-per-netns-rtnl-in-register-netdev'
Kuniyuki Iwashima says:

====================
dev: Hold per-netns RTNL in register_netdev().

Patch 1 adds rtnl_net_lock_killable() and Patch 2 uses it in
register_netdev() and converts it and unregister_netdev() to
per-netns RTNL.

With this and the netdev notifier series [0], ASSERT_RTNL_NET()
for NETDEV_REGISTER [1] wasn't fired on a simplest QEMU setup
like e1000 + x86_64_defconfig + CONFIG_DEBUG_NET_SMALL_RTNL.

[0]: https://lore.kernel.org/netdev/20250104063735.36945-1-kuniyu@amazon.com/

[1]:
---8<---
diff --git a/net/core/rtnl_net_debug.c b/net/core/rtnl_net_debug.c
index f406045cbd0e..c0c30929002e 100644
--- a/net/core/rtnl_net_debug.c
+++ b/net/core/rtnl_net_debug.c
@@ -21,7 +21,6 @@ static int rtnl_net_debug_event(struct notifier_block *nb,
 	case NETDEV_DOWN:
 	case NETDEV_REBOOT:
 	case NETDEV_CHANGE:
-	case NETDEV_REGISTER:
 	case NETDEV_UNREGISTER:
 	case NETDEV_CHANGEMTU:
 	case NETDEV_CHANGEADDR:
@@ -60,19 +59,10 @@ static int rtnl_net_debug_event(struct notifier_block *nb,
 		ASSERT_RTNL();
 		break;

-	/* Once an event fully supports RTNL_NET, move it here
-	 * and remove "if (0)" below.
-	 *
-	 * case NETDEV_XXX:
-	 *	ASSERT_RTNL_NET(net);
-	 *	break;
-	 */
-	}
-
-	/* Just to avoid unused-variable error for dev and net. */
-	if (0)
+	case NETDEV_REGISTER:
 		ASSERT_RTNL_NET(net);
+		break;
+	}

 	return NOTIFY_DONE;
 }
---8<---
====================

Link: https://patch.msgid.link/20250104082149.48493-1-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-07 13:45:58 +01:00
Kuniyuki Iwashima
00fb982393 dev: Hold per-netns RTNL in (un)?register_netdev().
Let's hold per-netns RTNL of dev_net(dev) in register_netdev()
and unregister_netdev().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-07 13:45:53 +01:00
Kuniyuki Iwashima
7bd72a4aa2 rtnetlink: Add rtnl_net_lock_killable().
rtnl_lock_killable() is used only in register_netdev()
and will be converted to per-netns RTNL.

Let's unexport it and add the corresponding helper.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-07 13:45:53 +01:00
Mohsin Bashir
2f4f8893e0 eth: fbnic: update fbnic_poll return value
In cases where the work done is less than the budget, `fbnic_poll` is
returning 0. This affects the tracing of `napi_poll`. Following is a
snippet of before and after result from `napi_poll` tracepoint. Instead,
returning the work done improves the manual tracing.

Before:
@[10]: 1
...
@[64]: 208175
@[0]: 2128008

After:
@[56]: 86
@[48]: 222
...
@[5]: 1885756
@[6]: 1933841

Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Link: https://patch.msgid.link/20250104015316.3192946-1-mohsin.bashr@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-07 13:22:02 +01:00
Paolo Abeni
097691b019 Merge branch 'net-airoha-add-qdisc-offload-support'
Lorenzo Bianconi says:

====================
net: airoha: Add Qdisc offload support

Introduce support for ETS and HTB Qdisc offload available on the Airoha
EN7581 ethernet controller.
====================

Link: https://patch.msgid.link/20250103-airoha-en7581-qdisc-offload-v1-0-608a23fa65d5@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-07 12:32:53 +01:00
Lorenzo Bianconi
ef1ca92713 net: airoha: Add sched HTB offload support
Introduce support for HTB Qdisc offload available in the Airoha EN7581
ethernet controller. EN7581 can offload only one level of HTB leafs.
Each HTB leaf represents a QoS channel supported by EN7581 SoC.
The typical use-case is creating a HTB leaf for QoS channel to rate
limit the egress traffic and attach an ETS Qdisc to each HTB leaf in
order to enforce traffic prioritization.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-07 12:32:50 +01:00
Lorenzo Bianconi
20bf7d07c9 net: airoha: Add sched ETS offload support
Introduce support for ETS Qdisc offload available on the Airoha EN7581
ethernet controller. In order to be effective, ETS Qdisc must configured
as leaf of a HTB Qdisc (HTB Qdisc offload will be added in the following
patch). ETS Qdisc available on EN7581 ethernet controller supports at
most 8 concurrent bands (QoS queues). We can enable an ETS Qdisc for
each available QoS channel.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-07 12:32:50 +01:00
Lorenzo Bianconi
2b288b8156 net: airoha: Introduce ndo_select_queue callback
Airoha EN7581 SoC supports 32 Tx DMA rings used to feed packets to QoS
channels. Each channels supports 8 QoS queues where the user can apply
QoS scheduling policies. In a similar way, the user can configure hw
rate shaping for each QoS channel.
Introduce ndo_select_queue callback in order to select the tx queue
based on QoS channel and QoS queue. In particular, for dsa device select
QoS channel according to the dsa user port index, rely on port id
otherwise. Select QoS queue based on the skb priority.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-07 12:32:50 +01:00
Lorenzo Bianconi
5f79559038 net: airoha: Enable Tx drop capability for each Tx DMA ring
This is a preliminary patch in order to enable hw Qdisc offloading.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-07 12:32:50 +01:00
Willem de Bruijn
912d6f6697 selftests/net: packetdrill: report benign debug flakes as xfail
A few recently added packetdrill tests that are known time sensitive
(e.g., because testing timestamping) occasionally fail in debug mode:
https://netdev.bots.linux.dev/contest.html?executor=vmksft-packetdrill-dbg

These failures are well understood. Correctness of the tests is
verified in non-debug mode. Continue running in debug mode also, to
keep coverage with debug instrumentation.

But, only in debug mode, mark these tests with well understood
timing issues as XFAIL (known failing) rather than FAIL when failing.

Introduce an allow list xfail_list with known cases.

Expand the ktap infrastructure with XFAIL support.

Fixes: eab35989cc ("selftests/net: packetdrill: import tcp/fast_recovery, tcp/nagle, tcp/timestamping")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/netdev/20241218100013.0c698629@kernel.org/
Signed-off-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250103113142.129251-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-07 11:23:37 +01:00
Furong Xu
51cfbed198 net: stmmac: Set dma_sync_size to zero for discarded frames
If a frame is going to be discarded by driver, this frame is never touched
by driver and the cache lines never become dirty obviously,
page_pool_recycle_direct() wastes CPU cycles on unnecessary calling of
page_pool_dma_sync_for_device() to sync entire frame.
page_pool_put_page() with sync_size setting to 0 is the proper method.

Signed-off-by: Furong Xu <0x1207@gmail.com>
Link: https://patch.msgid.link/20250103093733.3872939-1-0x1207@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-07 11:00:02 +01:00
Nihar Chaithanya
49afc040f4 octeontx2-pf: mcs: Remove dead code and semi-colon from rsrc_name()
Every case in the switch-block ends with return statement, and the
default: branch handles the cases where rsrc_type is invalid and
returns "Unknown", this makes the return statement at the end of the
function unreachable and redundant.
The semi-colon is not required after the switch-block's curly braces.

Remove the semi-colon after the switch-block's curly braces and the
return statement at the end of the function.

This issue was reported by Coverity Scan.

Signed-off-by: Nihar Chaithanya <niharchaithanya@gmail.com>
Link: https://patch.msgid.link/20250104171905.13293-1-niharchaithanya@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 16:51:03 -08:00
Krzysztof Kozlowski
21a8a77abb nfc: st21nfca: Drop unneeded null check in st21nfca_tx_work()
Variable 'info' is obtained via container_of() of struct work_struct, so
it cannot be NULL.  Simplify the code and solve Smatch warning:

  drivers/nfc/st21nfca/dep.c:119 st21nfca_tx_work() warn: can 'info' even be NULL?

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250104142043.116045-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 16:49:29 -08:00
Jakub Kicinski
3c89a986bb Merge branch 'mlx5-hardware-steering-part-2'
Tariq Toukan says:

====================
mlx5 Hardware Steering part 2

This series contain HWS code cleanups, enhancements, bug fixes, and
additions. Note that some of these patches are fixing bugs in existing
code, but we submit them without 'Fixes' tag to avoid the unnecessary
burden for stable releases, as HWS still couldn't be enabled.

Patches 1-5:
HWS, various code cleanups and enhancements

Patches 6-14:
HWS, various bug fixes and additions

Patch 15:
HWS, setting timeout on polling
====================

Link: https://patch.msgid.link/20250102181415.1477316-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 16:33:45 -08:00
Yevgeny Kliteynik
d74ee6e197 net/mlx5: HWS, set timeout on polling for completion
Consolidate BWC polling for completion into one function
and set a time limit on the loop that polls for completion.
This can happen only if there is some issue with FW/PCI/HW,
such as FW being stuck, PCI issue, etc.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Itamar Gozlan <igozlan@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250102181415.1477316-16-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 16:33:41 -08:00
Vlad Dogaru
663e61225c net/mlx5: HWS, support flow sampler destination
Since sampler isn't currently supported via HWS, use a FW island
that forwards any packets to the supplied sampler.

Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250102181415.1477316-15-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 16:33:41 -08:00
Yevgeny Kliteynik
85ab9ea325 net/mlx5: HWS, use the right size when writing arg data
When writing arg data, wrong size was used - fixing this.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Itamar Gozlan <igozlan@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250102181415.1477316-14-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 16:33:41 -08:00
Vlad Dogaru
a105db854c net/mlx5: HWS, handle returned error value in pool alloc
Handle all negative return values as errors, not just -1.
The code previously treated -ENOMEM (and potentially other negative
values) as valid segment numbers, leading to incorrect behavior.
This fix ensures that any negative return value is treated as an error.

Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250102181415.1477316-13-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 16:33:41 -08:00
Yevgeny Kliteynik
be482f1d10 net/mlx5: HWS, fix definer's HWS_SET32 macro for negative offset
When bit offset for HWS_SET32 macro is negative,
UBSAN complains about the shift-out-of-bounds:

  UBSAN: shift-out-of-bounds in
  drivers/net/ethernet/mellanox/mlx5/core/steering/hws/definer.c:177:2
  shift exponent -8 is negative

Fixes: 74a778b4a6 ("net/mlx5: HWS, added definers handling")
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Erez Shitrit <erezsh@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250102181415.1477316-12-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 16:33:41 -08:00
Yevgeny Kliteynik
2f851d1702 net/mlx5: HWS, separate SQ that HWS uses from the usual traffic SQs
Mark the HWS SQ as 'non_wire' so that 'Flow Update' flow
won't mix with network traffic.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Itamar Gozlan <igozlan@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250102181415.1477316-11-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 16:33:40 -08:00
Yevgeny Kliteynik
61fb92701b net/mlx5: HWS, num_of_rules counter on matcher should be atomic
Rule counter in matcher's struct is used in two places:

1. As heuristics to decide when the number of rules have crossed a
certain percentage threshold and the matcher should be resized.
We don't mind here if the number will be off by 1-2 due to concurrency.

2. When destroying matcher, the counter value is checked and the
user is warned if it is not 0. Here we lock all the queues, so the
counter will be correct.

We don't need to always have *exact* number, but we do need this
number to not be corrupted, which is what is happening when the
counter isn't atomic, due to update by different threads.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Erez Shitrit <erezsh@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250102181415.1477316-10-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 16:33:40 -08:00
Yevgeny Kliteynik
05e3c287b9 net/mlx5: HWS, reduce memory consumption of a matcher struct
Instead of having a large array of action templates allocated with
kmalloc, have smaller array and allocate it with kvmalloc.

The size of the array represents the max number of AT attach
operations for the same matcher. This number is not expected
to be very high. In any case, when the limit is reached, the
next attempt to attach new AT will result in creation of a new
matcher and moving all the rules to this matcher.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Erez Shitrit <erezsh@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250102181415.1477316-9-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 16:33:40 -08:00
Yevgeny Kliteynik
ad4da6cc36 net/mlx5: HWS, remove wrong deletion of the miss table list
Remove wrong cleanup of the old miss table list and
simplify the error flow in the function.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Itamar Gozlan <igozlan@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250102181415.1477316-8-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 16:33:40 -08:00
Yevgeny Kliteynik
1ce840c7a6 net/mlx5: HWS, change error flow on matcher disconnect
Currently, when firmware failure occurs during matcher disconnect flow,
the error flow of the function reconnects the matcher back and returns
an error, which continues running the calling function and eventually
frees the matcher that is being disconnected.
This leads to a case where we have a freed matcher on the matchers list,
which in turn leads to use-after-free and eventual crash.

This patch fixes that by not trying to reconnect the matcher back when
some FW command fails during disconnect.

Note that we're dealing here with FW error. We can't overcome this
problem. This might lead to bad steering state (e.g. wrong connection
between matchers), and will also lead to resource leakage, as it is
the case with any other error handling during resource destruction.

However, the goal here is to allow the driver to continue and not crash
the machine with use-after-free error.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Itamar Gozlan <igozlan@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250102181415.1477316-7-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 16:33:40 -08:00
Yevgeny Kliteynik
cc611ab6c7 net/mlx5: HWS, add error message on failure to move rules
Add error message for failure to move rules from
old matcher to new one during rehash.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Itamar Gozlan <igozlan@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250102181415.1477316-6-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 16:33:40 -08:00
Yevgeny Kliteynik
c86963aae5 net/mlx5: HWS, simplify allocations as we support only FDB
In pools, STCs and actions: no need to allocate array for various
table types, as HWS is used to manage only FDB flow tables.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Erez Shitrit <erezsh@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250102181415.1477316-5-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 16:33:40 -08:00
Yevgeny Kliteynik
0a1ef807a4 net/mlx5: HWS, denote how refcounts are protected
Some HWS structs have refcounts that are just u32.
Comment how they are protected and add '__must_hold()'
annotation where applicable.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Erez Shitrit <erezsh@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250102181415.1477316-4-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 16:33:39 -08:00
Yevgeny Kliteynik
0647f27a5f net/mlx5: HWS, remove implementation of unused FW commands
Remove functions that manage alias objects - they are not used.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Itamar Gozlan <igozlan@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250102181415.1477316-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 16:33:39 -08:00
Yevgeny Kliteynik
020ca0abae net/mlx5: HWS, remove the use of duplicated structs
Remove definition in HWS of structs that are already defined
in mlx5_ifc.h, and fix the usage of these structs.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Itamar Gozlan <igozlan@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250102181415.1477316-2-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 16:33:39 -08:00
Jakub Kicinski
7c7ea7056a Merge branch 'net-pcs-add-supported_interfaces-bitmap-for-pcs'
Russell King says:

====================
net: pcs: add supported_interfaces bitmap for PCS

This series adds supported_interfaces for PCS, which gives MAC code
a way to determine the interface modes that the PCS supports without
having to implement functions such as xpcs_get_interfaces(), or
workarounds such as in

 https://lore.kernel.org/20241213090526.71516-3-maxime.chevallier@bootlin.com

Patch 1 adds the new bitmask to struct phylink_pcs, and code within
phylink to validate that the PCS returned by the MAC driver supports
the interface mode - but only if this bitmask is non-empty.

Patch 2 through 4 fills in the interface modes for XPCS, Mediatek LynxI
and Lynx PCS.

Patch 5 adds support to stmmac to make use of this bitmask when filling
in phylink_config.supported_interfaces, eliminating the call to
xpcs_get_interfaces.

As xpcs_get_interfaces() is now unused outside of pcs-xpcs.c, patch 6
makes this function static and removes it from the header file.
====================

Link: https://patch.msgid.link/Z3fG9oTY9F9fCYHv@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 16:26:17 -08:00
Russell King (Oracle)
2410719cdd net: pcs: xpcs: make xpcs_get_interfaces() static
xpcs_get_interfaces() should no longer be used outside of the XPCS
code, so make it static.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/E1tTffk-007Roi-JM@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 16:26:13 -08:00
Russell King (Oracle)
d13cefbb10 net: stmmac: use PCS supported_interfaces
Use the PCS' supported_interfaces member to build the MAC level
supported_interfaces bitmap.

Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1tTfff-007Roc-Ff@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 16:26:13 -08:00
Russell King (Oracle)
b0f88c1b9a net: pcs: lynx: fill in PCS supported_interfaces
Fill in the new PCS supported_interfaces member with the interfaces
that Lynx supports.

Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1tTffa-007RoV-Bo@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 16:26:13 -08:00
Russell King (Oracle)
b87d4ee16b net: pcs: mtk-lynxi: fill in PCS supported_interfaces
Fill in the new PCS supported_interfaces member with the interfaces
that the Mediatek LynxI supports.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/E1tTffV-007RoP-8D@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 16:26:13 -08:00
Russell King (Oracle)
906909fabb net: pcs: xpcs: fill in PCS supported_interfaces
Fill in the new PCS supported_interfaces member with the interfaces
that XPCS supports.

Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1tTffQ-007RoJ-4u@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 16:26:12 -08:00
Russell King (Oracle)
fbb9a9d263 net: phylink: add support for PCS supported_interfaces bitmap
Add support for the PCS to specify which interfaces it supports, which
can be used by MAC drivers to build the main supported_interfaces
bitmap. Phylink also validates that the PCS returned by the MAC driver
supports the interface that the MAC was asked for.

An empty supported_interfaces bitmap from the PCS indicates that it
does not provide this information, and we handle that appropriately.

Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1tTffL-007RoD-1Y@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 16:26:12 -08:00
Eric Dumazet
4475d56145 net: hsr: remove one synchronize_rcu() from hsr_del_port()
Use kfree_rcu() instead of synchronize_rcu()+kfree().

This might allow syzbot to fuzz HSR a bit faster...

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250103101148.3594545-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 16:09:13 -08:00
Eric Dumazet
95fc45d1de ax25: rcu protect dev->ax25_ptr
syzbot found a lockdep issue [1].

We should remove ax25 RTNL dependency in ax25_setsockopt()

This should also fix a variety of possible UAF in ax25.

[1]

WARNING: possible circular locking dependency detected
6.13.0-rc3-syzkaller-00762-g9268abe611b0 #0 Not tainted
------------------------------------------------------
syz.5.1818/12806 is trying to acquire lock:
 ffffffff8fcb3988 (rtnl_mutex){+.+.}-{4:4}, at: ax25_setsockopt+0xa55/0xe90 net/ax25/af_ax25.c:680

but task is already holding lock:
 ffff8880617ac258 (sk_lock-AF_AX25){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1618 [inline]
 ffff8880617ac258 (sk_lock-AF_AX25){+.+.}-{0:0}, at: ax25_setsockopt+0x209/0xe90 net/ax25/af_ax25.c:574

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (sk_lock-AF_AX25){+.+.}-{0:0}:
        lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5849
        lock_sock_nested+0x48/0x100 net/core/sock.c:3642
        lock_sock include/net/sock.h:1618 [inline]
        ax25_kill_by_device net/ax25/af_ax25.c:101 [inline]
        ax25_device_event+0x24d/0x580 net/ax25/af_ax25.c:146
        notifier_call_chain+0x1a5/0x3f0 kernel/notifier.c:85
       __dev_notify_flags+0x207/0x400
        dev_change_flags+0xf0/0x1a0 net/core/dev.c:9026
        dev_ifsioc+0x7c8/0xe70 net/core/dev_ioctl.c:563
        dev_ioctl+0x719/0x1340 net/core/dev_ioctl.c:820
        sock_do_ioctl+0x240/0x460 net/socket.c:1234
        sock_ioctl+0x626/0x8e0 net/socket.c:1339
        vfs_ioctl fs/ioctl.c:51 [inline]
        __do_sys_ioctl fs/ioctl.c:906 [inline]
        __se_sys_ioctl+0xf5/0x170 fs/ioctl.c:892
        do_syscall_x64 arch/x86/entry/common.c:52 [inline]
        do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #0 (rtnl_mutex){+.+.}-{4:4}:
        check_prev_add kernel/locking/lockdep.c:3161 [inline]
        check_prevs_add kernel/locking/lockdep.c:3280 [inline]
        validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3904
        __lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5226
        lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5849
        __mutex_lock_common kernel/locking/mutex.c:585 [inline]
        __mutex_lock+0x1ac/0xee0 kernel/locking/mutex.c:735
        ax25_setsockopt+0xa55/0xe90 net/ax25/af_ax25.c:680
        do_sock_setsockopt+0x3af/0x720 net/socket.c:2324
        __sys_setsockopt net/socket.c:2349 [inline]
        __do_sys_setsockopt net/socket.c:2355 [inline]
        __se_sys_setsockopt net/socket.c:2352 [inline]
        __x64_sys_setsockopt+0x1ee/0x280 net/socket.c:2352
        do_syscall_x64 arch/x86/entry/common.c:52 [inline]
        do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(sk_lock-AF_AX25);
                               lock(rtnl_mutex);
                               lock(sk_lock-AF_AX25);
  lock(rtnl_mutex);

 *** DEADLOCK ***

1 lock held by syz.5.1818/12806:
  #0: ffff8880617ac258 (sk_lock-AF_AX25){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1618 [inline]
  #0: ffff8880617ac258 (sk_lock-AF_AX25){+.+.}-{0:0}, at: ax25_setsockopt+0x209/0xe90 net/ax25/af_ax25.c:574

stack backtrace:
CPU: 1 UID: 0 PID: 12806 Comm: syz.5.1818 Not tainted 6.13.0-rc3-syzkaller-00762-g9268abe611b0 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Call Trace:
 <TASK>
  __dump_stack lib/dump_stack.c:94 [inline]
  dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
  print_circular_bug+0x13a/0x1b0 kernel/locking/lockdep.c:2074
  check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2206
  check_prev_add kernel/locking/lockdep.c:3161 [inline]
  check_prevs_add kernel/locking/lockdep.c:3280 [inline]
  validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3904
  __lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5226
  lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5849
  __mutex_lock_common kernel/locking/mutex.c:585 [inline]
  __mutex_lock+0x1ac/0xee0 kernel/locking/mutex.c:735
  ax25_setsockopt+0xa55/0xe90 net/ax25/af_ax25.c:680
  do_sock_setsockopt+0x3af/0x720 net/socket.c:2324
  __sys_setsockopt net/socket.c:2349 [inline]
  __do_sys_setsockopt net/socket.c:2355 [inline]
  __se_sys_setsockopt net/socket.c:2352 [inline]
  __x64_sys_setsockopt+0x1ee/0x280 net/socket.c:2352
  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
  do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f7b62385d29

Fixes: c433570458 ("ax25: fix a use-after-free in ax25_fillin_cb()")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250103210514.87290-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 15:57:01 -08:00
Guillaume Nault
3f9f5cd005 sctp: Prepare sctp_v4_get_dst() to dscp_t conversion.
Define inet_sk_dscp() to get a dscp_t value from struct inet_sock, so
that sctp_v4_get_dst() can easily set ->flowi4_tos from a dscp_t
variable. For the SCTP_DSCP_SET_MASK case, we can just use
inet_dsfield_to_dscp() to get a dscp_t value.

Then, when converting ->flowi4_tos from __u8 to dscp_t, we'll just have
to drop the inet_dscp_to_dsfield() conversion function.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/1a645f4a0bc60ad18e7c0916642883ce8a43c013.1735835456.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 13:49:38 -08:00
Jakub Kicinski
286bb9985f Merge branch 'igc-deadcoding'
Dr. David Alan Gilbert says:

====================
igc deadcoding

This set removes some functions that are entirely unused
and have been since ~2018.
====================

Link: https://patch.msgid.link/20250102174142.200700-1-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 13:41:28 -08:00
Dr. David Alan Gilbert
c758890813 igc: Remove unused igc_read/write_pcie_cap_reg
The last uses of igc_read_pcie_cap_reg() and igc_write_pcie_cap_reg()
were removed in 2019 by
commit 16ecd8d9af ("igc: Remove the obsolete workaround")

Remove them.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20250102174142.200700-4-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 13:32:44 -08:00
Dr. David Alan Gilbert
121c3c6bc6 igc: Remove unused igc_read/write_pci_cfg wrappers
igc_read_pci_cfg() and igc_write_pci_cfg were added in 2018 as part of
commit 146740f9ab ("igc: Add support for PF")
but have remained unused.

Remove them.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20250102174142.200700-3-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 13:32:44 -08:00
Dr. David Alan Gilbert
b37dba891b igc: Remove unused igc_acquire/release_nvm
igc_acquire_nvm() and igc_release_nvm() were added in 2018 as part of
commit ab40561268 ("igc: Add NVM support")

but never used.

Remove them.

The igc_1225.c has it's own specific implementations.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20250102174142.200700-2-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-06 13:32:44 -08:00