FROMLIST: wifi: ath12k: fix EAPOL TX failure caused by stale tcl_metadata bits#1419
Conversation
A wrong channel survey index was introduced in ath12k_mac_op_get_survey by [1], which can cause ACS to fail. The index is decremented before being used, resulting in an incorrect value when accessing the channel survey data. Fix the index handling to ensure the correct survey entry is used and avoid ACS failures. Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00302-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.115823.3 Fixes: 4f242b1 ("wifi: ath12k: support get_survey mac op for single wiphy") # [1] Signed-off-by: Yingying Tang <yingying.tang@oss.qualcomm.com>
Commit [1] introduces dp->reo_cmd_update_rx_queue_list for the purpose of tracking all pending REO queue flush commands. The helper ath12k_dp_prepare_reo_update_elem() allocates an element and populates it with REO queue information, then add it to the list. The element would be helpful during clean up stage to finally unmap/free the corresponding REO queue buffer. In MLO scenarios with more than one links, for non dp_primary_link_only chips like WCN7850, that helper is called for each link peer. This results in multiple elements added to the list but all of them pointing to the same REO queue buffer. Consequently the same buffer gets unmap/freed multiple times: BUG kmalloc-2k (Tainted: G B W O ): Object already free ----------------------------------------------------------------------------- Allocated in ath12k_wifi7_dp_rx_assign_reoq+0xce/0x280 [ath12k_wifi7] age=7436 cpu=10 pid=16130 __kmalloc_noprof ath12k_wifi7_dp_rx_assign_reoq ath12k_dp_rx_peer_tid_setup ath12k_dp_peer_setup ath12k_mac_station_add ath12k_mac_op_sta_state [...] Freed in ath12k_dp_rx_tid_cleanup.part.0+0x25/0x40 [ath12k] age=1 cpu=27 pid=16137 kfree ath12k_dp_rx_tid_cleanup.part.0 ath12k_dp_rx_reo_cmd_list_cleanup ath12k_dp_cmn_device_deinit ath12k_core_stop ath12k_core_hw_group_cleanup ath12k_pci_remove Fix this by allowing list addition for primary link only. Note dp_primary_link_only chips like QCN9274 are not affected by this change, because that's what they were doing in the first place. Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00302-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.115823.3 Fixes: 3bf2e57 ("wifi: ath12k: Add Retry Mechanism for REO RX Queue Update Failures") # [1] Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221011 Signed-off-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Signed-off-by: Yingying Tang <yingying.tang@oss.qualcomm.com>
Add support for 5 GHz channel 177 with a center frequency of 5885 MHz and Operating Class 125 per IEEE Std 802.11-2024 Table E-4. Channels 169, 173, and 177 are in the 5.9 GHz band and must be disabled when 5.9 GHz service bit is not supported. The 5.9 GHz band is only permitted for WLAN operation under FCC regulations. Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00302-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.115823.3 Link: https://lore.kernel.org/ath12k/20260415063857.2462256-1-yintang@qti.qualcomm.com Signed-off-by: Yingying Tang <yingying.tang@oss.qualcomm.com>
ath12k_dp_rx_deliver_msdu() currently uses hal_rx_desc_data::peer_id parsed from mpdu_start descriptor to do peer lookup. However In an A-MSDU aggregation scenario, hardware only populates mpdu_start descriptor for the first sub-msdu, but not the following ones. In that case peer_id could be invalid, leading to peer lookup failure: ath12k_wifi7_pci 0000:06:00.0: rx skb 00000000c391c041 len 1532 peer (null) 0 ucast sn 0 eht320 rate_idx 12 vht_nss 2 freq 6105 band 3 flag 0x40d1a fcs-err 0 mic-err 0 amsdu-more 0 As a result pubsta is NULL and parts of ieee80211_rx_status structure are left uninitialized, which may cause unexpected behavior. Fix it by switching the normal RX path to use ath12k_skb_rxcb::peer_id which is parsed from REO ring's rx_mpdu_desc and is always valid. hal_rx_desc_data::peer_id is still used in ath12k_wifi7_dp_rx_frag_h_mpdu(), which is safe since A-MSDU aggregation does not occur for fragmented frames. Similarly, ath12k_skb_rxcb::peer_id may be overwritten by hal_rx_desc_data::peer_id in ath12k_wifi7_dp_rx_h_mpdu(), which only handles non-aggregated multicast/broadcast traffic. Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00302-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.115823.3 Link: https://lore.kernel.org/all/20260427-ath12k-fix-peer-id-source-v1-1-b5f701fb8e88@oss.qualcomm.com Fixes: 11157e0 ("wifi: ath12k: Use ath12k_dp_peer in per packet Tx & Rx paths") Signed-off-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
HAL_TLV_HDR_LEN was using the wrong bitmask; fix it to cover
bits [21:10]. Also drop HAL_SRNG_TLV_HDR_{TAG,LEN} and use the
generic TLV header bit definitions for TLV32/TLV64 encode/decode
to avoid redundant macros.
Tested-on: QCC2072 hw1.0 PCI WLAN.COL.1.0.c2-00068-QCACOLSWPL_V1_TO_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00302-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.115823.3
Fixes: d889913 ("wifi: ath12k: driver for Qualcomm Wi-Fi 7 devices")
Signed-off-by: Miaoqing Pan <miaoqing.pan@oss.qualcomm.com>
Link: https://lore.kernel.org/linux-wireless/20260509025819.1641630-2-miaoqing.pan@oss.qualcomm.com/
Change TLV decode helpers to return the TLV value pointer and optionally decode tag/len/usrid via out parameters. This allows reusing the helpers for DP monitor RX status header TLV parsing and avoids duplicated header decoding in callers. No functional change intended. Tested-on: QCC2072 hw1.0 PCI WLAN.COL.1.0.c2-00068-QCACOLSWPL_V1_TO_SILICONZ-1 Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00302-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.115823.3 Signed-off-by: Miaoqing Pan <miaoqing.pan@oss.qualcomm.com> Link: https://lore.kernel.org/linux-wireless/20260509025819.1641630-3-miaoqing.pan@oss.qualcomm.com/
… alignment Wi-Fi 7 monitor RX status TLV parsing needs to decode TLV headers and advance the pointer with the correct header alignment. Different targets use different TLV header layouts (32-bit vs 64-bit), but the HAL ops for dp_mon RX status header decode and header alignment were not populated for all wifi7 targets. Add dp_mon RX status TLV header decode callbacks and TLV header alignment helpers to the wifi7 HAL ops for QCC2072, QCN9274 and WCN7850. Export helpers to query the required TLV header alignment for 32-bit and 64-bit TLV headers so the caller can align the TLV walk correctly across targets. Tested-on: QCC2072 hw1.0 PCI WLAN.COL.1.0.c2-00068-QCACOLSWPL_V1_TO_SILICONZ-1 Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00302-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.115823.3 Signed-off-by: Miaoqing Pan <miaoqing.pan@oss.qualcomm.com> Link: https://lore.kernel.org/linux-wireless/20260509025819.1641630-4-miaoqing.pan@oss.qualcomm.com/
Wi-Fi 7 monitor status parsing in dp_mon currently assumes a 64-bit TLV header and directly decodes tag/len/userid from struct hal_tlv_64_hdr. On chips using a 32-bit TLV header (e.g. QCC2072), this causes monitor RX status packets to be dropped during TLV parsing. Introduce HAL helpers to decode TLV header fields (tag/len/userid/value) for both 32-bit and 64-bit header layouts. Without changing the actual TLV parsing logic. Tested-on: QCC2072 hw1.0 PCI WLAN.COL.1.0.c2-00068-QCACOLSWPL_V1_TO_SILICONZ-1 Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00302-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.115823.3 Signed-off-by: Miaoqing Pan <miaoqing.pan@oss.qualcomm.com> Link: https://lore.kernel.org/linux-wireless/20260509025819.1641630-5-miaoqing.pan@oss.qualcomm.com/
Validate the pointer to the next RX monitor TLV more strictly by ensuring that at least a full TLV header is available within the status buffer before continuing TLV parsing. Prevent potential out-of-bounds access when handling malformed or truncated RX monitor status data. Tested-on: QCC2072 hw1.0 PCI WLAN.COL.1.0.c2-00068-QCACOLSWPL_V1_TO_SILICONZ-1 Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00302-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.115823.3 Signed-off-by: Miaoqing Pan <miaoqing.pan@oss.qualcomm.com> Link: https://lore.kernel.org/linux-wireless/20260509025819.1641630-6-miaoqing.pan@oss.qualcomm.com/
…y_tkip_mic() In ath12k_wifi7_dp_rx_h_verify_tkip_mic(), the call to ath12k_dp_rx_check_nwifi_hdr_len_valid() may return false when the NWIFI header length is invalid, causing the function to abort early with -EINVAL. When this happens, the error propagates to ath12k_wifi7_dp_rx_h_defrag(), which clears first_frag by setting it to NULL. As a result, the corresponding MSDU is no longer referenced by the defragmentation path and is never freed. This leads to a memory leak for the affected MSDU on this error path. Proper cleanup is required to ensure the MSDU is released when header validation fails during TKIP MIC verification. Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00302-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.115823.3 Fixes: 9a0dddf ("wifi: ath12k: Fix invalid data access in ath12k_dp_rx_h_undecap_nwifi") Signed-off-by: Miaoqing Pan <miaoqing.pan@oss.qualcomm.com> Link: https://lore.kernel.org/linux-wireless/20260512021108.2031651-1-miaoqing.pan@oss.qualcomm.com/
…ecap_nwifi In certain cases, hardware might provide packets with a length greater than the maximum native Wi-Fi header length. This can lead to accessing and modifying fields in the header within the ath11k_dp_rx_h_undecap_nwifi() function for the DP_RX_DECAP_TYPE_NATIVE_WIFI decap type and potentially result in invalid data access and memory corruption. Kernel stack is corrupted in: ath11k_dp_rx_h_undecap+0x6b0/0x6b0 [ath11k] Call trace: ath11k_dp_rx_h_mpdu+0x0/0x2e8 [ath11k] ath11k_dp_rx_h_mpdu+0x1e0/0x2e8 [ath11k] ath11k_dp_rx_wbm_err+0x1e0/0x450 [ath11k] ath11k_dp_rx_process_wbm_err+0x2fc/0x460 [ath11k] ath11k_dp_service_srng+0x2e0/0x348 [ath11k] Add a sanity check before processing the SKB to prevent invalid data access in the undecap native Wi-Fi function for the DP_RX_DECAP_TYPE_NATIVE_WIFI decap type. This adapted from the discussion/patch of the ath12k driver [1]. Tested-on: WCN6855 hw2.1 PCI WLAN.HSP.1.1-04685-QCAHSPSWPL_V1_V2_SILICONZ_IOE-1 Link: https://lore.kernel.org/linux-wireless/20250211090302.4105141-1-tamizh.raja@oss.qualcomm.com/ # [1] Signed-off-by: Miaoqing Pan <miaoqing.pan@oss.qualcomm.com> Link: https://lore.kernel.org/linux-wireless/20260512022351.2033155-2-miaoqing.pan@oss.qualcomm.com/
In the WBM error path, while processing TKIP MIC errors, MSDU length is fetched from the hal_rx_desc's msdu_end. This MSDU length is directly passed to skb_put() without validation. In stress test scenarios, the WBM error ring may receive invalid descriptors, which could lead to an invalid MSDU length. To fix this, add a check to drop the skb when the calculated MSDU length is greater than the skb size. This is adapted from the discussion/patch of the ath12k driver [1]. Tested-on: WCN6855 hw2.1 PCI WLAN.HSP.1.1-04685-QCAHSPSWPL_V1_V2_SILICONZ_IOE-1 Link: https://lore.kernel.org/linux-wireless/20250416021903.3178962-1-nithyanantham.paramasivam@oss.qualcomm.com/ # [1] Signed-off-by: Miaoqing Pan <miaoqing.pan@oss.qualcomm.com> Link: https://lore.kernel.org/linux-wireless/20260512022351.2033155-3-miaoqing.pan@oss.qualcomm.com/
For some chipsets, firmware can report HTT_T2H_MSG_TYPE_PEER_MAP2 with
peer_id 0 as a valid value for mapping ath12k_dp_link_peer to
ath12k_dp_peer.
ath12k_dp_peer_find_by_peerid() currently treats peer_id 0 as invalid.
When firmware assigns peer_id 0, peer lookup fails. As a result,
DHCP OFFER packets are dropped in __ieee80211_rx_handle_packet()
because pubsta is NULL.
ath12k_dp_rx_deliver_msdu() <- rx_info->peer_id 0
ath12k_dp_peer_find_by_peerid -> peer NULL
ieee80211_rx_napi <- pubsta NULL
ieee80211_rx_list
__ieee80211_rx_handle_packet <- pubsta NULL, skb undelivered
The following error in the TX completion path is caused by the same issue:
ath12k_wifi7_pci 0000:04:00.0: dp_tx: failed to find the peer with peer_id 0
The error message is triggered by:
ath12k_wifi7_dp_tx_complete_msdu
ath12k_dp_link_peer_find_by_peerid <- ts->peer_id 0
ath12k_dp_peer_find_by_peerid -> peer NULL
ath12k_dp_tx_htt_tx_complete_buf
ath12k_dp_link_peer_find_by_peerid <- peer_id 0
ath12k_dp_peer_find_by_peerid -> peer NULL
Fix this by allowing peer_id 0 in ath12k_dp_peer_find_by_peerid() and
rejecting only values >= ATH12K_DP_PEER_ID_INVALID.
Also update peer_id 0 handling in monitor path:
Always call ath12k_dp_link_peer_find_by_peerid() in
ath12k_dp_rx_h_find_link_peer() to fetch the peer, including when
peer_id is 0.
Always store peer_id in ppdu_info->peer_id in
ath12k_wifi7_dp_mon_rx_parse_status_tlv(), including peer_id 0.
Tested-on: QCC2072 hw1.0 PCI WLAN.COL.1.0.c2-00074-QCACOLSWPL_V1_TO_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c7-00108-QCAHMTSWPL_V1.0_V2.0_SILICONZ_UPSTREAM-3
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.6-01243-QCAHKSWPL_SILICONZ-1
Signed-off-by: Hangtian Zhu <hangtian.zhu@oss.qualcomm.com>
Link: https://lore.kernel.org/all/20260512025732.1297849-1-hangtian.zhu@oss.qualcomm.com/
Export irq_can_set_affinity() for loadable drivers that need a runtime check for IRQ affinity capability. In hierarchical IRQ setups where the effective irqchip path lacks .irq_set_affinity(), drivers may need to switch to a fallback policy. Without this export, module drivers cannot use the core helper and have to open-code equivalent checks. Signed-off-by: Hangtian Zhu <hangtian.zhu@oss.qualcomm.com> Link: https://lore.kernel.org/all/20260519011627.713068-1-hangtian.zhu@oss.qualcomm.com/
…unavailable Determine threaded NAPI policy from runtime IRQ capability of the DP MSI IRQ. If irq_can_set_affinity() reports that affinity cannot be set, enable threaded NAPI for DP interrupt groups so datapath processing is not constrained by a single-CPU softirq context. On RB3Gen2, where IRQ affinity is unavailable in the effective IRQ path, EHT160 UDP downlink throughput improved from 802 Mbps to 2.58 Gbps after enabling threaded NAPI. Tested-on: QCC2072 hw1.0 PCI WLAN.COL.1.0.c2-00074-QCACOLSWPL_V1_TO_SILICONZ-1 Signed-off-by: Hangtian Zhu <hangtian.zhu@oss.qualcomm.com> Link: https://lore.kernel.org/all/20260519011627.713068-1-hangtian.zhu@oss.qualcomm.com/
… dual-station support When P2P support is enabled, wpa_supplicant creates a p2p-device interface by default, which implicitly consumes one vdev. On systems managed by NetworkManager, this interface cannot be reliably disabled, leaving only two usable interfaces for user configurations. Increase num_vdevs to four for QCA6390 hw2.0, WCN6855 hw2.0/hw2.1, QCA2066 hw2.1, and QCA6698AQ hw2.1 to account for the implicit p2p-device and enable common concurrency scenarios such as AP + AP + STA. This change increases interface concurrency in the two-channel scenario by raising the maximum vdev limit, while keeping other combination rules unchanged. Tested-on: QCA6390 hw2.0 PCI WLAN.HST.1.0.1-05266-QCAHSTSWPLZ_V2_TO_X86-1 Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.41 Tested-on: WCN6855 hw2.1 PCI WLAN.HSP.1.1-04685-QCAHSPSWPL_V1_V2_SILICONZ_IOE-1 Tested-on: QCA2066 hw2.1 PCI WLAN.HSP.1.1-03926.13-QCAHSPSWPL_V2_SILICONZ_CE-2.52297.9 Tested-on: QCA6698AQ hw2.1 PCI WLAN.HSP.1.1-04685-QCAHSPSWPL_V1_V2_SILICONZ_IOE-1 Link: https://lore.kernel.org/linux-wireless/20260525020711.2590815-1-wei.zhang@oss.qualcomm.com/ Signed-off-by: Wei Zhang <wei.zhang@oss.qualcomm.com>
…rror paths ath12k_mac_vdev_create() has three error path issues that leave arvif in an inconsistent state: 1. When ath12k_wmi_vdev_create() fails, the function returns directly without clearing arvif->ar, which was already set before the WMI call. Subsequent code checking arvif->ar to determine vdev readiness will see a non-NULL value despite no vdev existing in firmware. 2. When ath12k_wmi_send_peer_delete_cmd() fails in err_peer_del, the code jumped to err: skipping the DP peer cleanup and vdev rollback, leaving num_created_vdevs, vdev maps and arvif list membership live. 3. When ath12k_wait_for_peer_delete_done() fails, the code jumped to err_vdev_del: skipping the DP peer cleanup. Fix by changing the ath12k_wmi_vdev_create() failure to goto err instead of returning directly, routing both err_peer_del failure paths through err_dp_peer_del: for proper DP peer and vdev rollback, and consolidating the arvif state cleanup at err:. Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00302-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.115823.3 Fixes: 477cabf ("wifi: ath12k: modify link arvif creation and removal for MLO") Link: https://lore.kernel.org/linux-wireless/20260512044906.1735821-1-wei.zhang@oss.qualcomm.com/ Signed-off-by: Wei Zhang <wei.zhang@oss.qualcomm.com>
…y link _ieee80211_set_active_links() calls _ieee80211_link_use_channel() for each newly-added link and WARN_ON_ONCE()s if it fails. The call uses assign_on_failure=true, which allows mac80211 to continue despite driver failures, but when a mac80211-level channel validation fails (e.g., combinations check, DFS, or no available radio), drv_assign_vif_chanctx() is never reached. Since ath12k_mac_vdev_create() is only called from that path, arvif->is_created remains false and arvif->ar remains NULL for the failed link. The subsequent drv_change_sta_links() call reaches ath12k_mac_op_change_sta_links(), which allocates an arsta and sets ahsta->links_map |= BIT(link_id) for the broken link before checking whether the link is ready. When the vdev was never created, only station_add() is skipped, but the link remains in links_map. Any subsequent operation iterating links_map and dereferencing arvif->ar without a NULL check will crash. Two observed examples are NULL deref in ath12k_mac_ml_station_remove() on disconnect and in ath12k_mac_op_set_key() when wpa_supplicant installs PTK keys. BUG: Unable to handle kernel NULL pointer dereference at 0x00000000 pc : ath12k_mac_station_post_remove+0x40/0xe8 [ath12k] Call trace: ath12k_mac_station_post_remove+0x40/0xe8 [ath12k] ath12k_mac_op_sta_state+0xb60/0x1720 [ath12k] drv_sta_state+0x100/0xbd8 [mac80211] __sta_info_destroy_part2+0x148/0x178 [mac80211] ieee80211_set_disassoc+0x500/0x678 [mac80211] BUG: Unable to handle kernel NULL pointer dereference at 0x00000000 pc : ath12k_mac_op_set_key+0x1f8/0x2c0 [ath12k] Call trace: ath12k_mac_op_set_key+0x1f8/0x2c0 [ath12k] drv_set_key+0x70/0x100 [mac80211] ieee80211_key_enable_hw_accel+0x78/0x260 [mac80211] ieee80211_add_key+0x16c/0x2ac [mac80211] nl80211_new_key+0x138/0x280 [cfg80211] Fix this by checking arvif->is_created before calling ath12k_mac_alloc_assign_link_sta(). This prevents the broken link from entering links_map, so all subsequent operations iterating the bitmap are protected. The reliability of arvif->is_created across all error paths is ensured by the preceding patch. Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00302-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.115823.3 Fixes: a27fa61 ("wifi: ath12k: support change_sta_links() mac80211 op") Link: https://lore.kernel.org/linux-wireless/20260512044906.1735821-1-wei.zhang@oss.qualcomm.com/ Signed-off-by: Wei Zhang <wei.zhang@oss.qualcomm.com>
For WCN7850, MAC buffer ring size is updated to 2048 in 955df16 ("wifi: ath12k: change MAC buffer ring size to 2048") to increase peak throughput. But during the RX process, a phenomenon can still be observed where the throughput drops by about 30% from its peak value and then recovers, and this behavior repeats during RX. After increasing MAC buffer ring size to 4096, the data rate drop has gone. Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00302-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.115823.3 Signed-off-by: Yingying Tang <yingying.tang@oss.qualcomm.com>
Commit [1] introduced a regression causing severely degraded MLO RX
throughput on WCN7850.
On WCN7850, there is only a single ar instance, but MLO uses two
link IDs. ath12k_dp_peer->hw_links[] is indexed using ar->hw_link_id,
which causes both MLO link IDs to be stored at the same index.
As a result, an incorrect link ID is assigned to MSDUs in
ath12k_dp_rx_deliver_msdu(), leading to severe MLO RX throughput loss.
Different chipsets identify the per-MSDU link differently:
- On QCN9274 / IPQ5332, the host owns multiple ar instances and the
per-MSDU hw_link_id from the RX descriptor maps cleanly through
dp_peer->hw_links[hw_link_id] to the IEEE link_id.
- On single-ar chipsets like WCN7850 / QCC2072, there is only one ar
instance for both MLO links, so dp_peer->hw_links[] has just one
valid slot and cannot be used to distinguish the two links. To
resolve the link, walk dp_peer->link_peers[] and match by
rxcb->peer_id, which on the link_peer side identifies the link
peer for the MSDU.
Add a new hw_op set_rx_link_id() so each chipset resolves the link
on the RX fast path using whatever signal it actually has, and let
the op itself decide whether to populate rx_status::link_valid and
rx_status::link_id:
QCN9274 / IPQ5332 : always derive link_id from
dp_peer->hw_links[rxcb->hw_link_id] and set
link_valid.
WCN7850 / QCC2072 : walk the link_peers[] of dp_peer to find the
link_peer whose peer_id matches rxcb->peer_id,
and set link_valid only when a match is found.
Otherwise leave link_valid clear so that
mac80211 can fall back to its own link
resolution path (via addr2 / deflink).
For WCN7850 / QCC2072, walking dp_peer->link_peers[] is bounded by
the number of links actually populated, so introduce a link_peers_map
bitmap (unsigned long) in struct ath12k_dp_peer that tracks populated
slots and use for_each_set_bit() to iterate. Non-MLO clients hit one
slot, current MLO clients hit two; the full ATH12K_NUM_MAX_LINKS
array is never scanned. The bitmap is maintained with WRITE_ONCE() on
the write side (under dp_hw->peer_lock) paired with READ_ONCE() on
both the lockless RX read side and the write-side RMW for KCSAN
correctness.
Also guard the dp_peer dereference in ath12k_mac_peer_cleanup_all()
with a NULL check, since peer->dp_peer can be NULL for self-peers or
peers not yet fully assigned, the pre-existing rcu_assign_pointer()
call there had the same latent issue.
This restores the correct link ID on WCN7850 without changing the
QCN9274 / IPQ5332 data path, which keeps its O(1) hw_links[]
indexing.
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c5-00302-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.115823.3
Fixes: 11157e0 ("wifi: ath12k: Use ath12k_dp_peer in per packet Tx & Rx paths") # [1]
Signed-off-by: Yingying Tang <yingying.tang@oss.qualcomm.com>
This reverts commit 8090556. Call trace: rcu_note_context_switch+0x4c4/0x508 (P) __schedule+0xbc/0x1204 schedule+0x34/0x110 schedule_timeout+0x84/0x11c __mhi_device_get_sync+0x164/0x228 [mhi] mhi_device_get_sync+0x1c/0x3c [mhi] ath12k_wifi7_pci_bus_wake_up+0x20/0x2c [ath12k_wifi7] ath12k_pci_read32+0x58/0x350 [ath12k] ath12k_pci_clear_dbg_registers+0x28/0xb8 [ath12k] ath12k_pci_panic_handler+0x20/0x44 [ath12k] ath12k_core_panic_handler+0x28/0x3c [ath12k] notifier_call_chain+0x78/0x1c0 atomic_notifier_call_chain+0x3c/0x5c ath12k_core_panic_handler() is invoked via atomic_notifier_call_chain(), which runs inside an RCU read-side critical section. The current code calls ath12k_pci_sw_reset() synchronously from this context, which eventually reaches mhi_device_get_sync() and schedule_timeout(), triggering a voluntary context switch within RCU. Revert change "wifi: ath12k: add panic handler" to avoid this issue. Tested-on: WLAN.HMT.1.1.c7-00108-QCAHMTSWPL_V1.0_V2.0_SILICONZ_UPSTREAM-3 Link: https://lore.kernel.org/all/20260612032332.2278338-1-yingying.tang@oss.qualcomm.com/ Signed-off-by: Yingying Tang <yingying.tang@oss.qualcomm.com>
Commit afbab6e ("wifi: ath12k: modify ath12k_mac_op_bss_info_changed() for MLO") replaced the bss_info_changed() callback with vif_cfg_changed() and link_info_changed() to support Multi-Link Operation (MLO). As a result, the station power save configuration is no longer correctly applied in ath12k_mac_bss_info_changed(). Move the handling of 'BSS_CHANGED_PS' into ath12k_mac_op_vif_cfg_changed() to align with the updated callback structure introduced for MLO, ensuring proper power-save behavior for station interfaces. Tested-on: WCN7850 hw2.0 PCI WLAN.IOE_HMT.1.1-00011-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1 Fixes: afbab6e ("wifi: ath12k: modify ath12k_mac_op_bss_info_changed() for MLO") Signed-off-by: Miaoqing Pan <miaoqing.pan@oss.qualcomm.com> Link: https://lore.kernel.org/all/20250908015025.1301398-1-miaoqing.pan@oss.qualcomm.com/ Signed-off-by: Daizhuang Bai <daizhuang.bai@oss.qualcomm.com>
…ng_access_begin In ATH11K_QMI_EVENT_FW_READY, ATH11K_FLAG_REGISTERED is set unconditionally even when ath11k_core_qmi_firmware_ready() fails. This leaves the driver in an inconsistent state where initialization is considered complete although the firmware ready handling did not finish successfully. During the subsequent SSR, the driver enters the restart path based on this incorrect state and dereferences uninitialized srng members, resulting in a NULL pointer dereference. Call trace: ath11k_hal_srng_access_begin+0xc/0x60 [ath11k] (P) ath11k_ce_cleanup_pipes+0x17c/0x180 [ath11k] ath11k_core_restart+0x40/0x168 [ath11k] Fix this by: - skipping firmware_ready if ATH11K_FLAG_REGISTERED is already set - setting ATH11K_FLAG_REGISTERED only when firmware_ready succeeds - setting ATH11K_FLAG_QMI_FAIL and aborting the FW_READY handling on error Tested-on: WCN6750 hw1.0 AHB WLAN.MSL.2.0.c2-00204-QCAMSLSWPLZ-1 Fixes: 6fe62a8 ("wifi: ath11k: Add cold boot calibration support on WCN6750") Signed-off-by: Gaole Zhang <gaole.zhang@oss.qualcomm.com> Link: https://lore.kernel.org/linux-wireless/20260609090609.4041009-1-gaole.zhang@oss.qualcomm.com/
…data bits On WCN7850, after the following sequence: 1. load ath12k and connect to a non-MLO AP 2. disconnect and connect to an MLO AP 3. disconnect and reconnect to the non-MLO AP the third connection always fails with a 4-Way handshake timeout. The supplicant transmits message 2 of 4 four times in response to AP retries of message 1, but the AP never sees any of them. ath12k_dp_vdev_tx_attach() composes dp_link_vif->tcl_metadata using |=, but dp_link_vif is embedded in struct ath12k_dp_vif and its slots are reused across vif/peer teardown and setup. Since tcl_metadata is never cleared on detach, vdev_id bits from a previous attach remain set when the same link slot is reused with a different vdev_id. In this specific issue, the same link slot is used for vdev_id 0, then vdev_id 1, then vdev_id 0 again, the OR yields tcl_metadata == 0x9, which encodes vdev_id 1 in the HTT_TCL_META_DATA_VDEV_ID field even though ti.vdev_id is 0. Firmware then routes the EAPOL frame to the wrong vdev and the AP never receives message 2. Use plain assignment instead of |= so the field is fully recomputed from the current arvif on every attach. Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c7-00108-QCAHMTSWPL_V1.0_V2.0_SILICONZ_UPSTREAM-3 Fixes: af66c76 ("wifi: ath12k: Refactor ath12k_vif structure") Link: https://lore.kernel.org/all/20260609-ath12k-fix-eapol-tcl-metadata-v1-1-d47e6f90d4ee@oss.qualcomm.com/ Signed-off-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com> Signed-off-by: Yingying Tang <yingying.tang@oss.qualcomm.com>
1aff87f to
da18553
Compare
PR #1419 — validate-patchPR: #1419
Final Summary
|
PR #1419 — checker-log-analyzerPR: #1419
Detailed report: Full report
|
|
Merge Check Failed: No Component Found Configuration Error: No component found for branch 'tech/net/ath'. There is no component associated with the provided branch in Polaris. Please verify the branch configuration. Branch: |
5c73a7e to
3c8e3d6
Compare
On WCN7850, after the following sequence:
the third connection always fails with a 4-Way handshake timeout. The supplicant transmits message 2 of 4 four times in response to AP retries of message 1, but the AP never sees any of them.
ath12k_dp_vdev_tx_attach() composes dp_link_vif->tcl_metadata using |=, but dp_link_vif is embedded in struct ath12k_dp_vif and its slots are reused across vif/peer teardown and setup. Since tcl_metadata is never cleared on detach, vdev_id bits from a previous attach remain set when the same link slot is reused with a different vdev_id. In this specific issue, the same link slot is used for vdev_id 0, then vdev_id 1, then vdev_id 0 again, the OR yields tcl_metadata == 0x9, which encodes vdev_id 1 in the HTT_TCL_META_DATA_VDEV_ID field even though ti.vdev_id is 0. Firmware then routes the EAPOL frame to the wrong vdev and the AP never receives message 2.
Use plain assignment instead of |= so the field is fully recomputed from the current arvif on every attach.
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.1.c7-00108-QCAHMTSWPL_V1.0_V2.0_SILICONZ_UPSTREAM-3
Fixes: af66c76 ("wifi: ath12k: Refactor ath12k_vif structure")
Link: https://lore.kernel.org/all/20260609-ath12k-fix-eapol-tcl-metadata-v1-1-d47e6f90d4ee@oss.qualcomm.com/
CRs-Fixed: 4589769
Signed-off-by: Baochen Qiang baochen.qiang@oss.qualcomm.com
Signed-off-by: Yingying Tang yingying.tang@oss.qualcomm.com