Skip to content

[RLC-9] Rebase Custom Changes to rlc-9/5.14.0-611.41.1.el9_7#1012

Open
ciq-kernel-automation[bot] wants to merge 41 commits intorlc-9/5.14.0-611.41.1.el9_7from
{jmaple}_rlc-9/5.14.0-611.41.1.el9_7
Open

[RLC-9] Rebase Custom Changes to rlc-9/5.14.0-611.41.1.el9_7#1012
ciq-kernel-automation[bot] wants to merge 41 commits intorlc-9/5.14.0-611.41.1.el9_7from
{jmaple}_rlc-9/5.14.0-611.41.1.el9_7

Conversation

@ciq-kernel-automation
Copy link

@ciq-kernel-automation ciq-kernel-automation bot commented Mar 25, 2026

Summary

This PR has been automatically created after successful completion of all CI stages.
This is an attempt to see how the KernelCI will work with a RLC rebase

Manually Inserted from local process

https://ciqinc.atlassian.net/browse/KERNEL-750

Update process (This kernel CentOS base for 5.14.0-611.41.1.el9_7)

  • Rolling Release Rebase Process
  • Create rlc-9/5.14.0-611.41.1.el9_7 branch from rocky9_7
  • Cherry-pick all code from previous branch rlc-9/5.14.0-611.36.1.el9_7 into new branch (skipping unneeded code)
    • Fix conflicts as they arise
  • Build and Test

Rebase Log

Already on 'rlc-9/5.14.0-611.36.1.el9_7'
Already on 'jmaple_rlc-9/5.14.0-611.41.1.el9_7'
[rolling release update] Rolling Product:  rlc-9
[rolling release update] Checking out branch:  rlc-9/5.14.0-611.36.1.el9_7
[rolling release update] Gathering all the RESF kernel Tags
[rolling release update] Found 13 RESF kernel tags
[rolling release update] Checking out branch:  rocky9_7
[rolling release update] Gathering all the RESF kernel Tags
[rolling release update] Found 14 RESF kernel tags
[rolling release update] Common tag sha:  b'52342fd2bff0'
"52342fd2bff05234cd1abbd3509551b77c8fe8cc Rebuild rocky9_7 with kernel-5.14.0-611.36.1.el9_7"
[rolling release update] Checking for FIPS protected changes between the common tag and HEAD
[rolling release update] Checking for FIPS protected changes
[rolling release update] Getting SHAS 52342fd2bff0..HEAD
[rolling release update] Number of commits to check:  24
[rolling release update] Checking modifications of shas
[rolling release update] Checked 2 of 24 commits
[rolling release update] Checked 4 of 24 commits
[rolling release update] Checked 6 of 24 commits
[rolling release update] Checked 8 of 24 commits
[rolling release update] Checked 10 of 24 commits
[rolling release update] Checked 12 of 24 commits
[rolling release update] Checked 14 of 24 commits
[rolling release update] Checked 16 of 24 commits
[rolling release update] Checked 18 of 24 commits
[rolling release update] Checked 20 of 24 commits
[rolling release update] Checked 22 of 24 commits
[rolling release update] Checked 24 of 24 commits
[rolling release update] 0 of 24 commits have FIPS protected changes
[rolling release update] Checking out old rolling branch:  rlc-9/5.14.0-611.36.1.el9_7
[rolling release update] Finding the CIQ Kernel and Associated Upstream commits between the last resf tag and HEAD
[rolling release update] Getting SHAS 52342fd2bff0..HEAD
[rolling release update] Last RESF tag sha:  b'52342fd2bff0'
[rolling release update] Total commits in old branch: 41
[rolling release update] Checking out new base branch:  rocky9_7
[rolling release update] Finding the kernel version for the new rolling release
[rolling release update] New Branch to create: rlc-9/5.14.0-611.41.1.el9_7
[rolling release update] Creating new branch: rlc-9/5.14.0-611.41.1.el9_7
[rolling release update] Creating new branch for PR:  jmaple_rlc-9/5.14.0-611.41.1.el9_7
[rolling release update] Creating Map of all new commits from last rolling release fork
[rolling release update] Total commits in new branch: 23
[rolling release update] Checking if any of the commits from the old rolling release are already present in the new base branch
[rolling release update] Found 0 duplicate commits to remove
[rolling release update] Applying 41 remaining commits to the new branch
  [1/41] fbabade1cfee selftests/mm temporary fix of hmm infinite loop
  [2/41] ed0f73c4d1ca SUSE: patch: crypto-ecdh-implement-FIPS-PCT.patch
  [3/41] 636c93eedf7d crypto: essiv - Zeroize keys on exit in essiv_aead_setkey()
  [4/41] fa6c6c89d1b1 crypto: jitter - replace LFSR with SHA3-256
  [5/41] 21fd6d824d3c crypto: aead,cipher - zeroize key buffer after use
  [6/41] e979812e8847 crypto: ecdh - explicitly zeroize private_key
  [7/41] c454e5047a21 crypto: lib/mpi - Fix unexpected pointer access in mpi_ec_init
  [8/41] 29ead2ab81de crypto: Kconfig - Make CRYPTO_FIPS depend on the DRBG being built-in
  [9/41] cffe0dc8fef4 random: Restrict extrng registration to init time
  [10/41] 873f7715c906 crypto: rng - Convert crypto_default_rng_refcnt into an unsigned int
  [11/41] 18d07f8180b4 crypto: drbg - Align buffers to at least a cache line
  [12/41] 6222f0a06928 crypto: rng - Fix priority inversions due to mutex locks
  [13/41] 4c31e45146e2 mm/gup: reintroduce pin_user_pages_fast_only()
  [14/41] aa6607107278 crypto: rng - Implement fast per-CPU DRBG instances
  [15/41] c6a8041ffdad configs: Ensure FIPS settings defined
  [16/41] 985394dd7246 github actions: Use reusable validate kernel commits workflow
  [17/41] 80e42b158326 github actions: Add kernelCI for rlc-9
  [18/41] 07b7e647fc15 tools: hv: Enable debug logs for hv_kvp_daemon
  [19/41] a49d9a8d8b29 crypto: rng - Only allow the DRBG to register as "stdrng" in FIPS mode
  [20/41] 4f7116562c7e PCI/MSI: Export pci_msix_prepare_desc() for dynamic MSI-X allocations
  [21/41] 0a1522e87798 PCI: hv: Allow dynamic MSI-X vector allocation
  [22/41] 6c35c5c09f71 net: mana: explain irq_setup() algorithm
  [23/41] f8cb97437920 net: mana: Allow irq_setup() to skip cpus for affinity
  [24/41] 90fe3f8d567e net: mana: Allocate MSI-X vectors dynamically
  [25/41] bcf468a6044a net: mana: Add support for net_shaper_ops
  [26/41] ee212a5244d4 net: mana: Add speed support in mana_get_link_ksettings
  [27/41] ac5935e45cf0 net: mana: Handle unsupported HWC commands
  [28/41] 4bcaec501516 net: mana: Fix build errors when CONFIG_NET_SHAPER is disabled
  [29/41] 6f1f1181c843 RDMA/mana_ib: add additional port counters
  [30/41] 901c8c869a29 RDMA/mana_ib: Drain send wrs of GSI QP
  [31/41] 264c6c8e795f net: hv_netvsc: fix loss of early receive events from host during channel open.
  [32/41] 2ccba0df6bfb net: mana: Reduce waiting time if HWC not responding
  [33/41] d12122d2422b RDMA/mana_ib: Extend modify QP
  [34/41] ecf2b5299929 scsi: storvsc: Prefer returning channel with the same CPU as on the I/O issuing CPU
  [35/41] 53d7719d751d net: mana: Use page pool fragments for RX buffers instead of full pages to improve memory efficiency.
  [36/41] ffae3b8b6ba3 idpf: add support for Tx refillqs in flow scheduling mode
  [37/41] e3c8b8f9ad8a idpf: improve when to set RE bit logic
  [38/41] 527ad8260f1e idpf: simplify and fix splitq Tx packet rollback error path
  [39/41] 8946be0013f4 idpf: replace flow scheduling buffer ring with buffer pool
  [40/41] 6009f32b882f idpf: stop Tx if there are insufficient buffer resources
  [41/41] ceab0482f216 idpf: remove obsolete stashing code
[rolling release update] Successfully applied all 41 commits

Commit Message(s)

selftests/mm temporary fix of hmm infinite loop

jira SECO-170
SUSE: patch: crypto-ecdh-implement-FIPS-PCT.patch

Signed-off-by: Jeremy Allison <jallison@ciq.com>
crypto: essiv - Zeroize keys on exit in essiv_aead_setkey()

In essiv_aead_setkey(), use the same logic as crypto_authenc_esn_setkey()
to zeroize keys on exit.
crypto: jitter - replace LFSR with SHA3-256

        Using the kernel crypto API, the SHA3-256 algorithm is used as
        conditioning element to replace the LFSR in the Jitter RNG. All other
        parts of the Jitter RNG are unchanged.
crypto: aead,cipher - zeroize key buffer after use

    I.G 9.7.B for FIPS 140-3 specifies that variables temporarily holding
    cryptographic information should be zeroized once they are no longer
    needed. Accomplish this by using kfree_sensitive for buffers that
    previously held the private key.
crypto: ecdh - explicitly zeroize private_key

private_key is overwritten with the key parameter passed in by the
caller (if present), or alternatively a newly generated private key.
However, it is possible that the caller provides a key (or the newly
generated key) which is shorter than the previous key. In that
scenario, some key material from the previous key would not be
overwritten. The easiest solution is to explicitly zeroize the entire
private_key array first.
crypto: lib/mpi - Fix unexpected pointer access in mpi_ec_init

[ Upstream commit ba3c5574203034781ac4231acf117da917efcd2a ]
crypto: Kconfig - Make CRYPTO_FIPS depend on the DRBG being built-in

When FIPS mode is enabled (via fips=1), there is an absolute need for the
DRBG to be available. This is at odds with the fact that the DRBG can be
built as a module when in FIPS mode, leaving critical RNG functionality at
the whims of userspace.
random: Restrict extrng registration to init time

It is technically a risk to permit extrng registration by modules after
kernel init completes. Since there is only one user of the extrng interface
and it is imperative that it is the _only_ registered extrng for FIPS
compliance, restrict the extrng registration interface to only permit
registration during kernel init and only from built-in drivers.
crypto: rng - Convert crypto_default_rng_refcnt into an unsigned int

There is no reason this refcount should be a signed int. Convert it to an
unsigned int, thereby also making it less likely to ever overflow.
crypto: drbg - Align buffers to at least a cache line

None of the ciphers used by the DRBG have an alignment requirement; thus,
they all return 0 from .crypto_init, resulting in inconsistent alignment
across all buffers.
crypto: rng - Fix priority inversions due to mutex locks

Since crypto_devrandom_read_iter() is invoked directly by user tasks and is
accessible by every task in the system, there are glaring priority
inversions on crypto_reseed_rng_lock and crypto_default_rng_lock.
mm/gup: reintroduce pin_user_pages_fast_only()

Like pin_user_pages_fast(), but with the internal-only FOLL_FAST_ONLY flag.
crypto: rng - Implement fast per-CPU DRBG instances

When the kernel is booted with fips=1, the RNG exposed to userspace is
hijacked away from the CRNG and redirects to crypto_devrandom_read_iter(),
which utilizes the DRBG.
configs: Ensure FIPS settings defined

We want to hard set the x86_64 FIPS required configs rather than rely on
default settings in the kernel, should these ever change without our
knowing it would not be something we would have actively checked.
github actions: Use reusable validate kernel commits workflow

Simplifies the workflow to use the reusable workflow defined in main
branch. This reduces duplication and makes the workflow easier to
maintain across multiple branches.
github actions: Add kernelCI for rlc-9

Signed-off-by: Roxana Nicolescu <rnicolescu@ciq.com>
tools: hv: Enable debug logs for hv_kvp_daemon

jira LE-3207
feature tools_hv
commit-author Shradha Gupta <shradhagupta@linux.microsoft.com>
commit a9c0b33ef2306327dd2db02c6274107065ff9307
crypto: rng - Only allow the DRBG to register as "stdrng" in FIPS mode

In FIPS mode, the DRBG must take precedence over all stdrng algorithms.
The only problem standing in the way of this is that a different stdrng
algorithm could get registered and utilized before the DRBG is registered,
and since crypto_alloc_rng() only allocates an stdrng algorithm when
there's no existing allocation, this means that it's possible for the wrong
stdrng algorithm to remain in use indefinitely.
PCI/MSI: Export pci_msix_prepare_desc() for dynamic MSI-X allocations

jira LE-4466
commit-author Shradha Gupta <shradhagupta@linux.microsoft.com>
commit 5da8a8b8090b5f79a816ba016af3a70a9d7287bf
PCI: hv: Allow dynamic MSI-X vector allocation

jira LE-4466
commit-author Shradha Gupta <shradhagupta@linux.microsoft.com>
commit ad518f2557b971976fc9d99a6a8cd2b453742bf9
net: mana: explain irq_setup() algorithm

jira LE-4466
commit-author Yury Norov <yury.norov@gmail.com>
commit 4607617af1b4747df0284ea8c1ddcecb21cae528
net: mana: Allow irq_setup() to skip cpus for affinity

jira LE-4466
commit-author Shradha Gupta <shradhagupta@linux.microsoft.com>
commit 845c62c543d6bd5d8b80f53835997789e4bb8e29
net: mana: Allocate MSI-X vectors dynamically

jira LE-4466
commit-author Shradha Gupta <shradhagupta@linux.microsoft.com>
commit 755391121038c06cb653241aa94dcabd87179f62
upstream-diff There were conflicts seen when applying this patch
due to following commits present in our tree before this patch.
590bcf15ae4a ("net: mana: Add handler for hardware servicing events")
00c2b0fef4b5 ("net: mana: Fix warnings for missing export.h header inclusion")
net: mana: Add support for net_shaper_ops

jira LE-4472
commit-author Erni Sri Satya Vennela <ernis@linux.microsoft.com>
commit 75cabb46935b6de8e2bdfde563e460ac41cfff12
upstream-diff There was a conflict seen when applying this
patch due to the following commit not present in our tree.
92272ec4107e ("eth: add missing xdp.h includes in drivers")
net: mana: Add speed support in mana_get_link_ksettings

jira LE-4472
commit-author Erni Sri Satya Vennela <ernis@linux.microsoft.com>
commit a6d5edf11e0cf5a4650f1d353d20ec29de093813
net: mana: Handle unsupported HWC commands

jira LE-4472
commit-author Erni Sri Satya Vennela <ernis@linux.microsoft.com>
commit ca8ac489ca33c986ff02ee14c3e1c10b86355428
upstream-diff There were conflicts seen when applying this
patch due to the following patch being in our tree before
this one.
7a3c23599984 ("net: mana: Handle Reset Request from MANA NIC")
net: mana: Fix build errors when CONFIG_NET_SHAPER is disabled

jira LE-4472
commit-author Erni Sri Satya Vennela <ernis@linux.microsoft.com>
commit 11cd0206987205ee05b0abd70a8eafa400ba89e3
RDMA/mana_ib: add additional port counters

jira LE-4526
commit-author Zhiyue Qiu <zhiyueqiu@microsoft.com>
commit 084f35b84f57e059b542ea44240a51b294a096a1
RDMA/mana_ib: Drain send wrs of GSI QP

jira LE-4523
commit-author Konstantin Taranov <kotaranov@microsoft.com>
commit 44d69d3cf2e8047c279cbb9708f05e2c43e33234
net: hv_netvsc: fix loss of early receive events from host during channel open.

jira LE-4493
commit-author Dipayaan Roy <dipayanroy@linux.microsoft.com>
commit 9448ccd853368582efa9db05db344f8bb9dffe0f
net: mana: Reduce waiting time if HWC not responding

jira LE-4496
commit-author Haiyang Zhang <haiyangz@microsoft.com>
commit c4deabbc1abe452ea230b86d53ed3711e5a8a062
RDMA/mana_ib: Extend modify QP

jira LE-4520
commit-author Shiraz Saleem <shirazsaleem@microsoft.com>
commit 2bd7dd383609f11330814ecc0d3c10b67073a6be
scsi: storvsc: Prefer returning channel with the same CPU as on the I/O issuing CPU

jira LE-4536
commit-author Long Li <longli@microsoft.com>
commit b69ffeaa0ae43892683113b3f4ddf156398738b9
net: mana: Use page pool fragments for RX buffers instead of full pages to improve memory efficiency.

jira LE-4489
commit-author Dipayaan Roy <dipayanroy@linux.microsoft.com>
commit 730ff06d3f5cc2ce0348414b78c10528b767d4a3
upstream-diff This patch was causing build failures due to missing
commit 0f9214046893 ("memory-provider: dmabuf devmem memory provider")
To fix it, we have removed pprm.queue_idx parameter which seems
is not being used even after being set because of the missing commit.
idpf: add support for Tx refillqs in flow scheduling mode

jira KERNEL-169
commit-author Joshua Hay <joshua.a.hay@intel.com>
commit cb83b559bea39f207ee214ee2972657e8576ed18
upstream-diff |
	adjusted the number of bytes expected in
	libeth_cacheline_set_assert for struct idpf_tx_queue due to missing
	of some elements in the struct introduced in commit
	1a49cf814fe1e ("idpf: add Tx timestamp flows").
idpf: improve when to set RE bit logic

jira KERNEL-169
commit-author Joshua Hay <joshua.a.hay@intel.com>
commit f2d18e16479cac7a708d77cbfb4220a9114a71fc
idpf: simplify and fix splitq Tx packet rollback error path

jira KERNEL-169
commit-author Joshua Hay <joshua.a.hay@intel.com>
commit b61dfa9bc4430ad82b96d3a7c1c485350f91b467
upstream-diff |
	adjusted context in 2 places:
	- when removing func idpf_tx_dma_map_error due to different memset
	call that uses the hardcoded struct type;
	- in func idpf_tx_splitq_frame due to missing expected
	union idpf_flex_tx_ctx_desc *ctx_desc;
	both differences were introduced in commit
	1a49cf814fe1e ("idpf: add Tx timestamp flows").
idpf: replace flow scheduling buffer ring with buffer pool

jira KERNEL-169
commit-author Joshua Hay <joshua.a.hay@intel.com>
commit 5f417d551324d2894168b362f2429d120ab06243
upstream-diff |
	adjusted context in:
	- ifpf_tx_splitq_frame and idpf_tx_clean_bufs;
	- libeth_cacheline_set_assert for struct idpf_tx_queue due to missing
	of some elements in the struct;
	all cases are due to missing commit
	1a49cf814fe1e ("idpf: add Tx timestamp flows").
idpf: stop Tx if there are insufficient buffer resources

jira KERNEL-169
commit-author Joshua Hay <joshua.a.hay@intel.com>
commit 0c3f135e840d4a2ba4253e15d530ec61bc30718e
upstream-diff |
	adjusted conflict in idpf_tx_splitq_frame func due to missing
	1a49cf814fe1e ("idpf: add Tx timestamp flows").
idpf: remove obsolete stashing code

jira KERNEL-169
commit-author Joshua Hay <joshua.a.hay@intel.com>
commit 6c4e68480238274f84aa50d54da0d9e262df6284
upstream-diff |
	- adjusted context due to missing idpf_tx_read_tstamp func;
	- adjusted the number of bytes expected in
	libeth_cacheline_set_assert for struct idpf_tx_queue due to missing
	of some elements in the struct;
	both are due to missing commit
	1a49cf814fe1e ("idpf: add Tx timestamp flows").

Test Results

✅ Build Stage

Architecture Build Time Total Time
x86_64 33m 35s 37m 59s
aarch64 19m 55s 22m 58s

✅ Boot Verification

✅ Kernel Selftests

Architecture Passed Failed
x86_64 205 48
aarch64 153 47

Test Comparison

x86_64:

  • ⚠️ Status: Skipped - No baseline available

aarch64:

  • ⚠️ Status: Skipped - No baseline available

🤖 This PR was automatically generated by GitHub Actions
Run ID: 23512604666

PlaidCat and others added 30 commits March 24, 2026 15:44
jira SECO-170

In Rocky9 if you run ./run_vmtests.sh -t hmm it will fail and cause an
infinite loop on ASSERTs in FIXTURE_TEARDOWN()
This temporary fix is based on the discussion here
https://patchwork.kernel.org/project/linux-kselftest/patch/26017fe3-5ad7-6946-57db-e5ec48063ceb@suse.cz/#25046055

We will investigate further kselftest updates that will resolve the root
causes of this.

Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Signed-off-by: Jeremy Allison <jallison@ciq.com>
In essiv_aead_setkey(), use the same logic as crypto_authenc_esn_setkey()
to zeroize keys on exit.

[Sultan: touched up commit message]

Signed-off-by: Jason Rodriguez <jrodriguez@ciq.com>
        Using the kernel crypto API, the SHA3-256 algorithm is used as
        conditioning element to replace the LFSR in the Jitter RNG. All other
        parts of the Jitter RNG are unchanged.

        The application and use of the SHA-3 conditioning operation is identical
        to the user space Jitter RNG 3.4.0 by applying the following concept:

        - the Jitter RNG initializes a SHA-3 state which acts as the "entropy
          pool" when the Jitter RNG is allocated.

        - When a new time delta is obtained, it is inserted into the "entropy
          pool" with a SHA-3 update operation. Note, this operation in most of
          the cases is a simple memcpy() onto the SHA-3 stack.

        - To cause a true SHA-3 operation for each time delta operation, a
          second SHA-3 operation is performed hashing Jitter RNG status
          information. The final message digest is also inserted into the
          "entropy pool" with a SHA-3 update operation. Yet, this data is not
          considered to provide any entropy, but it shall stir the entropy pool.

        - To generate a random number, a SHA-3 final operation is performed to
          calculate a message digest followed by an immediate SHA-3 init to
          re-initialize the "entropy pool". The obtained message digest is one
          block of the Jitter RNG that is returned to the caller.

        Mathematically speaking, the random number generated by the Jitter RNG
        is:

        aux_t = SHA-3(Jitter RNG state data)

        Jitter RNG block = SHA-3(time_i || aux_i || time_(i-1) || aux_(i-1) ||
                                 ... || time_(i-255) || aux_(i-255))

        when assuming that the OSR = 1, i.e. the default value.

        This operation implies that the Jitter RNG has an output-blocksize of
        256 bits instead of the 64 bits of the LFSR-based Jitter RNG that is
        replaced with this patch.

        The patch also replaces the varying number of invocations of the
        conditioning function with one fixed number of invocations. The use
        of the conditioning function consistent with the userspace Jitter RNG
        library version 3.4.0.

        The code is tested with a system that exhibited the least amount of
        entropy generated by the Jitter RNG: the SiFive Unmatched RISC-V
        system. The measured entropy rate is well above the heuristically
        implied entropy value of 1 bit of entropy per time delta. On all other
        tested systems, the measured entropy rate is even higher by orders
        of magnitude. The measurement was performed using updated tooling
        provided with the user space Jitter RNG library test framework.

        The performance of the Jitter RNG with this patch is about en par
        with the performance of the Jitter RNG without the patch.

        Signed-off-by: Stephan Mueller <smueller@chronox.de>
        Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

            Back-port of commit bb897c5
            Author: Stephan Müller <smueller@chronox.de>
            Date:   Fri Apr 21 08:08:04 2023 +0200

Signed-off-by: Jeremy Allison <jallison@ciq.com>
    I.G 9.7.B for FIPS 140-3 specifies that variables temporarily holding
    cryptographic information should be zeroized once they are no longer
    needed. Accomplish this by using kfree_sensitive for buffers that
    previously held the private key.

    Signed-off-by: Hailey Mothershead <hailmo@amazon.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

        Back-ported from commit 23e4099
        Author: Hailey Mothershead <hailmo@amazon.com>
        Date:   Mon Apr 15 22:19:15 2024 +0000

Signed-off-by: Jeremy Allison <jallison@ciq.com>
private_key is overwritten with the key parameter passed in by the
caller (if present), or alternatively a newly generated private key.
However, it is possible that the caller provides a key (or the newly
generated key) which is shorter than the previous key. In that
scenario, some key material from the previous key would not be
overwritten. The easiest solution is to explicitly zeroize the entire
private_key array first.

Note that this patch slightly changes the behavior of this function:
previously, if the ecc_gen_privkey failed, the old private_key would
remain. Now, the private_key is always zeroized. This behavior is
consistent with the case where params.key is set and ecc_is_key_valid
fails.

Signed-off-by: Joachim Vandersmissen <git@jvdsn.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Jonathan Maple <jmaple@ciq.com>
[ Upstream commit ba3c557 ]

When the mpi_ec_ctx structure is initialized, some fields are not
cleared, causing a crash when referencing the field when the
structure was released. Initially, this issue was ignored because
memory for mpi_ec_ctx is allocated with the __GFP_ZERO flag.
For example, this error will be triggered when calculating the
Za value for SM2 separately.

Fixes: d58bb7e ("lib/mpi: Introduce ec implementation to MPI library")
Cc: stable@vger.kernel.org # v6.5
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Jonathan Maple <jmaple@ciq.com>
When FIPS mode is enabled (via fips=1), there is an absolute need for the
DRBG to be available. This is at odds with the fact that the DRBG can be
built as a module when in FIPS mode, leaving critical RNG functionality at
the whims of userspace.

Userspace could simply rmmod the DRBG module, or not provide it at all and
thus a different stdrng algorithm could be used without anyone noticing.

Additionally, when running a FIPS-enabled userspace, modprobe itself may
perform a getrandom() syscall _before_ loading a given module. As a result,
there's a possible deadlock scenario where the RNG core (crypto/rng.c)
initializes _before_ the DRBG, thereby installing its getrandom() override
without having an stdrng algorithm available. Then, when userspace calls
getrandom() which redirects to the override in crypto/rng.c,
crypto_alloc_rng("stdrng") invokes the UMH (modprobe) to load the DRBG
(which is aliased to stdrng). And *then* that modprobe invocation gets
stuck at getrandom() because there's no stdrng algorithm available!

There are too many risks that come with allowing the DRBG and RNG core to
be modular for FIPS mode. Therefore, make CRYPTO_FIPS require the DRBG to
be built-in, which in turn makes the DRBG require the RNG core to be
built-in. That way, it's guaranteed for these drivers to be built-in when
running in FIPS mode.

Also clean up the CRYPTO_FIPS option name and remove the CRYPTO_ANSI_CPRNG
dependency since it's obsolete for FIPS now.

Signed-off-by: Sultan Alsawaf <sultan@ciq.com>
It is technically a risk to permit extrng registration by modules after
kernel init completes. Since there is only one user of the extrng interface
and it is imperative that it is the _only_ registered extrng for FIPS
compliance, restrict the extrng registration interface to only permit
registration during kernel init and only from built-in drivers.

This also eliminates the risks associated with the extrng interface itself
being designed to solely accommodate a single registration, which would
therefore permit the registered extrng to be overridden or even removed by
an unrelated module.

Signed-off-by: Sultan Alsawaf <sultan@ciq.com>
There is no reason this refcount should be a signed int. Convert it to an
unsigned int, thereby also making it less likely to ever overflow.

Signed-off-by: Sultan Alsawaf <sultan@ciq.com>
None of the ciphers used by the DRBG have an alignment requirement; thus,
they all return 0 from .crypto_init, resulting in inconsistent alignment
across all buffers.

Align all buffers to at least a cache line to improve performance. This is
especially useful when multiple DRBG instances are used, since it prevents
false sharing of cache lines between the different instances.

Signed-off-by: Sultan Alsawaf <sultan@ciq.com>
Since crypto_devrandom_read_iter() is invoked directly by user tasks and is
accessible by every task in the system, there are glaring priority
inversions on crypto_reseed_rng_lock and crypto_default_rng_lock.

Tasks of arbitrary scheduling priority access crypto_devrandom_read_iter().
When a low-priority task owns one of the mutex locks, higher-priority tasks
waiting on that mutex lock are stalled until the low-priority task is done.

Fix the priority inversions by converting the mutex locks into rt_mutex
locks which have PI support.

Signed-off-by: Sultan Alsawaf <sultan@ciq.com>
Like pin_user_pages_fast(), but with the internal-only FOLL_FAST_ONLY flag.

This complements the get_user_pages*() API, which already has
get_user_pages_fast_only().

Note that pin_user_pages_fast_only() used to exist but was removed in
upstream commit edad1bb ("mm/gup: remove pin_user_pages_fast_only()")
due to it not having any users.

Signed-off-by: Sultan Alsawaf <sultan@ciq.com>
When the kernel is booted with fips=1, the RNG exposed to userspace is
hijacked away from the CRNG and redirects to crypto_devrandom_read_iter(),
which utilizes the DRBG.

Notably, crypto_devrandom_read_iter() maintains just two global DRBG
instances _for the entire system_, and the two instances serve separate
request types: one instance for GRND_RANDOM requests (crypto_reseed_rng),
and one instance for non-GRND_RANDOM requests (crypto_default_rng). So in
essence, for requests of a single type, there is just one global RNG for
all CPUs in the entire system, which scales _very_ poorly.

To make matters worse, the temporary buffer used to ferry data between the
DRBG and userspace is woefully small at only 256 bytes, which doesn't do a
good job of maximizing throughput from the DRBG. This results in lost
performance when userspace requests >256 bytes; it is observed that DRBG
throughput improves by 70% on an i9-13900H when the buffer size is
increased to 4096 bytes (one page). Going beyond the size of one page up to
the DRBG maximum request limit of 65536 bytes produces diminishing returns
of only 3% improved throughput in comparison. And going below the size of
one page produces progressively less throughput at each power of 2: there's
a 5% loss going from 4096 bytes to 2048 bytes and a 9% loss going from 2048
bytes to 1024 bytes.

Thus, this implements per-CPU DRBG instances utilizing a page-sized buffer
for each CPU to utilize the DRBG itself more effectively. On top of that,
for non-GRND_RANDOM requests, the DRBG's operations now occur under a local
lock that disables preemption on non-PREEMPT_RT kernels, which not only
keeps each CPU's DRBG instance isolated from another, but also improves
temporal cache locality while the DRBG actively generates a new string of
random bytes.

Prefaulting one user destination page at a time is also employed to prevent
a DRBG instance from getting blocked on page faults, thereby maximizing the
use of the DRBG so that the only bottleneck is the DRBG itself.

Signed-off-by: Sultan Alsawaf <sultan@ciq.com>
We want to hard set the x86_64 FIPS required configs rather than rely on
default settings in the kernel, should these ever change without our
knowing it would not be something we would have actively checked.

The configs are a limited set of configs that is expanded out when
building using `make olddefconfig` a common practice in kernel building.

Note had to manually add the following since its normaly set by the RPM
build process.
CONFIG_CRYPTO_FIPS_NAME="Rocky Linux 9 Kernel Cryptographic API"

Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Simplifies the workflow to use the reusable workflow defined in main
branch. This reduces duplication and makes the workflow easier to
maintain across multiple branches.

The workflow was renamed because it now includes validation over
and above just checking for upstream fixes

Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Signed-off-by: Roxana Nicolescu <rnicolescu@ciq.com>
jira LE-3207
feature tools_hv
commit-author Shradha Gupta <shradhagupta@linux.microsoft.com>
commit a9c0b33

Allow the KVP daemon to log the KVP updates triggered in the VM
with a new debug flag(-d).
When the daemon is started with this flag, it logs updates and debug
information in syslog with loglevel LOG_DEBUG. This information comes
in handy for debugging issues where the key-value pairs for certain
pools show mismatch/incorrect values.
The distro-vendors can further consume these changes and modify the
respective service files to redirect the logs to specific files as
needed.

	Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
	Reviewed-by: Naman Jain <namjain@linux.microsoft.com>
	Reviewed-by: Dexuan Cui <decui@microsoft.com>
Link: https://lore.kernel.org/r/1744715978-8185-1-git-send-email-shradhagupta@linux.microsoft.com
	Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <1744715978-8185-1-git-send-email-shradhagupta@linux.microsoft.com>
(cherry picked from commit a9c0b33)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
In FIPS mode, the DRBG must take precedence over all stdrng algorithms.
The only problem standing in the way of this is that a different stdrng
algorithm could get registered and utilized before the DRBG is registered,
and since crypto_alloc_rng() only allocates an stdrng algorithm when
there's no existing allocation, this means that it's possible for the wrong
stdrng algorithm to remain in use indefinitely.

This issue is also often impossible to observe from userspace; an RNG other
than the DRBG could be used somewhere in the kernel and userspace would be
none the wiser.

To ensure this can never happen, only allow stdrng instances from the DRBG
to be registered when running in FIPS mode. This works since the previous
commit forces the DRBG to be built into the kernel when CONFIG_CRYPTO_FIPS
is enabled, so the DRBG's presence is guaranteed when fips_enabled is true.

Signed-off-by: Sultan Alsawaf <sultan@ciq.com>
jira LE-4466
commit-author Shradha Gupta <shradhagupta@linux.microsoft.com>
commit 5da8a8b

For supporting dynamic MSI-X vector allocation by PCI controllers, enabling
the flag MSI_FLAG_PCI_MSIX_ALLOC_DYN is not enough, msix_prepare_msi_desc()
to prepare the MSI descriptor is also needed.

Export pci_msix_prepare_desc() to allow PCI controllers to support dynamic
MSI-X vector allocation.

	Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
	Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
	Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
	Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com>
	Acked-by: Bjorn Helgaas <bhelgaas@google.com>
(cherry picked from commit 5da8a8b)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-4466
commit-author Shradha Gupta <shradhagupta@linux.microsoft.com>
commit ad518f2

Allow dynamic MSI-X vector allocation for pci_hyperv PCI controller
by adding support for the flag MSI_FLAG_PCI_MSIX_ALLOC_DYN and using
pci_msix_prepare_desc() to prepare the MSI-X descriptors.

Feature support added for both x86 and ARM64

	Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
	Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
	Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com>
	Acked-by: Bjorn Helgaas <bhelgaas@google.com>
(cherry picked from commit ad518f2)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-4466
commit-author Yury Norov <yury.norov@gmail.com>
commit 4607617

Commit 91bfe21 ("net: mana: add a function to spread IRQs per CPUs")
added the irq_setup() function that distributes IRQs on CPUs according
to a tricky heuristic. The corresponding commit message explains the
heuristic.

Duplicate it in the source code to make available for readers without
digging git in history. Also, add more detailed explanation about how
the heuristics is implemented.

	Signed-off-by: Yury Norov <yury.norov@gmail.com>
	Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
(cherry picked from commit 4607617)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-4466
commit-author Shradha Gupta <shradhagupta@linux.microsoft.com>
commit 845c62c

In order to prepare the MANA driver to allocate the MSI-X IRQs
dynamically, we need to enhance irq_setup() to allow skipping
affinitizing IRQs to the first CPU sibling group.

This would be for cases when the number of IRQs is less than or equal
to the number of online CPUs. In such cases for dynamically added IRQs
the first CPU sibling group would already be affinitized with HWC IRQ.

	Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
	Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
	Reviewed-by: Yury Norov [NVIDIA] <yury.norov@gmail.com>
(cherry picked from commit 845c62c)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-4466
commit-author Shradha Gupta <shradhagupta@linux.microsoft.com>
commit 7553911
upstream-diff There were conflicts seen when applying this patch
due to following commits present in our tree before this patch.
590bcf1 ("net: mana: Add handler for hardware servicing events")
00c2b0f ("net: mana: Fix warnings for missing export.h header inclusion")

Currently, the MANA driver allocates MSI-X vectors statically based on
MANA_MAX_NUM_QUEUES and num_online_cpus() values and in some cases ends
up allocating more vectors than it needs. This is because, by this time
we do not have a HW channel and do not know how many IRQs should be
allocated.

To avoid this, we allocate 1 MSI-X vector during the creation of HWC and
after getting the value supported by hardware, dynamically add the
remaining MSI-X vectors.

	Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
	Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
(cherry picked from commit 7553911)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-4472
commit-author Erni Sri Satya Vennela <ernis@linux.microsoft.com>
commit 75cabb4
upstream-diff There was a conflict seen when applying this
patch due to the following commit not present in our tree.
92272ec ("eth: add missing xdp.h includes in drivers")

Introduce support for net_shaper_ops in the MANA driver,
enabling configuration of rate limiting on the MANA NIC.

To apply rate limiting, the driver issues a HWC command via
mana_set_bw_clamp() and updates the corresponding shaper object
in the net_shaper cache. If an error occurs during this process,
the driver restores the previous speed by querying the current link
configuration using mana_query_link_cfg().

The minimum supported bandwidth is 100 Mbps, and only values that are
exact multiples of 100 Mbps are allowed. Any other values are rejected.

To remove a shaper, the driver resets the bandwidth to the maximum
supported by the SKU using mana_set_bw_clamp() and clears the
associated cache entry. If an error occurs during this process,
the shaper details are retained.

On the hardware that does not support these APIs, the net-shaper
calls to set speed would fail.

Set the speed:
./tools/net/ynl/pyynl/cli.py \
 --spec Documentation/netlink/specs/net_shaper.yaml \
 --do set --json '{"ifindex":'$IFINDEX',
		   "handle":{"scope": "netdev", "id":'$ID' },
		   "bw-max": 200000000 }'

Get the shaper details:
./tools/net/ynl/pyynl/cli.py \
 --spec Documentation/netlink/specs/net_shaper.yaml \
 --do get --json '{"ifindex":'$IFINDEX',
		      "handle":{"scope": "netdev", "id":'$ID' }}'

> {'bw-max': 200000000,
> 'handle': {'scope': 'netdev'},
> 'ifindex': $IFINDEX,
> 'metric': 'bps'}

Delete the shaper object:
./tools/net/ynl/pyynl/cli.py \
 --spec Documentation/netlink/specs/net_shaper.yaml \
 --do delete --json '{"ifindex":'$IFINDEX',
		      "handle":{"scope": "netdev","id":'$ID' }}'

	Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
	Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
	Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
	Reviewed-by: Saurabh Singh Sengar <ssengar@linux.microsoft.com>
	Reviewed-by: Long Li <longli@microsoft.com>
Link: https://patch.msgid.link/1750144656-2021-3-git-send-email-ernis@linux.microsoft.com
	Signed-off-by: Paolo Abeni <pabeni@redhat.com>

(cherry picked from commit 75cabb4)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-4472
commit-author Erni Sri Satya Vennela <ernis@linux.microsoft.com>
commit a6d5edf

Allow mana ethtool get_link_ksettings operation to report
the maximum speed supported by the SKU in mbps.

The driver retrieves this information by issuing a
HWC command to the hardware via mana_query_link_cfg(),
which retrieves the SKU's maximum supported speed.

These APIs when invoked on hardware that are older/do
not support these APIs, the speed would be reported as UNKNOWN.

Before:
$ethtool enP30832s1
> Settings for enP30832s1:
        Supported ports: [  ]
        Supported link modes:   Not reported
        Supported pause frame use: No
        Supports auto-negotiation: No
        Supported FEC modes: Not reported
        Advertised link modes:  Not reported
        Advertised pause frame use: No
        Advertised auto-negotiation: No
        Advertised FEC modes: Not reported
        Speed: Unknown!
        Duplex: Full
        Auto-negotiation: off
        Port: Other
        PHYAD: 0
        Transceiver: internal
        Link detected: yes

After:
$ethtool enP30832s1
> Settings for enP30832s1:
        Supported ports: [  ]
        Supported link modes:   Not reported
        Supported pause frame use: No
        Supports auto-negotiation: No
        Supported FEC modes: Not reported
        Advertised link modes:  Not reported
        Advertised pause frame use: No
        Advertised auto-negotiation: No
        Advertised FEC modes: Not reported
        Speed: 16000Mb/s
        Duplex: Full
        Auto-negotiation: off
        Port: Other
        PHYAD: 0
        Transceiver: internal
        Link detected: yes

	Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
	Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
	Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
	Reviewed-by: Saurabh Singh Sengar <ssengar@linux.microsoft.com>
	Reviewed-by: Long Li <longli@microsoft.com>
Link: https://patch.msgid.link/1750144656-2021-4-git-send-email-ernis@linux.microsoft.com
	Signed-off-by: Paolo Abeni <pabeni@redhat.com>

(cherry picked from commit a6d5edf)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-4472
commit-author Erni Sri Satya Vennela <ernis@linux.microsoft.com>
commit ca8ac48
upstream-diff There were conflicts seen when applying this
patch due to the following patch being in our tree before
this one.
7a3c235 ("net: mana: Handle Reset Request from MANA NIC")

If any of the HWC commands are not recognized by the
underlying hardware, the hardware returns the response
header status of -1. Log the information using
netdev_info_once to avoid multiple error logs in dmesg.

	Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
	Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
	Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
	Reviewed-by: Saurabh Singh Sengar <ssengar@linux.microsoft.com>
	Reviewed-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
Link: https://patch.msgid.link/1750144656-2021-5-git-send-email-ernis@linux.microsoft.com
	Signed-off-by: Paolo Abeni <pabeni@redhat.com>

(cherry picked from commit ca8ac48)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-4472
commit-author Erni Sri Satya Vennela <ernis@linux.microsoft.com>
commit 11cd020

Fix build errors when CONFIG_NET_SHAPER is disabled, including:

drivers/net/ethernet/microsoft/mana/mana_en.c:804:10: error:
'const struct net_device_ops' has no member named 'net_shaper_ops'

     804 |         .net_shaper_ops         = &mana_shaper_ops,

drivers/net/ethernet/microsoft/mana/mana_en.c:804:35: error:
initialization of 'int (*)(struct net_device *, struct neigh_parms *)'
from incompatible pointer type 'const struct net_shaper_ops *'
[-Werror=incompatible-pointer-types]

     804 |         .net_shaper_ops         = &mana_shaper_ops,

	Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
Fixes: 75cabb4 ("net: mana: Add support for net_shaper_ops")
	Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202506230625.bfUlqb8o-lkp@intel.com/
	Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1750851355-8067-1-git-send-email-ernis@linux.microsoft.com
	Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 11cd020)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-4526
commit-author Zhiyue Qiu <zhiyueqiu@microsoft.com>
commit 084f35b

Add packet and request port counters to mana_ib.

	Signed-off-by: Zhiyue Qiu <zhiyueqiu@microsoft.com>
	Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Link: https://patch.msgid.link/1752143395-5324-1-git-send-email-kotaranov@linux.microsoft.com
	Reviewed-by: Long Li <longli@microsoft.com>
	Signed-off-by: Leon Romanovsky <leon@kernel.org>
(cherry picked from commit 084f35b)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-4523
commit-author Konstantin Taranov <kotaranov@microsoft.com>
commit 44d69d3

Drain send WRs of the GSI QP on device removal.

In rare servicing scenarios, the hardware may delete the
state of the GSI QP, preventing it from generating CQEs
for pending send WRs. Since WRs submitted to the GSI QP
hold CM resources, the device cannot be removed until
those WRs are completed. This patch marks all pending
send WRs as failed, allowing the GSI QP to release the CM
resources and enabling safe device removal.

	Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Link: https://patch.msgid.link/1753779618-23629-1-git-send-email-kotaranov@linux.microsoft.com
	Signed-off-by: Leon Romanovsky <leon@kernel.org>
(cherry picked from commit 44d69d3)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
shreeya-patel98 and others added 11 commits March 24, 2026 15:44
…nnel open.

jira LE-4493
commit-author Dipayaan Roy <dipayanroy@linux.microsoft.com>
commit 9448ccd

The hv_netvsc driver currently enables NAPI after opening the primary and
subchannels. This ordering creates a race: if the Hyper-V host places data
in the host -> guest ring buffer and signals the channel before
napi_enable() has been called, the channel callback will run but
napi_schedule_prep() will return false. As a result, the NAPI poller never
gets scheduled, the data in the ring buffer is not consumed, and the
receive queue may remain permanently stuck until another interrupt happens
to arrive.

Fix this by enabling NAPI and registering it with the RX/TX queues before
vmbus channel is opened. This guarantees that any early host signal after
open will correctly trigger NAPI scheduling and the ring buffer will be
drained.

Fixes: 76bb5db ("netvsc: fix use after free on module removal")
	Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
Link: https://patch.msgid.link/20250825115627.GA32189@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net
	Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit 9448ccd)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-4496
commit-author Haiyang Zhang <haiyangz@microsoft.com>
commit c4deabb

If HW Channel (HWC) is not responding, reduce the waiting time, so further
steps will fail quickly.
This will prevent getting stuck for a long time (30 minutes or more), for
example, during unloading while HWC is not responding.

	Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/1757537841-5063-1-git-send-email-haiyangz@linux.microsoft.com
	Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit c4deabb)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira LE-4520
commit-author Shiraz Saleem <shirazsaleem@microsoft.com>
commit 2bd7dd3

Extend modify QP to support further attributes: local_ack_timeout, UD qkey,
rate_limit, qp_access_flags, flow_label, max_rd_atomic.

	Signed-off-by: Shiraz Saleem <shirazsaleem@microsoft.com>
	Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com>
Link: https://patch.msgid.link/1757923172-4475-1-git-send-email-kotaranov@linux.microsoft.com
	Signed-off-by: Leon Romanovsky <leon@kernel.org>
(cherry picked from commit 2bd7dd3)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
…/O issuing CPU

jira LE-4536
commit-author Long Li <longli@microsoft.com>
commit b69ffea

When selecting an outgoing channel for I/O, storvsc tries to select a
channel with a returning CPU that is not the same as issuing CPU. This
worked well in the past, however it doesn't work well when the Hyper-V
exposes a large number of channels (up to the number of all CPUs). Use a
different CPU for returning channel is not efficient on Hyper-V.

Change this behavior by preferring to the channel with the same CPU as
the current I/O issuing CPU whenever possible.

Tests have shown improvements in newer Hyper-V/Azure environment, and no
regression with older Hyper-V/Azure environments.

	Tested-by: Raheel Abdul Faizy <rabdulfaizy@microsoft.com>
	Signed-off-by: Long Li <longli@microsoft.com>
Message-Id: <1759381530-7414-1-git-send-email-longli@linux.microsoft.com>
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit b69ffea)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
…es to improve memory efficiency.

jira LE-4489
commit-author Dipayaan Roy <dipayanroy@linux.microsoft.com>
commit 730ff06
upstream-diff This patch was causing build failures due to missing
commit 0f92140 ("memory-provider: dmabuf devmem memory provider")
To fix it, we have removed pprm.queue_idx parameter which seems
is not being used even after being set because of the missing commit.

This patch enhances RX buffer handling in the mana driver by allocating
pages from a page pool and slicing them into MTU-sized fragments, rather
than dedicating a full page per packet. This approach is especially
beneficial on systems with large base page sizes like 64KB.

Key improvements:

- Proper integration of page pool for RX buffer allocations.
- MTU-sized buffer slicing to improve memory utilization.
- Reduce overall per Rx queue memory footprint.
- Automatic fallback to full-page buffers when:
   * Jumbo frames are enabled (MTU > PAGE_SIZE / 2).
   * The XDP path is active, to avoid complexities with fragment reuse.

Testing on VMs with 64KB pages shows around 200% throughput improvement.
Memory efficiency is significantly improved due to reduced wastage in page
allocations. Example: We are now able to fit 35 rx buffers in a single 64kb
page for MTU size of 1500, instead of 1 rx buffer per page previously.

Tested:

- iperf3, iperf2, and nttcp benchmarks.
- Jumbo frames with MTU 9000.
- Native XDP programs (XDP_PASS, XDP_DROP, XDP_TX, XDP_REDIRECT) for
  testing the XDP path in driver.
- Memory leak detection (kmemleak).
- Driver load/unload, reboot, and stress scenarios.

	Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
	Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com>
	Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
	Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
Link: https://patch.msgid.link/20250814140410.GA22089@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net
	Signed-off-by: Paolo Abeni <pabeni@redhat.com>

(cherry picked from commit 730ff06)
	Signed-off-by: Shreeya Patel <spatel@ciq.com>
jira KERNEL-169
commit-author Joshua Hay <joshua.a.hay@intel.com>
commit cb83b55
upstream-diff |
	adjusted the number of bytes expected in
	libeth_cacheline_set_assert for struct idpf_tx_queue due to missing
	of some elements in the struct introduced in commit
	1a49cf8 ("idpf: add Tx timestamp flows").

In certain production environments, it is possible for completion tags
to collide, meaning N packets with the same completion tag are in flight
at the same time. In this environment, any given Tx queue is effectively
used to send both slower traffic and higher throughput traffic
simultaneously. This is the result of a customer's specific
configuration in the device pipeline, the details of which Intel cannot
provide. This configuration results in a small number of out-of-order
completions, i.e., a small number of packets in flight. The existing
guardrails in the driver only protect against a large number of packets
in flight. The slower flow completions are delayed which causes the
out-of-order completions. The fast flow will continue sending traffic
and generating tags. Because tags are generated on the fly, the fast
flow eventually uses the same tag for a packet that is still in flight
from the slower flow. The driver has no idea which packet it should
clean when it processes the completion with that tag, but it will look
for the packet on the buffer ring before the hash table.  If the slower
flow packet completion is processed first, it will end up cleaning the
fast flow packet on the ring prematurely. This leaves the descriptor
ring in a bad state resulting in a crash or Tx timeout.

In summary, generating a tag when a packet is sent can lead to the same
tag being associated with multiple packets. This can lead to resource
leaks, crashes, and/or Tx timeouts.

Before we can replace the tag generation, we need a new mechanism for
the send path to know what tag to use next. The driver will allocate and
initialize a refillq for each TxQ with all of the possible free tag
values. During send, the driver grabs the next free tag from the refillq
from next_to_clean. While cleaning the packet, the clean routine posts
the tag back to the refillq's next_to_use to indicate that it is now
free to use.

This mechanism works exactly the same way as the existing Rx refill
queues, which post the cleaned buffer IDs back to the buffer queue to be
reposted to HW. Since we're using the refillqs for both Rx and Tx now,
genericize some of the existing refillq support.

Note: the refillqs will not be used yet. This is only demonstrating how
they will be used to pass free tags back to the send path.

	Signed-off-by: Joshua Hay <joshua.a.hay@intel.com>
	Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
	Tested-by: Samuel Salin <Samuel.salin@intel.com>
	Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
(cherry picked from commit cb83b55)
	Signed-off-by: Roxana Nicolescu <rnicolescu@ciq.com>
jira KERNEL-169
commit-author Joshua Hay <joshua.a.hay@intel.com>
commit f2d18e1

Track the gap between next_to_use and the last RE index. Set RE again
if the gap is large enough to ensure RE bit is set frequently. This is
critical before removing the stashing mechanisms because the
opportunistic descriptor ring cleaning from the out-of-order completions
will go away. Previously the descriptors would be "cleaned" by both the
descriptor (RE) completion and the out-of-order completions. Without the
latter, we must ensure the RE bit is set more frequently. Otherwise,
it's theoretically possible for the descriptor ring next_to_clean to
never advance.  The previous implementation was dependent on the start
of a packet falling on a 64th index in the descriptor ring, which is not
guaranteed with large packets.

	Signed-off-by: Luigi Rizzo <lrizzo@google.com>
	Signed-off-by: Brian Vazquez <brianvv@google.com>
	Signed-off-by: Joshua Hay <joshua.a.hay@intel.com>
	Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
	Tested-by: Samuel Salin <Samuel.salin@intel.com>
	Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
(cherry picked from commit f2d18e1)
	Signed-off-by: Roxana Nicolescu <rnicolescu@ciq.com>
jira KERNEL-169
commit-author Joshua Hay <joshua.a.hay@intel.com>
commit b61dfa9
upstream-diff |
	adjusted context in 2 places:
	- when removing func idpf_tx_dma_map_error due to different memset
	call that uses the hardcoded struct type;
	- in func idpf_tx_splitq_frame due to missing expected
	union idpf_flex_tx_ctx_desc *ctx_desc;
	both differences were introduced in commit
	1a49cf8 ("idpf: add Tx timestamp flows").

Move (and rename) the existing rollback logic to singleq.c since that
will be the only consumer. Create a simplified splitq specific rollback
function to loop through and unmap tx_bufs based on the completion tag.
This is critical before replacing the Tx buffer ring with the buffer
pool since the previous rollback indexing will not work to unmap the
chained buffers from the pool.

Cache the next_to_use index before any portion of the packet is put on
the descriptor ring. In case of an error, the rollback will bump tail to
the correct next_to_use value. Because the splitq path now supports
different types of context descriptors (and potentially multiple in the
future), this will take care of rolling back any and all context
descriptors encoded on the ring for the erroneous packet. The previous
rollback logic was broken for PTP packets since it would not account for
the PTP context descriptor.

Fixes: 1a49cf8 ("idpf: add Tx timestamp flows")
	Signed-off-by: Joshua Hay <joshua.a.hay@intel.com>
	Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
	Tested-by: Samuel Salin <Samuel.salin@intel.com>
	Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
(cherry picked from commit b61dfa9)
	Signed-off-by: Roxana Nicolescu <rnicolescu@ciq.com>
jira KERNEL-169
commit-author Joshua Hay <joshua.a.hay@intel.com>
commit 5f417d5
upstream-diff |
	adjusted context in:
	- ifpf_tx_splitq_frame and idpf_tx_clean_bufs;
	- libeth_cacheline_set_assert for struct idpf_tx_queue due to missing
	of some elements in the struct;
	all cases are due to missing commit
	1a49cf8 ("idpf: add Tx timestamp flows").

Replace the TxQ buffer ring with one large pool/array of buffers (only
for flow scheduling). This eliminates the tag generation and makes it
impossible for a tag to be associated with more than one packet.

The completion tag passed to HW through the descriptor is the index into
the array. That same completion tag is posted back to the driver in the
completion descriptor, and used to index into the array to quickly
retrieve the buffer during cleaning.  In this way, the tags are treated
as a fix sized resource. If all tags are in use, no more packets can be
sent on that particular queue (until some are freed up). The tag pool
size is 64K since the completion tag width is 16 bits.

For each packet, the driver pulls a free tag from the refillq to get the
next free buffer index. When cleaning is complete, the tag is posted
back to the refillq. A multi-frag packet spans multiple buffers in the
driver, therefore it uses multiple buffer indexes/tags from the pool.
Each frag pulls from the refillq to get the next free buffer index.
These are tracked in a next_buf field that replaces the completion tag
field in the buffer struct. This chains the buffers together so that the
packet can be cleaned from the starting completion tag taken from the
completion descriptor, then from the next_buf field for each subsequent
buffer.

In case of a dma_mapping_error occurs or the refillq runs out of free
buf_ids, the packet will execute the rollback error path. This unmaps
any buffers previously mapped for the packet. Since several free
buf_ids could have already been pulled from the refillq, we need to
restore its original state as well. Otherwise, the buf_ids/tags
will be leaked and not used again until the queue is reallocated.

Descriptor completions only advance the descriptor ring index to "clean"
the descriptors. The packet completions only clean the buffers
associated with the given packet completion tag and do not update the
descriptor ring index.

When operating in queue based scheduling mode, the array still acts as a
ring and will only have TxQ descriptor count entries. The tx_bufs are
still associated 1:1 with the descriptor ring entries and we can use the
conventional indexing mechanisms.

Fixes: c2d548c ("idpf: add TX splitq napi poll support")
	Signed-off-by: Luigi Rizzo <lrizzo@google.com>
	Signed-off-by: Brian Vazquez <brianvv@google.com>
	Signed-off-by: Joshua Hay <joshua.a.hay@intel.com>
	Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
	Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
	Tested-by: Samuel Salin <Samuel.salin@intel.com>
	Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
(cherry picked from commit 5f417d5)
	Signed-off-by: Roxana Nicolescu <rnicolescu@ciq.com>
jira KERNEL-169
commit-author Joshua Hay <joshua.a.hay@intel.com>
commit 0c3f135
upstream-diff |
	adjusted conflict in idpf_tx_splitq_frame func due to missing
	1a49cf8 ("idpf: add Tx timestamp flows").

The Tx refillq logic will cause packets to be silently dropped if there
are not enough buffer resources available to send a packet in flow
scheduling mode. Instead, determine how many buffers are needed along
with number of descriptors. Make sure there are enough of both resources
to send the packet, and stop the queue if not.

Fixes: 7292af0 ("idpf: fix a race in txq wakeup")
	Signed-off-by: Joshua Hay <joshua.a.hay@intel.com>
	Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
	Tested-by: Samuel Salin <Samuel.salin@intel.com>
	Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
(cherry picked from commit 0c3f135)
	Signed-off-by: Roxana Nicolescu <rnicolescu@ciq.com>
jira KERNEL-169
commit-author Joshua Hay <joshua.a.hay@intel.com>
commit 6c4e684
upstream-diff |
	- adjusted context due to missing idpf_tx_read_tstamp func;
	- adjusted the number of bytes expected in
	libeth_cacheline_set_assert for struct idpf_tx_queue due to missing
	of some elements in the struct;
	both are due to missing commit
	1a49cf8 ("idpf: add Tx timestamp flows").

With the new Tx buffer management scheme, there is no need for all of
the stashing mechanisms, the hash table, the reserve buffer stack, etc.
Remove all of that.

	Signed-off-by: Joshua Hay <joshua.a.hay@intel.com>
	Reviewed-by: Madhu Chittim <madhu.chittim@intel.com>
	Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
	Tested-by: Samuel Salin <Samuel.salin@intel.com>
	Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
(cherry picked from commit 6c4e684)
	Signed-off-by: Roxana Nicolescu <rnicolescu@ciq.com>
@PlaidCat PlaidCat changed the title [rlc-9/5.14.0-611.41.1.el9_7] Multiple patches tested (41 commits) [RLC-9] Rebase Custom Changes to rlc-9/5.14.0-611.41.1.el9_7 Mar 25, 2026
@PlaidCat PlaidCat requested a review from a team March 25, 2026 12:53
@PlaidCat PlaidCat self-assigned this Mar 25, 2026
@github-actions
Copy link

🤖 Validation Checks In Progress Workflow run: https://github.com/ctrliq/kernel-src-tree/actions/runs/23542504544

@github-actions
Copy link

🔍 Interdiff Analysis

  • ⚠️ PR commit cda645fd4c2 (net: mana: Allocate MSI-X vectors dynamically) → upstream 755391121038
    Differences found:
================================================================================
*    DELTA DIFFERENCES - code changes that differ between the patches          *
================================================================================

--- b/include/net/mana/gdma.h
+++ b/include/net/mana/gdma.h
@@ -597,7 +594,6 @@
 	 GDMA_DRV_CAP_FLAG_1_HWC_TIMEOUT_RECONFIG | \
 	 GDMA_DRV_CAP_FLAG_1_VARIABLE_INDIRECTION_TABLE_SUPPORT | \
 	 GDMA_DRV_CAP_FLAG_1_DEV_LIST_HOLES_SUP | \
-	 GDMA_DRV_CAP_FLAG_1_DYNAMIC_IRQ_ALLOC_SUPPORT | \
 	 GDMA_DRV_CAP_FLAG_1_SELF_RESET_ON_EQE | \
 	 GDMA_DRV_CAP_FLAG_1_HANDLE_RECONFIG_EQE)
 

################################################################################
!    REJECTED PATCH2 HUNKS - could not be compared; manual review needed       !
################################################################################

--- b/drivers/net/ethernet/microsoft/mana/gdma_main.c
+++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
@@ -6,6 +6,8 @@
 #include <linux/pci.h>
 #include <linux/utsname.h>
 #include <linux/version.h>
+#include <linux/msi.h>
+#include <linux/irqdomain.h>
 
 #include <net/mana/mana.h>
 
--- b/include/net/mana/gdma.h
+++ b/include/net/mana/gdma.h
@@ -578,6 +578,9 @@
 /* Driver can handle holes (zeros) in the device list */
 #define GDMA_DRV_CAP_FLAG_1_DEV_LIST_HOLES_SUP BIT(11)
 
+/* Driver supports dynamic MSI-X vector allocation */
+#define GDMA_DRV_CAP_FLAG_1_DYNAMIC_IRQ_ALLOC_SUPPORT BIT(13)
+
 #define GDMA_DRV_CAP_FLAGS1 \
 	(GDMA_DRV_CAP_FLAG_1_EQ_SHARING_MULTI_VPORT | \
 	 GDMA_DRV_CAP_FLAG_1_NAPI_WKDONE_FIX | \
@@ -581,7 +584,8 @@
 	 GDMA_DRV_CAP_FLAG_1_NAPI_WKDONE_FIX | \
 	 GDMA_DRV_CAP_FLAG_1_HWC_TIMEOUT_RECONFIG | \
 	 GDMA_DRV_CAP_FLAG_1_VARIABLE_INDIRECTION_TABLE_SUPPORT | \
-	 GDMA_DRV_CAP_FLAG_1_DEV_LIST_HOLES_SUP)
+	 GDMA_DRV_CAP_FLAG_1_DEV_LIST_HOLES_SUP | \
+	 GDMA_DRV_CAP_FLAG_1_DYNAMIC_IRQ_ALLOC_SUPPORT)
 
 #define GDMA_DRV_CAP_FLAGS2 0
 

================================================================================
*    CONTEXT DIFFERENCES - surrounding code differences between the patches    *
================================================================================

--- b/drivers/net/ethernet/microsoft/mana/gdma_main.c
+++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
@@ -7,5 +7,4 @@
 #include <linux/utsname.h>
 #include <linux/version.h>
-#include <linux/export.h>
 
 #include <net/mana/mana.h>
--- b/include/net/mana/gdma.h
+++ b/include/net/mana/gdma.h
@@ -584,5 +582,5 @@
 	 GDMA_DRV_CAP_FLAG_1_VARIABLE_INDIRECTION_TABLE_SUPPORT | \
-	 GDMA_DRV_CAP_FLAG_1_DEV_LIST_HOLES_SUP | \
-	 GDMA_DRV_CAP_FLAG_1_SELF_RESET_ON_EQE | \
-	 GDMA_DRV_CAP_FLAG_1_HANDLE_RECONFIG_EQE)
+	 GDMA_DRV_CAP_FLAG_1_DEV_LIST_HOLES_SUP)
+
+#define GDMA_DRV_CAP_FLAGS2 0
  • ⚠️ PR commit 16779c1ee17 (net: mana: Add support for net_shaper_ops) → upstream 75cabb46935b
    Differences found:
================================================================================
*    DELTA DIFFERENCES - code changes that differ between the patches          *
================================================================================

--- b/include/net/mana/mana.h
+++ b/include/net/mana/mana.h
@@ -4,8 +4,6 @@
 #ifndef _MANA_H
 #define _MANA_H
 
-#include <net/net_shaper.h>
-
 #include "gdma.h"
 #include "hw_channel.h"
 

################################################################################
!    REJECTED PATCH2 HUNKS - could not be compared; manual review needed       !
################################################################################

--- b/include/net/mana/mana.h
+++ b/include/net/mana/mana.h
@@ -5,6 +5,7 @@
 #define _MANA_H
 
 #include <net/xdp.h>
+#include <net/net_shaper.h>
 
 #include "gdma.h"
 #include "hw_channel.h"

================================================================================
*    CONTEXT DIFFERENCES - surrounding code differences between the patches    *
================================================================================

--- b/include/net/mana/mana.h
+++ b/include/net/mana/mana.h
@@ -2,4 +1,8 @@
 #define _MANA_H
 
+#include <net/xdp.h>
+
+#include <net/net_shaper.h>
+
 #include "gdma.h"
 #include "hw_channel.h"
  • ⚠️ PR commit 3c53b414114 (net: mana: Handle unsupported HWC commands) → upstream ca8ac489ca33
    Differences found:
################################################################################
!    REJECTED PATCH2 HUNKS - could not be compared; manual review needed       !
################################################################################

--- b/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -847,6 +847,9 @@
 	err = mana_gd_send_request(gc, in_len, in_buf, out_len,
 				   out_buf);
 	if (err || resp->status) {
+		if (err == -EOPNOTSUPP)
+			return err;
+
 		if (req->req.msg_type != MANA_QUERY_PHY_STAT)
 			dev_err(dev, "Failed to send mana message: %d, 0x%x\n",
 				err, resp->status);

================================================================================
*    ONLY IN PATCH2 - files not modified by patch1                             *
================================================================================

--- a/drivers/net/ethernet/microsoft/mana/hw_channel.c
+++ b/drivers/net/ethernet/microsoft/mana/hw_channel.c
@@ -891,6 +891,10 @@ int mana_hwc_send_request(struct hw_channel_context *hwc, u32 req_len,
 	}
 
 	if (ctx->status_code && ctx->status_code != GDMA_STATUS_MORE_ENTRIES) {
+		if (ctx->status_code == GDMA_STATUS_CMD_UNSUPPORTED) {
+			err = -EOPNOTSUPP;
+			goto out;
+		}
 		if (req_msg->req.msg_type != MANA_QUERY_PHY_STAT)
 			dev_err(hwc->dev, "HWC: Failed hw_channel req: 0x%x\n",
 				ctx->status_code);
--- a/include/net/mana/gdma.h
+++ b/include/net/mana/gdma.h
@@ -10,6 +10,7 @@
 #include "shm_channel.h"
 
 #define GDMA_STATUS_MORE_ENTRIES	0x00000105
+#define GDMA_STATUS_CMD_UNSUPPORTED	0xffffffff
 
 /* Structures labeled with "HW DATA" are exchanged with the hardware. All of
  * them are naturally aligned and hence don't need __packed.
  • ⚠️ PR commit c9c6b123bfa (net: mana: Use page pool fragments for RX buffers instead of full pages to improve memory efficiency.) → upstream 730ff06d3f5c
    Differences found:
================================================================================
*    DELTA DIFFERENCES - code changes that differ between the patches          *
================================================================================

--- b/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -2471,6 +2471,7 @@
 	pprm.napi = &rxq->rx_cq.napi;
 	pprm.netdev = rxq->ndev;
 	pprm.order = get_order(rxq->alloc_size);
+	pprm.queue_idx = rxq->rxq_idx;
 	pprm.dev = gc->dev;
 
 	/* Let the page pool do the dma map when page sharing with multiple
  • ⚠️ PR commit 4bd7711dfba (idpf: add support for Tx refillqs in flow scheduling mode) → upstream cb83b559bea3
    Differences found:
================================================================================
*    DELTA DIFFERENCES - code changes that differ between the patches          *
================================================================================

--- b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
@@ -683,7 +683,7 @@
 	__cacheline_group_end_aligned(cold);
 };
 libeth_cacheline_set_assert(struct idpf_tx_queue, 64,
-			    96 + sizeof(struct u64_stats_sync),
+			    88 + sizeof(struct u64_stats_sync),
 			    24);
 
 /**

################################################################################
!    REJECTED PATCH2 HUNKS - could not be compared; manual review needed       !
################################################################################

--- b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
@@ -694,7 +696,7 @@
 	__cacheline_group_end_aligned(cold);
 };
 libeth_cacheline_set_assert(struct idpf_tx_queue, 64,
-			    112 + sizeof(struct u64_stats_sync),
+			    120 + sizeof(struct u64_stats_sync),
 			    24);
 
 /**

================================================================================
*    CONTEXT DIFFERENCES - surrounding code differences between the patches    *
================================================================================

--- b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
@@ -3306,5 +3431,5 @@
 skip_data:
-		rx_buf->page = NULL;
+		rx_buf->netmem = 0;
 
 		idpf_rx_post_buf_refill(refillq, buf_id);
 		IDPF_RX_BUMP_NTC(rxq, ntc);
--- b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
@@ -678,7 +686,7 @@
 	__cacheline_group_end_aligned(cold);
 };
 libeth_cacheline_set_assert(struct idpf_tx_queue, 64,
-			    88 + sizeof(struct u64_stats_sync),
+			    112 + sizeof(struct u64_stats_sync),
 			    24);
 
 /**
  • ⚠️ PR commit 3627e99e5db (idpf: simplify and fix splitq Tx packet rollback error path) → upstream b61dfa9bc443
    Differences found:
================================================================================
*    DELTA DIFFERENCES - code changes that differ between the patches          *
================================================================================

--- b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
@@ -2289,4 +2289,55 @@
 
 /**
+ * idpf_tx_dma_map_error - handle TX DMA map errors
+ * @txq: queue to send buffer on
+ * @skb: send buffer
+ * @first: original first buffer info buffer for packet
+ * @idx: starting point on ring to unwind
+ */
+void idpf_tx_dma_map_error(struct idpf_tx_queue *txq, struct sk_buff *skb,
+			   struct idpf_tx_buf *first, u16 idx)
+{
+	struct libeth_sq_napi_stats ss = { };
+	struct libeth_cq_pp cp = {
+		.dev	= txq->dev,
+		.ss	= &ss,
+	};
+
+	u64_stats_update_begin(&txq->stats_sync);
+	u64_stats_inc(&txq->q_stats.dma_map_errs);
+	u64_stats_update_end(&txq->stats_sync);
+
+	/* clear dma mappings for failed tx_buf map */
+	for (;;) {
+		struct idpf_tx_buf *tx_buf;
+
+		tx_buf = &txq->tx_buf[idx];
+		libeth_tx_complete(tx_buf, &cp);
+		if (tx_buf == first)
+			break;
+		if (idx == 0)
+			idx = txq->desc_count;
+		idx--;
+	}
+
+	if (skb_is_gso(skb)) {
+		union idpf_tx_flex_desc *tx_desc;
+
+		/* If we failed a DMA mapping for a TSO packet, we will have
+		 * used one additional descriptor for a context
+		 * descriptor. Reset that here.
+		 */
+		tx_desc = &txq->flex_tx[idx];
+		memset(tx_desc, 0, sizeof(struct idpf_flex_tx_ctx_desc));
+		if (idx == 0)
+			idx = txq->desc_count;
+		idx--;
+	}
+
+	/* Update tail in case netdev_xmit_more was previously true */
+	idpf_tx_buf_hw_update(txq, idx, false);
+}
+
+/**
  * idpf_tx_splitq_bump_ntu - adjust NTU and generation
  * @txq: the tx ring to wrap
@@ -2337,35 +2388,4 @@
 
 /**
- * idpf_tx_splitq_pkt_err_unmap - Unmap buffers and bump tail in case of error
- * @txq: Tx queue to unwind
- * @params: pointer to splitq params struct
- * @first: starting buffer for packet to unmap
- */
-static void idpf_tx_splitq_pkt_err_unmap(struct idpf_tx_queue *txq,
-					 struct idpf_tx_splitq_params *params,
-					 struct idpf_tx_buf *first)
-{
-	struct libeth_sq_napi_stats ss = { };
-	struct idpf_tx_buf *tx_buf = first;
-	struct libeth_cq_pp cp = {
-		.dev    = txq->dev,
-		.ss     = &ss,
-	};
-	u32 idx = 0;
-
-	u64_stats_update_begin(&txq->stats_sync);
-	u64_stats_inc(&txq->q_stats.dma_map_errs);
-	u64_stats_update_end(&txq->stats_sync);
-
-	do {
-		libeth_tx_complete(tx_buf, &cp);
-		idpf_tx_clean_buf_ring_bump_ntc(txq, idx, tx_buf);
-	} while (idpf_tx_buf_compl_tag(tx_buf) == params->compl_tag);
-
-	/* Update tail in case netdev_xmit_more was previously true. */
-	idpf_tx_buf_hw_update(txq, params->prev_ntu, false);
-}
-
-/**
  * idpf_tx_splitq_map - Build the Tx flex descriptor
  * @tx_q: queue to send buffer on

################################################################################
!    REJECTED PATCH2 HUNKS - could not be compared; manual review needed       !
################################################################################

--- b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
@@ -2339,57 +2339,6 @@
 	return count;
 }
 
-/**
- * idpf_tx_dma_map_error - handle TX DMA map errors
- * @txq: queue to send buffer on
- * @skb: send buffer
- * @first: original first buffer info buffer for packet
- * @idx: starting point on ring to unwind
- */
-void idpf_tx_dma_map_error(struct idpf_tx_queue *txq, struct sk_buff *skb,
-			   struct idpf_tx_buf *first, u16 idx)
-{
-	struct libeth_sq_napi_stats ss = { };
-	struct libeth_cq_pp cp = {
-		.dev	= txq->dev,
-		.ss	= &ss,
-	};
-
-	u64_stats_update_begin(&txq->stats_sync);
-	u64_stats_inc(&txq->q_stats.dma_map_errs);
-	u64_stats_update_end(&txq->stats_sync);
-
-	/* clear dma mappings for failed tx_buf map */
-	for (;;) {
-		struct idpf_tx_buf *tx_buf;
-
-		tx_buf = &txq->tx_buf[idx];
-		libeth_tx_complete(tx_buf, &cp);
-		if (tx_buf == first)
-			break;
-		if (idx == 0)
-			idx = txq->desc_count;
-		idx--;
-	}
-
-	if (skb_is_gso(skb)) {
-		union idpf_tx_flex_desc *tx_desc;
-
-		/* If we failed a DMA mapping for a TSO packet, we will have
-		 * used one additional descriptor for a context
-		 * descriptor. Reset that here.
-		 */
-		tx_desc = &txq->flex_tx[idx];
-		memset(tx_desc, 0, sizeof(*tx_desc));
-		if (idx == 0)
-			idx = txq->desc_count;
-		idx--;
-	}
-
-	/* Update tail in case netdev_xmit_more was previously true */
-	idpf_tx_buf_hw_update(txq, idx, false);
-}
-
 /**
  * idpf_tx_splitq_bump_ntu - adjust NTU and generation
  * @txq: the tx ring to wrap
@@ -2438,6 +2387,37 @@
 	return true;
 }
 
+/**
+ * idpf_tx_splitq_pkt_err_unmap - Unmap buffers and bump tail in case of error
+ * @txq: Tx queue to unwind
+ * @params: pointer to splitq params struct
+ * @first: starting buffer for packet to unmap
+ */
+static void idpf_tx_splitq_pkt_err_unmap(struct idpf_tx_queue *txq,
+					 struct idpf_tx_splitq_params *params,
+					 struct idpf_tx_buf *first)
+{
+	struct libeth_sq_napi_stats ss = { };
+	struct idpf_tx_buf *tx_buf = first;
+	struct libeth_cq_pp cp = {
+		.dev    = txq->dev,
+		.ss     = &ss,
+	};
+	u32 idx = 0;
+
+	u64_stats_update_begin(&txq->stats_sync);
+	u64_stats_inc(&txq->q_stats.dma_map_errs);
+	u64_stats_update_end(&txq->stats_sync);
+
+	do {
+		libeth_tx_complete(tx_buf, &cp);
+		idpf_tx_clean_buf_ring_bump_ntc(txq, idx, tx_buf);
+	} while (idpf_tx_buf_compl_tag(tx_buf) == params->compl_tag);
+
+	/* Update tail in case netdev_xmit_more was previously true. */
+	idpf_tx_buf_hw_update(txq, params->prev_ntu, false);
+}
+
 /**
  * idpf_tx_splitq_map - Build the Tx flex descriptor
  * @tx_q: queue to send buffer on
@@ -2482,8 +2462,9 @@
 	for (frag = &skb_shinfo(skb)->frags[0];; frag++) {
 		unsigned int max_data = IDPF_TX_MAX_DESC_DATA_ALIGNED;
 
-		if (dma_mapping_error(tx_q->dev, dma))
-			return idpf_tx_dma_map_error(tx_q, skb, first, i);
+		if (unlikely(dma_mapping_error(tx_q->dev, dma)))
+			return idpf_tx_splitq_pkt_err_unmap(tx_q, params,
+							    first);
 
 		first->nr_frags++;
 		idpf_tx_buf_compl_tag(tx_buf) = params->compl_tag;
@@ -2939,7 +2920,9 @@
 static netdev_tx_t idpf_tx_splitq_frame(struct sk_buff *skb,
 					struct idpf_tx_queue *tx_q)
 {
-	struct idpf_tx_splitq_params tx_params = { };
+	struct idpf_tx_splitq_params tx_params = {
+		.prev_ntu = tx_q->next_to_use,
+	};
 	union idpf_flex_tx_ctx_desc *ctx_desc;
 	struct idpf_tx_buf *first;
 	unsigned int count;

================================================================================
*    CONTEXT DIFFERENCES - surrounding code differences between the patches    *
================================================================================

--- b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
@@ -2328,7 +2380,7 @@
 		 * descriptor. Reset that here.
 		 */
 		tx_desc = &txq->flex_tx[idx];
-		memset(tx_desc, 0, sizeof(struct idpf_flex_tx_ctx_desc));
+		memset(tx_desc, 0, sizeof(*tx_desc));
 		if (idx == 0)
 			idx = txq->desc_count;
 		idx--;
@@ -2819,4 +2871,5 @@
 {
 	struct idpf_tx_splitq_params tx_params = { };
+	union idpf_flex_tx_ctx_desc *ctx_desc;
 	struct idpf_tx_buf *first;
 	unsigned int count;
  • ⚠️ PR commit e1b85f55748 (idpf: replace flow scheduling buffer ring with buffer pool) → upstream 5f417d551324
    Differences found:
================================================================================
*    DELTA DIFFERENCES - code changes that differ between the patches          *
================================================================================

--- b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
@@ -1917,11 +1917,8 @@
 		.napi	= budget,
 	};
 
-	tx_buf = &txq->tx_buf[buf_id];
-	if (tx_buf->type == LIBETH_SQE_SKB) {
+	if (tx_buf->type == LIBETH_SQE_SKB)
 		libeth_tx_complete(tx_buf, &cp);
-		idpf_post_buf_refill(txq->refillq, buf_id);
-	}
 
 	while (idpf_tx_buf_next(tx_buf) != IDPF_TXBUF_NULL) {
 		buf_id = idpf_tx_buf_next(tx_buf);

################################################################################
!    REJECTED PATCH2 HUNKS - could not be compared; manual review needed       !
################################################################################

--- b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
@@ -1962,6 +1962,7 @@
 		     idpf_tx_buf_compl_tag(tx_buf) != compl_tag))
 		return false;
 
+	tx_buf = &txq->tx_buf[buf_id];
 	if (tx_buf->type == LIBETH_SQE_SKB) {
 		if (skb_shinfo(tx_buf->skb)->tx_flags & SKBTX_IN_PROGRESS)
 			idpf_tx_read_tstamp(txq, tx_buf->skb);
@@ -1965,6 +1966,7 @@
 			idpf_tx_read_tstamp(txq, tx_buf->skb);
 
 		libeth_tx_complete(tx_buf, &cp);
+		idpf_post_buf_refill(txq->refillq, buf_id);
 	}
 
 	idpf_tx_clean_buf_ring_bump_ntc(txq, idx, tx_buf);
@@ -2892,6 +2859,7 @@
 	struct idpf_tx_buf *first;
 	unsigned int count;
 	int tso, idx;
+	u32 buf_id;
 
 	count = idpf_tx_desc_count_required(tx_q, skb);
 	if (unlikely(!count))
--- b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
@@ -707,7 +715,7 @@
 };
 libeth_cacheline_set_assert(struct idpf_tx_queue, 64,
 			    120 + sizeof(struct u64_stats_sync),
-			    24);
+			    32);
 
 /**
  * struct idpf_buf_queue - software structure representing a buffer queue

================================================================================
*    CONTEXT DIFFERENCES - surrounding code differences between the patches    *
================================================================================

--- b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
@@ -1917,8 +1965,12 @@
 		     idpf_tx_buf_compl_tag(tx_buf) != compl_tag))
 		return false;
 
-	if (tx_buf->type == LIBETH_SQE_SKB)
+	if (tx_buf->type == LIBETH_SQE_SKB) {
+		if (skb_shinfo(tx_buf->skb)->tx_flags & SKBTX_IN_PROGRESS)
+			idpf_tx_read_tstamp(txq, tx_buf->skb);
+
 		libeth_tx_complete(tx_buf, &cp);
+	}
 
 	idpf_tx_clean_buf_ring_bump_ntc(txq, idx, tx_buf);
 
@@ -2746,4 +2798,4 @@
-	struct idpf_flex_tx_ctx_desc *desc;
+	union idpf_flex_tx_ctx_desc *desc;
 	int i = txq->next_to_use;
 
 	txq->tx_buf[i].type = LIBETH_SQE_CTX;
@@ -2804,6 +2856,6 @@
 	struct idpf_tx_buf *first;
 	unsigned int count;
-	int tso;
+	int tso, idx;
 
 	count = idpf_tx_desc_count_required(tx_q, skb);
 	if (unlikely(!count))
@@ -2842,4 +2962,4 @@
-		u64_stats_update_end(&tx_q->stats_sync);
+		idpf_tx_set_tstamp_desc(ctx_desc, idx);
 	}
 
 	/* record the location of the first descriptor for this packet */
--- b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
@@ -685,7 +693,7 @@
 	__cacheline_group_end_aligned(cold);
 };
 libeth_cacheline_set_assert(struct idpf_tx_queue, 64,
-			    96 + sizeof(struct u64_stats_sync),
+			    120 + sizeof(struct u64_stats_sync),
 			    24);
 
 /**
  • ⚠️ PR commit f6596c35c31 (idpf: stop Tx if there are insufficient buffer resources) → upstream 0c3f135e840d
    Differences found:
################################################################################
!    REJECTED PATCH2 HUNKS - could not be compared; manual review needed       !
################################################################################

--- b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
@@ -2909,7 +2926,7 @@
 	};
 	union idpf_flex_tx_ctx_desc *ctx_desc;
 	struct idpf_tx_buf *first;
-	unsigned int count;
+	u32 count, buf_count = 1;
 	int tso, idx;
 	u32 buf_id;
 

================================================================================
*    CONTEXT DIFFERENCES - surrounding code differences between the patches    *
================================================================================

--- b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
@@ -2770,7 +2821,8 @@
 	};
+	union idpf_flex_tx_ctx_desc *ctx_desc;
 	struct idpf_tx_buf *first;
 	unsigned int count;
-	int tso;
+	int tso, idx;
 	u32 buf_id;
 
 	count = idpf_tx_desc_count_required(tx_q, skb);
  • ⚠️ PR commit ddde1d9445f (idpf: remove obsolete stashing code) → upstream 6c4e68480238
    Differences found:
================================================================================
*    DELTA DIFFERENCES - code changes that differ between the patches          *
================================================================================

--- b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
@@ -1559,6 +1559,82 @@
 	wake_up(&vport->sw_marker_wq);
 }
 
+/**
+ * idpf_tx_clean_stashed_bufs - clean bufs that were stored for
+ * out of order completions
+ * @txq: queue to clean
+ * @compl_tag: completion tag of packet to clean (from completion descriptor)
+ * @cleaned: pointer to stats struct to track cleaned packets/bytes
+ * @budget: Used to determine if we are in netpoll
+ */
+static void idpf_tx_clean_stashed_bufs(struct idpf_tx_queue *txq,
+				       u16 compl_tag,
+				       struct libeth_sq_napi_stats *cleaned,
+				       int budget)
+{
+	struct idpf_tx_stash *stash;
+	struct hlist_node *tmp_buf;
+	struct libeth_cq_pp cp = {
+		.dev	= txq->dev,
+		.ss	= cleaned,
+		.napi	= budget,
+	};
+
+	/* Buffer completion */
+	hash_for_each_possible_safe(txq->stash->sched_buf_hash, stash, tmp_buf,
+				    hlist, compl_tag) {
+		if (unlikely(idpf_tx_buf_compl_tag(&stash->buf) != compl_tag))
+			continue;
+
+		hash_del(&stash->hlist);
+		libeth_tx_complete(&stash->buf, &cp);
+
+		/* Push shadow buf back onto stack */
+		idpf_buf_lifo_push(&txq->stash->buf_stack, stash);
+	}
+}
+
+/**
+ * idpf_stash_flow_sch_buffers - store buffer parameters info to be freed at a
+ * later time (only relevant for flow scheduling mode)
+ * @txq: Tx queue to clean
+ * @tx_buf: buffer to store
+ */
+static int idpf_stash_flow_sch_buffers(struct idpf_tx_queue *txq,
+				       struct idpf_tx_buf *tx_buf)
+{
+	struct idpf_tx_stash *stash;
+
+	if (unlikely(tx_buf->type <= LIBETH_SQE_CTX))
+		return 0;
+
+	stash = idpf_buf_lifo_pop(&txq->stash->buf_stack);
+	if (unlikely(!stash)) {
+		net_err_ratelimited("%s: No out-of-order TX buffers left!\n",
+				    netdev_name(txq->netdev));
+
+		return -ENOMEM;
+	}
+
+	/* Store buffer params in shadow buffer */
+	stash->buf.skb = tx_buf->skb;
+	stash->buf.bytes = tx_buf->bytes;
+	stash->buf.packets = tx_buf->packets;
+	stash->buf.type = tx_buf->type;
+	stash->buf.nr_frags = tx_buf->nr_frags;
+	dma_unmap_addr_set(&stash->buf, dma, dma_unmap_addr(tx_buf, dma));
+	dma_unmap_len_set(&stash->buf, len, dma_unmap_len(tx_buf, len));
+	idpf_tx_buf_compl_tag(&stash->buf) = idpf_tx_buf_compl_tag(tx_buf);
+
+	/* Add buffer to buf_hash table to be freed later */
+	hash_add(txq->stash->sched_buf_hash, &stash->hlist,
+		 idpf_tx_buf_compl_tag(&stash->buf));
+
+	tx_buf->type = LIBETH_SQE_EMPTY;
+
+	return 0;
+}
+
 #define idpf_tx_splitq_clean_bump_ntc(txq, ntc, desc, buf)	\
 do {								\
 	if (unlikely(++(ntc) == (txq)->desc_count)) {		\
--- b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
@@ -653,7 +660,7 @@
 	__cacheline_group_end_aligned(cold);
 };
 libeth_cacheline_set_assert(struct idpf_tx_queue, 64,
-			    80 + sizeof(struct u64_stats_sync),
+			    96 + sizeof(struct u64_stats_sync),
 			    32);
 
 /**

################################################################################
!    REJECTED PATCH2 HUNKS - could not be compared; manual review needed       !
################################################################################

--- b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
@@ -1602,87 +1462,6 @@
 	spin_unlock_bh(&tx_tstamp_caps->status_lock);
 }
 
-/**
- * idpf_tx_clean_stashed_bufs - clean bufs that were stored for
- * out of order completions
- * @txq: queue to clean
- * @compl_tag: completion tag of packet to clean (from completion descriptor)
- * @cleaned: pointer to stats struct to track cleaned packets/bytes
- * @budget: Used to determine if we are in netpoll
- */
-static void idpf_tx_clean_stashed_bufs(struct idpf_tx_queue *txq,
-				       u16 compl_tag,
-				       struct libeth_sq_napi_stats *cleaned,
-				       int budget)
-{
-	struct idpf_tx_stash *stash;
-	struct hlist_node *tmp_buf;
-	struct libeth_cq_pp cp = {
-		.dev	= txq->dev,
-		.ss	= cleaned,
-		.napi	= budget,
-	};
-
-	/* Buffer completion */
-	hash_for_each_possible_safe(txq->stash->sched_buf_hash, stash, tmp_buf,
-				    hlist, compl_tag) {
-		if (unlikely(idpf_tx_buf_compl_tag(&stash->buf) != compl_tag))
-			continue;
-
-		hash_del(&stash->hlist);
-
-		if (stash->buf.type == LIBETH_SQE_SKB &&
-		    (skb_shinfo(stash->buf.skb)->tx_flags & SKBTX_IN_PROGRESS))
-			idpf_tx_read_tstamp(txq, stash->buf.skb);
-
-		libeth_tx_complete(&stash->buf, &cp);
-
-		/* Push shadow buf back onto stack */
-		idpf_buf_lifo_push(&txq->stash->buf_stack, stash);
-	}
-}
-
-/**
- * idpf_stash_flow_sch_buffers - store buffer parameters info to be freed at a
- * later time (only relevant for flow scheduling mode)
- * @txq: Tx queue to clean
- * @tx_buf: buffer to store
- */
-static int idpf_stash_flow_sch_buffers(struct idpf_tx_queue *txq,
-				       struct idpf_tx_buf *tx_buf)
-{
-	struct idpf_tx_stash *stash;
-
-	if (unlikely(tx_buf->type <= LIBETH_SQE_CTX))
-		return 0;
-
-	stash = idpf_buf_lifo_pop(&txq->stash->buf_stack);
-	if (unlikely(!stash)) {
-		net_err_ratelimited("%s: No out-of-order TX buffers left!\n",
-				    netdev_name(txq->netdev));
-
-		return -ENOMEM;
-	}
-
-	/* Store buffer params in shadow buffer */
-	stash->buf.skb = tx_buf->skb;
-	stash->buf.bytes = tx_buf->bytes;
-	stash->buf.packets = tx_buf->packets;
-	stash->buf.type = tx_buf->type;
-	stash->buf.nr_frags = tx_buf->nr_frags;
-	dma_unmap_addr_set(&stash->buf, dma, dma_unmap_addr(tx_buf, dma));
-	dma_unmap_len_set(&stash->buf, len, dma_unmap_len(tx_buf, len));
-	idpf_tx_buf_compl_tag(&stash->buf) = idpf_tx_buf_compl_tag(tx_buf);
-
-	/* Add buffer to buf_hash table to be freed later */
-	hash_add(txq->stash->sched_buf_hash, &stash->hlist,
-		 idpf_tx_buf_compl_tag(&stash->buf));
-
-	tx_buf->type = LIBETH_SQE_EMPTY;
-
-	return 0;
-}
-
 #define idpf_tx_splitq_clean_bump_ntc(txq, ntc, desc, buf)	\
 do {								\
 	if (unlikely(++(ntc) == (txq)->desc_count)) {		\
--- b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
@@ -599,9 +565,6 @@
  * @cleaned_pkts: Number of packets cleaned for the above said case
  * @stash: Tx buffer stash for Flow-based scheduling mode
  * @refillq: Pointer to refill queue
- * @compl_tag_bufid_m: Completion tag buffer id mask
- * @compl_tag_cur_gen: Used to keep track of current completion tag generation
- * @compl_tag_gen_max: To determine when compl_tag_cur_gen should be reset
  * @cached_tstamp_caps: Tx timestamp capabilities negotiated with the CP
  * @tstamp_task: Work that handles Tx timestamp read
  * @stats_sync: See struct u64_stats_sync
@@ -650,10 +611,6 @@
 	struct idpf_txq_stash *stash;
 	struct idpf_sw_queue *refillq;
 
-	u16 compl_tag_bufid_m;
-	u16 compl_tag_cur_gen;
-	u16 compl_tag_gen_max;
-
 	struct idpf_ptp_vport_tx_tstamp_caps *cached_tstamp_caps;
 	struct work_struct *tstamp_task;
 
@@ -671,7 +628,7 @@
 	__cacheline_group_end_aligned(cold);
 };
 libeth_cacheline_set_assert(struct idpf_tx_queue, 64,
-			    120 + sizeof(struct u64_stats_sync),
+			    104 + sizeof(struct u64_stats_sync),
 			    32);
 
 /**

================================================================================
*    CONTEXT DIFFERENCES - surrounding code differences between the patches    *
================================================================================

--- b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
@@ -4,4 +4,4 @@
-#include "idpf.h"
+#include "idpf_ptp.h"
 #include "idpf_virtchnl.h"
 
 struct idpf_tx_stash {
@@ -1727,6 +1770,11 @@
 			continue;
 
 		hash_del(&stash->hlist);
+
+		if (stash->buf.type == LIBETH_SQE_SKB &&
+		    (skb_shinfo(stash->buf.skb)->tx_flags & SKBTX_IN_PROGRESS))
+			idpf_tx_read_tstamp(txq, stash->buf.skb);
+
 		libeth_tx_complete(&stash->buf, &cp);
 
 		/* Push shadow buf back onto stack */
--- b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
@@ -632,2 +638,4 @@
  * @compl_tag_gen_max: To determine when compl_tag_cur_gen should be reset
+ * @cached_tstamp_caps: Tx timestamp capabilities negotiated with the CP
+ * @tstamp_task: Work that handles Tx timestamp read
  * @stats_sync: See struct u64_stats_sync
@@ -682,6 +690,6 @@
 	u16 compl_tag_cur_gen;
 	u16 compl_tag_gen_max;
 
-	struct u64_stats_sync stats_sync;
-	struct idpf_tx_queue_stats q_stats;
-	__cacheline_group_end_aligned(read_write);
+	struct idpf_ptp_vport_tx_tstamp_caps *cached_tstamp_caps;
+	struct work_struct *tstamp_task;
+
@@ -696,7 +707,7 @@
 	__cacheline_group_end_aligned(cold);
 };
 libeth_cacheline_set_assert(struct idpf_tx_queue, 64,
-			    96 + sizeof(struct u64_stats_sync),
+			    120 + sizeof(struct u64_stats_sync),
 			    32);
 
 /**

This is an automated interdiff check for backported commits.

@github-actions
Copy link

Validation checks completed successfully View full results: https://github.com/ctrliq/kernel-src-tree/actions/runs/23542504544

Copy link
Collaborator

@bmastbergen bmastbergen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥌

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

10 participants