aboutsummaryrefslogtreecommitdiff
path: root/src/drivers/net
AgeCommit message (Collapse)AuthorFilesLines
2023-06-21[efi] Always poll for TX completionsdell3440bMichael Brown1-5/+5
Polling for TX completions is arguably redundant when there are no transmissions currently in progress. Commit c6c7e78 ("[efi] Poll for TX completions only when there is an outstanding TX buffer") switched to setting the PXE_OPFLAGS_GET_TRANSMITTED_BUFFERS flag only when there is an in-progress transmission awaiting completion, in order to reduce reported TX errors and debug message noise from buggy NII implementations that report spurious TX completions whenever the transmit queue is empty. Some other NII implementations (observed with the Realtek driver in a Dell Latitude 3440) seem to have a bug in the transmit datapath handling which results in the transmit ring freezing after sending a few hundred packets under heavy load. The symptoms are that the TPPoll register's NPQ bit remains set and the 256-entry transmit ring contains a large number of uncompleted descriptors (with the OWN bit set), the first two of which have identical data buffer addresses. Though iPXE will submit at most one in-progress transmission via NII, the Dell/Realtek driver seems to make a page-aligned copy of each transmit data buffer and to report TX completions immediately without waiting for the packet to actually be transmitted. These synthetic TX completions continue even after the hardware transmit ring freezes. Setting PXE_OPFLAGS_GET_TRANSMITTED_BUFFERS on every poll reduces the probability of this Dell/Realtek driver bug being triggered by a factor of around 500, which brings the failure rate down to the point that it can sensibly be managed by external logic such as the "--timeout" option for image downloads. Closing and reopening the interface (via "ifclose"/"ifopen") will clear the error condition and allow transmissions to resume. Revert to setting PXE_OPFLAGS_GET_TRANSMITTED_BUFFERS on every poll, and silently ignore situations in which the hardware reports a completion when no transmission is in progress. This approximately matches the behaviour of the SnpDxe driver, which will also generally set PXE_OPFLAGS_GET_TRANSMITTED_BUFFERS on every poll. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2023-03-14[intel] Add workaround for I210 reset hardware bugsMatt Parrella2-2/+24
The Intel I210's packet buffer size registers reset only on power up, not when a reset signal is asserted. This can lead to the inability to pass traffic in the event that the DMA TX Maximum Packet Size (which does reset to its default value on reset) is bigger than the TX Packet Buffer Size. For example, an operating system may be using the time sensitive networking features of the I210 and the registers may be programmed correctly, but then a reset signal is asserted and iPXE on the next boot will be unable to use the I210. Mimic what Linux does and forcibly set the registers to their default values. Signed-off-by: Matt Parrella <parrella.matthew@gmail.com>
2023-03-05[intelx] Add PCI IDs for Intel 82599 10GBASE-T NICForest Crossman1-0/+1
Signed-off-by: Forest Crossman <cyrozap@gmail.com>
2023-02-01[realtek] Explicitly disable VLAN offloadMichael Brown2-2/+7
Some cards seem to have the receive VLAN tag stripping feature enabled by default, which causes received VLAN packets to be misinterpreted as being received by the trunk device. Fix by disabling VLAN tag stripping in the C+ Command Register. Debugged-by: Xinming Lai <yiyihu@gmail.com> Tested-by: Xinming Lai <yiyihu@gmail.com> Signed-off-by: Michael Brown <mcb30@ipxe.org>
2023-01-23[efi] Bind to only the topmost instance of the SNP or NII protocolssnploopMichael Brown1-30/+36
UEFI has the mildly annoying habit of installing copies of the EFI_SIMPLE_NETWORK_PROTOCOL instance on the IPv4 and IPv6 child device handles. This can cause iPXE's SNP driver to attempt to bind to a copy of the EFI_SIMPLE_NETWORK_PROTOCOL that iPXE itself provided on a different handle. Fix by refusing to bind to an SNP (or NII) handle if there exists another instance of the same protocol further up the device path (on the basis that we always want to bind to the highest possible device). Signed-off-by: Michael Brown <mcb30@ipxe.org>
2023-01-23[efi] Extend efi_locate_device() to allow searching up the device pathMichael Brown2-2/+2
Extend the functionality of efi_locate_device() to allow callers to find instances of the protocol that may exist further up the device path. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2023-01-18[ena] Allocate an unused Asynchronous Event Notification Queue (AENQ)aenqAlexander Graf2-0/+139
We currently don't allocate an Asynchronous Event Notification Queue (AENQ) because we don't actually care about any of the events that may come in. The ENA firmware found on Graviton instances requires the AENQ to exist, otherwise all admin queue commands will fail. Fix by allocating an AENQ and disabling all events (so that we do not need to include code to acknowledge any events that may arrive). Signed-off-by: Alexander Graf <graf@amazon.com>
2023-01-15[netdevice] Allow duplicate MAC addressesMichael Brown3-3/+56
Many laptops now include the ability to specify a "system-specific MAC address" (also known as "pass-through MAC"), which is supposed to be used for both the onboard NIC and for any attached docking station or other USB NIC. This is intended to simplify interoperability with software or hardware that relies on a MAC address to recognise an individual machine: for example, a deployment server may associate the MAC address with a particular operating system image to be deployed. This therefore creates legitimate situations in which duplicate MAC addresses may exist within the same system. As described in commit 98d09a1 ("[netdevice] Avoid registering duplicate network devices"), the Xen netfront driver relies on the rejection of duplicate MAC addresses in order to inhibit registration of the emulated PCI devices that a Xen PV-HVM guest will create to shadow each of the paravirtual network devices. Move the code that rejects duplicate MAC addresses from the network device core to the Xen netfront driver, to allow for the existence of duplicate MAC addresses in non-Xen setups. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2023-01-11[efi] Disable receive filters to work around buggy UNDI driversMichael Brown1-10/+47
Some UNDI drivers (such as the AMI UsbNetworkPkg currently in the process of being upstreamed into EDK2) have a bug that will prevent any packets from being received unless at least one attempt has been made to disable some receive filters. Work around these buggy drivers by attempting to disable receive filters before enabling them. Ignore any errors, since we genuinely do not care whether or not the disabling succeeds. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-11-15[intel] Add PCI ID for I219-V and -LM 16,17Christian I. Nilsson1-0/+4
Signed-off-by: Christian I. Nilsson <nikize@gmail.com> Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-09-19[ena] Assign memory BAR if left empty by BIOSMichael Brown1-0/+45
Some BIOSes in AWS EC2 (observed with a c6i.metal instance in eu-west-2) will fail to assign an MMIO address to the ENA device, which causes ioremap() to fail. Experiments show that the ENA device is the only device behind its bridge, even when multiple ENA devices are present, and that the BIOS does assign a memory window to the bridge. We may therefore choose to assign the device an MMIO address at the start of the bridge's memory window. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-26[ena] Increase receive ring size to 128 entriesMichael Brown2-5/+12
Some versions of the ENA hardware (observed on a c6i.large instance in eu-west-2) seem to require a receive ring containing at least 128 entries: any smaller ring will never see receive completions or will stall after the first few completions. Increase the receive ring size to 128 entries (determined empirically) for compatibility with these hardware versions. Limit the receive ring fill level to 16 (as at present) to avoid consuming more memory than will typically be available in the internal heap. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-26[ena] Provide a host information pageMichael Brown2-0/+159
Some versions of the ENA firmware (observed on a c6i.large instance in eu-west-2) seem to require a host information page, without which the CREATE_CQ command will fail with ENA_ADMIN_UNKNOWN_ERROR. These firmware versions also seem to require us to claim that we are a Linux kernel with a specific driver major version number. This appears to be a firmware bug, as revealed by Linux kernel commit 1a63443af ("net/amazon: Ensure that driver version is aligned to the linux kernel"): this commit changed the value of the driver version number field to be the Linux kernel version, and was hastily reverted in commit 92040c6da ("net: ena: fix broken interface between ENA driver and FW") which clarified that the version number field does actually have some undocumented significance to some versions of the firmware. Fix by providing a host information page via the SET_FEATURE command, incorporating the apparently necessary lies about our identity. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-26[ena] Specify the unused completion queue MSI-X vector as 0xffffffffMichael Brown2-0/+9
Some versions of the ENA firmware (observed on a c6i.large instance in eu-west-2) will complain if the completion queue's MSI-X vector field is left empty, even though the queue configuration specifies that interrupts are not used. Work around these firmware versions by passing in what appears to be the magic "no MSI-X vector" value in this field. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-26[ena] Allow for out-of-order completionsMichael Brown2-20/+62
The ENA data path design has separate submission and completion queues. Submission queues must be refilled in strict order (since there is only a single linear tail pointer used to communicate the existence of new entries to the hardware), and completion queue entries include a request identifier copied verbatim from the submission queue entry. Once the submission queue doorbell has been rung, software never again reads from the submission queue entry and nothing ever needs to write back to the submission queue entry since completions are reported via the separate completion queue. This design allows the hardware to complete submission queue entries out of order, provided that it internally caches at least as many entries as it leaves gaps. Record and identify I/O buffers by request identifier (using a circular ring buffer of unique request identifiers), and remove the assumption that submission queue entries will be completed in order. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-26[ena] Limit submission queue fill level to completion queue sizeMichael Brown2-4/+11
The CREATE_CQ command is permitted to return a size smaller than requested, which could leave us in a situation where the completion queue could overflow. Avoid overflow by limiting the submission queue fill level to the actual size of the completion queue. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-16[intelxl] Explicitly request a single queue pair for virtual functionsMichael Brown2-1/+58
Current versions of the E810 PF driver fail to set the number of in-use queue pairs in response to the CONFIG_VSI_QUEUES message. When the number of in-use queue pairs is less than the number of available queue pairs, this results in some packets being directed to nonexistent receive queues and hence silently dropped. Work around this PF driver bug by explicitly configuring the number of available queue pairs via the REQUEST_QUEUES message. This message triggers a VF reset that, in turn, requires us to reopen the admin queue and issue an additional GET_RESOURCES message to restore the VF to a functional state. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-16[intelxl] Allow for admin commands that trigger a VF resetMichael Brown1-13/+28
The RESET_VF admin queue command does not complete via the usual mechanism, but instead requires us to poll registers to wait for the reset to take effect and then reopen the admin queue. Allow for the existence of other admin queue commands that also trigger a VF reset, by separating out the logic that waits for the reset to complete. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-16[intelxl] Negotiate virtual function API version 1.1Michael Brown3-3/+31
Negotiate API version 1.1 in order to allow access to virtual function opcodes that are disallowed by default on the E810. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-16[intelxl] Show virtual function packet statistics for debuggingMichael Brown2-0/+88
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-12[intelxl] Add driver for Intel 100 Gigabit Ethernet NICsMichael Brown4-7/+1568
Add a driver for the E810 family of 100 Gigabit Ethernet NICs. The core datapath is identical to that of the 40 Gigabit XL710, and this part of the code is shared between both drivers. The admin queue mechanism is sufficiently similar to make it worth reusing substantial portions of the code, with separate implementations for several commands to handle the (unnecessarily) breaking changes in data structure layouts. The major differences are in the mechanisms for programming queue contexts (where the E810 abandons TX/RX symmetry) and for configuring the transmit scheduler and receive filters: these portions are sufficiently different to justify a separate driver. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-12[intelxl] Use admin queue to set port MAC address and maximum frame sizeMichael Brown2-27/+105
Remove knowledge of the PRTGL_SA[HL] registers, and instead use the admin queue to set the MAC address and maximum frame size. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-12[intelxl] Use admin queue to get port MAC addressMichael Brown2-51/+82
Remove knowledge of the PRTPM_SA[HL] registers, and instead use the admin queue to retrieve the MAC address. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-12[intelxl] Defer fetching MAC address until after opening admin queueMichael Brown1-5/+5
Allow for the MAC address to be fetched using an admin queue command, instead of reading the PRTPM_SA[HL] registers directly. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-12[intelxl] Set maximum frame size to 9728 bytes as per datasheetMichael Brown2-10/+6
The PRTGL_SAH register contains the current maximum frame size, and is not guaranteed on reset to contain the actual maximum frame size supported by the hardware, which the datasheet specifies as 9728 bytes (including the 4-byte CRC). Set the maximum packet size to a hardcoded 9728 bytes instead of reading from the PRTGL_SAH register. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-11[intelxl] Always issue "clear PXE mode" admin queue commandMichael Brown2-13/+11
Remove knowledge of the GLLAN_RCTL_0 register (which changes location between the XL810 and E810 register maps), and instead unconditionally issue the "clear PXE mode" command with the EEXIST error silenced. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-11[intelxl] Allow expected admin queue command errors to be silencedMichael Brown1-3/+7
The "clear PXE mode" admin queue command will return an EEXIST error if the device is already in non-PXE mode, but there is no other admin queue command that can be used to determine whether the device has already been switched into non-PXE mode. Provide a mechanism to allow expected errors from a command to be silenced, to allow the "clear PXE mode" command to be cleanly used without needing to first check the GLLAN_RCTL_0 register value. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-11[intelxl] Increase data buffer size to 4kBMichael Brown1-2/+5
At least one E810 admin queue command (Query Default Scheduling Tree Topology) insists upon being provided with a 4kB data buffer, even when the data to be returned is much smaller. Work around this requirement by increasing the admin queue data buffer size to 4kB. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-11[intelxl] Separate virtual function driver definitionsMichael Brown4-259/+320
Move knowledge of the virtual function data structures and admin command definitions from intelxl.h to intelxlvf.h. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-11[intelxl] Reuse admin command descriptor and buffer for VF responsesMichael Brown2-17/+15
Remove the large static admin data buffer structure embedded within struct intelxl_nic, and instead copy the response received via the "send to VF" admin queue event to the (already consumed and completed) admin command descriptor and data buffer. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-11[intelxl] Handle admin events via a callbackMichael Brown3-30/+43
The physical and virtual function drivers each care about precisely one admin queue event type. Simplify event handling by using a per-driver callback instead of the existing weak function symbol. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-10[intelxl] Rename 8086:1889 PCI ID to "iavf"Michael Brown1-1/+1
The PCI device ID 8086:1889 is for the Intel Ethernet Adaptive Virtual Function, which is a generic virtual function that can be exposed by different generations of Intel hardware. Rename the PCI ID from "xl710-vf-ad" to "iavf" to reflect that the driver is not XL710-specific. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-10[intelxl] Increase receive descriptor ring size to 64 entriesMichael Brown1-2/+2
The E810 requires that receive descriptor rings have at least 64 entries (and are a multiple of 32 entries). Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-10[intelxl] Negotiate API version for virtual function via admin queueMichael Brown3-10/+75
Do not attempt to use the admin commands to get the firmware version and report the driver version for the virtual function driver, since these will be rejected by the E810 firmware as invalid commands when issued by a virtual function. Instead, use the mailbox interface to negotiate the API version with the physical function driver. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-10[intelxl] Use non-zero MSI-X vector for virtual function interruptsMichael Brown4-18/+39
The 100 Gigabit physical function driver requires a virtual function driver to request that transmit and receive queues are mapped to MSI-X vector 1 or higher. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-10[intelxl] Fix invocation of intelxlvf_admin_queues()Michael Brown1-1/+1
The second parameter to intelxlvf_admin_queues() is a boolean used to select the VF opcode, rather than the raw VF opcode itself. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-08[intelxl] Use function-level reset instead of PFGEN_CTRL.PFSWRMichael Brown4-39/+18
Remove knowledge of the PFGEN_CTRL register (which changes location between XL710 and E810 register maps), and instead use PCIe FLR to reset the physical function. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-08[pci] Generalise function-level reset mechanismMichael Brown1-20/+3
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-08[intelxl] Update list of PCI IDsMichael Brown1-0/+5
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-08[intelxl] Include admin command response data buffer in debug outputMichael Brown1-1/+5
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-08[intelxl] Identify rings consistently in debug messagesMichael Brown1-4/+3
Use the tail register offset (which exists for all ring types) as the ring identifier in all relevant debug messages. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-08[intelxl] Add missing padding bytes to receive queue contextMichael Brown1-0/+2
For the sake of completeness, ensure that all 32 bytes of the receive queue context are programmed (including the unused final 8 bytes). Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-08[intelxl] Fix bit width of function number in PFFUNC_RID registerMichael Brown1-1/+1
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-08[intelxl] Fix retrieval of switch configuration via admin queueMichael Brown1-9/+11
Commit 8f3e648 ("[intelxl] Use one admin queue buffer per admin queue descriptor") changed the API for intelxl_admin_command() such that the caller now constructs the command directly within the next available descriptor ring entry, rather than relying on intelxl_admin_command() to copy the descriptor to and from the descriptor ring. This introduced a regression in intelxl_admin_switch(), since the second and subsequent iterations of the loop will not have constructed a valid command in the new descriptor ring entry before calling intelxl_admin_command(). Fix by constructing the command within the loop. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-05-23[ecm] Treat ACPI MAC address as being a non-permanent MAC addressMichael Brown3-14/+18
When applying an ACPI-provided system-specific MAC address, apply it to netdev->ll_addr rather than netdev->hw_addr. This allows iPXE scripts to access the permanent MAC address via the ${netX/hwaddr} setting (and thereby provides scripts with a mechanism to ascertain that the NIC is using a MAC address other than its own permanent hardware address). Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-11-23[efi] Run ExitBootServices shutdown hook at TPL_NOTIFYshutdown_tpl_notifyMichael Brown2-5/+15
On some systems (observed with the Thunderbolt ports on a ThinkPad X1 Extreme Gen3 and a ThinkPad P53), if the IOMMU is enabled then the system firmware will install an ExitBootServices notification event that disables bus mastering on the Thunderbolt xHCI controller and all PCI bridges, and destroys any extant IOMMU mappings. This leaves the xHCI controller unable to perform any DMA operations. As described in commit 236299b ("[xhci] Avoid DMA during shutdown if firmware has disabled bus mastering"), any subsequent DMA operation attempted by the xHCI controller will end up completing after the operating system kernel has reenabled bus mastering, resulting in a DMA operation to an area of memory that the hardware is no longer permitted to access and, on Windows with the Driver Verifier enabled, a STOP 0xE6 (DRIVER_VERIFIER_DMA_VIOLATION). That commit avoids triggering any DMA attempts during the shutdown of the xHCI controller itself. However, this is not a complete solution since any attached and opened USB device (e.g. a USB NIC) may asynchronously trigger DMA attempts that happen to occur after bus mastering has been disabled but before we reset the xHCI controller. Avoid this problem by installing our own ExitBootServices notification event at TPL_NOTIFY, thereby causing it to be invoked before the firmware's own ExitBootServices notification event that disables bus mastering. This unsurprisingly causes the shutdown hook itself to be invoked at TPL_NOTIFY, which causes a fatal error when later code attempts to raise the TPL to TPL_CALLBACK (which is a lower TPL). Work around this problem by redefining the "internal" iPXE TPL to be variable, and set this internal TPL to TPL_NOTIFY when the shutdown hook is invoked. Avoid calling into an underlying SNP protocol instance from within our shutdown hook at TPL_NOTIFY, since the underlying SNP driver may attempt to raise the TPL to TPL_CALLBACK (which would cause a fatal error). Failing to shut down the underlying SNP device is safe to do since the underlying device must, in any case, have installed its own ExitBootServices hook if any shutdown actions are required. Reported-by: Andreas Hammarskjöld <junior@2PintSoftware.com> Tested-by: Andreas Hammarskjöld <junior@2PintSoftware.com> Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-11-22[intel] Add PCI ID for Intel X553 0x15e4Benedikt Braunger1-0/+1
Modified-by: Michael Brown <mcb30@ipxe.org> Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-10-28[virtio] Update driver to use DMA APIAaron Young1-11/+25
Signed-off-by: Aaron Young <aaron.young@oracle.com>
2021-09-09[ecm] Use ACPI-provided system-specific MAC address if presentMichael Brown1-0/+9
Use the "system MAC address" provided within the DSDT/SSDT if such an address is available and has not already been assigned to a network device. Tested-by: Andreas Hammarskjöld <junior@2PintSoftware.com> Signed-off-by: Michael Brown <mcb30@ipxe.org>
2021-09-09[ecm] Expose USB vendor/device information to ecm_fetch_mac()Michael Brown3-7/+8
Signed-off-by: Michael Brown <mcb30@ipxe.org>