aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-10-25[crypto] Allow initialisation vector length to vary from cipher blocksizeMichael Brown7-16/+24
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-10-25[crypto] Expose null crypto algorithm methods for reuseMichael Brown4-51/+54
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-10-11[tls] Add support for DHE variants of the existing cipher suitesMichael Brown3-4/+56
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-10-11[tls] Add support for Ephemeral Diffie-Hellman key exchangeHEADMichael Brown2-0/+247
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-10-11[tls] Add key exchange mechanism to definition of cipher suiteMichael Brown4-3/+48
Allow for the key exchange mechanism to vary depending upon the selected cipher suite. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-10-11[tls] Record ServerKeyExchange record, if providedMichael Brown2-0/+40
Accept and record the ServerKeyExchange record, which is required for key exchange mechanisms such as Ephemeral Diffie-Hellman (DHE). Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-10-11[tls] Generate pre-master secret at point of sending ClientKeyExchangeMichael Brown2-26/+27
The pre-master secret is currently constructed at the time of instantiating the TLS connection. This precludes the use of key exchange mechanisms such as Ephemeral Diffie-Hellman (DHE), which require a ServerKeyExchange message to exchange additional key material before the pre-master secret can be constructed. Allow for the use of such cipher suites by deferring generation of the master secret until the point of sending the ClientKeyExchange message. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-10-11[tls] Generate master secret at point of sending ClientKeyExchangeMichael Brown1-8/+13
The master secret is currently constructed upon receiving the ServerHello message. This precludes the use of key exchange mechanisms such as Ephemeral Diffie-Hellman (DHE), which require a ServerKeyExchange message to exchange additional key material before the pre-master secret and master secret can be constructed. Allow for the use of such cipher suites by deferring generation of the master secret until the point of sending the ClientKeyExchange message. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-10-11[crypto] Add Ephemeral Diffie-Hellman key exchange algorithmMichael Brown5-0/+936
Add an implementation of the Ephemeral Diffie-Hellman key exchange algorithm as defined in RFC2631, with test vectors taken from the NIST Cryptographic Toolkit. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-10-10[crypto] Simplify internal HMAC APIMichael Brown16-163/+142
Simplify the internal HMAC API so that the key is provided only at the point of calling hmac_init(), and the (potentially reduced) key is stored as part of the context for later use by hmac_final(). This simplifies the calling code, and avoids the need for callers such as TLS to allocate a potentially variable length block in order to retain a copy of the unmodified key. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-10-10[test] Add HMAC self-testsMichael Brown2-0/+212
The HMAC code is already tested indirectly via several consuming algorithms that themselves provide self-tests (e.g. HMAC-DRBG, NTLM authentication, and PeerDist content identification), but lacks any direct test vectors. Add explicit HMAC tests and ensure that corner cases such as empty keys, block-length keys, and over-length keys are all covered. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-09-19[ena] Assign memory BAR if left empty by BIOSMichael Brown1-0/+45
Some BIOSes in AWS EC2 (observed with a c6i.metal instance in eu-west-2) will fail to assign an MMIO address to the ENA device, which causes ioremap() to fail. Experiments show that the ENA device is the only device behind its bridge, even when multiple ENA devices are present, and that the BIOS does assign a memory window to the bridge. We may therefore choose to assign the device an MMIO address at the start of the bridge's memory window. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-09-19[pci] Add minimal PCI bridge driverMichael Brown4-0/+191
Add a minimal driver for PCI bridges that can be used to locate the bridge to which a PCI device is attached. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-09-18[pci] Select PCI I/O API at runtime for cloud imagesMichael Brown11-1/+256
Pretty much all physical machines and off-the-shelf virtual machines will provide a functional PCI BIOS. We therefore default to using only the PCI BIOS, with no fallback to an alternative mechanism if the PCI BIOS fails. AWS EC2 provides the opportunity to experience some exceptions to this rule. For example, the t3a.nano instances in eu-west-1 have no functional PCI BIOS at all. As of commit 83516ba ("[cloud] Use PCIAPI_DIRECT for cloud images") we therefore use direct Type 1 configuration space accesses in the images built and published for use in the cloud. Recent experience has discovered yet more variation in AWS EC2 instances. For example, some of the metal instance types have multiple PCI host bridges and the direct Type 1 accesses therefore see only a subset of the PCI devices. Attempt to accommodate future such variations by making the PCI I/O API selectable at runtime and choosing ECAM (if available), falling back to the PCI BIOS (if available), then finally falling back to direct Type 1 accesses. This is implemented as a dedicated PCIAPI_CLOUD API, rather than by having the PCI core select a suitable API at runtime (as was done for timers in commit 302f1ee ("[time] Allow timer to be selected at runtime"). The common case will remain that only the PCI BIOS API is required, and we would prefer to retain the optimisations that come from inlining the configuration space accesses in this common case. Cloud images are (at present) disk images rather than ROM images, and so the increased code size required for this design approach in the PCIAPI_CLOUD case is acceptable. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-09-18[bios] Allow pcibios_discover() to return an empty rangeMichael Brown1-3/+5
Allow pcibios_discover() to return an empty range if the INT 1A,B101 PCI BIOS installation check call fails. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-09-16[pci] Add support for the Enhanced Configuration Access Mechanism (ECAM)Michael Brown5-0/+461
The ACPI MCFG table describes a direct mapping of PCI configuration space into MMIO space. This mapping allows access to extended configuration space (up to 4096 bytes) and also provides for the existence of multiple host bridges. Add support for the ECAM mechanism described by the ACPI MCFG table, as a selectable PCI I/O API alongside the existing PCI BIOS and Type 1 mechanisms. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-09-15[pci] Generalise pci_num_bus() to pci_discover()Michael Brown10-43/+78
Allow pci_find_next() to discover devices beyond the first PCI segment, by generalising pci_num_bus() (which implicitly assumes that there is only a single PCI segment) with pci_discover() (which has the ability to return an arbitrary contiguous chunk of PCI bus:dev.fn address space). Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-09-15[pci] Check for wraparound in callers of pci_find_next()Michael Brown3-3/+10
The semantics of the bus:dev.fn parameter passed to pci_find_next() are "find the first existent PCI device at this address or higher", with the caller expected to increment the address between finding devices. This does not allow the parameter to distinguish between the two cases "start from address zero" and "wrapped after incrementing maximal possible address", which could therefore lead to an infinite loop in the degenerate case that a device with address ffff:ff:1f.7 really exists. Fix by checking for wraparound in the caller (which is already responsible for performing the increment). Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-09-15[pci] Allow pci_find_next() to return non-zero PCI segmentsMichael Brown3-16/+14
Separate the return status code from the returned PCI bus:dev.fn address, in order to allow pci_find_next() to be used to find devices with a non-zero PCI segment number. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-09-15[linux] Add missing PROVIDE_PCIAPI_INLINE() macrosMichael Brown1-0/+9
Ensure type consistency of the PCI I/O API methods by adding the missing PROVIDE_PCIAPI_INLINE() macros. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-09-13[ipv6] Ignore SLAAC on prefixes with an incompatible prefix lengthMichael Brown1-11/+25
Experience suggests that routers are often misconfigured to advertise SLAAC even on prefixes that do not have a SLAAC-compatible prefix length. iPXE will currently treat this as an error, resulting in the prefix being ignored completely. Handle this misconfiguration by ignoring the autonomous address flag when the prefix length is unsuitable for SLAAC. Reported-by: Malte Janduda <mail@janduda.net> Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-09-06[ipv6] Fix mask calculation when prefix length is not a multiple of 8Michael Brown2-1/+38
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-09-06[test] Validate constructed IPv6 routing table entriesMichael Brown1-12/+52
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-26[ena] Increase receive ring size to 128 entriesMichael Brown2-5/+12
Some versions of the ENA hardware (observed on a c6i.large instance in eu-west-2) seem to require a receive ring containing at least 128 entries: any smaller ring will never see receive completions or will stall after the first few completions. Increase the receive ring size to 128 entries (determined empirically) for compatibility with these hardware versions. Limit the receive ring fill level to 16 (as at present) to avoid consuming more memory than will typically be available in the internal heap. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-26[ena] Provide a host information pageMichael Brown2-0/+159
Some versions of the ENA firmware (observed on a c6i.large instance in eu-west-2) seem to require a host information page, without which the CREATE_CQ command will fail with ENA_ADMIN_UNKNOWN_ERROR. These firmware versions also seem to require us to claim that we are a Linux kernel with a specific driver major version number. This appears to be a firmware bug, as revealed by Linux kernel commit 1a63443af ("net/amazon: Ensure that driver version is aligned to the linux kernel"): this commit changed the value of the driver version number field to be the Linux kernel version, and was hastily reverted in commit 92040c6da ("net: ena: fix broken interface between ENA driver and FW") which clarified that the version number field does actually have some undocumented significance to some versions of the firmware. Fix by providing a host information page via the SET_FEATURE command, incorporating the apparently necessary lies about our identity. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-26[ena] Specify the unused completion queue MSI-X vector as 0xffffffffMichael Brown2-0/+9
Some versions of the ENA firmware (observed on a c6i.large instance in eu-west-2) will complain if the completion queue's MSI-X vector field is left empty, even though the queue configuration specifies that interrupts are not used. Work around these firmware versions by passing in what appears to be the magic "no MSI-X vector" value in this field. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-26[ena] Allow for out-of-order completionsMichael Brown2-20/+62
The ENA data path design has separate submission and completion queues. Submission queues must be refilled in strict order (since there is only a single linear tail pointer used to communicate the existence of new entries to the hardware), and completion queue entries include a request identifier copied verbatim from the submission queue entry. Once the submission queue doorbell has been rung, software never again reads from the submission queue entry and nothing ever needs to write back to the submission queue entry since completions are reported via the separate completion queue. This design allows the hardware to complete submission queue entries out of order, provided that it internally caches at least as many entries as it leaves gaps. Record and identify I/O buffers by request identifier (using a circular ring buffer of unique request identifiers), and remove the assumption that submission queue entries will be completed in order. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-26[ena] Limit submission queue fill level to completion queue sizeMichael Brown2-4/+11
The CREATE_CQ command is permitted to return a size smaller than requested, which could leave us in a situation where the completion queue could overflow. Avoid overflow by limiting the submission queue fill level to the actual size of the completion queue. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-16[intelxl] Explicitly request a single queue pair for virtual functionsMichael Brown2-1/+58
Current versions of the E810 PF driver fail to set the number of in-use queue pairs in response to the CONFIG_VSI_QUEUES message. When the number of in-use queue pairs is less than the number of available queue pairs, this results in some packets being directed to nonexistent receive queues and hence silently dropped. Work around this PF driver bug by explicitly configuring the number of available queue pairs via the REQUEST_QUEUES message. This message triggers a VF reset that, in turn, requires us to reopen the admin queue and issue an additional GET_RESOURCES message to restore the VF to a functional state. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-16[intelxl] Allow for admin commands that trigger a VF resetMichael Brown1-13/+28
The RESET_VF admin queue command does not complete via the usual mechanism, but instead requires us to poll registers to wait for the reset to take effect and then reopen the admin queue. Allow for the existence of other admin queue commands that also trigger a VF reset, by separating out the logic that waits for the reset to complete. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-16[intelxl] Negotiate virtual function API version 1.1Michael Brown3-3/+31
Negotiate API version 1.1 in order to allow access to virtual function opcodes that are disallowed by default on the E810. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-16[intelxl] Show virtual function packet statistics for debuggingMichael Brown2-0/+88
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-12[intelxl] Add driver for Intel 100 Gigabit Ethernet NICsMichael Brown5-7/+1569
Add a driver for the E810 family of 100 Gigabit Ethernet NICs. The core datapath is identical to that of the 40 Gigabit XL710, and this part of the code is shared between both drivers. The admin queue mechanism is sufficiently similar to make it worth reusing substantial portions of the code, with separate implementations for several commands to handle the (unnecessarily) breaking changes in data structure layouts. The major differences are in the mechanisms for programming queue contexts (where the E810 abandons TX/RX symmetry) and for configuring the transmit scheduler and receive filters: these portions are sufficiently different to justify a separate driver. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-12[intelxl] Use admin queue to set port MAC address and maximum frame sizeMichael Brown2-27/+105
Remove knowledge of the PRTGL_SA[HL] registers, and instead use the admin queue to set the MAC address and maximum frame size. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-12[intelxl] Use admin queue to get port MAC addressMichael Brown2-51/+82
Remove knowledge of the PRTPM_SA[HL] registers, and instead use the admin queue to retrieve the MAC address. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-12[intelxl] Defer fetching MAC address until after opening admin queueMichael Brown1-5/+5
Allow for the MAC address to be fetched using an admin queue command, instead of reading the PRTPM_SA[HL] registers directly. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-12[intelxl] Set maximum frame size to 9728 bytes as per datasheetMichael Brown2-10/+6
The PRTGL_SAH register contains the current maximum frame size, and is not guaranteed on reset to contain the actual maximum frame size supported by the hardware, which the datasheet specifies as 9728 bytes (including the 4-byte CRC). Set the maximum packet size to a hardcoded 9728 bytes instead of reading from the PRTGL_SAH register. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-11[intelxl] Always issue "clear PXE mode" admin queue commandMichael Brown2-13/+11
Remove knowledge of the GLLAN_RCTL_0 register (which changes location between the XL810 and E810 register maps), and instead unconditionally issue the "clear PXE mode" command with the EEXIST error silenced. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-11[intelxl] Allow expected admin queue command errors to be silencedMichael Brown1-3/+7
The "clear PXE mode" admin queue command will return an EEXIST error if the device is already in non-PXE mode, but there is no other admin queue command that can be used to determine whether the device has already been switched into non-PXE mode. Provide a mechanism to allow expected errors from a command to be silenced, to allow the "clear PXE mode" command to be cleanly used without needing to first check the GLLAN_RCTL_0 register value. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-11[intelxl] Increase data buffer size to 4kBMichael Brown1-2/+5
At least one E810 admin queue command (Query Default Scheduling Tree Topology) insists upon being provided with a 4kB data buffer, even when the data to be returned is much smaller. Work around this requirement by increasing the admin queue data buffer size to 4kB. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-11[intelxl] Separate virtual function driver definitionsMichael Brown4-259/+320
Move knowledge of the virtual function data structures and admin command definitions from intelxl.h to intelxlvf.h. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-11[intelxl] Reuse admin command descriptor and buffer for VF responsesMichael Brown2-17/+15
Remove the large static admin data buffer structure embedded within struct intelxl_nic, and instead copy the response received via the "send to VF" admin queue event to the (already consumed and completed) admin command descriptor and data buffer. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-11[intelxl] Handle admin events via a callbackMichael Brown3-30/+43
The physical and virtual function drivers each care about precisely one admin queue event type. Simplify event handling by using a per-driver callback instead of the existing weak function symbol. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-10[intelxl] Rename 8086:1889 PCI ID to "iavf"Michael Brown1-1/+1
The PCI device ID 8086:1889 is for the Intel Ethernet Adaptive Virtual Function, which is a generic virtual function that can be exposed by different generations of Intel hardware. Rename the PCI ID from "xl710-vf-ad" to "iavf" to reflect that the driver is not XL710-specific. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-10[intelxl] Increase receive descriptor ring size to 64 entriesMichael Brown1-2/+2
The E810 requires that receive descriptor rings have at least 64 entries (and are a multiple of 32 entries). Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-10[intelxl] Negotiate API version for virtual function via admin queueMichael Brown3-10/+75
Do not attempt to use the admin commands to get the firmware version and report the driver version for the virtual function driver, since these will be rejected by the E810 firmware as invalid commands when issued by a virtual function. Instead, use the mailbox interface to negotiate the API version with the physical function driver. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-10[intelxl] Use non-zero MSI-X vector for virtual function interruptsMichael Brown4-18/+39
The 100 Gigabit physical function driver requires a virtual function driver to request that transmit and receive queues are mapped to MSI-X vector 1 or higher. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-10[intelxl] Fix invocation of intelxlvf_admin_queues()Michael Brown1-1/+1
The second parameter to intelxlvf_admin_queues() is a boolean used to select the VF opcode, rather than the raw VF opcode itself. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-08[intelxl] Use function-level reset instead of PFGEN_CTRL.PFSWRMichael Brown4-39/+18
Remove knowledge of the PFGEN_CTRL register (which changes location between XL710 and E810 register maps), and instead use PCIe FLR to reset the physical function. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2022-08-08[pci] Generalise function-level reset mechanismMichael Brown3-20/+26
Signed-off-by: Michael Brown <mcb30@ipxe.org>