aboutsummaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)AuthorFilesLines
2015-07-06[ipoib] Transmit multicast packets as broadcastsMichael Brown1-2/+4
Multicast MAC addresses will never have REMAC cache entries, and the corresponding multicast IPoIB MAC address cannot be obtained simply by issuing an ARP request. For the trivial volume of multicast packets that we expect to send in any realistic scenario, the simplest solution is to send them as broadcasts instead. Reported-by: Wissam Shoukair <wissams@mellanox.com> Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-07-04[tcp] Gracefully close connections during shutdownMichael Brown2-1/+63
We currently do not wait for a received FIN before exiting to boot a loaded OS. In the common case of booting from an HTTP server, this means that the TCP connection is left consuming resources on the server side: the server will retransmit the FIN several times before giving up. Fix by initiating a graceful close of all TCP connections and waiting (for up to one second) for all connections to finish closing gracefully (i.e. for the outgoing FIN to have been sent and ACKed, and for the incoming FIN to have been received and ACKed at least once). Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-06-30[xen] Wait for and clear XenStore event before receiving dataMichael Brown2-0/+23
Older, out-of-tree Xen kernel modules (such as those provided with SuSE Linux Enterprise Server 11) do not clear the leftover "event pending" bit when opening an event channel. Consequently, no event is ever delivered to indicate that there is information in the XenStore ring buffer, and the system hangs shortly after loading the xen-platform-pci kernel module. Work around this problem by always waiting for the XenStore event channel to be signalled, and clearing the event before processing the received data. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-06-29[gdb] Allow gdbstub to be started on an arbitrary serial portMichael Brown2-7/+44
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-06-29[ipoib] Attempt to generate ARPs as needed to repopulate REMAC cacheMichael Brown3-8/+57
The only way to map an eIPoIB MAC address (REMAC) to an IPoIB MAC address is to intercept an incoming ARP request or reply. If we do not have an REMAC cache entry for a particular destination MAC address, then we cannot transmit the packet. This can arise in at least two situations: - An external program (e.g. a PXE NBP using the UNDI API) may attempt to transmit to a destination MAC address that has been obtained by some method other than ARP. - Memory pressure may have caused REMAC cache entries to be discarded. This is fairly likely on a busy network, since REMAC cache entries are created for all received (broadcast) ARP requests. (We can't sensibly avoid creating these cache entries, since they are required in order to send an ARP reply, and when we are being used via the UNDI API we may have no knowledge of which IP addresses are "ours".) Attempt to ameliorate the situation by generating a semi-spurious ARP request whenever we find a missing REMAC cache entry. This will hopefully trigger an ARP reply, which would then provide us with the information required to populate the REMAC cache. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-06-29[ipoib] Mark REMAC cache as expensiveMichael Brown1-1/+1
As with the neighbour cache, discarding an REMAC cache entry is potentially very disruptive. Originally-fixed-by: Wissam Shoukair <wissams@mellanox.com> Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-06-29[comboot] Implement INT22,0x000cWissam Shoukair1-0/+4
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-06-29[serial] Use new UART abstraction in serial console driverMichael Brown7-282/+155
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-06-29[gdb] Use new UART abstraction in GDB serial transportMichael Brown2-17/+45
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-06-29[serial] Add general abstraction of a 16550-compatible UARTMichael Brown6-0/+371
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-06-29[pxe] Always reconstruct packet for PXENV_GET_CACHED_INFOMichael Brown1-10/+8
Avoid accidentally returning stale packets (e.g. for a previously attempted network device) by always constructing a fresh DHCP packet. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-06-25[dhcp] Defer discovery if link is blockedMichael Brown1-0/+9
If the link is blocked (e.g. due to a Spanning Tree Protocol port not yet forwarding packets) then defer DHCP discovery until the link becomes unblocked. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-06-25[stp] Fix interpretaton of hello timeMichael Brown1-3/+3
Times in STP packets are expressed in units of 1/256 of a second. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-06-25[stp] Add support for detecting Spanning Tree Protocol non-forwarding portsMichael Brown5-0/+233
A fairly common end-user problem is that the default configuration of a switch may leave the port in a non-forwarding state for a substantial length of time (tens of seconds) after link up. This can cause iPXE to time out and give up attempting to boot. We cannot force the switch to start forwarding packets sooner, since any attempt to send a Spanning Tree Protocol bridge PDU may cause the switch to disable our port (if the switch happens to have the Bridge PDU Guard feature enabled for the port). For non-ancient versions of the Spanning Tree Protocol, we can detect whether or not the port is currently forwarding and use this to inform the network device core that the link is currently blocked. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-06-25[netdevice] Add a generic concept of a "blocked link"Michael Brown3-2/+70
When Spanning Tree Protocol (STP) is used, there may be a substantial delay (tens of seconds) from the time that the link goes up to the time that the port starts forwarding packets. Add a generic concept of a "blocked link" (i.e. a link which is up but which is not expected to communicate successfully), and allow "ifstat" to indicate when a link is blocked. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-06-25[ethernet] Add minimal support for receiving LLC framesMichael Brown1-2/+36
In some Ethernet framing variants the two-byte protocol field is used as a length, with the Ethernet header being followed by an IEEE 802.2 LLC header. The first two bytes of the LLC header are the DSAP and SSAP. If the received Ethernet packet appears to use this framing, then interpret the two-byte DSAP and SSAP as being the network-layer protocol. This allows support for receiving Spanning Tree Protocol frames (which use an LLC header with {DSAP,SSAP}=0x4242) to be added without requiring a full LLC protocol layer. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-06-25[mromprefix] Report a dummy size at offset 0x02 of .mrom payloadMichael Brown1-0/+1
The size of the .mrom payload (the second PCI ROM image) is defined in its PCI header. The code type for the .mrom payload image is deliberately set to an invalid value (0xff) to ensure that no BIOS tries to parse anything in the image other than the PCI header. Since the code type is not set to 0x00 ("Intel x86, PC-AT compatible"), bytes 0x02-0x17 should not be interpreted by the BIOS as being in the standard ISA expansion ROM format. In particular, the byte at offset 0x02 does not represent the length of the ROM image (in 512-byte blocks). However, some Dell BIOSes seem to erroneously use the byte at offset 0x02 to determine the length of the .mrom payload when walking the list of PCI ROM images. Since this byte is currently set to zero, this can lead to the BIOS getting stuck in an infinite loop during POST. (This problem may not arise if the .mrom payload is the final image in the ROM, since the BIOS will then have no reason to attempt to locate the next image.) One possible workaround would be to put the real payload size in this byte, but doing so would constrain the .mrom payload size to 128kB (see commit 8049a52 ("[mromprefix] Allow for .mrom images larger than 128kB") for more details). Another possible workaround would be to put the real payload size as a word in bytes 0x02-0x03 (as is done for EFI ROMs). This would not constrain the .mrom payload size, but a payload size which happened to be exactly 128kB would result in a zero value in the byte at offset 0x02 and so could still result in infinite loops on BIOSes with this bug. We choose to place a fixed value of 0x01 in the byte at offset 0x02. This should at least prevent the BIOS from getting stuck in an infinite loop. (The BIOS may walk into the middle of the .mrom payload, where it will almost certainly not find a valid {0x55,0xaa} signature or a valid PCIR header, and will therefore hopefully abort processing.) Reported-by: Wissam Shoukair <wissams@mellanox.com> Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-06-25[tcp] Do not shrink window when discarding received packetsMichael Brown1-20/+3
We currently shrink the TCP window permanently if we are ever forced (by a low-memory condition) to discard a previously received TCP packet. This behaviour was intended to reduce the number of retransmissions in a lossy network, since lost packets might potentially result in the entire window contents being retransmitted. Since commit e0fc8fe ("[tcp] Implement support for TCP Selective Acknowledgements (SACK)") the cost of lost packets has been reduced by around one order of magnitude, and the reduction in the window size (which affects the maximum throughput) is now the more significant cost. Remove the code which reduces the TCP maximum window size when a received packet is discarded. Reported-by: Wissam Shoukair <wissams@mellanox.com> Tested-by: Wissam Shoukair <wissams@mellanox.com> Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-06-23[pci] Use flat real mode to call INT 1a,b101Michael Brown1-1/+5
Some HP BIOSes (observed with an HP ProLiant m710p Server Cartridge) have a bug in the implementation of INT 1a,b101: they blithely assume that real-mode code is able to read from anywhere in the 32-bit memory space. This problem affects the call to INT 1a,b101 made from within pcibios_num_bus() (which uses REAL_CODE() and hence executes in genuine real mode) but does not affect the call made from within romprefix.S (since with a PMM BIOS, that call executes in flat real mode anyway). Work around the problem by explicitly calling flatten_real_mode() before invoking INT 1a,b101. This is a rarely-used code path, and so the extra overhead of emulating instructions in some VM configurations (see commit 6d4deee ("[librm] Use genuine real mode to accelerate operation in virtual machines") for more details) is negligible. Reported-by: Wissam Shoukair <wissams@mellanox.com> Debugged-by: Wissam Shoukair <wissams@mellanox.com> Debugged-by: Michael Brown <mcb30@ipxe.org> Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-06-18[xhci] Ignore invalid protocol speed ID values on Intel Skylake platformsMichael Brown2-3/+9
Some Intel Skylake platforms (observed on a prototype Lenovo ThinkPad) report the list of available USB3 protocol speed ID values as {1,2,3} but then report a port's speed using ID value 4. The value 4 happens to be the default value for SuperSpeed (when no protocol speed ID value list is explicitly defined), and the hardware seems to function correctly if we simply ignore its protocol speed ID table and assume that it uses the default values. Fix by adding a "broken PSI values" quirk for this controller. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-06-18[xhci] Record device-specific quirks in xHCI device structureMichael Brown2-3/+6
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-06-01[ipoib] Fix REMAC cache discarderMichael Brown1-3/+11
Originally-fixed-by: Wissam Shoukair <wissams@mellanox.com> Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-06-01[xhci] Fix comparison of signed and unsigned integersMichael Brown1-1/+1
gcc 4.8.2 fails to report this erroneous comparison unless assertions are enabled. Reported-by: Mary-Ann Johnson <MaryAnn.Johnson@displaylink.com> Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-06-01[build] Fix .ids.o creation for drivers not in the all-drivers buildMichael Brown2-2/+2
Commit dc19e63 ("[build] Construct all-drivers list based on driver class") accidentally excluded the USB bus drivers from the list of files parsed in order to create PCI 3.0 device ID lists. Fix by returning $(DRIVERS) to its previous definition as a list of all driver files, and use only $(DRIVERS_ipxe) to contain the filtered list containing only those drivers which we want to include in the "all-drivers" build. Reported-by: Mary-Ann Johnson <MaryAnn.Johnson@displaylink.com> Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-06-01[xhci] Fix length of allocated slot arrayMichael Brown1-2/+3
The xHCI slot ID is one-based, not zero-based. Fix the length of the xhci->slot[] array to account for this, and add assertions to check that the hardware returns a valid slot ID in response to the Enable Slot command. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-05-20[neighbour] Return success when deferring a packetMichael Brown1-1/+1
Deferral of a packet for neighbour discovery is not really an error. If we fail to discover a neighbour then the failure will eventually be reported by the call to neighbour_destroy() when any outstanding I/O buffers are discarded. The current behaviour breaks PXE booting on FreeBSD, which seems to treat the error return from PXENV_UDP_WRITE as a fatal error and so never proceeds to poll PXENV_UDP_READ (and hence never allows iPXE to receive the ARP reply and send the deferred UDP packet). Change neighbour_tx() to return success when deferring a packet. This fixes interoperability with FreeBSD and removes transient neighbour cache misses from the "ifstat" error output, while leaving genuine neighbour discovery failures visible via "ifstat" (once neighbour discovery times out, or the interface is closed). Debugged-by: Wissam Shoukair <wissams@mellanox.com> Tested-by: Wissam Shoukair <wissams@mellanox.com> Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-05-19[intel] Fix operation when physical function has jumbo frames enabledMichael Brown4-2/+134
When jumbo frames are enabled, the Linux ixgbe physical function driver will disable the virtual function's receive datapath by default, and will enable it only if the virtual function negotiates API version 1.1 (or higher) and explicitly selects an MTU. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-05-19[intel] Add intelxvf_stats() to dump packet statistics registersMichael Brown2-0/+46
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-05-19[int13con] Add basic ability to log to a local disk via INT 13Michael Brown5-6/+304
Several popular public cloud providers do not provide any sensible mechanism for obtaining debug output from an OS which is failing to boot. For example, Amazon EC2 provides the "Get System Log" facility, which occasionally deigns to report a random subset of the characters emitted via the VM's serial port, but usually returns only a blank screen. (Amazingly, this is still superior to the debugging facilities provided by Azure.) Work around these shortcomings by adding a console type which sends output to a magically detected raw disk partition, and including such a partition within any iPXE .usb-format image. To use this facility: - build an iPXE .usb image with CONSOLE_INT13 enabled - boot the cloud VM from this image - after the boot fails, attach the VM's boot disk to a second VM - from this second VM, use "less -f -R /dev/sdb3" (or similar) to view the iPXE output. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-05-16[intel] Add intelxvf driver for Intel 10 GigE virtual function NICsMichael Brown3-0/+455
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-05-16[intel] Add support for mailbox used by virtual functionsMichael Brown4-0/+414
Virtual functions use a mailbox to communicate with the physical function driver: this covers functionality such as obtaining the MAC address. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-05-16[intel] Allow for the use of advanced TX descriptorsMichael Brown3-42/+126
Intel virtual function NICs almost work with the use of "legacy" transmit and receive descriptors (which are backwards compatible right back to the original Intel Gigabit NICs). Unfortunately the "TX switching" feature (which allows for VM<->VM traffic to be looped back within the NIC itself) does not work when a legacy TX descriptor is used: the packet is instead sent onto the wire. Fix by allowing for the use of an "advanced" TX descriptor (containing exactly the same information as is found in the "legacy" descriptor). Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-05-15[intel] Expose intel_diag() for use by other Intel NIC driversMichael Brown2-26/+19
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-05-13[usb] Always clear recorded disconnections after performing hotplug actionsMichael Brown1-7/+7
The recorded disconnections (in port->disconnected) will currently be left uncleared if usb_attached() returns an error (e.g. because there are no drivers for a particular USB device). This is incorrect behaviour: the disconnection has been handled and the record should be cleared until the next physical disconnection is detected (via the CSC bit). The problem is masked for EHCI, UHCI, and USB hubs, since these will report a changed port (via usb_port_changed()) only when the underlying hardware reports a change. xHCI will call usb_port_changed() in response to any port status event, at which point the stale value of port->disconnected will be erroneously acted upon. This can lead to an endless loop of repeatedly enumerating the same device when a driverless device is attached to an xHCI root hub port. Fix by unconditionally clearing port->disconnected in usb_hotplugged(). Reported-by: Robin Smidsrød <robin@smidsrod.no> Tested-by: Robin Smidsrød <robin@smidsrod.no> Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-05-13[usb] Do not call usb_hotplug() when registering a new hubMichael Brown1-3/+3
The action of registering a new hub can itself happen in only two ways: either a new USB hub has been created (in which case we are already inside a call to usb_hotplug()), or a new root hub has been created. In the former case, we do not need to issue a further call to usb_hotplug(), since the hub's ports will all be marked as changed and so will be handled after the return from register_usb_hub() anyway. Calling usb_hotplug() within register_usb_hub() leads to a confusing order of events, such as: - root hub port 1 detects a change - root hub port 2 detects a change - usb_hotplug() is called - root hub port 1 finds a USB hub - usb_hotplug() is called - this inner call to usb_hotplug() handles root hub port 2 Fix by calling usb_hotplug() only from usb_step() and from register_usb_bus(). This avoids recursive calls to usb_hotplug() and ensures that devices are enumerated in the order of detection. Tested-by: Robin Smidsrød <robin@smidsrod.no> Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-05-12[usb] Add basic support for USB keyboardsMichael Brown7-0/+680
When USB network card drivers are used, the BIOS' legacy USB capability is necessarily disabled since there is no way to share the host controller between the BIOS and iPXE. This currently results in USB keyboards becoming non-functional in USB-enabled builds of iPXE. Fix by adding basic support for USB keyboards, enabled by default in iPXE builds which include USB support. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-05-12[usb] Add generic USB human interface device (HID) frameworkMichael Brown3-0/+258
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-05-11[usb] Add USB_INTERRUPT_OUT internal typeMichael Brown4-5/+9
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-05-11[ipv6] Disambiguate received ICMPv6 errorsMichael Brown2-2/+90
Originally-implemented-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-05-10[uhci] Use meaningful device names in debug messagesMichael Brown2-15/+21
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-05-10[ehci] Use meaningful device names in debug messagesMichael Brown2-43/+52
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-05-10[xhci] Use meaningful device names in debug messagesMichael Brown2-119/+124
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-05-09[usb] Provide usb_endpoint_name() for use by host controller driversMichael Brown2-33/+30
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-05-09[uhci] Add support for UHCI host controllersMichael Brown7-0/+1937
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-05-09[ehci] Allow UHCI/OHCI controllers to locate the EHCI companion controllerMichael Brown3-0/+29
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-05-09[usb] Add find_usb_bus_by_location() helper functionMichael Brown2-0/+22
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-05-09[ehci] Poll child companion controllers after disowning portMichael Brown2-0/+59
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-05-09[usb] Maintain single lists of halted endpoints and changed portsMichael Brown2-50/+55
When an EHCI hotplug action results in the controller disowning the port, it will result in a hotplug action on the corresponding UHCI or OHCI controller. Allow such hotplug actions to be carried out as part of the same call to usb_step() or usb_register_bus(), by maintaining a single central list of changed ports. Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-05-09[usb] Maintain a list of all USB busesMichael Brown2-0/+18
Signed-off-by: Michael Brown <mcb30@ipxe.org>
2015-05-08[usb] Detect missed disconnectionsMichael Brown5-51/+78
The USB core will currently fail to detect disconnections if a new device has attached by the time the port is examined in usb_hotplug(). Fix by recording the fact that a disconnection has taken place whenever the "connection status changed" (CSC) bit is observed to be set. (Whether the change represents a disconnection or a reconnection, it indicates that the port has experienced some time of being disconnected.) Note that the time at which a disconnection can be detected varies by hub type. In particular: root hubs can observe the CSC bit when polling, and so will record the disconnection before calling usb_port_changed(), but USB hubs read the port status (and hence the CSC bit) only during the call to hub_speed(), long after the call to usb_port_changed(). Signed-off-by: Michael Brown <mcb30@ipxe.org>