From 189c099f75f39da1c1a0f3e527109af2b169a8fe Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Alex=20Benn=C3=A9e?= Date: Wed, 21 Jul 2021 00:26:36 +0100 Subject: docs: collect the disparate device emulation docs into one section MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit While we are at it add a brief preamble that explains some of the common concepts in QEMU's device emulation which will hopefully lead to less confusing about our dizzying command line options. Signed-off-by: Alex Bennée Reviewed-by: Markus Armbruster Cc: Paolo Bonzini Cc: Daniel P. Berrangé Cc: Eduardo Habkost Message-Id: <20210720232703.10650-3-alex.bennee@linaro.org> --- docs/system/device-emulation.rst | 89 ++++++++++++++ docs/system/devices/ivshmem.rst | 64 ++++++++++ docs/system/devices/net.rst | 100 +++++++++++++++ docs/system/devices/nvme.rst | 237 ++++++++++++++++++++++++++++++++++++ docs/system/devices/usb.rst | 140 +++++++++++++++++++++ docs/system/devices/virtio-pmem.rst | 76 ++++++++++++ docs/system/index.rst | 6 +- docs/system/ivshmem.rst | 64 ---------- docs/system/net.rst | 100 --------------- docs/system/nvme.rst | 237 ------------------------------------ docs/system/usb.rst | 140 --------------------- docs/system/virtio-pmem.rst | 76 ------------ 12 files changed, 707 insertions(+), 622 deletions(-) create mode 100644 docs/system/device-emulation.rst create mode 100644 docs/system/devices/ivshmem.rst create mode 100644 docs/system/devices/net.rst create mode 100644 docs/system/devices/nvme.rst create mode 100644 docs/system/devices/usb.rst create mode 100644 docs/system/devices/virtio-pmem.rst delete mode 100644 docs/system/ivshmem.rst delete mode 100644 docs/system/net.rst delete mode 100644 docs/system/nvme.rst delete mode 100644 docs/system/usb.rst delete mode 100644 docs/system/virtio-pmem.rst (limited to 'docs') diff --git a/docs/system/device-emulation.rst b/docs/system/device-emulation.rst new file mode 100644 index 0000000..8adf05f --- /dev/null +++ b/docs/system/device-emulation.rst @@ -0,0 +1,89 @@ +.. _device-emulation: + +Device Emulation +---------------- + +QEMU supports the emulation of a large number of devices from +peripherals such network cards and USB devices to integrated systems +on a chip (SoCs). Configuration of these is often a source of +confusion so it helps to have an understanding of some of the terms +used to describes devices within QEMU. + +Common Terms +~~~~~~~~~~~~ + +Device Front End +================ + +A device front end is how a device is presented to the guest. The type +of device presented should match the hardware that the guest operating +system is expecting to see. All devices can be specified with the +``--device`` command line option. Running QEMU with the command line +options ``--device help`` will list all devices it is aware of. Using +the command line ``--device foo,help`` will list the additional +configuration options available for that device. + +A front end is often paired with a back end, which describes how the +host's resources are used in the emulation. + +Device Buses +============ + +Most devices will exist on a BUS of some sort. Depending on the +machine model you choose (``-M foo``) a number of buses will have been +automatically created. In most cases the BUS a device is attached to +can be inferred, for example PCI devices are generally automatically +allocated to the next free address of first PCI bus found. However in +complicated configurations you can explicitly specify what bus +(``bus=ID``) a device is attached to along with its address +(``addr=N``). + +Some devices, for example a PCI SCSI host controller, will add an +additional buses to the system that other devices can be attached to. +A hypothetical chain of devices might look like: + + --device foo,bus=pci.0,addr=0,id=foo + --device bar,bus=foo.0,addr=1,id=baz + +which would be a bar device (with the ID of baz) which is attached to +the first foo bus (foo.0) at address 1. The foo device which provides +that bus is itself is attached to the first PCI bus (pci.0). + + +Device Back End +=============== + +The back end describes how the data from the emulated device will be +processed by QEMU. The configuration of the back end is usually +specific to the class of device being emulated. For example serial +devices will be backed by a ``--chardev`` which can redirect the data +to a file or socket or some other system. Storage devices are handled +by ``--blockdev`` which will specify how blocks are handled, for +example being stored in a qcow2 file or accessing a raw host disk +partition. Back ends can sometimes be stacked to implement features +like snapshots. + +While the choice of back end is generally transparent to the guest, +there are cases where features will not be reported to the guest if +the back end is unable to support it. + +Device Pass Through +=================== + +Device pass through is where the device is actually given access to +the underlying hardware. This can be as simple as exposing a single +USB device on the host system to the guest or dedicating a video card +in a PCI slot to the exclusive use of the guest. + + +Emulated Devices +~~~~~~~~~~~~~~~~ + +.. toctree:: + :maxdepth: 1 + + devices/ivshmem.rst + devices/net.rst + devices/nvme.rst + devices/usb.rst + devices/virtio-pmem.rst diff --git a/docs/system/devices/ivshmem.rst b/docs/system/devices/ivshmem.rst new file mode 100644 index 0000000..b03a48a --- /dev/null +++ b/docs/system/devices/ivshmem.rst @@ -0,0 +1,64 @@ +.. _pcsys_005fivshmem: + +Inter-VM Shared Memory device +----------------------------- + +On Linux hosts, a shared memory device is available. The basic syntax +is: + +.. parsed-literal:: + + |qemu_system_x86| -device ivshmem-plain,memdev=hostmem + +where hostmem names a host memory backend. For a POSIX shared memory +backend, use something like + +:: + + -object memory-backend-file,size=1M,share,mem-path=/dev/shm/ivshmem,id=hostmem + +If desired, interrupts can be sent between guest VMs accessing the same +shared memory region. Interrupt support requires using a shared memory +server and using a chardev socket to connect to it. The code for the +shared memory server is qemu.git/contrib/ivshmem-server. An example +syntax when using the shared memory server is: + +.. parsed-literal:: + + # First start the ivshmem server once and for all + ivshmem-server -p pidfile -S path -m shm-name -l shm-size -n vectors + + # Then start your qemu instances with matching arguments + |qemu_system_x86| -device ivshmem-doorbell,vectors=vectors,chardev=id + -chardev socket,path=path,id=id + +When using the server, the guest will be assigned a VM ID (>=0) that +allows guests using the same server to communicate via interrupts. +Guests can read their VM ID from a device register (see +ivshmem-spec.txt). + +Migration with ivshmem +~~~~~~~~~~~~~~~~~~~~~~ + +With device property ``master=on``, the guest will copy the shared +memory on migration to the destination host. With ``master=off``, the +guest will not be able to migrate with the device attached. In the +latter case, the device should be detached and then reattached after +migration using the PCI hotplug support. + +At most one of the devices sharing the same memory can be master. The +master must complete migration before you plug back the other devices. + +ivshmem and hugepages +~~~~~~~~~~~~~~~~~~~~~ + +Instead of specifying the using POSIX shm, you may specify a +memory backend that has hugepage support: + +.. parsed-literal:: + + |qemu_system_x86| -object memory-backend-file,size=1G,mem-path=/dev/hugepages/my-shmem-file,share,id=mb1 + -device ivshmem-plain,memdev=mb1 + +ivshmem-server also supports hugepages mount points with the ``-m`` +memory path argument. diff --git a/docs/system/devices/net.rst b/docs/system/devices/net.rst new file mode 100644 index 0000000..4b2640c --- /dev/null +++ b/docs/system/devices/net.rst @@ -0,0 +1,100 @@ +.. _pcsys_005fnetwork: + +Network emulation +----------------- + +QEMU can simulate several network cards (e.g. PCI or ISA cards on the PC +target) and can connect them to a network backend on the host or an +emulated hub. The various host network backends can either be used to +connect the NIC of the guest to a real network (e.g. by using a TAP +devices or the non-privileged user mode network stack), or to other +guest instances running in another QEMU process (e.g. by using the +socket host network backend). + +Using TAP network interfaces +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +This is the standard way to connect QEMU to a real network. QEMU adds a +virtual network device on your host (called ``tapN``), and you can then +configure it as if it was a real ethernet card. + +Linux host +^^^^^^^^^^ + +As an example, you can download the ``linux-test-xxx.tar.gz`` archive +and copy the script ``qemu-ifup`` in ``/etc`` and configure properly +``sudo`` so that the command ``ifconfig`` contained in ``qemu-ifup`` can +be executed as root. You must verify that your host kernel supports the +TAP network interfaces: the device ``/dev/net/tun`` must be present. + +See :ref:`sec_005finvocation` to have examples of command +lines using the TAP network interfaces. + +Windows host +^^^^^^^^^^^^ + +There is a virtual ethernet driver for Windows 2000/XP systems, called +TAP-Win32. But it is not included in standard QEMU for Windows, so you +will need to get it separately. It is part of OpenVPN package, so +download OpenVPN from : https://openvpn.net/. + +Using the user mode network stack +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +By using the option ``-net user`` (default configuration if no ``-net`` +option is specified), QEMU uses a completely user mode network stack +(you don't need root privilege to use the virtual network). The virtual +network configuration is the following:: + + guest (10.0.2.15) <------> Firewall/DHCP server <-----> Internet + | (10.0.2.2) + | + ----> DNS server (10.0.2.3) + | + ----> SMB server (10.0.2.4) + +The QEMU VM behaves as if it was behind a firewall which blocks all +incoming connections. You can use a DHCP client to automatically +configure the network in the QEMU VM. The DHCP server assign addresses +to the hosts starting from 10.0.2.15. + +In order to check that the user mode network is working, you can ping +the address 10.0.2.2 and verify that you got an address in the range +10.0.2.x from the QEMU virtual DHCP server. + +Note that ICMP traffic in general does not work with user mode +networking. ``ping``, aka. ICMP echo, to the local router (10.0.2.2) +shall work, however. If you're using QEMU on Linux >= 3.0, it can use +unprivileged ICMP ping sockets to allow ``ping`` to the Internet. The +host admin has to set the ping_group_range in order to grant access to +those sockets. To allow ping for GID 100 (usually users group):: + + echo 100 100 > /proc/sys/net/ipv4/ping_group_range + +When using the built-in TFTP server, the router is also the TFTP server. + +When using the ``'-netdev user,hostfwd=...'`` option, TCP or UDP +connections can be redirected from the host to the guest. It allows for +example to redirect X11, telnet or SSH connections. + +Hubs +~~~~ + +QEMU can simulate several hubs. A hub can be thought of as a virtual +connection between several network devices. These devices can be for +example QEMU virtual ethernet cards or virtual Host ethernet devices +(TAP devices). You can connect guest NICs or host network backends to +such a hub using the ``-netdev +hubport`` or ``-nic hubport`` options. The legacy ``-net`` option also +connects the given device to the emulated hub with ID 0 (i.e. the +default hub) unless you specify a netdev with ``-net nic,netdev=xxx`` +here. + +Connecting emulated networks between QEMU instances +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Using the ``-netdev socket`` (or ``-nic socket`` or ``-net socket``) +option, it is possible to create emulated networks that span several +QEMU instances. See the description of the ``-netdev socket`` option in +:ref:`sec_005finvocation` to have a basic +example. diff --git a/docs/system/devices/nvme.rst b/docs/system/devices/nvme.rst new file mode 100644 index 0000000..bff72d1 --- /dev/null +++ b/docs/system/devices/nvme.rst @@ -0,0 +1,237 @@ +============== +NVMe Emulation +============== + +QEMU provides NVMe emulation through the ``nvme``, ``nvme-ns`` and +``nvme-subsys`` devices. + +See the following sections for specific information on + + * `Adding NVMe Devices`_, `additional namespaces`_ and `NVM subsystems`_. + * Configuration of `Optional Features`_ such as `Controller Memory Buffer`_, + `Simple Copy`_, `Zoned Namespaces`_, `metadata`_ and `End-to-End Data + Protection`_, + +Adding NVMe Devices +=================== + +Controller Emulation +-------------------- + +The QEMU emulated NVMe controller implements version 1.4 of the NVM Express +specification. All mandatory features are implement with a couple of exceptions +and limitations: + + * Accounting numbers in the SMART/Health log page are reset when the device + is power cycled. + * Interrupt Coalescing is not supported and is disabled by default. + +The simplest way to attach an NVMe controller on the QEMU PCI bus is to add the +following parameters: + +.. code-block:: console + + -drive file=nvm.img,if=none,id=nvm + -device nvme,serial=deadbeef,drive=nvm + +There are a number of optional general parameters for the ``nvme`` device. Some +are mentioned here, but see ``-device nvme,help`` to list all possible +parameters. + +``max_ioqpairs=UINT32`` (default: ``64``) + Set the maximum number of allowed I/O queue pairs. This replaces the + deprecated ``num_queues`` parameter. + +``msix_qsize=UINT16`` (default: ``65``) + The number of MSI-X vectors that the device should support. + +``mdts=UINT8`` (default: ``7``) + Set the Maximum Data Transfer Size of the device. + +``use-intel-id`` (default: ``off``) + Since QEMU 5.2, the device uses a QEMU allocated "Red Hat" PCI Device and + Vendor ID. Set this to ``on`` to revert to the unallocated Intel ID + previously used. + +Additional Namespaces +--------------------- + +In the simplest possible invocation sketched above, the device only support a +single namespace with the namespace identifier ``1``. To support multiple +namespaces and additional features, the ``nvme-ns`` device must be used. + +.. code-block:: console + + -device nvme,id=nvme-ctrl-0,serial=deadbeef + -drive file=nvm-1.img,if=none,id=nvm-1 + -device nvme-ns,drive=nvm-1 + -drive file=nvm-2.img,if=none,id=nvm-2 + -device nvme-ns,drive=nvm-2 + +The namespaces defined by the ``nvme-ns`` device will attach to the most +recently defined ``nvme-bus`` that is created by the ``nvme`` device. Namespace +identifers are allocated automatically, starting from ``1``. + +There are a number of parameters available: + +``nsid`` (default: ``0``) + Explicitly set the namespace identifier. + +``uuid`` (default: *autogenerated*) + Set the UUID of the namespace. This will be reported as a "Namespace UUID" + descriptor in the Namespace Identification Descriptor List. + +``eui64`` + Set the EUI-64 of the namespace. This will be reported as a "IEEE Extended + Unique Identifier" descriptor in the Namespace Identification Descriptor List. + Since machine type 6.1 a non-zero default value is used if the parameter + is not provided. For earlier machine types the field defaults to 0. + +``bus`` + If there are more ``nvme`` devices defined, this parameter may be used to + attach the namespace to a specific ``nvme`` device (identified by an ``id`` + parameter on the controller device). + +NVM Subsystems +-------------- + +Additional features becomes available if the controller device (``nvme``) is +linked to an NVM Subsystem device (``nvme-subsys``). + +The NVM Subsystem emulation allows features such as shared namespaces and +multipath I/O. + +.. code-block:: console + + -device nvme-subsys,id=nvme-subsys-0,nqn=subsys0 + -device nvme,serial=a,subsys=nvme-subsys-0 + -device nvme,serial=b,subsys=nvme-subsys-0 + +This will create an NVM subsystem with two controllers. Having controllers +linked to an ``nvme-subsys`` device allows additional ``nvme-ns`` parameters: + +``shared`` (default: ``off``) + Specifies that the namespace will be attached to all controllers in the + subsystem. If set to ``off`` (the default), the namespace will remain a + private namespace and may only be attached to a single controller at a time. + +``detached`` (default: ``off``) + If set to ``on``, the namespace will be be available in the subsystem, but + not attached to any controllers initially. + +Thus, adding + +.. code-block:: console + + -drive file=nvm-1.img,if=none,id=nvm-1 + -device nvme-ns,drive=nvm-1,nsid=1,shared=on + -drive file=nvm-2.img,if=none,id=nvm-2 + -device nvme-ns,drive=nvm-2,nsid=3,detached=on + +will cause NSID 1 will be a shared namespace (due to ``shared=on``) that is +initially attached to both controllers. NSID 3 will be a private namespace +(i.e. only attachable to a single controller at a time) and will not be +attached to any controller initially (due to ``detached=on``). + +Optional Features +================= + +Controller Memory Buffer +------------------------ + +``nvme`` device parameters related to the Controller Memory Buffer support: + +``cmb_size_mb=UINT32`` (default: ``0``) + This adds a Controller Memory Buffer of the given size at offset zero in BAR + 2. + +``legacy-cmb`` (default: ``off``) + By default, the device uses the "v1.4 scheme" for the Controller Memory + Buffer support (i.e, the CMB is initially disabled and must be explicitly + enabled by the host). Set this to ``on`` to behave as a v1.3 device wrt. the + CMB. + +Simple Copy +----------- + +The device includes support for TP 4065 ("Simple Copy Command"). A number of +additional ``nvme-ns`` device parameters may be used to control the Copy +command limits: + +``mssrl=UINT16`` (default: ``128``) + Set the Maximum Single Source Range Length (``MSSRL``). This is the maximum + number of logical blocks that may be specified in each source range. + +``mcl=UINT32`` (default: ``128``) + Set the Maximum Copy Length (``MCL``). This is the maximum number of logical + blocks that may be specified in a Copy command (the total for all source + ranges). + +``msrc=UINT8`` (default: ``127``) + Set the Maximum Source Range Count (``MSRC``). This is the maximum number of + source ranges that may be used in a Copy command. This is a 0's based value. + +Zoned Namespaces +---------------- + +A namespaces may be "Zoned" as defined by TP 4053 ("Zoned Namespaces"). Set +``zoned=on`` on an ``nvme-ns`` device to configure it as a zoned namespace. + +The namespace may be configured with additional parameters + +``zoned.zone_size=SIZE`` (default: ``128MiB``) + Define the zone size (``ZSZE``). + +``zoned.zone_capacity=SIZE`` (default: ``0``) + Define the zone capacity (``ZCAP``). If left at the default (``0``), the zone + capacity will equal the zone size. + +``zoned.descr_ext_size=UINT32`` (default: ``0``) + Set the Zone Descriptor Extension Size (``ZDES``). Must be a multiple of 64 + bytes. + +``zoned.cross_read=BOOL`` (default: ``off``) + Set to ``on`` to allow reads to cross zone boundaries. + +``zoned.max_active=UINT32`` (default: ``0``) + Set the maximum number of active resources (``MAR``). The default (``0``) + allows all zones to be active. + +``zoned.max_open=UINT32`` (default: ``0``) + Set the maximum number of open resources (``MOR``). The default (``0``) + allows all zones to be open. If ``zoned.max_active`` is specified, this value + must be less than or equal to that. + +``zoned.zasl=UINT8`` (default: ``0``) + Set the maximum data transfer size for the Zone Append command. Like + ``mdts``, the value is specified as a power of two (2^n) and is in units of + the minimum memory page size (CAP.MPSMIN). The default value (``0``) + has this property inherit the ``mdts`` value. + +Metadata +-------- + +The virtual namespace device supports LBA metadata in the form separate +metadata (``MPTR``-based) and extended LBAs. + +``ms=UINT16`` (default: ``0``) + Defines the number of metadata bytes per LBA. + +``mset=UINT8`` (default: ``0``) + Set to ``1`` to enable extended LBAs. + +End-to-End Data Protection +-------------------------- + +The virtual namespace device supports DIF- and DIX-based protection information +(depending on ``mset``). + +``pi=UINT8`` (default: ``0``) + Enable protection information of the specified type (type ``1``, ``2`` or + ``3``). + +``pil=UINT8`` (default: ``0``) + Controls the location of the protection information within the metadata. Set + to ``1`` to transfer protection information as the first eight bytes of + metadata. Otherwise, the protection information is transferred as the last + eight bytes. diff --git a/docs/system/devices/usb.rst b/docs/system/devices/usb.rst new file mode 100644 index 0000000..eeab78d --- /dev/null +++ b/docs/system/devices/usb.rst @@ -0,0 +1,140 @@ +.. _pcsys_005fusb: + +USB emulation +------------- + +QEMU can emulate a PCI UHCI, OHCI, EHCI or XHCI USB controller. You can +plug virtual USB devices or real host USB devices (only works with +certain host operating systems). QEMU will automatically create and +connect virtual USB hubs as necessary to connect multiple USB devices. + +.. _Connecting USB devices: + +Connecting USB devices +~~~~~~~~~~~~~~~~~~~~~~ + +USB devices can be connected with the ``-device usb-...`` command line +option or the ``device_add`` monitor command. Available devices are: + +``usb-mouse`` + Virtual Mouse. This will override the PS/2 mouse emulation when + activated. + +``usb-tablet`` + Pointer device that uses absolute coordinates (like a touchscreen). + This means QEMU is able to report the mouse position without having + to grab the mouse. Also overrides the PS/2 mouse emulation when + activated. + +``usb-storage,drive=drive_id`` + Mass storage device backed by drive_id (see the :ref:`disk images` + chapter in the System Emulation Users Guide) + +``usb-uas`` + USB attached SCSI device, see + `usb-storage.txt `__ + for details + +``usb-bot`` + Bulk-only transport storage device, see + `usb-storage.txt `__ + for details here, too + +``usb-mtp,rootdir=dir`` + Media transfer protocol device, using dir as root of the file tree + that is presented to the guest. + +``usb-host,hostbus=bus,hostaddr=addr`` + Pass through the host device identified by bus and addr + +``usb-host,vendorid=vendor,productid=product`` + Pass through the host device identified by vendor and product ID + +``usb-wacom-tablet`` + Virtual Wacom PenPartner tablet. This device is similar to the + ``tablet`` above but it can be used with the tslib library because in + addition to touch coordinates it reports touch pressure. + +``usb-kbd`` + Standard USB keyboard. Will override the PS/2 keyboard (if present). + +``usb-serial,chardev=id`` + Serial converter. This emulates an FTDI FT232BM chip connected to + host character device id. + +``usb-braille,chardev=id`` + Braille device. This will use BrlAPI to display the braille output on + a real or fake device referenced by id. + +``usb-net[,netdev=id]`` + Network adapter that supports CDC ethernet and RNDIS protocols. id + specifies a netdev defined with ``-netdev …,id=id``. For instance, + user-mode networking can be used with + + .. parsed-literal:: + + |qemu_system| [...] -netdev user,id=net0 -device usb-net,netdev=net0 + +``usb-ccid`` + Smartcard reader device + +``usb-audio`` + USB audio device + +``u2f-{emulated,passthru}`` + Universal Second Factor device + +.. _host_005fusb_005fdevices: + +Using host USB devices on a Linux host +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +WARNING: this is an experimental feature. QEMU will slow down when using +it. USB devices requiring real time streaming (i.e. USB Video Cameras) +are not supported yet. + +1. If you use an early Linux 2.4 kernel, verify that no Linux driver is + actually using the USB device. A simple way to do that is simply to + disable the corresponding kernel module by renaming it from + ``mydriver.o`` to ``mydriver.o.disabled``. + +2. Verify that ``/proc/bus/usb`` is working (most Linux distributions + should enable it by default). You should see something like that: + + :: + + ls /proc/bus/usb + 001 devices drivers + +3. Since only root can access to the USB devices directly, you can + either launch QEMU as root or change the permissions of the USB + devices you want to use. For testing, the following suffices: + + :: + + chown -R myuid /proc/bus/usb + +4. Launch QEMU and do in the monitor: + + :: + + info usbhost + Device 1.2, speed 480 Mb/s + Class 00: USB device 1234:5678, USB DISK + + You should see the list of the devices you can use (Never try to use + hubs, it won't work). + +5. Add the device in QEMU by using: + + :: + + device_add usb-host,vendorid=0x1234,productid=0x5678 + + Normally the guest OS should report that a new USB device is plugged. + You can use the option ``-device usb-host,...`` to do the same. + +6. Now you can try to use the host USB device in QEMU. + +When relaunching QEMU, you may have to unplug and plug again the USB +device to make it work again (this is a bug). diff --git a/docs/system/devices/virtio-pmem.rst b/docs/system/devices/virtio-pmem.rst new file mode 100644 index 0000000..c82ac06 --- /dev/null +++ b/docs/system/devices/virtio-pmem.rst @@ -0,0 +1,76 @@ + +=========== +virtio pmem +=========== + +This document explains the setup and usage of the virtio pmem device. +The virtio pmem device is a paravirtualized persistent memory device +on regular (i.e non-NVDIMM) storage. + +Usecase +------- + +Virtio pmem allows to bypass the guest page cache and directly use +host page cache. This reduces guest memory footprint as the host can +make efficient memory reclaim decisions under memory pressure. + +How does virtio-pmem compare to the nvdimm emulation? +----------------------------------------------------- + +NVDIMM emulation on regular (i.e. non-NVDIMM) host storage does not +persist the guest writes as there are no defined semantics in the device +specification. The virtio pmem device provides guest write persistence +on non-NVDIMM host storage. + +virtio pmem usage +----------------- + +A virtio pmem device backed by a memory-backend-file can be created on +the QEMU command line as in the following example:: + + -object memory-backend-file,id=mem1,share,mem-path=./virtio_pmem.img,size=4G + -device virtio-pmem-pci,memdev=mem1,id=nv1 + +where: + + - "object memory-backend-file,id=mem1,share,mem-path=, size=" + creates a backend file with the specified size. + + - "device virtio-pmem-pci,id=nvdimm1,memdev=mem1" creates a virtio pmem + pci device whose storage is provided by above memory backend device. + +Multiple virtio pmem devices can be created if multiple pairs of "-object" +and "-device" are provided. + +Hotplug +------- + +Virtio pmem devices can be hotplugged via the QEMU monitor. First, the +memory backing has to be added via 'object_add'; afterwards, the virtio +pmem device can be added via 'device_add'. + +For example, the following commands add another 4GB virtio pmem device to +the guest:: + + (qemu) object_add memory-backend-file,id=mem2,share=on,mem-path=virtio_pmem2.img,size=4G + (qemu) device_add virtio-pmem-pci,id=virtio_pmem2,memdev=mem2 + +Guest Data Persistence +---------------------- + +Guest data persistence on non-NVDIMM requires guest userspace applications +to perform fsync/msync. This is different from a real nvdimm backend where +no additional fsync/msync is required. This is to persist guest writes in +host backing file which otherwise remains in host page cache and there is +risk of losing the data in case of power failure. + +With virtio pmem device, MAP_SYNC mmap flag is not supported. This provides +a hint to application to perform fsync for write persistence. + +Limitations +----------- + +- Real nvdimm device backend is not supported. +- virtio pmem hotunplug is not supported. +- ACPI NVDIMM features like regions/namespaces are not supported. +- ndctl command is not supported. diff --git a/docs/system/index.rst b/docs/system/index.rst index fda4b1b..64a424a 100644 --- a/docs/system/index.rst +++ b/docs/system/index.rst @@ -11,15 +11,12 @@ or Hypervisor.Framework. quickstart invocation + device-emulation keys mux-chardev monitor images - net virtio-net-failover - usb - nvme - ivshmem linuxboot generic-loader guest-loader @@ -30,7 +27,6 @@ or Hypervisor.Framework. gdb managed-startup cpu-hotplug - virtio-pmem pr-manager targets security diff --git a/docs/system/ivshmem.rst b/docs/system/ivshmem.rst deleted file mode 100644 index b03a48a..0000000 --- a/docs/system/ivshmem.rst +++ /dev/null @@ -1,64 +0,0 @@ -.. _pcsys_005fivshmem: - -Inter-VM Shared Memory device ------------------------------ - -On Linux hosts, a shared memory device is available. The basic syntax -is: - -.. parsed-literal:: - - |qemu_system_x86| -device ivshmem-plain,memdev=hostmem - -where hostmem names a host memory backend. For a POSIX shared memory -backend, use something like - -:: - - -object memory-backend-file,size=1M,share,mem-path=/dev/shm/ivshmem,id=hostmem - -If desired, interrupts can be sent between guest VMs accessing the same -shared memory region. Interrupt support requires using a shared memory -server and using a chardev socket to connect to it. The code for the -shared memory server is qemu.git/contrib/ivshmem-server. An example -syntax when using the shared memory server is: - -.. parsed-literal:: - - # First start the ivshmem server once and for all - ivshmem-server -p pidfile -S path -m shm-name -l shm-size -n vectors - - # Then start your qemu instances with matching arguments - |qemu_system_x86| -device ivshmem-doorbell,vectors=vectors,chardev=id - -chardev socket,path=path,id=id - -When using the server, the guest will be assigned a VM ID (>=0) that -allows guests using the same server to communicate via interrupts. -Guests can read their VM ID from a device register (see -ivshmem-spec.txt). - -Migration with ivshmem -~~~~~~~~~~~~~~~~~~~~~~ - -With device property ``master=on``, the guest will copy the shared -memory on migration to the destination host. With ``master=off``, the -guest will not be able to migrate with the device attached. In the -latter case, the device should be detached and then reattached after -migration using the PCI hotplug support. - -At most one of the devices sharing the same memory can be master. The -master must complete migration before you plug back the other devices. - -ivshmem and hugepages -~~~~~~~~~~~~~~~~~~~~~ - -Instead of specifying the using POSIX shm, you may specify a -memory backend that has hugepage support: - -.. parsed-literal:: - - |qemu_system_x86| -object memory-backend-file,size=1G,mem-path=/dev/hugepages/my-shmem-file,share,id=mb1 - -device ivshmem-plain,memdev=mb1 - -ivshmem-server also supports hugepages mount points with the ``-m`` -memory path argument. diff --git a/docs/system/net.rst b/docs/system/net.rst deleted file mode 100644 index 4b2640c..0000000 --- a/docs/system/net.rst +++ /dev/null @@ -1,100 +0,0 @@ -.. _pcsys_005fnetwork: - -Network emulation ------------------ - -QEMU can simulate several network cards (e.g. PCI or ISA cards on the PC -target) and can connect them to a network backend on the host or an -emulated hub. The various host network backends can either be used to -connect the NIC of the guest to a real network (e.g. by using a TAP -devices or the non-privileged user mode network stack), or to other -guest instances running in another QEMU process (e.g. by using the -socket host network backend). - -Using TAP network interfaces -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -This is the standard way to connect QEMU to a real network. QEMU adds a -virtual network device on your host (called ``tapN``), and you can then -configure it as if it was a real ethernet card. - -Linux host -^^^^^^^^^^ - -As an example, you can download the ``linux-test-xxx.tar.gz`` archive -and copy the script ``qemu-ifup`` in ``/etc`` and configure properly -``sudo`` so that the command ``ifconfig`` contained in ``qemu-ifup`` can -be executed as root. You must verify that your host kernel supports the -TAP network interfaces: the device ``/dev/net/tun`` must be present. - -See :ref:`sec_005finvocation` to have examples of command -lines using the TAP network interfaces. - -Windows host -^^^^^^^^^^^^ - -There is a virtual ethernet driver for Windows 2000/XP systems, called -TAP-Win32. But it is not included in standard QEMU for Windows, so you -will need to get it separately. It is part of OpenVPN package, so -download OpenVPN from : https://openvpn.net/. - -Using the user mode network stack -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -By using the option ``-net user`` (default configuration if no ``-net`` -option is specified), QEMU uses a completely user mode network stack -(you don't need root privilege to use the virtual network). The virtual -network configuration is the following:: - - guest (10.0.2.15) <------> Firewall/DHCP server <-----> Internet - | (10.0.2.2) - | - ----> DNS server (10.0.2.3) - | - ----> SMB server (10.0.2.4) - -The QEMU VM behaves as if it was behind a firewall which blocks all -incoming connections. You can use a DHCP client to automatically -configure the network in the QEMU VM. The DHCP server assign addresses -to the hosts starting from 10.0.2.15. - -In order to check that the user mode network is working, you can ping -the address 10.0.2.2 and verify that you got an address in the range -10.0.2.x from the QEMU virtual DHCP server. - -Note that ICMP traffic in general does not work with user mode -networking. ``ping``, aka. ICMP echo, to the local router (10.0.2.2) -shall work, however. If you're using QEMU on Linux >= 3.0, it can use -unprivileged ICMP ping sockets to allow ``ping`` to the Internet. The -host admin has to set the ping_group_range in order to grant access to -those sockets. To allow ping for GID 100 (usually users group):: - - echo 100 100 > /proc/sys/net/ipv4/ping_group_range - -When using the built-in TFTP server, the router is also the TFTP server. - -When using the ``'-netdev user,hostfwd=...'`` option, TCP or UDP -connections can be redirected from the host to the guest. It allows for -example to redirect X11, telnet or SSH connections. - -Hubs -~~~~ - -QEMU can simulate several hubs. A hub can be thought of as a virtual -connection between several network devices. These devices can be for -example QEMU virtual ethernet cards or virtual Host ethernet devices -(TAP devices). You can connect guest NICs or host network backends to -such a hub using the ``-netdev -hubport`` or ``-nic hubport`` options. The legacy ``-net`` option also -connects the given device to the emulated hub with ID 0 (i.e. the -default hub) unless you specify a netdev with ``-net nic,netdev=xxx`` -here. - -Connecting emulated networks between QEMU instances -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Using the ``-netdev socket`` (or ``-nic socket`` or ``-net socket``) -option, it is possible to create emulated networks that span several -QEMU instances. See the description of the ``-netdev socket`` option in -:ref:`sec_005finvocation` to have a basic -example. diff --git a/docs/system/nvme.rst b/docs/system/nvme.rst deleted file mode 100644 index bff72d1..0000000 --- a/docs/system/nvme.rst +++ /dev/null @@ -1,237 +0,0 @@ -============== -NVMe Emulation -============== - -QEMU provides NVMe emulation through the ``nvme``, ``nvme-ns`` and -``nvme-subsys`` devices. - -See the following sections for specific information on - - * `Adding NVMe Devices`_, `additional namespaces`_ and `NVM subsystems`_. - * Configuration of `Optional Features`_ such as `Controller Memory Buffer`_, - `Simple Copy`_, `Zoned Namespaces`_, `metadata`_ and `End-to-End Data - Protection`_, - -Adding NVMe Devices -=================== - -Controller Emulation --------------------- - -The QEMU emulated NVMe controller implements version 1.4 of the NVM Express -specification. All mandatory features are implement with a couple of exceptions -and limitations: - - * Accounting numbers in the SMART/Health log page are reset when the device - is power cycled. - * Interrupt Coalescing is not supported and is disabled by default. - -The simplest way to attach an NVMe controller on the QEMU PCI bus is to add the -following parameters: - -.. code-block:: console - - -drive file=nvm.img,if=none,id=nvm - -device nvme,serial=deadbeef,drive=nvm - -There are a number of optional general parameters for the ``nvme`` device. Some -are mentioned here, but see ``-device nvme,help`` to list all possible -parameters. - -``max_ioqpairs=UINT32`` (default: ``64``) - Set the maximum number of allowed I/O queue pairs. This replaces the - deprecated ``num_queues`` parameter. - -``msix_qsize=UINT16`` (default: ``65``) - The number of MSI-X vectors that the device should support. - -``mdts=UINT8`` (default: ``7``) - Set the Maximum Data Transfer Size of the device. - -``use-intel-id`` (default: ``off``) - Since QEMU 5.2, the device uses a QEMU allocated "Red Hat" PCI Device and - Vendor ID. Set this to ``on`` to revert to the unallocated Intel ID - previously used. - -Additional Namespaces ---------------------- - -In the simplest possible invocation sketched above, the device only support a -single namespace with the namespace identifier ``1``. To support multiple -namespaces and additional features, the ``nvme-ns`` device must be used. - -.. code-block:: console - - -device nvme,id=nvme-ctrl-0,serial=deadbeef - -drive file=nvm-1.img,if=none,id=nvm-1 - -device nvme-ns,drive=nvm-1 - -drive file=nvm-2.img,if=none,id=nvm-2 - -device nvme-ns,drive=nvm-2 - -The namespaces defined by the ``nvme-ns`` device will attach to the most -recently defined ``nvme-bus`` that is created by the ``nvme`` device. Namespace -identifers are allocated automatically, starting from ``1``. - -There are a number of parameters available: - -``nsid`` (default: ``0``) - Explicitly set the namespace identifier. - -``uuid`` (default: *autogenerated*) - Set the UUID of the namespace. This will be reported as a "Namespace UUID" - descriptor in the Namespace Identification Descriptor List. - -``eui64`` - Set the EUI-64 of the namespace. This will be reported as a "IEEE Extended - Unique Identifier" descriptor in the Namespace Identification Descriptor List. - Since machine type 6.1 a non-zero default value is used if the parameter - is not provided. For earlier machine types the field defaults to 0. - -``bus`` - If there are more ``nvme`` devices defined, this parameter may be used to - attach the namespace to a specific ``nvme`` device (identified by an ``id`` - parameter on the controller device). - -NVM Subsystems --------------- - -Additional features becomes available if the controller device (``nvme``) is -linked to an NVM Subsystem device (``nvme-subsys``). - -The NVM Subsystem emulation allows features such as shared namespaces and -multipath I/O. - -.. code-block:: console - - -device nvme-subsys,id=nvme-subsys-0,nqn=subsys0 - -device nvme,serial=a,subsys=nvme-subsys-0 - -device nvme,serial=b,subsys=nvme-subsys-0 - -This will create an NVM subsystem with two controllers. Having controllers -linked to an ``nvme-subsys`` device allows additional ``nvme-ns`` parameters: - -``shared`` (default: ``off``) - Specifies that the namespace will be attached to all controllers in the - subsystem. If set to ``off`` (the default), the namespace will remain a - private namespace and may only be attached to a single controller at a time. - -``detached`` (default: ``off``) - If set to ``on``, the namespace will be be available in the subsystem, but - not attached to any controllers initially. - -Thus, adding - -.. code-block:: console - - -drive file=nvm-1.img,if=none,id=nvm-1 - -device nvme-ns,drive=nvm-1,nsid=1,shared=on - -drive file=nvm-2.img,if=none,id=nvm-2 - -device nvme-ns,drive=nvm-2,nsid=3,detached=on - -will cause NSID 1 will be a shared namespace (due to ``shared=on``) that is -initially attached to both controllers. NSID 3 will be a private namespace -(i.e. only attachable to a single controller at a time) and will not be -attached to any controller initially (due to ``detached=on``). - -Optional Features -================= - -Controller Memory Buffer ------------------------- - -``nvme`` device parameters related to the Controller Memory Buffer support: - -``cmb_size_mb=UINT32`` (default: ``0``) - This adds a Controller Memory Buffer of the given size at offset zero in BAR - 2. - -``legacy-cmb`` (default: ``off``) - By default, the device uses the "v1.4 scheme" for the Controller Memory - Buffer support (i.e, the CMB is initially disabled and must be explicitly - enabled by the host). Set this to ``on`` to behave as a v1.3 device wrt. the - CMB. - -Simple Copy ------------ - -The device includes support for TP 4065 ("Simple Copy Command"). A number of -additional ``nvme-ns`` device parameters may be used to control the Copy -command limits: - -``mssrl=UINT16`` (default: ``128``) - Set the Maximum Single Source Range Length (``MSSRL``). This is the maximum - number of logical blocks that may be specified in each source range. - -``mcl=UINT32`` (default: ``128``) - Set the Maximum Copy Length (``MCL``). This is the maximum number of logical - blocks that may be specified in a Copy command (the total for all source - ranges). - -``msrc=UINT8`` (default: ``127``) - Set the Maximum Source Range Count (``MSRC``). This is the maximum number of - source ranges that may be used in a Copy command. This is a 0's based value. - -Zoned Namespaces ----------------- - -A namespaces may be "Zoned" as defined by TP 4053 ("Zoned Namespaces"). Set -``zoned=on`` on an ``nvme-ns`` device to configure it as a zoned namespace. - -The namespace may be configured with additional parameters - -``zoned.zone_size=SIZE`` (default: ``128MiB``) - Define the zone size (``ZSZE``). - -``zoned.zone_capacity=SIZE`` (default: ``0``) - Define the zone capacity (``ZCAP``). If left at the default (``0``), the zone - capacity will equal the zone size. - -``zoned.descr_ext_size=UINT32`` (default: ``0``) - Set the Zone Descriptor Extension Size (``ZDES``). Must be a multiple of 64 - bytes. - -``zoned.cross_read=BOOL`` (default: ``off``) - Set to ``on`` to allow reads to cross zone boundaries. - -``zoned.max_active=UINT32`` (default: ``0``) - Set the maximum number of active resources (``MAR``). The default (``0``) - allows all zones to be active. - -``zoned.max_open=UINT32`` (default: ``0``) - Set the maximum number of open resources (``MOR``). The default (``0``) - allows all zones to be open. If ``zoned.max_active`` is specified, this value - must be less than or equal to that. - -``zoned.zasl=UINT8`` (default: ``0``) - Set the maximum data transfer size for the Zone Append command. Like - ``mdts``, the value is specified as a power of two (2^n) and is in units of - the minimum memory page size (CAP.MPSMIN). The default value (``0``) - has this property inherit the ``mdts`` value. - -Metadata --------- - -The virtual namespace device supports LBA metadata in the form separate -metadata (``MPTR``-based) and extended LBAs. - -``ms=UINT16`` (default: ``0``) - Defines the number of metadata bytes per LBA. - -``mset=UINT8`` (default: ``0``) - Set to ``1`` to enable extended LBAs. - -End-to-End Data Protection --------------------------- - -The virtual namespace device supports DIF- and DIX-based protection information -(depending on ``mset``). - -``pi=UINT8`` (default: ``0``) - Enable protection information of the specified type (type ``1``, ``2`` or - ``3``). - -``pil=UINT8`` (default: ``0``) - Controls the location of the protection information within the metadata. Set - to ``1`` to transfer protection information as the first eight bytes of - metadata. Otherwise, the protection information is transferred as the last - eight bytes. diff --git a/docs/system/usb.rst b/docs/system/usb.rst deleted file mode 100644 index eeab78d..0000000 --- a/docs/system/usb.rst +++ /dev/null @@ -1,140 +0,0 @@ -.. _pcsys_005fusb: - -USB emulation -------------- - -QEMU can emulate a PCI UHCI, OHCI, EHCI or XHCI USB controller. You can -plug virtual USB devices or real host USB devices (only works with -certain host operating systems). QEMU will automatically create and -connect virtual USB hubs as necessary to connect multiple USB devices. - -.. _Connecting USB devices: - -Connecting USB devices -~~~~~~~~~~~~~~~~~~~~~~ - -USB devices can be connected with the ``-device usb-...`` command line -option or the ``device_add`` monitor command. Available devices are: - -``usb-mouse`` - Virtual Mouse. This will override the PS/2 mouse emulation when - activated. - -``usb-tablet`` - Pointer device that uses absolute coordinates (like a touchscreen). - This means QEMU is able to report the mouse position without having - to grab the mouse. Also overrides the PS/2 mouse emulation when - activated. - -``usb-storage,drive=drive_id`` - Mass storage device backed by drive_id (see the :ref:`disk images` - chapter in the System Emulation Users Guide) - -``usb-uas`` - USB attached SCSI device, see - `usb-storage.txt `__ - for details - -``usb-bot`` - Bulk-only transport storage device, see - `usb-storage.txt `__ - for details here, too - -``usb-mtp,rootdir=dir`` - Media transfer protocol device, using dir as root of the file tree - that is presented to the guest. - -``usb-host,hostbus=bus,hostaddr=addr`` - Pass through the host device identified by bus and addr - -``usb-host,vendorid=vendor,productid=product`` - Pass through the host device identified by vendor and product ID - -``usb-wacom-tablet`` - Virtual Wacom PenPartner tablet. This device is similar to the - ``tablet`` above but it can be used with the tslib library because in - addition to touch coordinates it reports touch pressure. - -``usb-kbd`` - Standard USB keyboard. Will override the PS/2 keyboard (if present). - -``usb-serial,chardev=id`` - Serial converter. This emulates an FTDI FT232BM chip connected to - host character device id. - -``usb-braille,chardev=id`` - Braille device. This will use BrlAPI to display the braille output on - a real or fake device referenced by id. - -``usb-net[,netdev=id]`` - Network adapter that supports CDC ethernet and RNDIS protocols. id - specifies a netdev defined with ``-netdev …,id=id``. For instance, - user-mode networking can be used with - - .. parsed-literal:: - - |qemu_system| [...] -netdev user,id=net0 -device usb-net,netdev=net0 - -``usb-ccid`` - Smartcard reader device - -``usb-audio`` - USB audio device - -``u2f-{emulated,passthru}`` - Universal Second Factor device - -.. _host_005fusb_005fdevices: - -Using host USB devices on a Linux host -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -WARNING: this is an experimental feature. QEMU will slow down when using -it. USB devices requiring real time streaming (i.e. USB Video Cameras) -are not supported yet. - -1. If you use an early Linux 2.4 kernel, verify that no Linux driver is - actually using the USB device. A simple way to do that is simply to - disable the corresponding kernel module by renaming it from - ``mydriver.o`` to ``mydriver.o.disabled``. - -2. Verify that ``/proc/bus/usb`` is working (most Linux distributions - should enable it by default). You should see something like that: - - :: - - ls /proc/bus/usb - 001 devices drivers - -3. Since only root can access to the USB devices directly, you can - either launch QEMU as root or change the permissions of the USB - devices you want to use. For testing, the following suffices: - - :: - - chown -R myuid /proc/bus/usb - -4. Launch QEMU and do in the monitor: - - :: - - info usbhost - Device 1.2, speed 480 Mb/s - Class 00: USB device 1234:5678, USB DISK - - You should see the list of the devices you can use (Never try to use - hubs, it won't work). - -5. Add the device in QEMU by using: - - :: - - device_add usb-host,vendorid=0x1234,productid=0x5678 - - Normally the guest OS should report that a new USB device is plugged. - You can use the option ``-device usb-host,...`` to do the same. - -6. Now you can try to use the host USB device in QEMU. - -When relaunching QEMU, you may have to unplug and plug again the USB -device to make it work again (this is a bug). diff --git a/docs/system/virtio-pmem.rst b/docs/system/virtio-pmem.rst deleted file mode 100644 index c82ac06..0000000 --- a/docs/system/virtio-pmem.rst +++ /dev/null @@ -1,76 +0,0 @@ - -=========== -virtio pmem -=========== - -This document explains the setup and usage of the virtio pmem device. -The virtio pmem device is a paravirtualized persistent memory device -on regular (i.e non-NVDIMM) storage. - -Usecase -------- - -Virtio pmem allows to bypass the guest page cache and directly use -host page cache. This reduces guest memory footprint as the host can -make efficient memory reclaim decisions under memory pressure. - -How does virtio-pmem compare to the nvdimm emulation? ------------------------------------------------------ - -NVDIMM emulation on regular (i.e. non-NVDIMM) host storage does not -persist the guest writes as there are no defined semantics in the device -specification. The virtio pmem device provides guest write persistence -on non-NVDIMM host storage. - -virtio pmem usage ------------------ - -A virtio pmem device backed by a memory-backend-file can be created on -the QEMU command line as in the following example:: - - -object memory-backend-file,id=mem1,share,mem-path=./virtio_pmem.img,size=4G - -device virtio-pmem-pci,memdev=mem1,id=nv1 - -where: - - - "object memory-backend-file,id=mem1,share,mem-path=, size=" - creates a backend file with the specified size. - - - "device virtio-pmem-pci,id=nvdimm1,memdev=mem1" creates a virtio pmem - pci device whose storage is provided by above memory backend device. - -Multiple virtio pmem devices can be created if multiple pairs of "-object" -and "-device" are provided. - -Hotplug -------- - -Virtio pmem devices can be hotplugged via the QEMU monitor. First, the -memory backing has to be added via 'object_add'; afterwards, the virtio -pmem device can be added via 'device_add'. - -For example, the following commands add another 4GB virtio pmem device to -the guest:: - - (qemu) object_add memory-backend-file,id=mem2,share=on,mem-path=virtio_pmem2.img,size=4G - (qemu) device_add virtio-pmem-pci,id=virtio_pmem2,memdev=mem2 - -Guest Data Persistence ----------------------- - -Guest data persistence on non-NVDIMM requires guest userspace applications -to perform fsync/msync. This is different from a real nvdimm backend where -no additional fsync/msync is required. This is to persist guest writes in -host backing file which otherwise remains in host page cache and there is -risk of losing the data in case of power failure. - -With virtio pmem device, MAP_SYNC mmap flag is not supported. This provides -a hint to application to perform fsync for write persistence. - -Limitations ------------ - -- Real nvdimm device backend is not supported. -- virtio pmem hotunplug is not supported. -- ACPI NVDIMM features like regions/namespaces are not supported. -- ndctl command is not supported. -- cgit v1.1