diff options
Diffstat (limited to 'docs')
-rw-r--r-- | docs/about/deprecated.rst | 18 | ||||
-rw-r--r-- | docs/about/removed-features.rst | 13 | ||||
-rw-r--r-- | docs/conf.py | 4 | ||||
-rw-r--r-- | docs/meson.build | 1 | ||||
-rw-r--r-- | docs/tools/index.rst | 1 | ||||
-rw-r--r-- | docs/tools/virtiofsd.rst | 403 |
6 files changed, 13 insertions, 427 deletions
diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst index 2827b0c..ee95bcb 100644 --- a/docs/about/deprecated.rst +++ b/docs/about/deprecated.rst @@ -330,24 +330,6 @@ versions, aliases will point to newer CPU model versions depending on the machine type, so management software must resolve CPU model aliases before starting a virtual machine. -Tools ------ - -virtiofsd -''''''''' - -There is a new Rust implementation of ``virtiofsd`` at -``https://gitlab.com/virtio-fs/virtiofsd``; -since this is now marked stable, new development should be done on that -rather than the existing C version in the QEMU tree. -The C version will still accept fixes and patches that -are already in development for the moment, but will eventually -be deleted from this tree. -New deployments should use the Rust version, and existing systems -should consider moving to it. The command line and feature set -is very close and moving should be simple. - - QEMU guest agent ---------------- diff --git a/docs/about/removed-features.rst b/docs/about/removed-features.rst index e901637..5b258b4 100644 --- a/docs/about/removed-features.rst +++ b/docs/about/removed-features.rst @@ -889,3 +889,16 @@ The VXHS code did not compile since v2.12.0. It was removed in 5.1. The corresponding upstream server project is no longer maintained. Users are recommended to switch to an alternative distributed block device driver such as RBD. + +Tools +----- + +virtiofsd (removed in 8.0) +'''''''''''''''''''''''''' + +There is a newer Rust implementation of ``virtiofsd`` at +``https://gitlab.com/virtio-fs/virtiofsd``; this has been +stable for some time and is now widely used. +The command line and feature set is very close to the removed +C implementation. + diff --git a/docs/conf.py b/docs/conf.py index 73a287a..00767b0 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -290,10 +290,6 @@ man_pages = [ ('tools/virtfs-proxy-helper', 'virtfs-proxy-helper', 'QEMU 9p virtfs proxy filesystem helper', ['M. Mohan Kumar'], 1), - ('tools/virtiofsd', 'virtiofsd', - 'QEMU virtio-fs shared file system daemon', - ['Stefan Hajnoczi <stefanha@redhat.com>', - 'Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>'], 1), ] man_make_section_directory = False diff --git a/docs/meson.build b/docs/meson.build index 9136fed..bbcdccc 100644 --- a/docs/meson.build +++ b/docs/meson.build @@ -48,7 +48,6 @@ if build_docs 'qemu-storage-daemon.1': (have_tools ? 'man1' : ''), 'qemu-trace-stap.1': (stap.found() ? 'man1' : ''), 'virtfs-proxy-helper.1': (have_virtfs_proxy_helper ? 'man1' : ''), - 'virtiofsd.1': (have_virtiofsd ? 'man1' : ''), 'qemu.1': 'man1', 'qemu-block-drivers.7': 'man7', 'qemu-cpu-models.7': 'man7' diff --git a/docs/tools/index.rst b/docs/tools/index.rst index 2151adc..8e65ce0 100644 --- a/docs/tools/index.rst +++ b/docs/tools/index.rst @@ -16,4 +16,3 @@ command line utilities and other standalone programs. qemu-pr-helper qemu-trace-stap virtfs-proxy-helper - virtiofsd diff --git a/docs/tools/virtiofsd.rst b/docs/tools/virtiofsd.rst deleted file mode 100644 index 995a754..0000000 --- a/docs/tools/virtiofsd.rst +++ /dev/null @@ -1,403 +0,0 @@ -QEMU virtio-fs shared file system daemon -======================================== - -Synopsis --------- - -**virtiofsd** [*OPTIONS*] - -Description ------------ - -Share a host directory tree with a guest through a virtio-fs device. This -program is a vhost-user backend that implements the virtio-fs device. Each -virtio-fs device instance requires its own virtiofsd process. - -This program is designed to work with QEMU's ``--device vhost-user-fs-pci`` -but should work with any virtual machine monitor (VMM) that supports -vhost-user. See the Examples section below. - -This program must be run as the root user. The program drops privileges where -possible during startup although it must be able to create and access files -with any uid/gid: - -* The ability to invoke syscalls is limited using seccomp(2). -* Linux capabilities(7) are dropped. - -In "namespace" sandbox mode the program switches into a new file system -namespace and invokes pivot_root(2) to make the shared directory tree its root. -A new pid and net namespace is also created to isolate the process. - -In "chroot" sandbox mode the program invokes chroot(2) to make the shared -directory tree its root. This mode is intended for container environments where -the container runtime has already set up the namespaces and the program does -not have permission to create namespaces itself. - -Both sandbox modes prevent "file system escapes" due to symlinks and other file -system objects that might lead to files outside the shared directory. - -Options -------- - -.. program:: virtiofsd - -.. option:: -h, --help - - Print help. - -.. option:: -V, --version - - Print version. - -.. option:: -d - - Enable debug output. - -.. option:: --syslog - - Print log messages to syslog instead of stderr. - -.. option:: -o OPTION - - * debug - - Enable debug output. - - * flock|no_flock - - Enable/disable flock. The default is ``no_flock``. - - * modcaps=CAPLIST - Modify the list of capabilities allowed; CAPLIST is a colon separated - list of capabilities, each preceded by either + or -, e.g. - ''+sys_admin:-chown''. - - * log_level=LEVEL - - Print only log messages matching LEVEL or more severe. LEVEL is one of - ``err``, ``warn``, ``info``, or ``debug``. The default is ``info``. - - * posix_lock|no_posix_lock - - Enable/disable remote POSIX locks. The default is ``no_posix_lock``. - - * readdirplus|no_readdirplus - - Enable/disable readdirplus. The default is ``readdirplus``. - - * sandbox=namespace|chroot - - Sandbox mode: - - namespace: Create mount, pid, and net namespaces and pivot_root(2) into - the shared directory. - - chroot: chroot(2) into shared directory (use in containers). - The default is "namespace". - - * source=PATH - - Share host directory tree located at PATH. This option is required. - - * timeout=TIMEOUT - - I/O timeout in seconds. The default depends on cache= option. - - * writeback|no_writeback - - Enable/disable writeback cache. The cache allows the FUSE client to buffer - and merge write requests. The default is ``no_writeback``. - - * xattr|no_xattr - - Enable/disable extended attributes (xattr) on files and directories. The - default is ``no_xattr``. - - * posix_acl|no_posix_acl - - Enable/disable posix acl support. Posix ACLs are disabled by default. - - * security_label|no_security_label - - Enable/disable security label support. Security labels are disabled by - default. This will allow client to send a MAC label of file during - file creation. Typically this is expected to be SELinux security - label. Server will try to set that label on newly created file - atomically wherever possible. - - * killpriv_v2|no_killpriv_v2 - - Enable/disable ``FUSE_HANDLE_KILLPRIV_V2`` support. KILLPRIV_V2 is enabled - by default as long as the client supports it. Enabling this option helps - with performance in write path. - -.. option:: --socket-path=PATH - - Listen on vhost-user UNIX domain socket at PATH. - -.. option:: --socket-group=GROUP - - Set the vhost-user UNIX domain socket gid to GROUP. - -.. option:: --fd=FDNUM - - Accept connections from vhost-user UNIX domain socket file descriptor FDNUM. - The file descriptor must already be listening for connections. - -.. option:: --thread-pool-size=NUM - - Restrict the number of worker threads per request queue to NUM. The default - is 0. - -.. option:: --cache=none|auto|always - - Select the desired trade-off between coherency and performance. ``none`` - forbids the FUSE client from caching to achieve best coherency at the cost of - performance. ``auto`` acts similar to NFS with a 1 second metadata cache - timeout. ``always`` sets a long cache lifetime at the expense of coherency. - The default is ``auto``. - -Extended attribute (xattr) mapping ----------------------------------- - -By default the name of xattr's used by the client are passed through to the server -file system. This can be a problem where either those xattr names are used -by something on the server (e.g. selinux client/server confusion) or if the -``virtiofsd`` is running in a container with restricted privileges where it -cannot access some attributes. - -Mapping syntax -~~~~~~~~~~~~~~ - -A mapping of xattr names can be made using -o xattrmap=mapping where the ``mapping`` -string consists of a series of rules. - -The first matching rule terminates the mapping. -The set of rules must include a terminating rule to match any remaining attributes -at the end. - -Each rule consists of a number of fields separated with a separator that is the -first non-white space character in the rule. This separator must then be used -for the whole rule. -White space may be added before and after each rule. - -Using ':' as the separator a rule is of the form: - -``:type:scope:key:prepend:`` - -**scope** is: - -- 'client' - match 'key' against a xattr name from the client for - setxattr/getxattr/removexattr -- 'server' - match 'prepend' against a xattr name from the server - for listxattr -- 'all' - can be used to make a single rule where both the server - and client matches are triggered. - -**type** is one of: - -- 'prefix' - is designed to prepend and strip a prefix; the modified - attributes then being passed on to the client/server. - -- 'ok' - Causes the rule set to be terminated when a match is found - while allowing matching xattr's through unchanged. - It is intended both as a way of explicitly terminating - the list of rules, and to allow some xattr's to skip following rules. - -- 'bad' - If a client tries to use a name matching 'key' it's - denied using EPERM; when the server passes an attribute - name matching 'prepend' it's hidden. In many ways it's use is very like - 'ok' as either an explicit terminator or for special handling of certain - patterns. - -- 'unsupported' - If a client tries to use a name matching 'key' it's - denied using ENOTSUP; when the server passes an attribute - name matching 'prepend' it's hidden. In many ways it's use is very like - 'ok' as either an explicit terminator or for special handling of certain - patterns. - -**key** is a string tested as a prefix on an attribute name originating -on the client. It maybe empty in which case a 'client' rule -will always match on client names. - -**prepend** is a string tested as a prefix on an attribute name originating -on the server, and used as a new prefix. It may be empty -in which case a 'server' rule will always match on all names from -the server. - -e.g.: - - ``:prefix:client:trusted.:user.virtiofs.:`` - - will match 'trusted.' attributes in client calls and prefix them before - passing them to the server. - - ``:prefix:server::user.virtiofs.:`` - - will strip 'user.virtiofs.' from all server replies. - - ``:prefix:all:trusted.:user.virtiofs.:`` - - combines the previous two cases into a single rule. - - ``:ok:client:user.::`` - - will allow get/set xattr for 'user.' xattr's and ignore - following rules. - - ``:ok:server::security.:`` - - will pass 'security.' xattr's in listxattr from the server - and ignore following rules. - - ``:ok:all:::`` - - will terminate the rule search passing any remaining attributes - in both directions. - - ``:bad:server::security.:`` - - would hide 'security.' xattr's in listxattr from the server. - -A simpler 'map' type provides a shorter syntax for the common case: - -``:map:key:prepend:`` - -The 'map' type adds a number of separate rules to add **prepend** as a prefix -to the matched **key** (or all attributes if **key** is empty). -There may be at most one 'map' rule and it must be the last rule in the set. - -Note: When the 'security.capability' xattr is remapped, the daemon has to do -extra work to remove it during many operations, which the host kernel normally -does itself. - -Security considerations -~~~~~~~~~~~~~~~~~~~~~~~ - -Operating systems typically partition the xattr namespace using -well defined name prefixes. Each partition may have different -access controls applied. For example, on Linux there are multiple -partitions - - * ``system.*`` - access varies depending on attribute & filesystem - * ``security.*`` - only processes with CAP_SYS_ADMIN - * ``trusted.*`` - only processes with CAP_SYS_ADMIN - * ``user.*`` - any process granted by file permissions / ownership - -While other OS such as FreeBSD have different name prefixes -and access control rules. - -When remapping attributes on the host, it is important to -ensure that the remapping does not allow a guest user to -evade the guest access control rules. - -Consider if ``trusted.*`` from the guest was remapped to -``user.virtiofs.trusted*`` in the host. An unprivileged -user in a Linux guest has the ability to write to xattrs -under ``user.*``. Thus the user can evade the access -control restriction on ``trusted.*`` by instead writing -to ``user.virtiofs.trusted.*``. - -As noted above, the partitions used and access controls -applied, will vary across guest OS, so it is not wise to -try to predict what the guest OS will use. - -The simplest way to avoid an insecure configuration is -to remap all xattrs at once, to a given fixed prefix. -This is shown in example (1) below. - -If selectively mapping only a subset of xattr prefixes, -then rules must be added to explicitly block direct -access to the target of the remapping. This is shown -in example (2) below. - -Mapping examples -~~~~~~~~~~~~~~~~ - -1) Prefix all attributes with 'user.virtiofs.' - -:: - - -o xattrmap=":prefix:all::user.virtiofs.::bad:all:::" - - -This uses two rules, using : as the field separator; -the first rule prefixes and strips 'user.virtiofs.', -the second rule hides any non-prefixed attributes that -the host set. - -This is equivalent to the 'map' rule: - -:: - - -o xattrmap=":map::user.virtiofs.:" - -2) Prefix 'trusted.' attributes, allow others through - -:: - - "/prefix/all/trusted./user.virtiofs./ - /bad/server//trusted./ - /bad/client/user.virtiofs.// - /ok/all///" - - -Here there are four rules, using / as the field -separator, and also demonstrating that new lines can -be included between rules. -The first rule is the prefixing of 'trusted.' and -stripping of 'user.virtiofs.'. -The second rule hides unprefixed 'trusted.' attributes -on the host. -The third rule stops a guest from explicitly setting -the 'user.virtiofs.' path directly to prevent access -control bypass on the target of the earlier prefix -remapping. -Finally, the fourth rule lets all remaining attributes -through. - -This is equivalent to the 'map' rule: - -:: - - -o xattrmap="/map/trusted./user.virtiofs./" - -3) Hide 'security.' attributes, and allow everything else - -:: - - "/bad/all/security./security./ - /ok/all///' - -The first rule combines what could be separate client and server -rules into a single 'all' rule, matching 'security.' in either -client arguments or lists returned from the host. This stops -the client seeing any 'security.' attributes on the server and -stops it setting any. - -SELinux support ---------------- -One can enable support for SELinux by running virtiofsd with option -"-o security_label". But this will try to save guest's security context -in xattr security.selinux on host and it might fail if host's SELinux -policy does not permit virtiofsd to do this operation. - -Hence, it is preferred to remap guest's "security.selinux" xattr to say -"trusted.virtiofs.security.selinux" on host. - -"-o xattrmap=:map:security.selinux:trusted.virtiofs.:" - -This will make sure that guest and host's SELinux xattrs on same file -remain separate and not interfere with each other. And will allow both -host and guest to implement their own separate SELinux policies. - -Setting trusted xattr on host requires CAP_SYS_ADMIN. So one will need -add this capability to daemon. - -"-o modcaps=+sys_admin" - -Giving CAP_SYS_ADMIN increases the risk on system. Now virtiofsd is more -powerful and if gets compromised, it can do lot of damage to host system. -So keep this trade-off in my mind while making a decision. - -Examples --------- - -Export ``/var/lib/fs/vm001/`` on vhost-user UNIX domain socket -``/var/run/vm001-vhost-fs.sock``: - -.. parsed-literal:: - - host# virtiofsd --socket-path=/var/run/vm001-vhost-fs.sock -o source=/var/lib/fs/vm001 - host# |qemu_system| \\ - -chardev socket,id=char0,path=/var/run/vm001-vhost-fs.sock \\ - -device vhost-user-fs-pci,chardev=char0,tag=myfs \\ - -object memory-backend-memfd,id=mem,size=4G,share=on \\ - -numa node,memdev=mem \\ - ... - guest# mount -t virtiofs myfs /mnt |