aboutsummaryrefslogtreecommitdiff
path: root/hw/9pfs
AgeCommit message (Collapse)AuthorFilesLines
2020-05-25xen/9pfs: increase max ring order to 9Stefano Stabellini1-1/+1
The max order allowed by the protocol is 9. Increase the max order supported by QEMU to 9 to increase performance. Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com> Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Message-Id: <20200521192627.15259-3-sstabellini@kernel.org> Signed-off-by: Greg Kurz <groug@kaod.org>
2020-05-25xen/9pfs: yield when there isn't enough room on the ringStefano Stabellini1-6/+25
Instead of truncating replies, which is problematic, wait until the client reads more data and frees bytes on the reply ring. Do that by calling qemu_coroutine_yield(). The corresponding qemu_coroutine_enter_if_inactive() is called from xen_9pfs_bh upon receiving the next notification from the client. We need to be careful to avoid races in case xen_9pfs_bh and the coroutine are both active at the same time. In xen_9pfs_bh, wait until either the critical section is over (ring->co == NULL) or until the coroutine becomes inactive (qemu_coroutine_yield() was called) before continuing. Then, simply wake up the coroutine if it is inactive. Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com> Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Message-Id: <20200521192627.15259-2-sstabellini@kernel.org> Signed-off-by: Greg Kurz <groug@kaod.org>
2020-05-25Revert "9p: init_in_iov_from_pdu can truncate the size"Stefano Stabellini4-39/+22
This reverts commit 16724a173049ac29c7b5ade741da93a0f46edff7. It causes https://bugs.launchpad.net/bugs/1877688. Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com> Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Message-Id: <20200521192627.15259-1-sstabellini@kernel.org> Signed-off-by: Greg Kurz <groug@kaod.org>
2020-05-259p: Lock directory streams with a CoMutexGreg Kurz1-4/+4
Locking was introduced in QEMU 2.7 to address the deprecation of readdir_r(3) in glibc 2.24. It turns out that the frontend code is the worst place to handle a critical section with a pthread mutex: the code runs in a coroutine on behalf of the QEMU mainloop and then yields control, waiting for the fsdev backend to process the request in a worker thread. If the client resends another readdir request for the same fid before the previous one finally unlocked the mutex, we're deadlocked. This never bit us because the linux client serializes readdir requests for the same fid, but it is quite easy to demonstrate with a custom client. A good solution could be to narrow the critical section in the worker thread code and to return a copy of the dirent to the frontend, but this causes quite some changes in both 9p.c and codir.c. So, instead of that, in order for people to easily backport the fix to older QEMU versions, let's simply use a CoMutex since all the users for this sit in coroutines. Fixes: 7cde47d4a89d ("9p: add locking to V9fsDir") Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Message-Id: <158981894794.109297.3530035833368944254.stgit@bahia.lan> Signed-off-by: Greg Kurz <groug@kaod.org>
2020-05-259pfs: include linux/limits.h for XATTR_SIZE_MAXDan Robertson1-0/+1
linux/limits.h should be included for the XATTR_SIZE_MAX definition used by v9fs_xattrcreate. Fixes: 3b79ef2cf488 ("9pfs: limit xattr size in xattrcreate") Signed-off-by: Dan Robertson <dan@dlrobertson.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Message-Id: <20200515203015.7090-2-dan@dlrobertson.com> Signed-off-by: Greg Kurz <groug@kaod.org>
2020-05-15qdev: Unrealize must not failMarkus Armbruster3-5/+5
Devices may have component devices and buses. Device realization may fail. Realization is recursive: a device's realize() method realizes its components, and device_set_realized() realizes its buses (which should in turn realize the devices on that bus, except bus_set_realized() doesn't implement that, yet). When realization of a component or bus fails, we need to roll back: unrealize everything we realized so far. If any of these unrealizes failed, the device would be left in an inconsistent state. Must not happen. device_set_realized() lets it happen: it ignores errors in the roll back code starting at label child_realize_fail. Since realization is recursive, unrealization must be recursive, too. But how could a partly failed unrealize be rolled back? We'd have to re-realize, which can fail. This design is fundamentally broken. device_set_realized() does not roll back at all. Instead, it keeps unrealizing, ignoring further errors. It can screw up even for a device with no buses: if the lone dc->unrealize() fails, it still unregisters vmstate, and calls listeners' unrealize() callback. bus_set_realized() does not roll back either. Instead, it stops unrealizing. Fortunately, no unrealize method can fail, as we'll see below. To fix the design error, drop parameter @errp from all the unrealize methods. Any unrealize method that uses @errp now needs an update. This leads us to unrealize() methods that can fail. Merely passing it to another unrealize method cannot cause failure, though. Here are the ones that do other things with @errp: * virtio_serial_device_unrealize() Fails when qbus_set_hotplug_handler() fails, but still does all the other work. On failure, the device would stay realized with its resources completely gone. Oops. Can't happen, because qbus_set_hotplug_handler() can't actually fail here. Pass &error_abort to qbus_set_hotplug_handler() instead. * hw/ppc/spapr_drc.c's unrealize() Fails when object_property_del() fails, but all the other work is already done. On failure, the device would stay realized with its vmstate registration gone. Oops. Can't happen, because object_property_del() can't actually fail here. Pass &error_abort to object_property_del() instead. * spapr_phb_unrealize() Fails and bails out when remove_drcs() fails, but other work is already done. On failure, the device would stay realized with some of its resources gone. Oops. remove_drcs() fails only when chassis_from_bus()'s object_property_get_uint() fails, and it can't here. Pass &error_abort to remove_drcs() instead. Therefore, no unrealize method can fail before this patch. device_set_realized()'s recursive unrealization via bus uses object_property_set_bool(). Can't drop @errp there, so pass &error_abort. We similarly unrealize with object_property_set_bool() elsewhere, always ignoring errors. Pass &error_abort instead. Several unrealize methods no longer handle errors from other unrealize methods: virtio_9p_device_unrealize(), virtio_input_device_unrealize(), scsi_qdev_unrealize(), ... Much of the deleted error handling looks wrong anyway. One unrealize methods no longer ignore such errors: usb_ehci_pci_exit(). Several realize methods no longer ignore errors when rolling back: v9fs_device_realize_common(), pci_qdev_unrealize(), spapr_phb_realize(), usb_qdev_realize(), vfio_ccw_realize(), virtio_device_realize(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200505152926.18877-17-armbru@redhat.com>
2020-05-14xen-9pfs: Fix log messages of reply errorsChristian Schoenebeck1-4/+5
If delivery of some 9pfs response fails for some reason, log the error message by mentioning the 9P protocol reply type, not by client's request type. The latter could be misleading that the error occurred already when handling the request input. Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org> Message-Id: <ad0e5a9b6abde52502aa40b30661d29aebe1590a.1589132512.git.qemu_oss@crudebyte.com> Signed-off-by: Greg Kurz <groug@kaod.org>
2020-05-149pfs: local: ignore O_NOATIME if we don't have permissionsOmar Sandoval1-0/+13
QEMU's local 9pfs server passes through O_NOATIME from the client. If the QEMU process doesn't have permissions to use O_NOATIME (namely, it does not own the file nor have the CAP_FOWNER capability), the open will fail. This causes issues when from the client's point of view, it believes it has permissions to use O_NOATIME (e.g., a process running as root in the virtual machine). Additionally, overlayfs on Linux opens files on the lower layer using O_NOATIME, so in this case a 9pfs mount can't be used as a lower layer for overlayfs (cf. https://github.com/osandov/drgn/blob/dabfe1971951701da13863dbe6d8a1d172ad9650/vmtest/onoatimehack.c and https://github.com/NixOS/nixpkgs/issues/54509). Luckily, O_NOATIME is effectively a hint, and is often ignored by, e.g., network filesystems. open(2) notes that O_NOATIME "may not be effective on all filesystems. One example is NFS, where the server maintains the access time." This means that we can honor it when possible but fall back to ignoring it. Acked-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Message-Id: <e9bee604e8df528584693a4ec474ded6295ce8ad.1587149256.git.osandov@fb.com> Signed-off-by: Greg Kurz <groug@kaod.org>
2020-03-109p/proxy: Fix export_flagsGreg Kurz1-2/+2
The common fsdev options are set by qemu_fsdev_add() before it calls the backend specific option parsing code. In the case of "proxy" this means "writeout" or "readonly" were simply ignored. This has been broken from the beginning. Reported-by: Stéphane Graber <stgraber@ubuntu.com> Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Message-Id: <158349633705.1237488.8895481990204796135.stgit@bahia.lan>
2020-02-08hw/9pfs/9p-synth: added directory for readdir testChristian Schoenebeck2-0/+24
This will provide the following virtual files by the 9pfs synth driver: - /ReadDirDir/ReadDirFile99 - /ReadDirDir/ReadDirFile98 ... - /ReadDirDir/ReadDirFile1 - /ReadDirDir/ReadDirFile0 This virtual directory and its virtual 100 files will be used by the upcoming 9pfs readdir tests. Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <5408c28c8de25dd575b745cef63bf785305ccef2.1579567020.git.qemu_oss@crudebyte.com> Signed-off-by: Greg Kurz <groug@kaod.org>
2020-02-089pfs: validate count sent by client with T_readdirChristian Schoenebeck1-0/+9
A good 9p client sends T_readdir with "count" parameter that's sufficiently smaller than client's initially negotiated msize (maximum message size). We perform a check for that though to avoid the server to be interrupted with a "Failed to encode VirtFS reply type 41" transport error message by bad clients. This count value constraint uses msize - 11, because 11 is the header size of R_readdir. Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <3990d3891e8ae2074709b56449e96ab4b4b93b7d.1579567020.git.qemu_oss@crudebyte.com> [groug: added comment ] Signed-off-by: Greg Kurz <groug@kaod.org>
2020-02-089pfs: require msize >= 4096Christian Schoenebeck2-0/+23
A client establishes a session by sending a Tversion request along with a 'msize' parameter which client uses to suggest server a maximum message size ever to be used for communication (for both requests and replies) between client and server during that session. If client suggests a 'msize' smaller than 4096 then deny session by server immediately with an error response (Rlerror for "9P2000.L" clients or Rerror for "9P2000.u" clients) instead of replying with Rversion. So far any msize submitted by client with Tversion was simply accepted by server without any check. Introduction of some minimum msize makes sense, because e.g. a msize < 7 would not allow any subsequent 9p operation at all, because 7 is the size of the header section common by all 9p message types. A substantial higher value of 4096 was chosen though to prevent potential issues with some message types. E.g. Rreadlink may yield up to a size of PATH_MAX which is usually 4096, and like almost all 9p message types, Rreadlink is not allowed to be truncated by the 9p protocol. This chosen size also prevents a similar issue with Rreaddir responses (provided client always sends adequate 'count' parameter with Treaddir), because even though directory entries retrieval may be split up over several T/Rreaddir messages; a Rreaddir response must not truncate individual directory entries though. So msize should be large enough to return at least one directory entry with the longest possible file name supported by host. Most file systems support a max. file name length of 255. Largest known file name lenght limit would be currently ReiserFS with max. 4032 bytes, which is also covered by this min. msize value because 4032 + 35 < 4096. Furthermore 4096 is already the minimum msize of the Linux kernel's 9pfs client. Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Greg Kurz <groug@kaod.org> Message-Id: <8ceecb7fb9fdbeabbe55c04339349a36929fb8e3.1579567019.git.qemu_oss@crudebyte.com> Signed-off-by: Greg Kurz <groug@kaod.org>
2020-01-27Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into stagingPeter Maydell1-1/+1
* Register qdev properties as class properties (Marc-André) * Cleanups (Philippe) * virtio-scsi fix (Pan Nengyuan) * Tweak Skylake-v3 model id (Kashyap) * x86 UCODE_REV support and nested live migration fix (myself) * Advisory mode for pvpanic (Zhenwei) # gpg: Signature made Fri 24 Jan 2020 20:16:23 GMT # gpg: using RSA key BFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full] # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full] # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream: (58 commits) build-sys: clean up flags included in the linker command line target/i386: Add the 'model-id' for Skylake -v3 CPU models qdev: use object_property_help() qapi/qmp: add ObjectPropertyInfo.default-value qom: introduce object_property_help() qom: simplify qmp_device_list_properties() vl: print default value in object help qdev: register properties as class properties qdev: move instance properties to class properties qdev: rename DeviceClass.props qdev: set properties with device_class_set_props() object: return self in object_ref() object: release all props object: add object_class_property_add_link() object: express const link with link property object: add direct link flag object: rename link "child" to "target" object: check strong flag with & object: do not free class properties object: add object_property_set_default ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-01-24qdev: set properties with device_class_set_props()Marc-André Lureau1-1/+1
The following patch will need to handle properties registration during class_init time. Let's use a device_class_set_props() setter. spatch --macro-file scripts/cocci-macro-file.h --sp-file ./scripts/coccinelle/qdev-set-props.cocci --keep-comments --in-place --dir . @@ typedef DeviceClass; DeviceClass *d; expression val; @@ - d->props = val + device_class_set_props(d, val) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20200110153039.1379601-20-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-22virtio-9p-device: convert to new virtio_delete_queuePan Nengyuan1-1/+1
Use virtio_delete_queue to make it more clear. Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com> Message-Id: <20200117060927.51996-3-pannengyuan@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
2020-01-22virtio-9p-device: fix memleak in virtio_9p_device_unrealizePan Nengyuan1-0/+1
v->vq forgot to cleanup in virtio_9p_device_unrealize, the memory leak stack is as follow: Direct leak of 14336 byte(s) in 2 object(s) allocated from: #0 0x7f819ae43970 (/lib64/libasan.so.5+0xef970) ??:? #1 0x7f819872f49d (/lib64/libglib-2.0.so.0+0x5249d) ??:? #2 0x55a3a58da624 (./x86_64-softmmu/qemu-system-x86_64+0x2c14624) /mnt/sdb/qemu/hw/virtio/virtio.c:2327 #3 0x55a3a571bac7 (./x86_64-softmmu/qemu-system-x86_64+0x2a55ac7) /mnt/sdb/qemu/hw/9pfs/virtio-9p-device.c:209 #4 0x55a3a58e7bc6 (./x86_64-softmmu/qemu-system-x86_64+0x2c21bc6) /mnt/sdb/qemu/hw/virtio/virtio.c:3504 #5 0x55a3a5ebfb37 (./x86_64-softmmu/qemu-system-x86_64+0x31f9b37) /mnt/sdb/qemu/hw/core/qdev.c:876 Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com> Message-Id: <20200117060927.51996-2-pannengyuan@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Acked-by: Greg Kurz <groug@kaod.org>
2020-01-209pfs/9p.c: remove unneeded labelsDaniel Henrique Barboza1-6/+3
'out' label in v9fs_xattr_write() and 'out_nofid' label in v9fs_complete_rename() can be replaced by appropriate return calls. CC: Greg Kurz <groug@kaod.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> Acked-by: Greg Kurz <groug@kaod.org> Signed-off-by: Greg Kurz <groug@kaod.org>
2020-01-209p: init_in_iov_from_pdu can truncate the sizeGreg Kurz4-21/+38
init_in_iov_from_pdu might not be able to allocate the full buffer size requested, which comes from the client and could be larger than the transport has available at the time of the request. Specifically, this can happen with read operations, with the client requesting a read up to the max allowed, which might be more than the transport has available at the time. Today the implementation of init_in_iov_from_pdu throws an error, both Xen and Virtio. Instead, change the V9fsTransport interface so that the size becomes a pointer and can be limited by the implementation of init_in_iov_from_pdu. Change both the Xen and Virtio implementations to set the size to the size of the buffer they managed to allocate, instead of throwing an error. However, if the allocated buffer size is less than P9_IOHDRSZ (the size of the header) still throw an error as the case is unhandable. Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com> CC: groug@kaod.org CC: anthony.perard@citrix.com CC: roman@zededa.com CC: qemu_oss@crudebyte.com [groug: fix 32-bit build] Signed-off-by: Greg Kurz <groug@kaod.org>
2020-01-209p: local: always return -1 on error in local_unlinkat_commonDaniel Henrique Barboza1-8/+6
local_unlinkat_common() is supposed to always return -1 on error. This is being done by jumps to the 'err_out' label, which is a 'return ret' call, and 'ret' is initialized with -1. Unfortunately there is a condition in which the function will return 0 on error: in a case where flags == AT_REMOVEDIR, 'ret' will be 0 when reaching map_dirfd = openat_dir(...) And, if map_dirfd == -1 and errno != ENOENT, the existing 'err_out' jump will execute 'return ret', when ret is still set to zero at that point. This patch fixes it by changing all 'err_out' labels by 'return -1' calls, ensuring that the function will always return -1 on error conditions. 'ret' can be left unintialized since it's now being used just to store the result of 'unlinkat' calls. CC: Greg Kurz <groug@kaod.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> [groug: changed prefix in title to be "9p: local:"] Signed-off-by: Greg Kurz <groug@kaod.org>
2020-01-209pfs: local: Fix possible memory leak in local_link()Jiajun Chen1-1/+1
There is a possible memory leak while local_link return -1 without free odirpath and oname. Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Jaijun Chen <chenjiajun8@huawei.com> Signed-off-by: Xiang Zheng <zhengxiang9@huawei.com> Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Greg Kurz <groug@kaod.org>
2019-12-189pfs: make Error **errp const where it is appropriateVladimir Sementsov-Ogievskiy2-2/+2
Mostly, Error ** is for returning error from the function, so the callee sets it. However error_append_security_model_hint and error_append_socket_sockfd_hint get already filled errp parameter. They don't change the pointer itself, only change the internal state of referenced Error object. So we can make it Error *const * errp, to stress the behavior. It will also help coccinelle script (in future) to distinguish such cases from common errp usage. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Acked-by: Greg Kurz <groug@kaod.org> Message-Id: <20191205174635.18758-9-vsementsov@virtuozzo.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Commit message replaced] Signed-off-by: Markus Armbruster <armbru@redhat.com>
2019-11-239pfs: Fix divide by zero bugDan Schatzberg1-2/+4
Some filesystems may return 0s in statfs (trivially, a FUSE filesystem can do so). QEMU should handle this gracefully and just behave the same as if statfs failed. Signed-off-by: Dan Schatzberg <dschatzberg@fb.com> Acked-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Signed-off-by: Greg Kurz <groug@kaod.org>
2019-10-109p: Use variable length suffixes for inode remappingChristian Schoenebeck2-33/+279
Use variable length suffixes for inode remapping instead of the fixed 16 bit size prefixes before. With this change the inode numbers on guest will typically be much smaller (e.g. around >2^1 .. >2^7 instead of >2^48 with the previous fixed size inode remapping. Additionally this solution is more efficient, since inode numbers in practice can take almost their entire 64 bit range on guest as well, so there is less likely a need for generating and tracking additional suffixes, which might also be beneficial for nested virtualization where each level of virtualization would shift up the inode bits and increase the chance of expensive remapping actions. The "Exponential Golomb" algorithm is used as basis for generating the variable length suffixes. The algorithm has a parameter k which controls the distribution of bits on increasing indeces (minimum bits at low index vs. maximum bits at high index). With k=0 the generated suffixes look like: Index Dec/Bin -> Generated Suffix Bin 1 [1] -> [1] (1 bits) 2 [10] -> [010] (3 bits) 3 [11] -> [110] (3 bits) 4 [100] -> [00100] (5 bits) 5 [101] -> [10100] (5 bits) 6 [110] -> [01100] (5 bits) 7 [111] -> [11100] (5 bits) 8 [1000] -> [0001000] (7 bits) 9 [1001] -> [1001000] (7 bits) 10 [1010] -> [0101000] (7 bits) 11 [1011] -> [1101000] (7 bits) 12 [1100] -> [0011000] (7 bits) ... 65533 [1111111111111101] -> [1011111111111111000000000000000] (31 bits) 65534 [1111111111111110] -> [0111111111111111000000000000000] (31 bits) 65535 [1111111111111111] -> [1111111111111111000000000000000] (31 bits) Hence minBits=1 maxBits=31 And with k=5 they would look like: Index Dec/Bin -> Generated Suffix Bin 1 [1] -> [000001] (6 bits) 2 [10] -> [100001] (6 bits) 3 [11] -> [010001] (6 bits) 4 [100] -> [110001] (6 bits) 5 [101] -> [001001] (6 bits) 6 [110] -> [101001] (6 bits) 7 [111] -> [011001] (6 bits) 8 [1000] -> [111001] (6 bits) 9 [1001] -> [000101] (6 bits) 10 [1010] -> [100101] (6 bits) 11 [1011] -> [010101] (6 bits) 12 [1100] -> [110101] (6 bits) ... 65533 [1111111111111101] -> [0011100000000000100000000000] (28 bits) 65534 [1111111111111110] -> [1011100000000000100000000000] (28 bits) 65535 [1111111111111111] -> [0111100000000000100000000000] (28 bits) Hence minBits=6 maxBits=28 Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Signed-off-by: Greg Kurz <groug@kaod.org>
2019-10-109p: stat_to_qid: implement slow pathAntonios Motakis2-7/+76
stat_to_qid attempts via qid_path_prefixmap to map unique files (which are identified by 64 bit inode nr and 32 bit device id) to a 64 QID path value. However this implementation makes some assumptions about inode number generation on the host. If qid_path_prefixmap fails, we still have 48 bits available in the QID path to fall back to a less memory efficient full mapping. Signed-off-by: Antonios Motakis <antonios.motakis@huawei.com> [CS: - Rebased to https://github.com/gkurz/qemu/commits/9p-next (SHA1 7fc4c49e91). - Updated hash calls to new xxhash API. - Removed unnecessary parantheses in qpf_lookup_func(). - Removed unnecessary g_malloc0() result checks. - Log error message when running out of prefixes in qid_path_fullmap(). - Log warning message about potential degraded performance in qid_path_prefixmap(). - Wrapped qpf_table initialization to dedicated qpf_table_init() function. - Fixed typo in comment. ] Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Signed-off-by: Greg Kurz <groug@kaod.org>
2019-10-109p: Added virtfs option 'multidevs=remap|forbid|warn'Antonios Motakis3-21/+200
'warn' (default): Only log an error message (once) on host if more than one device is shared by same export, except of that just ignore this config error though. This is the default behaviour for not breaking existing installations implying that they really know what they are doing. 'forbid': Like 'warn', but except of just logging an error this also denies access of guest to additional devices. 'remap': Allows to share more than one device per export by remapping inodes from host to guest appropriately. To support multiple devices on the 9p share, and avoid qid path collisions we take the device id as input to generate a unique QID path. The lowest 48 bits of the path will be set equal to the file inode, and the top bits will be uniquely assigned based on the top 16 bits of the inode and the device id. Signed-off-by: Antonios Motakis <antonios.motakis@huawei.com> [CS: - Rebased to https://github.com/gkurz/qemu/commits/9p-next (SHA1 7fc4c49e91). - Added virtfs option 'multidevs', original patch simply did the inode remapping without being asked. - Updated hash calls to new xxhash API. - Updated docs for new option 'multidevs'. - Fixed v9fs_do_readdir() not having remapped inodes. - Log error message when running out of prefixes in qid_path_prefixmap(). - Fixed definition of QPATH_INO_MASK. - Wrapped qpp_table initialization to dedicated qpp_table_init() function. - Dropped unnecessary parantheses in qpp_lookup_func(). - Dropped unnecessary g_malloc0() result checks. ] Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> [groug: - Moved "multidevs" parsing to the local backend. - Added hint to invalid multidevs option error. - Turn "remap" into "x-remap". ] Signed-off-by: Greg Kurz <groug@kaod.org>
2019-10-109p: Treat multiple devices on one export as an errorAntonios Motakis2-14/+57
The QID path should uniquely identify a file. However, the inode of a file is currently used as the QID path, which on its own only uniquely identifies files within a device. Here we track the device hosting the 9pfs share, in order to prevent security issues with QID path collisions from other devices. We only print a warning for now but a subsequent patch will allow users to have finer control over the desired behaviour. Failing the I/O will be one the proposed behaviour, so we also change stat_to_qid() to return an error here in order to keep other patches simpler. Signed-off-by: Antonios Motakis <antonios.motakis@huawei.com> [CS: - Assign dev_id to export root's device already in v9fs_device_realize_common(), not postponed in stat_to_qid(). - error_report_once() if more than one device was shared by export. - Return -ENODEV instead of -ENOSYS in stat_to_qid(). - Fixed typo in log comment. ] Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> [groug, changed to warning, updated message and changelog] Signed-off-by: Greg Kurz <groug@kaod.org>
2019-10-10fsdev: Add return value to fsdev_throttle_parse_opts()Greg Kurz1-2/+1
It is more convenient to use the return value of the function to notify errors, rather than to be tied up setting up the &local_err boilerplate. Signed-off-by: Greg Kurz <groug@kaod.org>
2019-10-109p: Simplify error path of v9fs_device_realize_common()Greg Kurz3-10/+14
Make v9fs_device_unrealize_common() idempotent and use it for rollback, in order to reduce code duplication. Signed-off-by: Greg Kurz <groug@kaod.org>
2019-10-109p: unsigned type for type, version, pathAntonios Motakis2-10/+10
There is no need for signedness on these QID fields for 9p. Signed-off-by: Antonios Motakis <antonios.motakis@huawei.com> [CS: - Also make QID type unsigned. - Adjust donttouch_stat() to new types. - Adjust trace-events to new types. ] Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Signed-off-by: Greg Kurz <groug@kaod.org>
2019-08-209p: simplify source file selectionPaolo Bonzini1-0/+5
Express the complex conditions in Kconfig rather than Makefiles, since Kconfig is better suited at expressing dependencies and detecting contradictions. Cc: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-08-16Include hw/qdev-properties.h lessMarkus Armbruster1-0/+1
In my "build everything" tree, changing hw/qdev-properties.h triggers a recompile of some 2700 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). Many places including hw/qdev-properties.h (directly or via hw/qdev.h) actually need only hw/qdev-core.h. Include hw/qdev-core.h there instead. hw/qdev.h is actually pointless: all it does is include hw/qdev-core.h and hw/qdev-properties.h, which in turn includes hw/qdev-core.h. Replace the remaining uses of hw/qdev.h by hw/qdev-properties.h. While there, delete a few superfluous inclusions of hw/qdev-core.h. Touching hw/qdev-properties.h now recompiles some 1200 objects. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Daniel P. Berrangé" <berrange@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20190812052359.30071-22-armbru@redhat.com>
2019-08-16Include qemu/main-loop.h lessMarkus Armbruster8-1/+7
In my "build everything" tree, changing qemu/main-loop.h triggers a recompile of some 5600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). It includes block/aio.h, which in turn includes qemu/event_notifier.h, qemu/notify.h, qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h, qemu/thread.h, qemu/timer.h, and a few more. Include qemu/main-loop.h only where it's needed. Touching it now recompiles only some 1700 objects. For block/aio.h and qemu/event_notifier.h, these numbers drop from 5600 to 2800. For the others, they shrink only slightly. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-21-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16Include hw/hw.h exactly where neededMarkus Armbruster1-1/+0
In my "build everything" tree, changing hw/hw.h triggers a recompile of some 2600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). The previous commits have left only the declaration of hw_error() in hw/hw.h. This permits dropping most of its inclusions. Touching it now recompiles less than 200 objects. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <20190812052359.30071-19-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-06-24xen: Import other xen/io/*.hAnthony PERARD1-2/+2
A Xen public header have been imported into QEMU (by f65eadb639 "xen: import ring.h from xen"), but there are other header that depends on ring.h which come from the system when building QEMU. This patch resolves the issue of having headers from the system importing a different copie of ring.h. This patch is prompt by the build issue described in the previous patch: 'Revert xen/io/ring.h of "Clean up a few header guard symbols"' ring.h and the new imported headers are moved to "include/hw/xen/interface" as those describe interfaces with a guest. The imported headers are cleaned up a bit while importing them: some part of the file that QEMU doesn't use are removed (description of how to make hypercall in grant_table.h have been removed). Other cleanup: - xen-mapcache.c and xen-legacy-backend.c don't need grant_table.h. - xenfb.c doesn't need event_channel.h. Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Message-Id: <20190621105441.3025-3-anthony.perard@citrix.com>
2019-06-12Supply missing header guardsMarkus Armbruster1-1/+5
Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190604181618.19980-5-armbru@redhat.com>
2019-06-12Include qemu-common.h exactly where neededMarkus Armbruster2-1/+1
No header includes qemu-common.h after this commit, as prescribed by qemu-common.h's file comment. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-5-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and net/tap-bsd.c fixed up]
2019-06-12Include qemu/module.h where needed, drop it from qemu-common.hMarkus Armbruster1-0/+1
Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-4-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for hw/usb/dev-hub.c hw/misc/exynos4210_rng.c hw/misc/bcm2835_rng.c hw/misc/aspeed_scu.c hw/display/virtio-vga.c hw/arm/stm32f205_soc.c; ui/cocoa.m fixed up]
2019-03-22trace-events: Fix attribution of trace points to sourceMarkus Armbruster1-1/+1
Some trace points are attributed to the wrong source file. Happens when we neglect to update trace-events for code motion, or add events in the wrong place, or misspell the file name. Clean up with help of cleanup-trace-events.pl. Same funnies as in the previous commit, of course. Manually shorten its change to linux-user/trace-events to */signal.c. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-id: 20190314180929.27722-6-armbru@redhat.com Message-Id: <20190314180929.27722-6-armbru@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-03-22trace-events: Shorten file names in commentsMarkus Armbruster1-1/+1
We spell out sub/dir/ in sub/dir/trace-events' comments pointing to source files. That's because when trace-events got split up, the comments were moved verbatim. Delete the sub/dir/ part from these comments. Gets rid of several misspellings. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190314180929.27722-3-armbru@redhat.com Message-Id: <20190314180929.27722-3-armbru@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-03-07virtio: express virtio dependencies with KconfigYang Zhong1-1/+1
Signed-off-by: Yang Zhong <yang.zhong@intel.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <20190123065618.3520-42-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-03-07build: switch to KconfigPaolo Bonzini1-0/+2
The make_device_config.sh script is replaced by minikconf, which is modified to support the same command line as its predecessor. The roots of the parsing are default-configs/*.mak, Kconfig.host and hw/Kconfig. One difference with make_device_config.sh is that all symbols have to be defined in a Kconfig file, including those coming from the configure script. This is the reason for the Kconfig.host file introduced in the previous patch. Whenever a file in default-configs/*.mak used $(...) to refer to a config-host.mak symbol, this is replaced by a Kconfig dependency; this part must be done already in this patch for bisectability. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Acked-by: Thomas Huth <thuth@redhat.com> Message-Id: <20190123065618.3520-28-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-03-07kconfig: introduce kconfig filesPaolo Bonzini1-0/+2
The Kconfig files were generated mostly with this script: for i in `grep -ho CONFIG_[A-Z0-9_]* default-configs/* | sort -u`; do set fnord `git grep -lw $i -- 'hw/*/Makefile.objs' ` shift if test $# = 1; then cat >> $(dirname $1)/Kconfig << EOF config ${i#CONFIG_} bool EOF git add $(dirname $1)/Kconfig else echo $i $* fi done sed -i '$d' hw/*/Kconfig for i in hw/*; do if test -d $i && ! test -f $i/Kconfig; then touch $i/Kconfig git add $i/Kconfig fi done Whenever a symbol is referenced from multiple subdirectories, the script prints the list of directories that reference the symbol. These symbols have to be added manually to the Kconfig files. Kconfig.host and hw/Kconfig were created manually. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20190123065618.3520-27-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-03-079pfs: remove unnecessary conditionalsPaolo Bonzini1-2/+0
The VIRTIO_9P || VIRTFS && XEN condition can be computed in hw/Makefile.objs, removing an "if" from hw/9pfs/Makefile.objs. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-01-14xen: re-name XenDevice to XenLegacyDevice...Paul Durrant1-8/+8
...and xen_backend.h to xen-legacy-backend.h Rather than attempting to convert the existing backend infrastructure to be QOM compliant (which would be hard to do in an incremental fashion), subsequent patches will introduce a completely new framework for Xen PV backends. Hence it is necessary to re-name parts of existing code to avoid name clashes. The re-named 'legacy' infrastructure will be removed once all backends have been ported to the new framework. This patch is purely cosmetic. No functional change. Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Anthony Perard <anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2018-12-129p: remove support for the "handle" backendGreg Kurz2-711/+0
The "handle" fsdev backend was deprecated in QEMU 2.12.0 with: commit db3b3c7281ca82e2647e072a1f97db111313dd73 Author: Greg Kurz <groug@kaod.org> Date: Mon Jan 8 11:18:23 2018 +0100 9pfs: deprecate handle backend This backend raise some concerns: - doesn't support symlinks - fails +100 tests in the PJD POSIX file system test suite [1] - requires the QEMU process to run with the CAP_DAC_READ_SEARCH capability, which isn't recommended for security reasons This backend should not be used and wil be removed. The 'local' backend is the recommended alternative. [1] https://www.tuxera.com/community/posix-test-suite/ Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> It has passed the two release cooling period without any complaint. Remove it now. Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Thomas Huth <thuth@redhat.com>
2018-12-12xen/9pfs: use g_new(T, n) instead of g_malloc(sizeof(T) * n)Greg Kurz1-3/+3
Because it is a recommended coding practice (see HACKING). Signed-off-by: Greg Kurz <groug@kaod.org> Acked-by: Anthony PERARD <anthony.perard@citrix.com>
2018-12-129p: use g_new(T, n) instead of g_malloc(sizeof(T) * n)Greg Kurz1-2/+2
Because it is a recommended coding practice (see HACKING). Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
2018-11-239p: fix QEMU crash when renaming filesGreg Kurz1-0/+3
When using the 9P2000.u version of the protocol, the following shell command line in the guest can cause QEMU to crash: while true; do rm -rf aa; mkdir -p a/b & touch a/b/c & mv a aa; done With 9P2000.u, file renaming is handled by the WSTAT command. The v9fs_wstat() function calls v9fs_complete_rename(), which calls v9fs_fix_path() for every fid whose path is affected by the change. The involved calls to v9fs_path_copy() may race with any other access to the fid path performed by some worker thread, causing a crash like shown below: Thread 12 "qemu-system-x86" received signal SIGSEGV, Segmentation fault. 0x0000555555a25da2 in local_open_nofollow (fs_ctx=0x555557d958b8, path=0x0, flags=65536, mode=0) at hw/9pfs/9p-local.c:59 59 while (*path && fd != -1) { (gdb) bt #0 0x0000555555a25da2 in local_open_nofollow (fs_ctx=0x555557d958b8, path=0x0, flags=65536, mode=0) at hw/9pfs/9p-local.c:59 #1 0x0000555555a25e0c in local_opendir_nofollow (fs_ctx=0x555557d958b8, path=0x0) at hw/9pfs/9p-local.c:92 #2 0x0000555555a261b8 in local_lstat (fs_ctx=0x555557d958b8, fs_path=0x555556b56858, stbuf=0x7fff84830ef0) at hw/9pfs/9p-local.c:185 #3 0x0000555555a2b367 in v9fs_co_lstat (pdu=0x555557d97498, path=0x555556b56858, stbuf=0x7fff84830ef0) at hw/9pfs/cofile.c:53 #4 0x0000555555a1e9e2 in v9fs_stat (opaque=0x555557d97498) at hw/9pfs/9p.c:1083 #5 0x0000555555e060a2 in coroutine_trampoline (i0=-669165424, i1=32767) at util/coroutine-ucontext.c:116 #6 0x00007fffef4f5600 in __start_context () at /lib64/libc.so.6 #7 0x0000000000000000 in () (gdb) The fix is to take the path write lock when calling v9fs_complete_rename(), like in v9fs_rename(). Impact: DoS triggered by unprivileged guest users. Fixes: CVE-2018-19489 Cc: P J P <ppandit@redhat.com> Reported-by: zhibin hu <noirfate@gmail.com> Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org> Signed-off-by: Greg Kurz <groug@kaod.org>
2018-11-209p: take write lock on fid path updates (CVE-2018-19364)Greg Kurz1-0/+15
Recent commit 5b76ef50f62079a fixed a race where v9fs_co_open2() could possibly overwrite a fid path with v9fs_path_copy() while it is being accessed by some other thread, ie, use-after-free that can be detected by ASAN with a custom 9p client. It turns out that the same can happen at several locations where v9fs_path_copy() is used to set the fid path. The fix is again to take the write lock. Fixes CVE-2018-19364. Cc: P J P <ppandit@redhat.com> Reported-by: zhibin hu <noirfate@gmail.com> Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org> Signed-off-by: Greg Kurz <groug@kaod.org>
2018-11-089p: write lock path in v9fs_co_open2()Greg Kurz1-3/+3
The assumption that the fid cannot be used by any other operation is wrong. At least, nothing prevents a misbehaving client to create a file with a given fid, and to pass this fid to some other operation at the same time (ie, without waiting for the response to the creation request). The call to v9fs_path_copy() performed by the worker thread after the file was created can race with any access to the fid path performed by some other thread. This causes use-after-free issues that can be detected by ASAN with a custom 9p client. Unlike other operations that only read the fid path, v9fs_co_open2() does modify it. It should hence take the write lock. Cc: P J P <ppandit@redhat.com> Reported-by: zhibin hu <noirfate@gmail.com> Signed-off-by: Greg Kurz <groug@kaod.org>