aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-09-27qdisk - hw/block/xen_disk: grant copy implementationPaulina Szubarczyk3-5/+217
Copy data operated on during request from/to local buffers to/from the grant references. Before grant copy operation local buffers must be allocated what is done by calling ioreq_init_copy_buffers. For the 'read' operation, first, the qemu device invokes the read operation on local buffers and on the completion grant copy is called and buffers are freed. For the 'write' operation grant copy is performed before invoking write by qemu device. A new value 'feature_grant_copy' is added to recognize when the grant copy operation is supported by a guest. Signed-off-by: Paulina Szubarczyk <paulinaszubarczyk@gmail.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Acked-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
2016-09-27Merge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into ↵Peter Maydell15-302/+540
staging x86 and machine queue, 2016-09-27 # gpg: Signature made Tue 27 Sep 2016 21:10:06 BST # gpg: using RSA key 0x2807936F984DC5A6 # gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>" # Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6 * remotes/ehabkost/tags/x86-pull-request: sysbus: Remove ignored return value of FindSysbusDeviceFunc target-i386: Remove has_msr_* global vars for KVM features target-i386: Clear KVM CPUID features if KVM is disabled target-i386: Remove has_msr_hv_tsc global variable target-i386: Remove has_msr_hv_apic global variable target-i386: Remove has_msr_mtrr global variable target-i386: Move xsave component mask to features array target-i386: xsave: Calculate set of xsave components on realize target-i386: xsave: Helper function to calculate xsave area size target-i386: xsave: Simplify CPUID[0xD,0].{EAX,EDX} calculation target-i386: xsave: Calculate enabled components only once target-i386: Don't try to enable PT State xsave component target-i386: Move feature name arrays inside FeatureWordInfo linux-user: remove #define smp_{cores, threads} target-i386: Enable CPUID[0x8000000A] if SVM is enabled target-i386: Automatically set level/xlevel/xlevel2 when needed tests: Test CPUID level handling for old machines tests: Add test code for CPUID level/xlevel handling target-i386: Add a marker to end of the region zeroed on reset target-i386: Remove unused X86CPUDefinition::xlevel2 field Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-09-27sysbus: Remove ignored return value of FindSysbusDeviceFuncDavid Gibson6-16/+8
Functions of type FindSysbusDeviceFunc currently return an integer. However, this return value is always ignored by the caller in find_sysbus_device(). This changes the function type to return void, to avoid confusion over the function semantics. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27target-i386: Remove has_msr_* global vars for KVM featuresEduardo Habkost1-15/+6
The global variables are not necessary because we can check KVM feature flags in X86CPU directly. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27target-i386: Clear KVM CPUID features if KVM is disabledEduardo Habkost1-0/+4
This will ensure all checks for features[FEAT_KVM] in the code will be correct in case the KVM CPUID leaf is completely disabled. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27target-i386: Remove has_msr_hv_tsc global variableEduardo Habkost1-6/+8
The global variable is not necessary because we can check cpu->hyperv_time directly. We just need to ensure cpu->hyperv_time will be cleared if the feature is not really being exposed to the guest due to missing KVM_CAP_HYPERV_TIME capability. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27target-i386: Remove has_msr_hv_apic global variableEduardo Habkost1-5/+3
The global variable is not necessary because we can check cpu->hyperv_vapic directly. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27target-i386: Remove has_msr_mtrr global variableEduardo Habkost1-6/+2
The global variable is not necessary because we can check the CPU feature flags directly. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27target-i386: Move xsave component mask to features arrayEduardo Habkost2-15/+30
This will reuse the existing check/enforce logic in x86_cpu_filter_features() to check the xsave component bits against GET_SUPPORTED_CPUID. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27target-i386: xsave: Calculate set of xsave components on realizeEduardo Habkost2-23/+33
Instead of doing complex calculations and calling kvm_arch_get_supported_cpuid() inside cpu_x86_cpuid(), calculate the set of required XSAVE components earlier, at realize time. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27target-i386: xsave: Helper function to calculate xsave area sizeEduardo Habkost1-7/+15
Move the xsave area size calculation from cpu_x86_cpuid() inside its own function. While doing it, change it to use the XSAVE area struct sizes for the initial size, instead of the magic 0x240 number. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27target-i386: xsave: Simplify CPUID[0xD,0].{EAX,EDX} calculationEduardo Habkost1-6/+2
Instead of assigning individual bits in a loop, just copy the values from ena_mask. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27target-i386: xsave: Calculate enabled components only onceEduardo Habkost1-10/+16
Instead of checking both env->features and ena_mask at two different places in the CPUID code, initialize ena_mask based on the features that are enabled for the CPU, and then clear unsupported bits based on kvm_arch_get_supported_cpuid(). The results should be exactly the same, but it will make it easier to move the mask calculation elsewhare, and reuse x86_cpu_filter_features() for the kvm_arch_get_supported_cpuid() check. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27target-i386: Don't try to enable PT State xsave componentEduardo Habkost1-3/+3
The code that calculates the set of supported XSAVE components on CPUID looks at ext_save_areas to find out which components should be enabled. However, if there are zeroed entries in the ext_save_areas array, the ((env->features[esa->feature] & esa->bits) == esa->bits) check will always succeed and QEMU will unconditionally try to enable the component. Luckily this never caused any problems because the only missing entry in ext_save_areas is the PT State component (bit 8), and KVM currently doesn't support it (so it was cleared on ena_mask). But the code was still incorrect and would break if KVM starts returning CPUID[EAX=0xD,ECX=0].EAX[bit 8] as supported on GET_SUPPORTED_CPUID. Fix the problem by changing the code to not enable a XSAVE component if ExtSaveArea::bits is zero. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27target-i386: Move feature name arrays inside FeatureWordInfoEduardo Habkost1-200/+170
It makes it easier to guarantee the arrays are the right size, and to find information when looking at the code. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27linux-user: remove #define smp_{cores, threads}Marc-André Lureau3-9/+7
Those are unneeded now that CPUState nr_{cores,threads} is always initialized. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27target-i386: Enable CPUID[0x8000000A] if SVM is enabledEduardo Habkost2-5/+14
SVM needs CPUID[0x8000000A] to be available. So if SVM is enabled in a CPU model or explicitly in the command-line, adjust CPUID xlevel to expose the CPUID[0x8000000A] leaf. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27target-i386: Automatically set level/xlevel/xlevel2 when neededEduardo Habkost4-13/+133
Instead of requiring users and management software to be aware of required CPUID level/xlevel/xlevel2 values for each feature, automatically increase those values when features need them. This was already done for CPUID[7].EBX, and is now made generic for all CPUID feature flags. Unit test included, to make sure we don't break ABI on older machine-types and don't mess with the CPUID level values if they are explicitly set by the user. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27tests: Test CPUID level handling for old machinesEduardo Habkost1-0/+13
We're going to change the way level/xlevel/xlevel2 are handled when enabling features, but we need to keep the old behavior on existing machine types. Add test cases for that. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27tests: Add test code for CPUID level/xlevel handlingEduardo Habkost3-0/+111
Add test code that will check if the automatic CPUID level changes are working as expected. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27target-i386: Add a marker to end of the region zeroed on resetEduardo Habkost2-1/+2
Instead of using cpuid_level, use an empty struct as a marker (like we already did with {start,end}_init_save). This will avoid accidentaly resetting the wrong fields if we change the field ordering on CPUX86State. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27target-i386: Remove unused X86CPUDefinition::xlevel2 fieldEduardo Habkost1-2/+0
No CPU model in builtin_x86_defs has xlevel2 set, so it is always zero. Delete the field. Note that this is not an user-visible change. It doesn't remove the ability to set xlevel2 on the command-line, it just removes an unused field in builtin_x86_defs. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into ↵Peter Maydell24-53/+1812
staging # gpg: Signature made Tue 27 Sep 2016 11:05:56 BST # gpg: using RSA key 0xEF04965B398D6211 # gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211 * remotes/jasowang/tags/net-pull-request: (27 commits) imx_fec: fix error in qemu_send_packet argument mcf_fec: fix error in qemu_send_packet argument net: mcf: limit buffer descriptor count e1000e: Fix EIAC register implementation e1000e: Fix spurious RX TCP ACK interrupts e1000e: Fix OTHER interrupts processing for MSI-X e1000e: Fix PBACLR implementation e1000e: Fix CTRL_EXT.EIAME behavior e1000e: Flush receive queues on link up e1000e: Flush all receive queues on receive enable net: limit allocation in nc_sendv_compat tap: Allow specifying a bridge e1000: fix buliding complaint docs: Add documentation for COLO-proxy MAINTAINERS: add maintainer for COLO-proxy filter-rewriter: rewrite tcp packet to keep secondary connection filter-rewriter: track connection and parse packet filter-rewriter: introduce filter-rewriter initialization colo-compare: add TCP, UDP, ICMP packet comparison colo-compare: introduce packet comparison thread ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-09-27imx_fec: fix error in qemu_send_packet argumentPaolo Bonzini1-1/+1
This uses the wrong frame size for packets composed of multiple descriptors. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-09-27mcf_fec: fix error in qemu_send_packet argumentPaolo Bonzini1-1/+1
This uses the wrong frame size for packets composed of multiple descriptors. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-09-27net: mcf: limit buffer descriptor countPrasad J Pandit1-2/+3
ColdFire Fast Ethernet Controller uses buffer descriptors to manage data flow to/fro receive & transmit queues. While transmitting packets, it could continue to read buffer descriptors if a buffer descriptor has length of zero and has crafted values in bd.flags. Set upper limit to number of buffer descriptors. Reported-by: Li Qiang <liqiang6-s@360.cn> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-09-27e1000e: Fix EIAC register implementationDmitry Fleytman1-5/+9
This patch fixes 2 issues: 1. Bits set in EIAC register should be cleared from IMS when EIAM is not used. 2. Only bit that corresonds to the interrupt being raised should be cleared. See spec. 10.2.4.7 Interrupt Auto Clear Signed-off-by: Dmitry Fleytman <dmitry@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-09-27e1000e: Fix spurious RX TCP ACK interruptsDmitry Fleytman1-1/+2
Do not raise ACK interrupts when RFCTL.ACKDIS bit is set (see spec. 10.2.5.16). Signed-off-by: Dmitry Fleytman <dmitry@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-09-27e1000e: Fix OTHER interrupts processing for MSI-XDmitry Fleytman1-1/+1
Interrupt mask for legacy OTHER causes should not apply to MSI-X OTHER cause. Signed-off-by: Dmitry Fleytman <dmitry@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-09-27e1000e: Fix PBACLR implementationDmitry Fleytman1-1/+1
This patch fixes incorrect check for interrypt type being used. PBSCLR register is valid for MSI-X only. See spec. 10.2.3.13 MSI—X PBA Clear Signed-off-by: Dmitry Fleytman <dmitry@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-09-27e1000e: Fix CTRL_EXT.EIAME behaviorDmitry Fleytman2-3/+3
CTRL_EXT.EIAME bit controls clearing of IAM bits, but current code clears IMS bits instead. See spec. 10.2.2.5 Extended Device Control Register. Signed-off-by: Dmitry Fleytman <dmitry@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-09-27e1000e: Flush receive queues on link upDmitry Fleytman1-0/+3
Signed-off-by: Dmitry Fleytman <dmitry@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-09-27e1000e: Flush all receive queues on receive enableDmitry Fleytman3-2/+5
Before this patch first netdev queue only was flushed. Signed-off-by: Dmitry Fleytman <dmitry@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-09-27net: limit allocation in nc_sendv_compatPeter Lieven1-2/+6
we only need to allocate enough memory to hold the packet. This might be less than NET_BUFSIZE. Additionally fail early if the packet is larger than NET_BUFSIZE. Signed-off-by: Peter Lieven <pl@kamp.de> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-09-27tap: Allow specifying a bridgeAlexey Kardashevskiy3-6/+13
The tap backend is already using qemu-bridge-helper to attach tap interface to a bridge but (unlike the bridge backend) it always uses the default bridge name - br0. This adds a "br" property support to the tap backend. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Greg Kurz <groug@kaod.org> Tested-by: Greg Kurz <groug@kaod.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-09-27e1000: fix buliding complaintGonglei1-1/+1
hw/net/e1000e_core.c:56: warning: e1000e_set_interrupt_cause declared inline after being called hw/net/e1000e_core.c:56: warning: previous declaration of e1000e_set_interrupt_cause was here Signed-off-by: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Dmitry Fleytman <dmitry@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-09-27docs: Add documentation for COLO-proxyZhang Chen1-0/+188
Introduce the design of COLO-proxy, and how to use it. Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-09-27MAINTAINERS: add maintainer for COLO-proxyZhang Chen1-0/+9
add Zhang Chen and Li zhijian as co-maintainers of COLO-proxy. Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-09-27filter-rewriter: rewrite tcp packet to keep secondary connectionZhang Chen4-2/+124
We will rewrite tcp packet secondary received and sent. When colo guest is a tcp server. Firstly, client start a tcp handshake. the packet's seq=client_seq, ack=0,flag=SYN. COLO primary guest get this pkt and mirror(filter-mirror) to secondary guest, secondary get it use filter-redirector. Then,primary guest response pkt (seq=primary_seq,ack=client_seq+1,flag=ACK|SYN). secondary guest response pkt (seq=secondary_seq,ack=client_seq+1,flag=ACK|SYN). In here,we use filter-rewriter save the secondary_seq to it's tcp connection. Finally handshake,client send pkt (seq=client_seq+1,ack=primary_seq+1,flag=ACK). Here,filter-rewriter can get primary_seq, and rewrite ack from primary_seq+1 to secondary_seq+1, recalculate checksum. So the secondary tcp connection kept good. When we send/recv packet. client send pkt(seq=client_seq+1+data_len,ack=primary_seq+1,flag=ACK|PSH). filter-rewriter rewrite ack and send to secondary guest. primary guest response pkt (seq=primary_seq+1,ack=client_seq+1+data_len,flag=ACK) secondary guest response pkt (seq=secondary_seq+1,ack=client_seq+1+data_len,flag=ACK) we rewrite secondary guest seq from secondary_seq+1 to primary_seq+1. So tcp connection kept good. In code We use offset( = secondary_seq - primary_seq ) to rewrite seq or ack. handle_primary_tcp_pkt: tcp_pkt->th_ack += offset; handle_secondary_tcp_pkt: tcp_pkt->th_seq -= offset; Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-09-27filter-rewriter: track connection and parse packetZhang Chen3-0/+65
We use net/colo.h to track connection and parse packet Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-09-27filter-rewriter: introduce filter-rewriter initializationZhang Chen4-1/+121
Filter-rewriter is a part of COLO project. It will rewrite some of secondary packet to make secondary guest's tcp connection established successfully. In this module we will rewrite tcp packet's ack to the secondary from primary,and rewrite tcp packet's seq to the primary from secondary. usage: colo secondary: -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 -object filter-rewriter,id=rew0,netdev=hn0,queue=all Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-09-27colo-compare: add TCP, UDP, ICMP packet comparisonZhang Chen2-4/+146
We add TCP,UDP,ICMP packet comparison to replace IP packet comparison. This can increase the accuracy of the package comparison. Less checkpoint more efficiency. Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-09-27colo-compare: introduce packet comparison threadZhang Chen4-0/+239
If primary packet is same with secondary packet, we will send primary packet and drop secondary packet, otherwise notify COLO frame to do checkpoint. If primary packet comes but secondary packet does not, after REGULAR_PACKET_CHECK_MS milliseconds we set the primary packet as old_packet,then do a checkpoint. Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-09-27colo-compare: track connection and enqueue packetZhang Chen3-10/+190
In this patch we use kernel jhash table to track connection, and then enqueue net packet like this: + CompareState ++ | | +---------------+ +---------------+ +---------------+ |conn list +--->conn +--------->conn | +---------------+ +---------------+ +---------------+ | | | | | | +---------------+ +---v----+ +---v----+ +---v----+ +---v----+ |primary | |secondary |primary | |secondary |packet | |packet + |packet | |packet + +--------+ +--------+ +--------+ +--------+ | | | | +---v----+ +---v----+ +---v----+ +---v----+ |primary | |secondary |primary | |secondary |packet | |packet + |packet | |packet + +--------+ +--------+ +--------+ +--------+ | | | | +---v----+ +---v----+ +---v----+ +---v----+ |primary | |secondary |primary | |secondary |packet | |packet + |packet | |packet + +--------+ +--------+ +--------+ +--------+ We use conn_list to record connection info. When we want to enqueue a packet, firstly get the connection from connection_track_table. then push the packet to g_queue(pri/sec) in it's own conn. Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-09-27Jhash: add linux kernel jhashtable in qemuZhang Chen2-0/+60
Jhash will be used by colo-compare and filter-rewriter to save and lookup net connection info Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-09-27net/colo.c: add colo.c to define and handle packetZhang Chen5-4/+240
The net/colo.c is used by colo-compare and filter-rewriter. this can share common data structure like net packet, and other functions. Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-09-27colo-compare: introduce colo compare initializationZhang Chen4-1/+312
This a COLO net ascii figure: Primary qemu Secondary qemu +--------------------------------------------------------------+ +----------------------------------------------------------------+ | +----------------------------------------------------------+ | | +-----------------------------------------------------------+ | | | | | | | | | | | guest | | | | guest | | | | | | | | | | | +-------^--------------------------+-----------------------+ | | +---------------------+--------+----------------------------+ | | | | | | ^ | | | | | | | | | | | | +------------------------------------------------------+ | | | | |netfilter| | | | | | netfilter | | | | +----------+ +----------------------------+ | | | +-----------------------------------------------------------+ | | | | | | | out | | | | | | filter excute order | | | | | | +-----------------------------+ | | | | | | +-------------------> | | | | | | | | | | | | | | | | TCP | | | | +-----+--+-+ +-----v----+ +-----v----+ |pri +----+----+sec| | | | +------------+ +---+----+---v+rewriter++ +------------+ | | | | | | | | | | |in | |in | | | | | | | | | | | | | | | | filter | | filter | | filter +------> colo <------+ +--------> filter +--> adjust | adjust +--> filter | | | | | | mirror | |redirector| |redirector| | | compare | | | | | | redirector | | ack | seq | | redirector | | | | | | | | | | | | | | | | | | | | | | | | | | | | | +----^-----+ +----+-----+ +----------+ | +---------+ | | | | +------------+ +--------+--------------+ +---+--------+ | | | | | tx | rx rx | | | | | tx all | rx | | | | | | | | | | +-----------------------------------------------------------+ | | | | +--------------+ | | | | | | | | | filter excute order | | | | | | | | | | +----------------> | | | +--------------------------------------------------------+ | | +-----------------------------------------+ | | | | | | | | | +--------------------------------------------------------------+ +----------------------------------------------------------------+ |guest receive | guest send | | +--------+----------------------------v------------------------+ | | NOTE: filter direction is rx/tx/all | tap | rx:receive packets sent to the netdev | | tx:receive packets sent by the netdev +--------------------------------------------------------------+ In COLO-compare, we do packet comparing job. Packets coming from the primary char indev will be sent to outdev. Packets coming from the secondary char dev will be dropped after comparing. colo-comapre need two input chardev and one output chardev: primary_in=chardev1-id (source: primary send packet) secondary_in=chardev2-id (source: secondary send packet) outdev=chardev3-id usage: primary: -netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown -device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66 -chardev socket,id=mirror0,host=3.3.3.3,port=9003,server,nowait -chardev socket,id=compare1,host=3.3.3.3,port=9004,server,nowait -chardev socket,id=compare0,host=3.3.3.3,port=9001,server,nowait -chardev socket,id=compare0-0,host=3.3.3.3,port=9001 -chardev socket,id=compare_out,host=3.3.3.3,port=9005,server,nowait -chardev socket,id=compare_out0,host=3.3.3.3,port=9005 -object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0 -object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out -object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0 -object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,outdev=compare_out0 secondary: -netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,down script=/etc/qemu-ifdown -device e1000,netdev=hn0,mac=52:a4:00:12:78:66 -chardev socket,id=red0,host=3.3.3.3,port=9003 -chardev socket,id=red1,host=3.3.3.3,port=9004 -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-09-27qemu-char: Add qemu_chr_add_handlers_full() for GMaincontextZhang Chen2-25/+63
Add qemu_chr_add_handlers_full() API, we can use this API pass in a GMainContext,make handler run in the context rather than main_loop. This comments from Daniel P . Berrange. Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-09-27net: hmp_host_net_remove: Del the -net option of the removed host_netShmulik Ladkani1-0/+1
Upon hmp_host_net_remove(), the appropriate -net client is deleted (according to the given vlan_id and device id), as well as the corresponsing hub port. However, the relevant '-net' option that was added by former hmp_host_net_add() call is still present in "net" options group. This makes the following legit HMP sequence erroneous: (qemu) host_net_add tap id=n1,ifname=tap1,script=no,downscript=no,vlan=1 (qemu) host_net_remove 1 n1 (qemu) host_net_add tap id=n1,ifname=tap1,script=no,downscript=no,vlan=1 Duplicate ID 'n1' for net Fix, by deleting the stored '-net' option associated with the given device id. Signed-off-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-09-27virtio-net: allow increasing rx queue sizeMichael S. Tsirkin2-1/+26
This allows increasing the rx queue size up to 1024: unlike with tx, guests don't put in huge S/G lists into RX so the risk of running into the max 1024 limitation due to some off-by-one seems small. It's helpful for users like OVS-DPDK which don't do any buffering on the host - 1K roughly matches 500 entries in tun + 256 in the current rx queue, which seems to work reasonably well. We could probably make do with ~750 entries but virtio spec limits us to powers of two. It might be a good idea to specify an s/g size limit in a future version. It also might be possible to make the queue size smaller down the road, 64 seems like the minimal value which will still work (as guests seem to assume a queue full of 1.5K buffers is enough to process the largest incoming packet, which is ~64K). No one actually asked for this, and with virtio 1 guests can reduce ring size without need for host configuration, so don't bother with this for now. Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: Jason Wang <jasowang@redhat.com> Suggested-by: Patrik Hermansson <phermansson@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Jason Wang <jasowang@redhat.com>