commit 184c20bbc978eb7a2e1d3637b7864208822c7ebc Author: Greg Kroah-Hartman Date: Sun Dec 8 08:17:21 2013 -0800 Linux 3.10.23 commit 84a0264e3992db6dd46d4485e171a13be4550d0c Author: Pierre Ossman Date: Wed Nov 6 20:00:32 2013 +0100 drm/radeon/audio: correct ACR table commit 3e71985f2439d8c4090dc2820e497e6f3d72dcff upstream. The values were taken from the HDMI spec, but they assumed exact x/1.001 clocks. Since we round the clocks, we also need to calculate different N and CTS values. Note that the N for 25.2/1.001 MHz at 44.1 kHz audio is out of spec. Hopefully this mode is rarely used and/or HDMI sinks tolerate overly large values of N. bug: https://bugs.freedesktop.org/show_bug.cgi?id=69675 Signed-off-by: Pierre Ossman Signed-off-by: Alex Deucher Cc: Josh Boyer Signed-off-by: Greg Kroah-Hartman commit f789de214b88dd2f5bbb372625b2e284f8469c4b Author: Pierre Ossman Date: Wed Nov 6 20:09:08 2013 +0100 drm/radeon/audio: improve ACR calculation commit a2098250fbda149cfad9e626afe80abe3b21e574 upstream. In order to have any realistic chance of calculating proper ACR values, we need to be able to calculate both N and CTS, not just CTS. We still aim for the ideal N as specified in the HDMI spec though. bug: https://bugs.freedesktop.org/show_bug.cgi?id=69675 Signed-off-by: Pierre Ossman Signed-off-by: Alex Deucher Cc: Josh Boyer Signed-off-by: Greg Kroah-Hartman commit 9baca2ff1035fbdc7f910a4b6fb34d4bec3f443b Author: Miroslav Lichvar Date: Thu Aug 1 19:31:35 2013 +0200 ntp: Make periodic RTC update more reliable commit a97ad0c4b447a132a322cedc3a5f7fa4cab4b304 upstream. The current code requires that the scheduled update of the RTC happens in the closest tick to the half of the second. This seems to be difficult to achieve reliably. The scheduled work may be missing the target time by a tick or two and be constantly rescheduled every second. Relax the limit to 10 ticks. As a typical RTC drifts in the 11-minute update interval by several milliseconds, this shouldn't affect the overall accuracy of the RTC much. Signed-off-by: Miroslav Lichvar Signed-off-by: John Stultz Cc: Josh Boyer Signed-off-by: Greg Kroah-Hartman commit 72b9401c2f5ac465d97969ef7933753642bde8bf Author: Tomoki Sekiyama Date: Tue Oct 15 16:42:19 2013 -0600 elevator: acquire q->sysfs_lock in elevator_change() commit 7c8a3679e3d8e9d92d58f282161760a0e247df97 upstream. Add locking of q->sysfs_lock into elevator_change() (an exported function) to ensure it is held to protect q->elevator from elevator_init(), even if elevator_change() is called from non-sysfs paths. sysfs path (elv_iosched_store) uses __elevator_change(), non-locking version, as the lock is already taken by elv_iosched_store(). Signed-off-by: Tomoki Sekiyama Signed-off-by: Jens Axboe Cc: Josh Boyer Signed-off-by: Greg Kroah-Hartman commit 6d53d39270ccd29938dcbd611a44870c63032521 Author: Tomoki Sekiyama Date: Tue Oct 15 16:42:16 2013 -0600 elevator: Fix a race in elevator switching and md device initialization commit eb1c160b22655fd4ec44be732d6594fd1b1e44f4 upstream. The soft lockup below happens at the boot time of the system using dm multipath and the udev rules to switch scheduler. [ 356.127001] BUG: soft lockup - CPU#3 stuck for 22s! [sh:483] [ 356.127001] RIP: 0010:[] [] lock_timer_base.isra.35+0x1d/0x50 ... [ 356.127001] Call Trace: [ 356.127001] [] try_to_del_timer_sync+0x20/0x70 [ 356.127001] [] ? kmem_cache_alloc_node_trace+0x20a/0x230 [ 356.127001] [] del_timer_sync+0x52/0x60 [ 356.127001] [] cfq_exit_queue+0x32/0xf0 [ 356.127001] [] elevator_exit+0x2f/0x50 [ 356.127001] [] elevator_change+0xf1/0x1c0 [ 356.127001] [] elv_iosched_store+0x20/0x50 [ 356.127001] [] queue_attr_store+0x59/0xb0 [ 356.127001] [] sysfs_write_file+0xc6/0x140 [ 356.127001] [] vfs_write+0xbd/0x1e0 [ 356.127001] [] SyS_write+0x49/0xa0 [ 356.127001] [] system_call_fastpath+0x16/0x1b This is caused by a race between md device initialization by multipathd and shell script to switch the scheduler using sysfs. - multipathd: SyS_ioctl -> do_vfs_ioctl -> dm_ctl_ioctl -> ctl_ioctl -> table_load -> dm_setup_md_queue -> blk_init_allocated_queue -> elevator_init q->elevator = elevator_alloc(q, e); // not yet initialized - sh -c 'echo deadline > /sys/$DEVPATH/queue/scheduler': elevator_switch (in the call trace above) struct elevator_queue *old = q->elevator; q->elevator = elevator_alloc(q, new_e); elevator_exit(old); // lockup! (*) - multipathd: (cont.) err = e->ops.elevator_init_fn(q); // init fails; q->elevator is modified (*) When del_timer_sync() is called, lock_timer_base() will loop infinitely while timer->base == NULL. In this case, as timer will never initialized, it results in lockup. This patch introduces acquisition of q->sysfs_lock around elevator_init() into blk_init_allocated_queue(), to provide mutual exclusion between initialization of the q->scheduler and switching of the scheduler. This should fix this bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=902012 Signed-off-by: Tomoki Sekiyama Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 645451235dc5098f08cc51e2f1205d9fa9ca8262 Author: Neil Horman Date: Fri Sep 27 12:53:35 2013 -0400 iommu: Remove stack trace from broken irq remapping warning commit 05104a4e8713b27291c7bb49c1e7e68b4e243571 upstream. The warning for the irq remapping broken check in intel_irq_remapping.c is pretty pointless. We need the warning, but we know where its comming from, the stack trace will always be the same, and it needlessly triggers things like Abrt. This changes the warning to just print a text warning about BIOS being broken, without the stack trace, then sets the appropriate taint bit. Since we automatically disable irq remapping, theres no need to contiue making Abrt jump at this problem Signed-off-by: Neil Horman CC: Joerg Roedel CC: Bjorn Helgaas CC: Andy Lutomirski CC: Konrad Rzeszutek Wilk CC: Sebastian Andrzej Siewior Signed-off-by: Joerg Roedel Signed-off-by: Greg Kroah-Hartman commit 3de762b3d32ff3d0440ca809e1ba084c682f0ca7 Author: Julian Stecklina Date: Wed Oct 9 10:03:52 2013 +0200 iommu/vt-d: Fixed interaction of VFIO_IOMMU_MAP_DMA with IOMMU address limits commit f9423606ade08653dd8a43334f0a7fb45504c5cc upstream. The BUG_ON in drivers/iommu/intel-iommu.c:785 can be triggered from userspace via VFIO by calling the VFIO_IOMMU_MAP_DMA ioctl on a vfio device with any address beyond the addressing capabilities of the IOMMU. The problem is that the ioctl code calls iommu_iova_to_phys before it calls iommu_map. iommu_map handles the case that it gets addresses beyond the addressing capabilities of its IOMMU. intel_iommu_iova_to_phys does not. This patch fixes iommu_iova_to_phys to return NULL for addresses beyond what the IOMMU can handle. This in turn causes the ioctl call to fail in iommu_map and (correctly) return EFAULT to the user with a helpful warning message in the kernel log. Signed-off-by: Julian Stecklina Acked-by: Alex Williamson Signed-off-by: Joerg Roedel Signed-off-by: Greg Kroah-Hartman commit 7dc39b55ecc16a4db44c820263c73a60fc750b27 Author: Simon Wood Date: Thu Oct 10 08:20:13 2013 -0600 HID: lg: fix Report Descriptor for Logitech MOMO Force (Black) commit 348cbaa800f8161168b20f85f72abb541c145132 upstream. By default the Logitech MOMO Force (Black) presents a combined accel/brake axis ('Y'). This patch modifies the HID descriptor to present seperate accel/brake axes ('Y' and 'Z'). Signed-off-by: Simon Wood Signed-off-by: Jiri Kosina Signed-off-by: Greg Kroah-Hartman commit ca77581ebaa42e9279a231fbcb98ee031c37fe3a Author: Sasha Levin Date: Tue Nov 19 14:25:36 2013 -0500 video: kyro: fix incorrect sizes when copying to userspace commit 2ab68ec927310dc488f3403bb48f9e4ad00a9491 upstream. kyro would copy u32s and specify sizeof(unsigned long) as the size to copy. This would copy more data than intended and cause memory corruption and might leak kernel memory. Signed-off-by: Sasha Levin Signed-off-by: Tomi Valkeinen Signed-off-by: Greg Kroah-Hartman commit f99510dc9a9b63ba8131e1e5eb70884db122a1bc Author: Mel Gorman Date: Tue Nov 12 15:08:32 2013 -0800 mm: numa: return the number of base pages altered by protection changes commit 72403b4a0fbdf433c1fe0127e49864658f6f6468 upstream. Commit 0255d4918480 ("mm: Account for a THP NUMA hinting update as one PTE update") was added to account for the number of PTE updates when marking pages prot_numa. task_numa_work was using the old return value to track how much address space had been updated. Altering the return value causes the scanner to do more work than it is configured or documented to in a single unit of work. This patch reverts that commit and accounts for the number of THP updates separately in vmstat. It is up to the administrator to interpret the pair of values correctly. This is a straight-forward operation and likely to only be of interest when actively debugging NUMA balancing problems. The impact of this patch is that the NUMA PTE scanner will scan slower when THP is enabled and workloads may converge slower as a result. On the flip size system CPU usage should be lower than recent tests reported. This is an illustrative example of a short single JVM specjbb test specjbb 3.12.0 3.12.0 vanilla acctupdates TPut 1 26143.00 ( 0.00%) 25747.00 ( -1.51%) TPut 7 185257.00 ( 0.00%) 183202.00 ( -1.11%) TPut 13 329760.00 ( 0.00%) 346577.00 ( 5.10%) TPut 19 442502.00 ( 0.00%) 460146.00 ( 3.99%) TPut 25 540634.00 ( 0.00%) 549053.00 ( 1.56%) TPut 31 512098.00 ( 0.00%) 519611.00 ( 1.47%) TPut 37 461276.00 ( 0.00%) 474973.00 ( 2.97%) TPut 43 403089.00 ( 0.00%) 414172.00 ( 2.75%) 3.12.0 3.12.0 vanillaacctupdates User 5169.64 5184.14 System 100.45 80.02 Elapsed 252.75 251.85 Performance is similar but note the reduction in system CPU time. While this showed a performance gain, it will not be universal but at least it'll be behaving as documented. The vmstats are obviously different but here is an obvious interpretation of them from mmtests. 3.12.0 3.12.0 vanillaacctupdates NUMA page range updates 1408326 11043064 NUMA huge PMD updates 0 21040 NUMA PTE updates 1408326 291624 "NUMA page range updates" == nr_pte_updates and is the value returned to the NUMA pte scanner. NUMA huge PMD updates were the number of THP updates which in combination can be used to calculate how many ptes were updated from userspace. Signed-off-by: Mel Gorman Reported-by: Alex Thorlton Reviewed-by: Rik van Riel Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Mel Gorman Signed-off-by: Greg Kroah-Hartman commit 7281bb5614bcdbf6789e9559420b814689b122e3 Author: Stephen Boyd Date: Thu Jun 13 11:39:50 2013 -0700 clockevents: Prefer CPU local devices over global devices commit 70e5975d3a04be5479a28eec4a2fb10f98ad2785 upstream. On an SMP system with only one global clockevent and a dummy clockevent per CPU we run into problems. We want the dummy clockevents to be registered as the per CPU tick devices, but we can only achieve that if we register the dummy clockevents before the global clockevent or if we artificially inflate the rating of the dummy clockevents to be higher than the rating of the global clockevent. Failure to do so leads to boot hangs when the dummy timers are registered on all other CPUs besides the CPU that accepted the global clockevent as its tick device and there is no broadcast timer to poke the dummy devices. If we're registering multiple clockevents and one clockevent is global and the other is local to a particular CPU we should choose to use the local clockevent regardless of the rating of the device. This way, if the clockevent is a dummy it will take the tick device duty as long as there isn't a higher rated tick device and any global clockevent will be bumped out into broadcast mode, fixing the problem described above. Reported-and-tested-by: Mark Rutland Signed-off-by: Stephen Boyd Tested-by: soren.brinkmann@xilinx.com Cc: John Stultz Cc: Daniel Lezcano Cc: linux-arm-kernel@lists.infradead.org Cc: John Stultz Link: http://lkml.kernel.org/r/20130613183950.GA32061@codeaurora.org Signed-off-by: Thomas Gleixner Cc: Kim Phillips Signed-off-by: Greg Kroah-Hartman commit 9bae8ea0544becdd8e6716b318c1844aeea41a69 Author: Thomas Gleixner Date: Thu Apr 25 20:31:50 2013 +0000 clockevents: Split out selection logic commit 45cb8e01b2ecef1c2afb18333e95793fa1a90281 upstream. Split out the clockevent device selection logic. Preparatory patch to allow unbinding active clockevent devices. Signed-off-by: Thomas Gleixner Cc: John Stultz Cc: Magnus Damm Link: http://lkml.kernel.org/r/20130425143436.431796247@linutronix.de Signed-off-by: Thomas Gleixner Cc: Kim Phillips Signed-off-by: Greg Kroah-Hartman commit 409d4ffaf0c8b29693243918217cec0044979395 Author: Thomas Gleixner Date: Thu Apr 25 20:31:49 2013 +0000 clockevents: Add module refcount commit ccf33d6880f39a35158fff66db13000ae4943fac upstream. We want to be able to remove clockevent modules as well. Add a refcount so we don't remove a module with an active clock event device. Signed-off-by: Thomas Gleixner Cc: John Stultz Cc: Magnus Damm Link: http://lkml.kernel.org/r/20130425143436.307435149@linutronix.de Signed-off-by: Thomas Gleixner Cc: Kim Phillips Signed-off-by: Greg Kroah-Hartman commit e8d630331dfe32c63438a4558eeda6f79c712485 Author: Thomas Gleixner Date: Thu Apr 25 20:31:47 2013 +0000 clockevents: Get rid of the notifier chain commit 7172a286ced0c1f4f239a0fa09db54ed37d3ead2 upstream. 7+ years and still a single user. Kill it. Signed-off-by: Thomas Gleixner Cc: John Stultz Cc: Magnus Damm Link: http://lkml.kernel.org/r/20130425143436.098520211@linutronix.de Signed-off-by: Thomas Gleixner Cc: Kim Phillips Signed-off-by: Greg Kroah-Hartman commit f8715e7d2a096cedd1c5ef4768deae7126659dcf Author: Mateusz Guzik Date: Thu Dec 5 11:09:02 2013 +0100 aio: restore locking of ioctx list on removal Commit 36f5588905c10a8c4568a210d601fe8c3c27e0f0 "aio: refcounting cleanup" resulted in ioctx_lock not being held during ctx removal, leaving the list susceptible to corruptions. In mainline kernel the issue went away as a side effect of db446a08c23d5475e6b08c87acca79ebb20f283c "aio: convert the ioctx list to table lookup v3". Fix the problem by restoring appropriate locking. Signed-off-by: Mateusz Guzik Reported-by: Eryu Guan Cc: Jeff Moyer Cc: Kent Overstreet Acked-by: Benjamin LaHaise Signed-off-by: Greg Kroah-Hartman commit 604ae797e7cd1eca7897eac555961a5a25a9bde5 Author: KOBAYASHI Yoshitake Date: Sun Jul 7 07:35:45 2013 +0900 mmc: block: fix a bug of error handling in MMC driver commit c8760069627ad3b0dbbea170f0c4c58b16e18d3d upstream. Current MMC driver doesn't handle generic error (bit19 of device status) in write sequence. As a result, write data gets lost when generic error occurs. For example, a generic error when updating a filesystem management information causes a loss of write data and corrupts the filesystem. In the worst case, the system will never boot. This patch includes the following functionality: 1. To enable error checking for the response of CMD12 and CMD13 in write command sequence 2. To retry write sequence when a generic error occurs Messages are added for v2 to show what occurs. Signed-off-by: KOBAYASHI Yoshitake Signed-off-by: Chris Ball Signed-off-by: Greg Kroah-Hartman commit bdd0a8e5ace90cad14c1bf1bf49c3bfabb77f831 Author: Dwight Engen Date: Thu Aug 15 14:08:03 2013 -0400 xfs: add capability check to free eofblocks ioctl commit 8c567a7fab6e086a0284eee2db82348521e7120c upstream. Check for CAP_SYS_ADMIN since the caller can truncate preallocated blocks from files they do not own nor have write access to. A more fine grained access check was considered: require the caller to specify their own uid/gid and to use inode_permission to check for write, but this would not catch the case of an inode not reachable via path traversal from the callers mount namespace. Add check for read-only filesystem to free eofblocks ioctl. Reviewed-by: Brian Foster Reviewed-by: Dave Chinner Reviewed-by: Gao feng Signed-off-by: Dwight Engen Signed-off-by: Ben Myers Cc: Kees Cook Signed-off-by: Greg Kroah-Hartman commit e8ef7efffbd51a4f256c5a3fe0c77f067db0c525 Author: Eric Dumazet Date: Fri Oct 25 17:26:17 2013 -0700 tcp: gso: fix truesize tracking [ Upstream commit 0d08c42cf9a71530fef5ebcfe368f38f2dd0476f ] commit 6ff50cd55545 ("tcp: gso: do not generate out of order packets") had an heuristic that can trigger a warning in skb_try_coalesce(), because skb->truesize of the gso segments were exactly set to mss. This breaks the requirement that skb->truesize >= skb->len + truesizeof(struct sk_buff); It can trivially be reproduced by : ifconfig lo mtu 1500 ethtool -K lo tso off netperf As the skbs are looped into the TCP networking stack, skb_try_coalesce() warns us of these skb under-estimating their truesize. Signed-off-by: Eric Dumazet Reported-by: Alexei Starovoitov Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit dd5e5276c4e7922e784c156be630972341d51370 Author: fan.du Date: Sun Dec 1 16:28:48 2013 +0800 {pktgen, xfrm} Update IPv4 header total len and checksum after tranformation [ Upstream commit 3868204d6b89ea373a273e760609cb08020beb1a ] commit a553e4a6317b2cfc7659542c10fe43184ffe53da ("[PKTGEN]: IPSEC support") tried to support IPsec ESP transport transformation for pktgen, but acctually this doesn't work at all for two reasons(The orignal transformed packet has bad IPv4 checksum value, as well as wrong auth value, reported by wireshark) - After transpormation, IPv4 header total length needs update, because encrypted payload's length is NOT same as that of plain text. - After transformation, IPv4 checksum needs re-caculate because of payload has been changed. With this patch, armmed pktgen with below cofiguration, Wireshark is able to decrypted ESP packet generated by pktgen without any IPv4 checksum error or auth value error. pgset "flag IPSEC" pgset "flows 1" Signed-off-by: Fan Du Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e761bada6f1eb056e720afd5704a21a79fb8df35 Author: Hannes Frederic Sowa Date: Fri Nov 29 06:39:44 2013 +0100 ipv6: fix possible seqlock deadlock in ip6_finish_output2 [ Upstream commit 7f88c6b23afbd31545c676dea77ba9593a1a14bf ] IPv6 stats are 64 bits and thus are protected with a seqlock. By not disabling bottom-half we could deadlock here if we don't disable bh and a softirq reentrantly updates the same mib. Cc: Eric Dumazet Signed-off-by: Hannes Frederic Sowa Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 7be560d6fa090edd57fa2761282f46ef85a20088 Author: Eric Dumazet Date: Thu Nov 28 09:51:22 2013 -0800 inet: fix possible seqlock deadlocks [ Upstream commit f1d8cba61c3c4b1eb88e507249c4cb8d635d9a76 ] In commit c9e9042994d3 ("ipv4: fix possible seqlock deadlock") I left another places where IP_INC_STATS_BH() were improperly used. udp_sendmsg(), ping_v4_sendmsg() and tcp_v4_connect() are called from process context, not from softirq context. This was detected by lockdep seqlock support. Reported-by: jongman heo Fixes: 584bdf8cbdf6 ("[IPV4]: Fix "ipOutNoRoutes" counter error for TCP and UDP") Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind") Signed-off-by: Eric Dumazet Cc: Hannes Frederic Sowa Acked-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit a52a9149f57d0b00dab0699028174a4f91dcb02f Author: Jiri Pirko Date: Thu Nov 28 18:01:38 2013 +0100 team: fix master carrier set when user linkup is enabled [ Upstream commit f5e0d34382e18f396d7673a84df8e3342bea7eb6 ] When user linkup is enabled and user sets linkup of individual port, we need to recompute linkup (carrier) of master interface so the change is reflected. Fix this by calling __team_carrier_check() which does the needed work. Please apply to all stable kernels as well. Thanks. Reported-by: Jan Tluka Signed-off-by: Jiri Pirko Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 86a243440a102e315438c6faf57881796533528d Author: Shawn Landden Date: Sun Nov 24 22:36:28 2013 -0800 net: update consumers of MSG_MORE to recognize MSG_SENDPAGE_NOTLAST [ Upstream commit d3f7d56a7a4671d395e8af87071068a195257bf6 ] Commit 35f9c09fe (tcp: tcp_sendpages() should call tcp_push() once) added an internal flag MSG_SENDPAGE_NOTLAST, similar to MSG_MORE. algif_hash, algif_skcipher, and udp used MSG_MORE from tcp_sendpages() and need to see the new flag as identical to MSG_MORE. This fixes sendfile() on AF_ALG. v3: also fix udp Reported-and-tested-by: Shawn Landden Cc: Tom Herbert Cc: Eric Dumazet Cc: David S. Miller Original-patch: Richard Weinberger Signed-off-by: Shawn Landden Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ac3e7d9bc445b256fc386a30df55dfd1b3fda0be Author: Yang Yingliang Date: Wed Nov 27 14:32:52 2013 +0800 net: 8139cp: fix a BUG_ON triggered by wrong bytes_compl [ Upstream commit 7fe0ee099ad5e3dea88d4ee1b6f20246b1ca57c3 ] Using iperf to send packets(GSO mode is on), a bug is triggered: [ 212.672781] kernel BUG at lib/dynamic_queue_limits.c:26! [ 212.673396] invalid opcode: 0000 [#1] SMP [ 212.673882] Modules linked in: 8139cp(O) nls_utf8 edd fuse loop dm_mod ipv6 i2c_piix4 8139too i2c_core intel_agp joydev pcspkr hid_generic intel_gtt floppy sr_mod mii button sg cdrom ext3 jbd mbcache usbhid hid uhci_hcd ehci_hcd usbcore sd_mod usb_common crc_t10dif crct10dif_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata scsi_mod [last unloaded: 8139cp] [ 212.676084] CPU: 0 PID: 4124 Comm: iperf Tainted: G O 3.12.0-0.7-default+ #16 [ 212.676084] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [ 212.676084] task: ffff8800d83966c0 ti: ffff8800db4c8000 task.ti: ffff8800db4c8000 [ 212.676084] RIP: 0010:[] [] dql_completed+0x17f/0x190 [ 212.676084] RSP: 0018:ffff880116e03e30 EFLAGS: 00010083 [ 212.676084] RAX: 00000000000005ea RBX: 0000000000000f7c RCX: 0000000000000002 [ 212.676084] RDX: ffff880111dd0dc0 RSI: 0000000000000bd4 RDI: ffff8800db6ffcc0 [ 212.676084] RBP: ffff880116e03e48 R08: 0000000000000992 R09: 0000000000000000 [ 212.676084] R10: ffffffff8181e400 R11: 0000000000000004 R12: 000000000000000f [ 212.676084] R13: ffff8800d94ec840 R14: ffff8800db440c80 R15: 000000000000000e [ 212.676084] FS: 00007f6685a3c700(0000) GS:ffff880116e00000(0000) knlGS:0000000000000000 [ 212.676084] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 212.676084] CR2: 00007f6685ad6460 CR3: 00000000db714000 CR4: 00000000000006f0 [ 212.676084] Stack: [ 212.676084] ffff8800db6ffc00 000000000000000f ffff8800d94ec840 ffff880116e03eb8 [ 212.676084] ffffffffa041509f ffff880116e03e88 0000000f16e03e88 ffff8800d94ec000 [ 212.676084] 00000bd400059858 000000050000000f ffffffff81094c36 ffff880116e03eb8 [ 212.676084] Call Trace: [ 212.676084] [ 212.676084] [] cp_interrupt+0x4ef/0x590 [8139cp] [ 212.676084] [] ? ktime_get+0x56/0xd0 [ 212.676084] [] handle_irq_event_percpu+0x53/0x170 [ 212.676084] [] handle_irq_event+0x3c/0x60 [ 212.676084] [] handle_fasteoi_irq+0x55/0xf0 [ 212.676084] [] handle_irq+0x1f/0x30 [ 212.676084] [] do_IRQ+0x5b/0xe0 [ 212.676084] [] common_interrupt+0x6a/0x6a [ 212.676084] [ 212.676084] [] ? cp_start_xmit+0x621/0x97c [8139cp] [ 212.676084] [] ? cp_start_xmit+0x609/0x97c [8139cp] [ 212.676084] [] dev_hard_start_xmit+0x2c9/0x550 [ 212.676084] [] sch_direct_xmit+0x179/0x1d0 [ 212.676084] [] dev_queue_xmit+0x293/0x440 [ 212.676084] [] ip_finish_output+0x236/0x450 [ 212.676084] [] ? __alloc_pages_nodemask+0x187/0xb10 [ 212.676084] [] ip_output+0x88/0x90 [ 212.676084] [] ip_local_out+0x24/0x30 [ 212.676084] [] ip_queue_xmit+0x14d/0x3e0 [ 212.676084] [] tcp_transmit_skb+0x501/0x840 [ 212.676084] [] tcp_write_xmit+0x1e3/0xb20 [ 212.676084] [] ? skb_page_frag_refill+0x87/0xd0 [ 212.676084] [] tcp_push_one+0x2b/0x40 [ 212.676084] [] tcp_sendmsg+0x926/0xc90 [ 212.676084] [] inet_sendmsg+0x61/0xc0 [ 212.676084] [] sock_aio_write+0x101/0x120 [ 212.676084] [] ? vma_adjust+0x2e1/0x5d0 [ 212.676084] [] ? timerqueue_add+0x60/0xb0 [ 212.676084] [] do_sync_write+0x60/0x90 [ 212.676084] [] ? rw_verify_area+0x54/0xf0 [ 212.676084] [] vfs_write+0x186/0x190 [ 212.676084] [] SyS_write+0x5d/0xa0 [ 212.676084] [] system_call_fastpath+0x16/0x1b [ 212.676084] Code: ca 41 89 dc 41 29 cc 45 31 db 29 c2 41 89 c5 89 d0 45 29 c5 f7 d0 c1 e8 1f e9 43 ff ff ff 66 0f 1f 44 00 00 31 c0 e9 7b ff ff ff <0f> 0b eb fe 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 c7 47 40 00 [ 212.676084] RIP [] dql_completed+0x17f/0x190 ------------[ cut here ]------------ When a skb has frags, bytes_compl plus skb->len nr_frags times in cp_tx(). It's not the correct value(actually, it should plus skb->len once) and it will trigger the BUG_ON(bytes_compl > num_queued - dql->num_completed). So only increase bytes_compl when finish sending all frags. pkts_compl also has a wrong value, fix it too. It's introduced by commit 871f0d4c ("8139cp: enable bql"). Suggested-by: Eric Dumazet Signed-off-by: Yang Yingliang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 1562432658a5bf8aa32d25a99c30cfdda5b43818 Author: David Chang Date: Wed Nov 27 15:48:36 2013 +0800 r8169: check ALDPS bit and disable it if enabled for the 8168g [ Upstream commit 1bac1072425c86f1ac85bd5967910706677ef8b3 ] Windows driver will enable ALDPS function, but linux driver and firmware do not have any configuration related to ALDPS function for 8168g. So restart system to linux and remove the NIC cable, LAN enter ALDPS, then LAN RX will be disabled. This issue can be easily reproduced on dual boot windows and linux system with RTL_GIGA_MAC_VER_40 chip. Realtek said, ALDPS function can be disabled by configuring to PHY, switch to page 0x0A43, reg0x10 bit2=0. Signed-off-by: David Chang Acked-by: Hayes Wang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 5b9e9be7756854ddd525fb9882f1feddd9435831 Author: Veaceslav Falico Date: Fri Nov 29 09:53:23 2013 +0100 af_packet: block BH in prb_shutdown_retire_blk_timer() [ Upstream commit ec6f809ff6f19fafba3212f6aff0dda71dfac8e8 ] Currently we're using plain spin_lock() in prb_shutdown_retire_blk_timer(), however the timer might fire right in the middle and thus try to re-aquire the same spinlock, leaving us in a endless loop. To fix that, use the spin_lock_bh() to block it. Fixes: f6fb8f100b80 ("af-packet: TPACKET_V3 flexible buffer implementation.") CC: "David S. Miller" CC: Daniel Borkmann CC: Willem de Bruijn CC: Phil Sutter CC: Eric Dumazet Reported-by: Jan Stancek Tested-by: Jan Stancek Signed-off-by: Veaceslav Falico Acked-by: Daniel Borkmann Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 026bb4057a0473132bdfa910bd2089de1a7af3f6 Author: Daniel Borkmann Date: Thu Nov 21 16:50:58 2013 +0100 packet: fix use after free race in send path when dev is released [ Upstream commit e40526cb20b5ee53419452e1f03d97092f144418 ] Salam reported a use after free bug in PF_PACKET that occurs when we're sending out frames on a socket bound device and suddenly the net device is being unregistered. It appears that commit 827d9780 introduced a possible race condition between {t,}packet_snd() and packet_notifier(). In the case of a bound socket, packet_notifier() can drop the last reference to the net_device and {t,}packet_snd() might end up suddenly sending a packet over a freed net_device. To avoid reverting 827d9780 and thus introducing a performance regression compared to the current state of things, we decided to hold a cached RCU protected pointer to the net device and maintain it on write side via bind spin_lock protected register_prot_hook() and __unregister_prot_hook() calls. In {t,}packet_snd() path, we access this pointer under rcu_read_lock through packet_cached_dev_get() that holds reference to the device to prevent it from being freed through packet_notifier() while we're in send path. This is okay to do as dev_put()/dev_hold() are per-cpu counters, so this should not be a performance issue. Also, the code simplifies a bit as we don't need need_rls_dev anymore. Fixes: 827d978037d7 ("af-packet: Use existing netdev reference for bound sockets.") Reported-by: Salam Noureddine Signed-off-by: Daniel Borkmann Signed-off-by: Salam Noureddine Cc: Ben Greear Cc: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 1acb97aeff84a72999eeb9cbea523a0079c5ce6d Author: Ding Tianhong Date: Sat Dec 7 22:12:05 2013 +0800 bridge: flush br's address entry in fdb when remove the bridge dev [ Upstream commit f873042093c0b418d2351fe142222b625c740149 ] When the following commands are executed: brctl addbr br0 ifconfig br0 hw ether rmmod bridge The calltrace will occur: [ 563.312114] device eth1 left promiscuous mode [ 563.312188] br0: port 1(eth1) entered disabled state [ 563.468190] kmem_cache_destroy bridge_fdb_cache: Slab cache still has objects [ 563.468197] CPU: 6 PID: 6982 Comm: rmmod Tainted: G O 3.12.0-0.7-default+ #9 [ 563.468199] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [ 563.468200] 0000000000000880 ffff88010f111e98 ffffffff814d1c92 ffff88010f111eb8 [ 563.468204] ffffffff81148efd ffff88010f111eb8 0000000000000000 ffff88010f111ec8 [ 563.468206] ffffffffa062a270 ffff88010f111ed8 ffffffffa063ac76 ffff88010f111f78 [ 563.468209] Call Trace: [ 563.468218] [] dump_stack+0x6a/0x78 [ 563.468234] [] kmem_cache_destroy+0xfd/0x100 [ 563.468242] [] br_fdb_fini+0x10/0x20 [bridge] [ 563.468247] [] br_deinit+0x4e/0x50 [bridge] [ 563.468254] [] SyS_delete_module+0x199/0x2b0 [ 563.468259] [] system_call_fastpath+0x16/0x1b [ 570.377958] Bridge firewalling registered --------------------------- cut here ------------------------------- The reason is that when the bridge dev's address is changed, the br_fdb_change_mac_address() will add new address in fdb, but when the bridge was removed, the address entry in the fdb did not free, the bridge_fdb_cache still has objects when destroy the cache, Fix this by flushing the bridge address entry when removing the bridge. v2: according to the Toshiaki Makita and Vlad's suggestion, I only delete the vlan0 entry, it still have a leak here if the vlan id is other number, so I need to call fdb_delete_by_port(br, NULL, 1) to flush all entries whose dst is NULL for the bridge. Suggested-by: Toshiaki Makita Suggested-by: Vlad Yasevich Signed-off-by: Ding Tianhong Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 05cf2143b56099f6452b13b60593e2cc8c9aff51 Author: Vlad Yasevich Date: Tue Nov 19 20:47:15 2013 -0500 net: core: Always propagate flag changes to interfaces [ Upstream commit d2615bf450694c1302d86b9cc8a8958edfe4c3a4 ] The following commit: b6c40d68ff6498b7f63ddf97cf0aa818d748dee7 net: only invoke dev->change_rx_flags when device is UP tried to fix a problem with VLAN devices and promiscuouse flag setting. The issue was that VLAN device was setting a flag on an interface that was down, thus resulting in bad promiscuity count. This commit blocked flag propagation to any device that is currently down. A later commit: deede2fabe24e00bd7e246eb81cd5767dc6fcfc7 vlan: Don't propagate flag changes on down interfaces fixed VLAN code to only propagate flags when the VLAN interface is up, thus fixing the same issue as above, only localized to VLAN. The problem we have now is that if we have create a complex stack involving multiple software devices like bridges, bonds, and vlans, then it is possible that the flags would not propagate properly to the physical devices. A simple examle of the scenario is the following: eth0----> bond0 ----> bridge0 ---> vlan50 If bond0 or eth0 happen to be down at the time bond0 is added to the bridge, then eth0 will never have promisc mode set which is currently required for operation as part of the bridge. As a result, packets with vlan50 will be dropped by the interface. The only 2 devices that implement the special flag handling are VLAN and DSA and they both have required code to prevent incorrect flag propagation. As a result we can remove the generic solution introduced in b6c40d68ff6498b7f63ddf97cf0aa818d748dee7 and leave it to the individual devices to decide whether they will block flag propagation or not. Reported-by: Stefan Priebe Suggested-by: Veaceslav Falico Signed-off-by: Vlad Yasevich Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 62713c4b6bc10c2d082ee1540e11b01a2b2162ab Author: Alexei Starovoitov Date: Tue Nov 19 19:12:34 2013 -0800 ipv4: fix race in concurrent ip_route_input_slow() [ Upstream commit dcdfdf56b4a6c9437fc37dbc9cee94a788f9b0c4 ] CPUs can ask for local route via ip_route_input_noref() concurrently. if nh_rth_input is not cached yet, CPUs will proceed to allocate equivalent DSTs on 'lo' and then will try to cache them in nh_rth_input via rt_cache_route() Most of the time they succeed, but on occasion the following two lines: orig = *p; prev = cmpxchg(p, orig, rt); in rt_cache_route() do race and one of the cpus fails to complete cmpxchg. But ip_route_input_slow() doesn't check the return code of rt_cache_route(), so dst is leaking. dst_destroy() is never called and 'lo' device refcnt doesn't go to zero, which can be seen in the logs as: unregister_netdevice: waiting for lo to become free. Usage count = 1 Adding mdelay() between above two lines makes it easily reproducible. Fix it similar to nh_pcpu_rth_output case. Fixes: d2d68ba9fe8b ("ipv4: Cache input routes in fib_info nexthops.") Signed-off-by: Alexei Starovoitov Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 8392415810ff70ffb61ed1d4dbe5672ce34de9a2 Author: Andrey Vagin Date: Tue Nov 19 22:10:06 2013 +0400 tcp: don't update snd_nxt, when a socket is switched from repair mode [ Upstream commit dbde497966804e63a38fdedc1e3815e77097efc2 ] snd_nxt must be updated synchronously with sk_send_head. Otherwise tp->packets_out may be updated incorrectly, what may bring a kernel panic. Here is a kernel panic from my host. [ 103.043194] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048 [ 103.044025] IP: [] tcp_rearm_rto+0xcf/0x150 ... [ 146.301158] Call Trace: [ 146.301158] [] tcp_ack+0xcc0/0x12c0 Before this panic a tcp socket was restored. This socket had sent and unsent data in the write queue. Sent data was restored in repair mode, then the socket was switched from reapair mode and unsent data was restored. After that the socket was switched back into repair mode. In that moment we had a socket where write queue looks like this: snd_una snd_nxt write_seq |_________|________| | sk_send_head After a second switching from repair mode the state of socket was changed: snd_una snd_nxt, write_seq |_________ ________| | sk_send_head This state is inconsistent, because snd_nxt and sk_send_head are not synchronized. Bellow you can find a call trace, how packets_out can be incremented twice for one skb, if snd_nxt and sk_send_head are not synchronized. In this case packets_out will be always positive, even when sk_write_queue is empty. tcp_write_wakeup skb = tcp_send_head(sk); tcp_fragment if (!before(tp->snd_nxt, TCP_SKB_CB(buff)->end_seq)) tcp_adjust_pcount(sk, skb, diff); tcp_event_new_data_sent tp->packets_out += tcp_skb_pcount(skb); I think update of snd_nxt isn't required, when a socket is switched from repair mode. Because it's initialized in tcp_connect_init. Then when a write queue is restored, snd_nxt is incremented in tcp_event_new_data_sent, so it's always is in consistent state. I have checked, that the bug is not reproduced with this patch and all tests about restoring tcp connections work fine. Signed-off-by: Andrey Vagin Cc: Pavel Emelyanov Cc: Eric Dumazet Cc: "David S. Miller" Cc: Alexey Kuznetsov Cc: James Morris Cc: Hideaki YOSHIFUJI Cc: Patrick McHardy Acked-by: Pavel Emelyanov Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e1667fa212d940b130c4f5d839bdd0091d2b0128 Author: Ying Xue Date: Tue Nov 19 18:09:27 2013 +0800 atm: idt77252: fix dev refcnt leak [ Upstream commit b5de4a22f157ca345cdb3575207bf46402414bc1 ] init_card() calls dev_get_by_name() to get a network deceive. But it doesn't decrease network device reference count after the device is used. Signed-off-by: Ying Xue Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit bdf1643e0536042df717443d6bfd73e07643c237 Author: fan.du Date: Tue Nov 19 16:53:28 2013 +0800 xfrm: Release dst if this dst is improper for vti tunnel [ Upstream commit 236c9f84868534c718b6889aa624de64763281f9 ] After searching rt by the vti tunnel dst/src parameter, if this rt has neither attached to any transformation nor the transformation is not tunnel oriented, this rt should be released back to ip layer. otherwise causing dst memory leakage. Signed-off-by: Fan Du Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit fddd8b501c59c87d63a0917c8e9e14bd28e3c724 Author: Jiri Pirko Date: Wed Nov 6 17:52:20 2013 +0100 netfilter: push reasm skb through instead of original frag skbs [ Upstream commit 6aafeef03b9d9ecf255f3a80ed85ee070260e1ae ] Pushing original fragments through causes several problems. For example for matching, frags may not be matched correctly. Take following example: On HOSTA do: ip6tables -I INPUT -p icmpv6 -j DROP ip6tables -I INPUT -p icmpv6 -m icmp6 --icmpv6-type 128 -j ACCEPT and on HOSTB you do: ping6 HOSTA -s2000 (MTU is 1500) Incoming echo requests will be filtered out on HOSTA. This issue does not occur with smaller packets than MTU (where fragmentation does not happen) As was discussed previously, the only correct solution seems to be to use reassembled skb instead of separete frags. Doing this has positive side effects in reducing sk_buff by one pointer (nfct_reasm) and also the reams dances in ipvs and conntrack can be removed. Future plan is to remove net/ipv6/netfilter/nf_conntrack_reasm.c entirely and use code in net/ipv6/reassembly.c instead. Signed-off-by: Jiri Pirko Acked-by: Julian Anastasov Signed-off-by: Marcelo Ricardo Leitner Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 2d9a30962d0310d33372c245a6191499a06409f3 Author: Jiri Pirko Date: Wed Nov 6 17:52:19 2013 +0100 ip6_output: fragment outgoing reassembled skb properly [ Upstream commit 9037c3579a277f3a23ba476664629fda8c35f7c4 ] If reassembled packet would fit into outdev MTU, it is not fragmented according the original frag size and it is send as single big packet. The second case is if skb is gso. In that case fragmentation does not happen according to the original frag size. This patch fixes these. Signed-off-by: Jiri Pirko Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 2d71627c5af3a50d3b22fe6900b6dd29e0445fd7 Author: Hannes Frederic Sowa Date: Sat Nov 23 07:22:33 2013 +0100 ipv6: fix leaking uninitialized port number of offender sockaddr [ Upstream commit 1fa4c710b6fe7b0aac9907240291b6fe6aafc3b8 ] Offenders don't have port numbers, so set it to 0. Signed-off-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ed47bba241c68cd39964dac3248e0a2962c8b7ff Author: Dan Carpenter Date: Wed Nov 27 15:40:21 2013 +0300 net: clamp ->msg_namelen instead of returning an error [ Upstream commit db31c55a6fb245fdbb752a2ca4aefec89afabb06 ] If kmsg->msg_namelen > sizeof(struct sockaddr_storage) then in the original code that would lead to memory corruption in the kernel if you had audit configured. If you didn't have audit configured it was harmless. There are some programs such as beta versions of Ruby which use too large of a buffer and returning an error code breaks them. We should clamp the ->msg_namelen value instead. Fixes: 1661bf364ae9 ("net: heap overflow in __audit_sockaddr()") Reported-by: Eric Wong Signed-off-by: Dan Carpenter Tested-by: Eric Wong Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 08c62a109ed5f716556b2211f8cfd0d5fe6d18d2 Author: Hannes Frederic Sowa Date: Sat Nov 23 00:46:12 2013 +0100 inet: fix addr_len/msg->msg_namelen assignment in recv_error and rxpmtu functions [ Upstream commit 85fbaa75037d0b6b786ff18658ddf0b4014ce2a4 ] Commit bceaa90240b6019ed73b49965eac7d167610be69 ("inet: prevent leakage of uninitialized memory to user in recv syscalls") conditionally updated addr_len if the msg_name is written to. The recv_error and rxpmtu functions relied on the recvmsg functions to set up addr_len before. As this does not happen any more we have to pass addr_len to those functions as well and set it to the size of the corresponding sockaddr length. This broke traceroute and such. Fixes: bceaa90240b6 ("inet: prevent leakage of uninitialized memory to user in recv syscalls") Reported-by: Brad Spengler Reported-by: Tom Labanowski Cc: mpb Cc: David S. Miller Cc: Eric Dumazet Signed-off-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 5c586f163dc452d8cc19b456f6f2f3e704025462 Author: Hannes Frederic Sowa Date: Thu Nov 21 03:14:34 2013 +0100 net: add BUG_ON if kernel advertises msg_namelen > sizeof(struct sockaddr_storage) [ Upstream commit 68c6beb373955da0886d8f4f5995b3922ceda4be ] In that case it is probable that kernel code overwrote part of the stack. So we should bail out loudly here. The BUG_ON may be removed in future if we are sure all protocols are conformant. Suggested-by: Eric Dumazet Signed-off-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 2f73d7fde99d702cba6a05062c27605a6eef1b78 Author: Hannes Frederic Sowa Date: Thu Nov 21 03:14:22 2013 +0100 net: rework recvmsg handler msg_name and msg_namelen logic [ Upstream commit f3d3342602f8bcbf37d7c46641cb9bca7618eb1c ] This patch now always passes msg->msg_namelen as 0. recvmsg handlers must set msg_namelen to the proper size <= sizeof(struct sockaddr_storage) to return msg_name to the user. This prevents numerous uninitialized memory leaks we had in the recvmsg handlers and makes it harder for new code to accidentally leak uninitialized memory. Optimize for the case recvfrom is called with NULL as address. We don't need to copy the address at all, so set it to NULL before invoking the recvmsg handler. We can do so, because all the recvmsg handlers must cope with the case a plain read() is called on them. read() also sets msg_name to NULL. Also document these changes in include/linux/net.h as suggested by David Miller. Changes since RFC: Set msg->msg_name = NULL if user specified a NULL in msg_name but had a non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't affect sendto as it would bail out earlier while trying to copy-in the address. It also more naturally reflects the logic by the callers of verify_iovec. With this change in place I could remove " if (!uaddr || msg_sys->msg_namelen == 0) msg->msg_name = NULL ". This change does not alter the user visible error logic as we ignore msg_namelen as long as msg_name is NULL. Also remove two unnecessary curly brackets in ___sys_recvmsg and change comments to netdev style. Cc: David Miller Suggested-by: Eric Dumazet Signed-off-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit a2214488937a84d8d0b5a3b546f97b2485029a17 Author: Hannes Frederic Sowa Date: Mon Nov 18 04:20:45 2013 +0100 inet: prevent leakage of uninitialized memory to user in recv syscalls [ Upstream commit bceaa90240b6019ed73b49965eac7d167610be69 ] Only update *addr_len when we actually fill in sockaddr, otherwise we can return uninitialized memory from the stack to the caller in the recvfrom, recvmmsg and recvmsg syscalls. Drop the the (addr_len == NULL) checks because we only get called with a valid addr_len pointer either from sock_common_recvmsg or inet_recvmsg. If a blocking read waits on a socket which is concurrently shut down we now return zero and set msg_msgnamelen to 0. Reported-by: mpb Suggested-by: Eric Dumazet Signed-off-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ac6a5b926d2d8df1596f7cb91f46afc18e5e39f9 Author: Eric Dumazet Date: Thu Nov 14 13:37:54 2013 -0800 ipv4: fix possible seqlock deadlock [ Upstream commit c9e9042994d37cbc1ee538c500e9da1bb9d1bcdf ] ip4_datagram_connect() being called from process context, it should use IP_INC_STATS() instead of IP_INC_STATS_BH() otherwise we can deadlock on 32bit arches, or get corruptions of SNMP counters. Fixes: 584bdf8cbdf6 ("[IPV4]: Fix "ipOutNoRoutes" counter error for TCP and UDP") Signed-off-by: Eric Dumazet Reported-by: Dave Jones Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit f0173a11a965e9e6cc248ccf9efa88cfd22355f9 Author: Chris Metcalf Date: Thu Nov 14 12:09:21 2013 -0500 connector: improved unaligned access error fix [ Upstream commit 1ca1a4cf59ea343a1a70084fe7cc96f37f3cf5b1 ] In af3e095a1fb4, Erik Jacobsen fixed one type of unaligned access bug for ia64 by converting a 64-bit write to use put_unaligned(). Unfortunately, since gcc will convert a short memset() to a series of appropriately-aligned stores, the problem is now visible again on tilegx, where the memset that zeros out proc_event is converted to three 64-bit stores, causing an unaligned access panic. A better fix for the original problem is to ensure that proc_event is aligned to 8 bytes here. We can do that relatively easily by arranging to start the struct cn_msg aligned to 8 bytes and then offset by 4 bytes. Doing so means that the immediately following proc_event structure is then correctly aligned to 8 bytes. The result is that the memset() stores are now aligned, and as an added benefit, we can remove the put_unaligned() calls in the code. Signed-off-by: Chris Metcalf Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 0056eb8b08101a3f9c21d3069ebc300fe5008813 Author: Dan Carpenter Date: Thu Nov 14 11:21:10 2013 +0300 isdnloop: use strlcpy() instead of strcpy() [ Upstream commit f9a23c84486ed350cce7bb1b2828abd1f6658796 ] These strings come from a copy_from_user() and there is no way to be sure they are NUL terminated. Signed-off-by: Dan Carpenter Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 5e8c945dc7573ba2f28bddf51ca06b152b4bf749 Author: Eric Dumazet Date: Wed Nov 13 15:00:46 2013 -0800 net-tcp: fix panic in tcp_fastopen_cache_set() [ Upstream commit dccf76ca6b626c0c4a4e09bb221adee3270ab0ef ] We had some reports of crashes using TCP fastopen, and Dave Jones gave a nice stack trace pointing to the error. Issue is that tcp_get_metrics() should not be called with a NULL dst Fixes: 1fe4c481ba637 ("net-tcp: Fast Open client - cookie cache") Signed-off-by: Eric Dumazet Reported-by: Dave Jones Cc: Yuchung Cheng Acked-by: Yuchung Cheng Tested-by: Dave Jones Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 9571243bac246e21eab0301fe48b7456d8804821 Author: Nikolay Aleksandrov Date: Wed Nov 13 17:07:46 2013 +0100 bonding: fix two race conditions in bond_store_updelay/downdelay [ Upstream commit b869ccfab1e324507fa3596e3e1308444fb68227 ] This patch fixes two race conditions between bond_store_updelay/downdelay and bond_store_miimon which could lead to division by zero as miimon can be set to 0 while either updelay/downdelay are being set and thus miss the zero check in the beginning, the zero div happens because updelay/downdelay are stored as new_value / bond->params.miimon. Use rtnl to synchronize with miimon setting. Signed-off-by: Nikolay Aleksandrov CC: Jay Vosburgh CC: Andy Gospodarek CC: Veaceslav Falico Acked-by: Veaceslav Falico Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 6ef30bdab8dab92939edbc964237b844efbe5946 Author: Eric Dumazet Date: Wed Nov 13 06:32:54 2013 -0800 tcp: tsq: restore minimal amount of queueing [ Upstream commit 98e09386c0ef4dfd48af7ba60ff908f0d525cdee ] After commit c9eeec26e32e ("tcp: TSQ can use a dynamic limit"), several users reported throughput regressions, notably on mvneta and wifi adapters. 802.11 AMPDU requires a fair amount of queueing to be effective. This patch partially reverts the change done in tcp_write_xmit() so that the minimal amount is sysctl_tcp_limit_output_bytes. It also remove the use of this sysctl while building skb stored in write queue, as TSO autosizing does the right thing anyway. Users with well behaving NICS and correct qdisc (like sch_fq), can then lower the default sysctl_tcp_limit_output_bytes value from 128KB to 8KB. This new usage of sysctl_tcp_limit_output_bytes permits each driver authors to check how their driver performs when/if the value is set to a minimum of 4KB. Normally, line rate for a single TCP flow should be possible, but some drivers rely on timers to perform TX completion and too long TX completion delays prevent reaching full throughput. Fixes: c9eeec26e32e ("tcp: TSQ can use a dynamic limit") Signed-off-by: Eric Dumazet Reported-by: Sujith Manoharan Reported-by: Arnaud Ebalard Tested-by: Sujith Manoharan Cc: Felix Fietkau Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit a6c8afd6ef8037c550d646ddb195ad5f0b5bebcf Author: Jason Wang Date: Wed Nov 13 14:00:40 2013 +0800 macvtap: limit head length of skb allocated [ Upstream commit 16a3fa28630331e28208872fa5341ce210b901c7 ] We currently use hdr_len as a hint of head length which is advertised by guest. But when guest advertise a very big value, it can lead to an 64K+ allocating of kmalloc() which has a very high possibility of failure when host memory is fragmented or under heavy stress. The huge hdr_len also reduce the effect of zerocopy or even disable if a gso skb is linearized in guest. To solves those issues, this patch introduces an upper limit (PAGE_SIZE) of the head, which guarantees an order 0 allocation each time. Signed-off-by: Jason Wang Cc: Stefan Hajnoczi Cc: Michael S. Tsirkin Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 4ccc92f8e5ae05dddd4cce8bee59cc37152b41cc Author: Jason Wang Date: Wed Nov 13 14:00:39 2013 +0800 tuntap: limit head length of skb allocated [ Upstream commit 96f8d9ecf227638c89f98ccdcdd50b569891976c ] We currently use hdr_len as a hint of head length which is advertised by guest. But when guest advertise a very big value, it can lead to an 64K+ allocating of kmalloc() which has a very high possibility of failure when host memory is fragmented or under heavy stress. The huge hdr_len also reduce the effect of zerocopy or even disable if a gso skb is linearized in guest. To solves those issues, this patch introduces an upper limit (PAGE_SIZE) of the head, which guarantees an order 0 allocation each time. Signed-off-by: Jason Wang Cc: Stefan Hajnoczi Cc: Michael S. Tsirkin Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 2d02839a2b8439eab70e53401029c7a6c0629ffd Author: Jukka Rissanen Date: Wed Nov 13 11:03:39 2013 +0200 6lowpan: Uncompression of traffic class field was incorrect [ Upstream commit 1188f05497e7bd2f2614b99c54adfbe7413d5749 ] If priority/traffic class field in IPv6 header is set (seen when using ssh), the uncompression sets the TC and Flow fields incorrectly. Example: This is IPv6 header of a sent packet. Note the priority/TC (=1) in the first byte. 00000000: 61 00 00 00 00 2c 06 40 fe 80 00 00 00 00 00 00 00000010: 02 02 72 ff fe c6 42 10 fe 80 00 00 00 00 00 00 00000020: 02 1e ab ff fe 4c 52 57 This gets compressed like this in the sending side 00000000: 72 31 04 06 02 1e ab ff fe 4c 52 57 ec c2 00 16 00000010: aa 2d fe 92 86 4e be c6 .... In the receiving end, the packet gets uncompressed to this IPv6 header 00000000: 60 06 06 02 00 2a 1e 40 fe 80 00 00 00 00 00 00 00000010: 02 02 72 ff fe c6 42 10 fe 80 00 00 00 00 00 00 00000020: ab ff fe 4c 52 57 ec c2 First four bytes are set incorrectly and we have also lost two bytes from destination address. The fix is to switch the case values in switch statement when checking the TC field. Signed-off-by: Jukka Rissanen Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 5a9b1ba637310f5bf1f684303dd5abdb65259357 Author: Felix Fietkau Date: Tue Nov 12 16:34:41 2013 +0100 usbnet: fix status interrupt urb handling [ Upstream commit 52f48d0d9aaa621ffa5e08d79da99a3f8c93b848 ] Since commit 7b0c5f21f348a66de495868b8df0284e8dfd6bbf "sierra_net: keep status interrupt URB active", sierra_net triggers status interrupt polling before the net_device is opened (in order to properly receive the sync message response). To be able to receive further interrupts, the interrupt urb needs to be re-submitted, so this patch removes the bogus check for netif_running(). Signed-off-by: Felix Fietkau Tested-by: Dan Williams Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e9a2fa2a6192f03f095772aa02cff12c60475b43 Author: Veaceslav Falico Date: Tue Nov 12 15:37:40 2013 +0100 bonding: don't permit to use ARP monitoring in 802.3ad mode [ Upstream commit ec9f1d15db8185f63a2c3143dc1e90ba18541b08 ] Currently the ARP monitoring is not supported with 802.3ad, and it's prohibited to use it via the module params. However we still can set it afterwards via sysfs, cause we only check for *LB modes there. To fix this - add a check for 802.3ad mode in bonding_store_arp_interval. Signed-off-by: Veaceslav Falico CC: Jay Vosburgh CC: Andy Gospodarek Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e29507fecb79c00ae76f9ef626d282af8b1f4225 Author: Daniel Borkmann Date: Mon Nov 11 12:20:32 2013 +0100 random32: fix off-by-one in seeding requirement [ Upstream commit 51c37a70aaa3f95773af560e6db3073520513912 ] For properly initialising the Tausworthe generator [1], we have a strict seeding requirement, that is, s1 > 1, s2 > 7, s3 > 15. Commit 697f8d0348 ("random32: seeding improvement") introduced a __seed() function that imposes boundary checks proposed by the errata paper [2] to properly ensure above conditions. However, we're off by one, as the function is implemented as: "return (x < m) ? x + m : x;", and called with __seed(X, 1), __seed(X, 7), __seed(X, 15). Thus, an unwanted seed of 1, 7, 15 would be possible, whereas the lower boundary should actually be of at least 2, 8, 16, just as GSL does. Fix this, as otherwise an initialization with an unwanted seed could have the effect that Tausworthe's PRNG properties cannot not be ensured. Note that this PRNG is *not* used for cryptography in the kernel. [1] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme.ps [2] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme2.ps Joint work with Hannes Frederic Sowa. Fixes: 697f8d0348a6 ("random32: seeding improvement") Cc: Stephen Hemminger Cc: Florian Weimer Cc: Theodore Ts'o Signed-off-by: Daniel Borkmann Signed-off-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e688cd4d32cfbf0b5198ba95da428038e907e633 Author: Hannes Frederic Sowa Date: Fri Nov 8 19:26:21 2013 +0100 ipv6: protect for_each_sk_fl_rcu in mem_check with rcu_read_lock_bh [ Upstream commit f8c31c8f80dd882f7eb49276989a4078d33d67a7 ] Fixes a suspicious rcu derference warning. Cc: Florent Fourcot Signed-off-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ea136c0d81a94c812dbcf6edc21b6fb3c210b4ca Author: Duan Jiong Date: Fri Nov 8 09:56:53 2013 +0800 ipv6: use rt6_get_dflt_router to get default router in rt6_route_rcv [ Upstream commit f104a567e673f382b09542a8dc3500aa689957b4 ] As the rfc 4191 said, the Router Preference and Lifetime values in a ::/0 Route Information Option should override the preference and lifetime values in the Router Advertisement header. But when the kernel deals with a ::/0 Route Information Option, the rt6_get_route_info() always return NULL, that means that overriding will not happen, because those default routers were added without flag RTF_ROUTEINFO in rt6_add_dflt_router(). In order to deal with that condition, we should call rt6_get_dflt_router when the prefix length is 0. Signed-off-by: Duan Jiong Acked-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d9160deb040a6b6dbd467cd2a20d731155b742fd Author: Andreas Henriksson Date: Thu Nov 7 18:26:38 2013 +0100 net: Fix "ip rule delete table 256" [ Upstream commit 13eb2ab2d33c57ebddc57437a7d341995fc9138c ] When trying to delete a table >= 256 using iproute2 the local table will be deleted. The table id is specified as a netlink attribute when it needs more then 8 bits and iproute2 then sets the table field to RT_TABLE_UNSPEC (0). Preconditions to matching the table id in the rule delete code doesn't seem to take the "table id in netlink attribute" into condition so the frh_get_table helper function never gets to do its job when matching against current rule. Use the helper function twice instead of peaking at the table value directly. Originally reported at: http://bugs.debian.org/724783 Reported-by: Nicolas HICHER Signed-off-by: Andreas Henriksson Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 99e9489477a4b5c90d64f82a40dbc9680c630b1c Author: Amir Vadai Date: Thu Nov 7 11:08:30 2013 +0200 net/mlx4_en: Fixed crash when port type is changed [ Upstream commit 1ec4864b10171b0691ee196d7006ae56d2c153f2 ] timecounter_init() was was called only after first potential timecounter_read(). Moved mlx4_en_init_timestamp() before mlx4_en_init_netdev() Signed-off-by: Amir Vadai Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 024abeeced731bf7cba90edad7e352bbc4ee1dd1 Author: Hannes Frederic Sowa Date: Tue Nov 5 02:41:27 2013 +0100 ipv6: fix headroom calculation in udp6_ufo_fragment [ Upstream commit 0e033e04c2678dbbe74a46b23fffb7bb918c288e ] Commit 1e2bd517c108816220f262d7954b697af03b5f9c ("udp6: Fix udp fragmentation for tunnel traffic.") changed the calculation if there is enough space to include a fragment header in the skb from a skb->mac_header dervived one to skb_headroom. Because we already peeled off the skb to transport_header this is wrong. Change this back to check if we have enough room before the mac_header. This fixes a panic Saran Neti reported. He used the tbf scheduler which skb_gso_segments the skb. The offsets get negative and we panic in memcpy because the skb was erroneously not expanded at the head. Reported-by: Saran Neti Cc: Pravin B Shelar Signed-off-by: Hannes Frederic Sowa Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman