aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-03-28Bump version to 5.0.2llvmorg-5.0.2-rc2llvmorg-5.0.2release/5.xTom Stellard1-1/+1
llvm-svn: 328729
2018-03-16Backporting r325647 and r325713:llvmorg-5.0.2-rc1Simon Dardis4-5/+17
------------------------------------------------------------------------ r325713 | sdardis | 2018-02-21 20:01:43 +0000 (Wed, 21 Feb 2018) | 5 lines [mips][lld] Address post commit review nit. Address @ruiu's post commit review comment about a value which is intended to be a unsigned 32 bit integer as using uint32_t rather than unsigned. ------------------------------------------------------------------------ ------------------------------------------------------------------------ r325647 | sdardis | 2018-02-20 23:49:17 +0000 (Tue, 20 Feb 2018) | 27 lines [mips][lld] Spectre variant two mitigation for MIPSR2 This patch provides migitation for CVE-2017-5715, Spectre variant two, which affects the P5600 and P6600. It implements the LLD part of -z hazardplt. Like the Clang part of this patch, I have opted for that specific option name in case alternative migitation methods are required in the future. The mitigation strategy suggested by MIPS for these processors is to use hazard barrier instructions. 'jalr.hb' and 'jr.hb' are hazard barrier variants of the 'jalr' and 'jr' instructions respectively. These instructions impede the execution of instruction stream until architecturally defined hazards (changes to the instruction stream, privileged registers which may affect execution) are cleared. These instructions in MIPS' designs are not speculated past. These instructions are defined by the MIPS32R2 ISA, so this mitigation method is not compatible with processors which implement an earlier revision of the MIPS ISA. For LLD, this changes PLT stubs to use 'jalr.hb' and 'jr.hb'. Reviewers: atanasyan, ruiu Differential Revision: https://reviews.llvm.org/D43488 ------------------------------------------------------------------------ llvm-svn: 327757
2018-03-16Backporting 325651::Simon Dardis6-1/+58
------------------------------------------------------------------------ r325651 | sdardis | 2018-02-21 00:05:05 +0000 (Wed, 21 Feb 2018) | 34 lines [mips] Spectre variant two mitigation for MIPSR2 This patch provides mitigation for CVE-2017-5715, Spectre variant two, which affects the P5600 and P6600. It provides the option -mindirect-jump=hazard, which instructs the LLVM backend to replace indirect branches with their hazard barrier variants. This option is accepted when targeting MIPS revision two or later. The migitation strategy suggested by MIPS for these processors is to use two hazard barrier instructions. 'jalr.hb' and 'jr.hb' are hazard barrier variants of the 'jalr' and 'jr' instructions respectively. These instructions impede the execution of instruction stream until architecturally defined hazards (changes to the instruction stream, privileged registers which may affect execution) are cleared. These instructions in MIPS' designs are not speculated past. These instructions are used with the option -mindirect-jump=hazard when branching indirectly and for indirect function calls. These instructions are defined by the MIPS32R2 ISA, so this mitigation method is not compatible with processors which implement an earlier revision of the MIPS ISA. Implementation note: I've opted to provide this as an -mindirect-jump={hazard,...} style option in case alternative mitigation methods are required for other implementations of the MIPS ISA in future, e.g. retpoline style solutions. Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D43487 ------------------------------------------------------------------------ llvm-svn: 327755
2018-03-16Backporting r325653:Simon Dardis25-40/+1474
------------------------------------------------------------------------ r325653 | sdardis | 2018-02-21 00:06:53 +0000 (Wed, 21 Feb 2018) | 31 lines [mips] Spectre variant two mitigation for MIPSR2 This patch provides mitigation for CVE-2017-5715, Spectre variant two, which affects the P5600 and P6600. It implements the LLVM part of -mindirect-jump=hazard. It is _not_ enabled by default for the P5600. The migitation strategy suggested by MIPS for these processors is to use hazard barrier instructions. 'jalr.hb' and 'jr.hb' are hazard barrier variants of the 'jalr' and 'jr' instructions respectively. These instructions impede the execution of instruction stream until architecturally defined hazards (changes to the instruction stream, privileged registers which may affect execution) are cleared. These instructions in MIPS' designs are not speculated past. These instructions are used with the attribute +use-indirect-jump-hazard when branching indirectly and for indirect function calls. These instructions are defined by the MIPS32R2 ISA, so this mitigation method is not compatible with processors which implement an earlier revision of the MIPS ISA. Performance benchmarking of this option with -fpic and lld using -z hazardplt shows a difference of overall 10%~ time increase for the LLVM testsuite. Certain benchmarks such as methcall show a substantially larger increase in time due to their nature. Reviewers: atanasyan, zoran.jovanovic Differential Revision: https://reviews.llvm.org/D43486 ------------------------------------------------------------------------ llvm-svn: 327751
2018-02-14Merging r325085:Reid Kleckner1-26/+0
------------------------------------------------------------------------ r325085 | rnk | 2018-02-13 16:24:29 -0800 (Tue, 13 Feb 2018) | 3 lines [X86] Remove dead code from retpoline thunk generation Follow-up to r325049 ------------------------------------------------------------------------ llvm-svn: 325091
2018-02-14Merging r325049:Reid Kleckner4-72/+76
------------------------------------------------------------------------ r325049 | rnk | 2018-02-13 12:47:49 -0800 (Tue, 13 Feb 2018) | 17 lines [X86] Use EDI for retpoline when no scratch regs are left Summary: Instead of solving the hard problem of how to pass the callee to the indirect jump thunk without a register, just use a CSR. At a call boundary, there's nothing stopping us from using a CSR to hold the callee as long as we save and restore it in the prologue. Also, add tests for this mregparm=3 case. I wrote execution tests for __llvm_retpoline_push, but they never got committed as lit tests, either because I never rewrote them or because they got lost in merge conflicts. Reviewers: chandlerc, dwmw2 Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D43214 ------------------------------------------------------------------------ llvm-svn: 325090
2018-02-14Merging r324645:Reid Kleckner2-1/+24
------------------------------------------------------------------------ r324645 | dwmw2 | 2018-02-08 12:06:05 -0800 (Thu, 08 Feb 2018) | 5 lines [X86] Support 'V' register operand modifier This allows the register name to be printed without the leading '%'. This can be used for emitting calls to the retpoline thunks from inline asm. ------------------------------------------------------------------------ llvm-svn: 325089
2018-02-14Merging r324449:Reid Kleckner2-39/+68
------------------------------------------------------------------------ r324449 | chandlerc | 2018-02-06 22:16:24 -0800 (Tue, 06 Feb 2018) | 15 lines [x86/retpoline] Make the external thunk names exactly match the names that happened to end up in GCC. This is really unfortunate, as the names don't have much rhyme or reason to them. Originally in the discussions it seemed fine to rely on aliases to map different names to whatever external thunk code developers wished to use but there are practical problems with that in the kernel it turns out. And since we're discovering this practical problems late and since GCC has already shipped a release with one set of names, we are forced, yet again, to blindly match what is there. Somewhat rushing this patch out for the Linux kernel folks to test and so we can get it patched into our releases. Differential Revision: https://reviews.llvm.org/D42998 ------------------------------------------------------------------------ llvm-svn: 325088
2018-02-01Merging r323288:Reid Kleckner2-47/+27
------------------------------------------------------------------------ r323288 | ruiu | 2018-01-23 16:26:57 -0800 (Tue, 23 Jan 2018) | 3 lines Fix retpoline PLT header size for i386. Differential Revision: https://reviews.llvm.org/D42397 ------------------------------------------------------------------------ llvm-svn: 324026
2018-02-01Merging r323155 in LLD, with modifications to handle int3 fillReid Kleckner8-10/+548
Original commit message: ------------------------------------------------------------------------ r323155 | chandlerc | 2018-01-22 14:05:25 -0800 (Mon, 22 Jan 2018) | 133 lines Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre.. Summary: First, we need to explain the core of the vulnerability. Note that this is a very incomplete description, please see the Project Zero blog post for details: https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html The basis for branch target injection is to direct speculative execution of the processor to some "gadget" of executable code by poisoning the prediction of indirect branches with the address of that gadget. The gadget in turn contains an operation that provides a side channel for reading data. Most commonly, this will look like a load of secret data followed by a branch on the loaded value and then a load of some predictable cache line. The attacker then uses timing of the processors cache to determine which direction the branch took *in the speculative execution*, and in turn what one bit of the loaded value was. Due to the nature of these timing side channels and the branch predictor on Intel processors, this allows an attacker to leak data only accessible to a privileged domain (like the kernel) back into an unprivileged domain. The goal is simple: avoid generating code which contains an indirect branch that could have its prediction poisoned by an attacker. In many cases, the compiler can simply use directed conditional branches and a small search tree. LLVM already has support for lowering switches in this way and the first step of this patch is to disable jump-table lowering of switches and introduce a pass to rewrite explicit indirectbr sequences into a switch over integers. However, there is no fully general alternative to indirect calls. We introduce a new construct we call a "retpoline" to implement indirect calls in a non-speculatable way. It can be thought of loosely as a trampoline for indirect calls which uses the RET instruction on x86. Further, we arrange for a specific call->ret sequence which ensures the processor predicts the return to go to a controlled, known location. The retpoline then "smashes" the return address pushed onto the stack by the call with the desired target of the original indirect call. The result is a predicted return to the next instruction after a call (which can be used to trap speculative execution within an infinite loop) and an actual indirect branch to an arbitrary address. On 64-bit x86 ABIs, this is especially easily done in the compiler by using a guaranteed scratch register to pass the target into this device. For 32-bit ABIs there isn't a guaranteed scratch register and so several different retpoline variants are introduced to use a scratch register if one is available in the calling convention and to otherwise use direct stack push/pop sequences to pass the target address. This "retpoline" mitigation is fully described in the following blog post: https://support.google.com/faqs/answer/7625886 We also support a target feature that disables emission of the retpoline thunk by the compiler to allow for custom thunks if users want them. These are particularly useful in environments like kernels that routinely do hot-patching on boot and want to hot-patch their thunk to different code sequences. They can write this custom thunk and use `-mretpoline-external-thunk` *in addition* to `-mretpoline`. In this case, on x86-64 thu thunk names must be: ``` __llvm_external_retpoline_r11 ``` or on 32-bit: ``` __llvm_external_retpoline_eax __llvm_external_retpoline_ecx __llvm_external_retpoline_edx __llvm_external_retpoline_push ``` And the target of the retpoline is passed in the named register, or in the case of the `push` suffix on the top of the stack via a `pushl` instruction. There is one other important source of indirect branches in x86 ELF binaries: the PLT. These patches also include support for LLD to generate PLT entries that perform a retpoline-style indirection. The only other indirect branches remaining that we are aware of are from precompiled runtimes (such as crt0.o and similar). The ones we have found are not really attackable, and so we have not focused on them here, but eventually these runtimes should also be replicated for retpoline-ed configurations for completeness. For kernels or other freestanding or fully static executables, the compiler switch `-mretpoline` is sufficient to fully mitigate this particular attack. For dynamic executables, you must compile *all* libraries with `-mretpoline` and additionally link the dynamic executable and all shared libraries with LLD and pass `-z retpolineplt` (or use similar functionality from some other linker). We strongly recommend also using `-z now` as non-lazy binding allows the retpoline-mitigated PLT to be substantially smaller. When manually apply similar transformations to `-mretpoline` to the Linux kernel we observed very small performance hits to applications running typical workloads, and relatively minor hits (approximately 2%) even for extremely syscall-heavy applications. This is largely due to the small number of indirect branches that occur in performance sensitive paths of the kernel. When using these patches on statically linked applications, especially C++ applications, you should expect to see a much more dramatic performance hit. For microbenchmarks that are switch, indirect-, or virtual-call heavy we have seen overheads ranging from 10% to 50%. However, real-world workloads exhibit substantially lower performance impact. Notably, techniques such as PGO and ThinLTO dramatically reduce the impact of hot indirect calls (by speculatively promoting them to direct calls) and allow optimized search trees to be used to lower switches. If you need to deploy these techniques in C++ applications, we *strongly* recommend that you ensure all hot call targets are statically linked (avoiding PLT indirection) and use both PGO and ThinLTO. Well tuned servers using all of these techniques saw 5% - 10% overhead from the use of retpoline. We will add detailed documentation covering these components in subsequent patches, but wanted to make the core functionality available as soon as possible. Happy for more code review, but we'd really like to get these patches landed and backported ASAP for obvious reasons. We're planning to backport this to both 6.0 and 5.0 release streams and get a 5.0 release with just this cherry picked ASAP for distros and vendors. This patch is the work of a number of people over the past month: Eric, Reid, Rui, and myself. I'm mailing it out as a single commit due to the time sensitive nature of landing this and the need to backport it. Huge thanks to everyone who helped out here, and everyone at Intel who helped out in discussions about how to craft this. Also, credit goes to Paul Turner (at Google, but not an LLVM contributor) for much of the underlying retpoline design. Reviewers: echristo, rnk, ruiu, craig.topper, DavidKreitzer Subscribers: sanjoy, emaste, mcrosier, mgorny, mehdi_amini, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D41723 ------------------------------------------------------------------------ llvm-svn: 324025
2018-02-01Merging r323155:Reid Kleckner3-0/+23
------------------------------------------------------------------------ r323155 | chandlerc | 2018-01-22 14:05:25 -0800 (Mon, 22 Jan 2018) | 133 lines Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre.. Summary: First, we need to explain the core of the vulnerability. Note that this is a very incomplete description, please see the Project Zero blog post for details: https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html The basis for branch target injection is to direct speculative execution of the processor to some "gadget" of executable code by poisoning the prediction of indirect branches with the address of that gadget. The gadget in turn contains an operation that provides a side channel for reading data. Most commonly, this will look like a load of secret data followed by a branch on the loaded value and then a load of some predictable cache line. The attacker then uses timing of the processors cache to determine which direction the branch took *in the speculative execution*, and in turn what one bit of the loaded value was. Due to the nature of these timing side channels and the branch predictor on Intel processors, this allows an attacker to leak data only accessible to a privileged domain (like the kernel) back into an unprivileged domain. The goal is simple: avoid generating code which contains an indirect branch that could have its prediction poisoned by an attacker. In many cases, the compiler can simply use directed conditional branches and a small search tree. LLVM already has support for lowering switches in this way and the first step of this patch is to disable jump-table lowering of switches and introduce a pass to rewrite explicit indirectbr sequences into a switch over integers. However, there is no fully general alternative to indirect calls. We introduce a new construct we call a "retpoline" to implement indirect calls in a non-speculatable way. It can be thought of loosely as a trampoline for indirect calls which uses the RET instruction on x86. Further, we arrange for a specific call->ret sequence which ensures the processor predicts the return to go to a controlled, known location. The retpoline then "smashes" the return address pushed onto the stack by the call with the desired target of the original indirect call. The result is a predicted return to the next instruction after a call (which can be used to trap speculative execution within an infinite loop) and an actual indirect branch to an arbitrary address. On 64-bit x86 ABIs, this is especially easily done in the compiler by using a guaranteed scratch register to pass the target into this device. For 32-bit ABIs there isn't a guaranteed scratch register and so several different retpoline variants are introduced to use a scratch register if one is available in the calling convention and to otherwise use direct stack push/pop sequences to pass the target address. This "retpoline" mitigation is fully described in the following blog post: https://support.google.com/faqs/answer/7625886 We also support a target feature that disables emission of the retpoline thunk by the compiler to allow for custom thunks if users want them. These are particularly useful in environments like kernels that routinely do hot-patching on boot and want to hot-patch their thunk to different code sequences. They can write this custom thunk and use `-mretpoline-external-thunk` *in addition* to `-mretpoline`. In this case, on x86-64 thu thunk names must be: ``` __llvm_external_retpoline_r11 ``` or on 32-bit: ``` __llvm_external_retpoline_eax __llvm_external_retpoline_ecx __llvm_external_retpoline_edx __llvm_external_retpoline_push ``` And the target of the retpoline is passed in the named register, or in the case of the `push` suffix on the top of the stack via a `pushl` instruction. There is one other important source of indirect branches in x86 ELF binaries: the PLT. These patches also include support for LLD to generate PLT entries that perform a retpoline-style indirection. The only other indirect branches remaining that we are aware of are from precompiled runtimes (such as crt0.o and similar). The ones we have found are not really attackable, and so we have not focused on them here, but eventually these runtimes should also be replicated for retpoline-ed configurations for completeness. For kernels or other freestanding or fully static executables, the compiler switch `-mretpoline` is sufficient to fully mitigate this particular attack. For dynamic executables, you must compile *all* libraries with `-mretpoline` and additionally link the dynamic executable and all shared libraries with LLD and pass `-z retpolineplt` (or use similar functionality from some other linker). We strongly recommend also using `-z now` as non-lazy binding allows the retpoline-mitigated PLT to be substantially smaller. When manually apply similar transformations to `-mretpoline` to the Linux kernel we observed very small performance hits to applications running typical workloads, and relatively minor hits (approximately 2%) even for extremely syscall-heavy applications. This is largely due to the small number of indirect branches that occur in performance sensitive paths of the kernel. When using these patches on statically linked applications, especially C++ applications, you should expect to see a much more dramatic performance hit. For microbenchmarks that are switch, indirect-, or virtual-call heavy we have seen overheads ranging from 10% to 50%. However, real-world workloads exhibit substantially lower performance impact. Notably, techniques such as PGO and ThinLTO dramatically reduce the impact of hot indirect calls (by speculatively promoting them to direct calls) and allow optimized search trees to be used to lower switches. If you need to deploy these techniques in C++ applications, we *strongly* recommend that you ensure all hot call targets are statically linked (avoiding PLT indirection) and use both PGO and ThinLTO. Well tuned servers using all of these techniques saw 5% - 10% overhead from the use of retpoline. We will add detailed documentation covering these components in subsequent patches, but wanted to make the core functionality available as soon as possible. Happy for more code review, but we'd really like to get these patches landed and backported ASAP for obvious reasons. We're planning to backport this to both 6.0 and 5.0 release streams and get a 5.0 release with just this cherry picked ASAP for distros and vendors. This patch is the work of a number of people over the past month: Eric, Reid, Rui, and myself. I'm mailing it out as a single commit due to the time sensitive nature of landing this and the need to backport it. Huge thanks to everyone who helped out here, and everyone at Intel who helped out in discussions about how to craft this. Also, credit goes to Paul Turner (at Google, but not an LLVM contributor) for much of the underlying retpoline design. Reviewers: echristo, rnk, ruiu, craig.topper, DavidKreitzer Subscribers: sanjoy, emaste, mcrosier, mgorny, mehdi_amini, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D41723 ------------------------------------------------------------------------ llvm-svn: 324012
2018-02-01Merging r323915:Reid Kleckner3-53/+87
------------------------------------------------------------------------ r323915 | chandlerc | 2018-01-31 12:56:37 -0800 (Wed, 31 Jan 2018) | 17 lines [x86] Make the retpoline thunk insertion a machine function pass. Summary: This removes the need for a machine module pass using some deeply questionable hacks. This should address PR36123 which is a case where in full LTO the memory usage of a machine module pass actually ended up being significant. We should revert this on trunk as soon as we understand and fix the memory usage issue, but we should include this in any backports of retpolines themselves. Reviewers: echristo, MatzeB Subscribers: sanjoy, mcrosier, mehdi_amini, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D42726 ------------------------------------------------------------------------ llvm-svn: 324009
2018-02-01Merging r323155:Reid Kleckner32-12/+1364
------------------------------------------------------------------------ r323155 | chandlerc | 2018-01-22 14:05:25 -0800 (Mon, 22 Jan 2018) | 133 lines Introduce the "retpoline" x86 mitigation technique for variant #2 of the speculative execution vulnerabilities disclosed today, specifically identified by CVE-2017-5715, "Branch Target Injection", and is one of the two halves to Spectre.. Summary: First, we need to explain the core of the vulnerability. Note that this is a very incomplete description, please see the Project Zero blog post for details: https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html The basis for branch target injection is to direct speculative execution of the processor to some "gadget" of executable code by poisoning the prediction of indirect branches with the address of that gadget. The gadget in turn contains an operation that provides a side channel for reading data. Most commonly, this will look like a load of secret data followed by a branch on the loaded value and then a load of some predictable cache line. The attacker then uses timing of the processors cache to determine which direction the branch took *in the speculative execution*, and in turn what one bit of the loaded value was. Due to the nature of these timing side channels and the branch predictor on Intel processors, this allows an attacker to leak data only accessible to a privileged domain (like the kernel) back into an unprivileged domain. The goal is simple: avoid generating code which contains an indirect branch that could have its prediction poisoned by an attacker. In many cases, the compiler can simply use directed conditional branches and a small search tree. LLVM already has support for lowering switches in this way and the first step of this patch is to disable jump-table lowering of switches and introduce a pass to rewrite explicit indirectbr sequences into a switch over integers. However, there is no fully general alternative to indirect calls. We introduce a new construct we call a "retpoline" to implement indirect calls in a non-speculatable way. It can be thought of loosely as a trampoline for indirect calls which uses the RET instruction on x86. Further, we arrange for a specific call->ret sequence which ensures the processor predicts the return to go to a controlled, known location. The retpoline then "smashes" the return address pushed onto the stack by the call with the desired target of the original indirect call. The result is a predicted return to the next instruction after a call (which can be used to trap speculative execution within an infinite loop) and an actual indirect branch to an arbitrary address. On 64-bit x86 ABIs, this is especially easily done in the compiler by using a guaranteed scratch register to pass the target into this device. For 32-bit ABIs there isn't a guaranteed scratch register and so several different retpoline variants are introduced to use a scratch register if one is available in the calling convention and to otherwise use direct stack push/pop sequences to pass the target address. This "retpoline" mitigation is fully described in the following blog post: https://support.google.com/faqs/answer/7625886 We also support a target feature that disables emission of the retpoline thunk by the compiler to allow for custom thunks if users want them. These are particularly useful in environments like kernels that routinely do hot-patching on boot and want to hot-patch their thunk to different code sequences. They can write this custom thunk and use `-mretpoline-external-thunk` *in addition* to `-mretpoline`. In this case, on x86-64 thu thunk names must be: ``` __llvm_external_retpoline_r11 ``` or on 32-bit: ``` __llvm_external_retpoline_eax __llvm_external_retpoline_ecx __llvm_external_retpoline_edx __llvm_external_retpoline_push ``` And the target of the retpoline is passed in the named register, or in the case of the `push` suffix on the top of the stack via a `pushl` instruction. There is one other important source of indirect branches in x86 ELF binaries: the PLT. These patches also include support for LLD to generate PLT entries that perform a retpoline-style indirection. The only other indirect branches remaining that we are aware of are from precompiled runtimes (such as crt0.o and similar). The ones we have found are not really attackable, and so we have not focused on them here, but eventually these runtimes should also be replicated for retpoline-ed configurations for completeness. For kernels or other freestanding or fully static executables, the compiler switch `-mretpoline` is sufficient to fully mitigate this particular attack. For dynamic executables, you must compile *all* libraries with `-mretpoline` and additionally link the dynamic executable and all shared libraries with LLD and pass `-z retpolineplt` (or use similar functionality from some other linker). We strongly recommend also using `-z now` as non-lazy binding allows the retpoline-mitigated PLT to be substantially smaller. When manually apply similar transformations to `-mretpoline` to the Linux kernel we observed very small performance hits to applications running typical workloads, and relatively minor hits (approximately 2%) even for extremely syscall-heavy applications. This is largely due to the small number of indirect branches that occur in performance sensitive paths of the kernel. When using these patches on statically linked applications, especially C++ applications, you should expect to see a much more dramatic performance hit. For microbenchmarks that are switch, indirect-, or virtual-call heavy we have seen overheads ranging from 10% to 50%. However, real-world workloads exhibit substantially lower performance impact. Notably, techniques such as PGO and ThinLTO dramatically reduce the impact of hot indirect calls (by speculatively promoting them to direct calls) and allow optimized search trees to be used to lower switches. If you need to deploy these techniques in C++ applications, we *strongly* recommend that you ensure all hot call targets are statically linked (avoiding PLT indirection) and use both PGO and ThinLTO. Well tuned servers using all of these techniques saw 5% - 10% overhead from the use of retpoline. We will add detailed documentation covering these components in subsequent patches, but wanted to make the core functionality available as soon as possible. Happy for more code review, but we'd really like to get these patches landed and backported ASAP for obvious reasons. We're planning to backport this to both 6.0 and 5.0 release streams and get a 5.0 release with just this cherry picked ASAP for distros and vendors. This patch is the work of a number of people over the past month: Eric, Reid, Rui, and myself. I'm mailing it out as a single commit due to the time sensitive nature of landing this and the need to backport it. Huge thanks to everyone who helped out here, and everyone at Intel who helped out in discussions about how to craft this. Also, credit goes to Paul Turner (at Google, but not an LLVM contributor) for much of the underlying retpoline design. Reviewers: echristo, rnk, ruiu, craig.topper, DavidKreitzer Subscribers: sanjoy, emaste, mcrosier, mgorny, mehdi_amini, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D41723 ------------------------------------------------------------------------ llvm-svn: 324007
2017-12-06Merging r312509:llvmorg-5.0.1-rc3llvmorg-5.0.1Tom Stellard2-7/+58
------------------------------------------------------------------------ r312509 | dannyb | 2017-09-04 19:17:42 -0700 (Mon, 04 Sep 2017) | 1 line NewGVN: Fix PR 34452 by passing instruction all the way down when we do aggregate value simplification ------------------------------------------------------------------------ llvm-svn: 319952
2017-12-05Merging r314733:Tom Stellard4-10/+72
------------------------------------------------------------------------ r314733 | rsmith | 2017-10-02 15:43:36 -0700 (Mon, 02 Oct 2017) | 5 lines PR33839: Fix -Wunused handling for structured binding declarations. We warn about a structured binding declaration being unused only if none of its bindings are used. ------------------------------------------------------------------------ llvm-svn: 319847
2017-12-05Merging r314107:Tom Stellard1-1/+4
------------------------------------------------------------------------ r314107 | sylvestre | 2017-09-25 07:08:35 -0700 (Mon, 25 Sep 2017) | 9 lines Fix clangd when built with LLVM_LINK_LLVM_DYLIB=ON Reviewers: malaperle, malaperle-ericsson, bkramer Reviewed By: bkramer Subscribers: bkramer, mgorny, ilya-biryukov, cfe-commits Differential Revision: https://reviews.llvm.org/D38228 ------------------------------------------------------------------------ llvm-svn: 319846
2017-12-05Merging r311734:Tom Stellard7-2/+79
------------------------------------------------------------------------ r311734 | ruiu | 2017-08-24 16:51:40 -0700 (Thu, 24 Aug 2017) | 44 lines [MACH-O] Fix the ASM code generated for __stub_helpers section Patch by Patricio Villalobos. I discovered that lld for darwin is generating the wrong code for lazy bindings in the __stub_helper section (at least for osx 10.12). This is the way i can reproduce this problem, using this program: #include <stdio.h> int main(int argc, char **argv) { printf("C: printf!\n"); puts("C: puts!\n"); return 0; } Then I link it using i have tested it in 3.9, 4.0 and 4.1 versions: $ clang -c hello.c $ lld -flavor darwin hello.o -o h1 -lc When i execute the binary h1 the system gives me the following error: C: printf! dyld: lazy symbol binding failed: BIND_OPCODE_SET_SEGMENT_AND_OFFSET_ULEB has segment 4 which is too large (0..3) dyld: BIND_OPCODE_SET_SEGMENT_AND_OFFSET_ULEB has segment 4 which is too large (0..3) Trace/BPT trap: 5 Investigating the code, it seems that the problem is that the asm code generated in the file StubPass.cpp, specifically in the line 323,when it adds, what it seems an arbitrary number (12) to the offset into the lazy bind opcodes section, but it should be calculated depending on the MachONormalizedFileBinaryWrite::lazyBindingInfo result. I confirmed this bug by patching the code manually in the binary and writing the right offset in the asm code (__stub_helper). This patch fixes the content of the atom that contains the assembly code when the offset is known. Differential Revision: https://reviews.llvm.org/D35387 ------------------------------------------------------------------------ llvm-svn: 319825
2017-12-03bpf: fix bug on silently truncating 64-bit immediateYonghong Song3-3/+42
We came across an llvm bug when compiling some testcases that 64-bit immediates are silently truncated into 32-bit and then packed into BPF_JMP | BPF_K encoding. This caused comparison with wrong value. This bug looks to be introduced by r308080 (llvm 5.0). The Select_Ri pattern is supposed to be lowered into J*_Ri while the latter only support 32-bit immediate encoding, therefore Select_Ri should have similar immediate predicate check as what J*_Ri are doing. The bug is fixed by git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315889 91177308-0d34-0410-b5e6-96231b3b80d8 in llvm 6.0. This patch is largely the same as the fix in llvm 6.0 except one minor adjustment for the test case. Reported-by: John Fastabend <john.fastabend@gmail.com> Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Yonghong Song <yhs@fb.com> llvm-svn: 319633
2017-11-28Merging r316035:llvmorg-5.0.1-rc2Tom Stellard3-0/+28
------------------------------------------------------------------------ r316035 | tnorthover | 2017-10-17 14:43:52 -0700 (Tue, 17 Oct 2017) | 6 lines AArch64: account for possible frame index operand in compares. If the address of a local is used in a comparison, AArch64 can fold the address-calculation into the comparison via "adds". Unfortunately, a couple of places (both hit in this one test) are not ready to deal with that yet and just assume the first source operand is a register. ------------------------------------------------------------------------ llvm-svn: 319231
2017-11-28Merging r319130:Tom Stellard2-1/+28
------------------------------------------------------------------------ r319130 | matze | 2017-11-27 17:17:52 -0800 (Mon, 27 Nov 2017) | 7 lines ARM: Fix PR32578 https://llvm.org/PR32578 I simplified and converted the reproducer into a lit test. Patch by Vedant Kumar! ------------------------------------------------------------------------ llvm-svn: 319181
2017-11-28Merging r311456:Tom Stellard1-1/+1
------------------------------------------------------------------------ r311456 | krasimir | 2017-08-22 07:28:01 -0700 (Tue, 22 Aug 2017) | 13 lines [clang-format] Fix lines regression in clang-format.py Summary: This patch fixes a regression after https://reviews.llvm.org/rL305665, which updates the structure of the `lines` variable. Reviewers: djasper Reviewed By: djasper Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D37011 ------------------------------------------------------------------------ llvm-svn: 319179
2017-11-27Merging r318848:Tom Stellard2-2/+9
------------------------------------------------------------------------ r318848 | hahnfeld | 2017-11-22 09:15:20 -0800 (Wed, 22 Nov 2017) | 7 lines Fix for OMP doacross implementation on Power Power has a weak consistency model so we need memory barriers to make writes (both from runtime and from user code) available for all threads. Differential Revision: https://reviews.llvm.org/D40175 ------------------------------------------------------------------------ llvm-svn: 319057
2017-11-27Merging r318658:Tom Stellard1-8/+15
------------------------------------------------------------------------ r318658 | achurbanov | 2017-11-20 08:00:42 -0800 (Mon, 20 Nov 2017) | 4 lines Fixed OMP doacross implementation on 32-bit platforms. Differential Revision: https://reviews.llvm.org/D40171 ------------------------------------------------------------------------ llvm-svn: 319053
2017-11-27Merging r316106:Tom Stellard1-2/+2
------------------------------------------------------------------------ r316106 | labath | 2017-10-18 11:52:16 -0700 (Wed, 18 Oct 2017) | 4 lines lldb-server tests: Fix undefined behavior We were creating a StringRef pointing to a temporary string. Problem manifested itself when running the test on osx. ------------------------------------------------------------------------ llvm-svn: 319035
2017-11-27Merging r316181:Tom Stellard2-1/+17
------------------------------------------------------------------------ r316181 | jvesely | 2017-10-19 13:40:13 -0700 (Thu, 19 Oct 2017) | 4 lines AMDGPU: Parse r600 CPU name early and expose FMAF capability Improve amdgcn macro test Differential Revision: https://reviews.llvm.org/D38667 ------------------------------------------------------------------------ llvm-svn: 319032
2017-11-22Merging r313182:Tom Stellard1-1/+1
------------------------------------------------------------------------ r313182 | sylvestre | 2017-09-13 13:03:29 -0700 (Wed, 13 Sep 2017) | 13 lines SplitEmptyFunction should be true in the Mozilla coding style Summary: As defined here: https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Coding_Style#Classes See for the downstream bug report: https://bugzilla.mozilla.org/show_bug.cgi?id=1399359 Reviewers: Typz, djasper Reviewed By: Typz Subscribers: klimek, cfe-commits Differential Revision: https://reviews.llvm.org/D37795 ------------------------------------------------------------------------ llvm-svn: 318858
2017-11-22Merging r318788:Tom Stellard2-62/+1
------------------------------------------------------------------------ r318788 | mcrosier | 2017-11-21 10:08:34 -0800 (Tue, 21 Nov 2017) | 16 lines [AArch64] Mark mrs of TPIDR_EL0 (thread pointer) as *having* side effects. This partially reverts r298851. The the underlying issue is that we don't currently model the dependency between mrs (read system register) and msr (write system register) instructions. Something like the below should never be reordered: msr TPIDR_EL0, x0 ;; set thread pointer mrs x8, TPIDR_EL0 ;; read thread pointer but was being reordered after r298851. The functional part of the patch that wasn't reverted needed to remain in place in order to not break r299462. PR35317 ------------------------------------------------------------------------ llvm-svn: 318854
2017-11-22Merging r315086:Tom Stellard5-1/+64
------------------------------------------------------------------------ r315086 | compnerd | 2017-10-06 11:06:59 -0700 (Fri, 06 Oct 2017) | 8 lines Bitcode: add an auto-upgrade for LTO section name The bitcode reader looks specifically for `__DATA, __objc_catlist` as a section name. However, SVN r304661 removed the spaces (the two names are functionally equivalent but do not compare equally lexicographically). This causes compatibility issues. Add an auto-upgrade path for removing the spaces as well as use the new name in the LTO plugin. ------------------------------------------------------------------------ llvm-svn: 318851
2017-11-22Merging r313398:Tom Stellard2-2/+22
------------------------------------------------------------------------ r313398 | steven_wu | 2017-09-15 14:12:14 -0700 (Fri, 15 Sep 2017) | 19 lines [AutoUpgrade] Fix a compatibility issue with module flag Summary: After r304661, module flag to record objective-c image info section is encoded without whitespaces after comma. The new name is equivalent to the old one, except that when LTO a module built by old compiler and a module built by a new compiler, it will fail with conflicting values. Fix the issue by removing whitespaces in bitcode upgrade path. rdar://problem/34416934 Reviewers: compnerd Reviewed By: compnerd Subscribers: mehdi_amini, hans, llvm-commits Differential Revision: https://reviews.llvm.org/D37909 ------------------------------------------------------------------------ llvm-svn: 318850
2017-11-22Merging r318039:Tom Stellard1-2/+1
------------------------------------------------------------------------ r318039 | labath | 2017-11-13 06:03:17 -0800 (Mon, 13 Nov 2017) | 37 lines Revert "[lldb] Use OrcMCJITReplacement rather than MCJIT as the underlying JIT for LLDB" This commit really did not introduce any functional changes (for most people) but it turns out it's not for the reason we thought it was. The reason wasn't that Orc is a perfect drop-in replacement for MCJIT, but it was because we were never using Orc in the first place, as it was not initialized. Orc's initialization relies on a global constructor in the LLVMOrcJIT.a. Since this archive does not expose any symbols referenced from other object files, it does not get linked into liblldb when linking against llvm components statically. However, in an LLVM_LINK_LLVM_DYLIB=On build, LLVMOrcJit.a is linked into libLLVM.so using --whole-archive, so the global constructor does end up firing. The result of using Orc jit is pr34194, where lldb fails to evaluate even very simple expressions. This bug can be reproduced in non-LLVM_LINK_LLVM_DYLIB builds by making sure Orc jit is linked into liblldb, for example by #including llvm/ExecutionEngine/OrcMCJITReplacement.h in IRExecutionUnit.cpp (and adding OrcJIT as a dependency to the relevant CMakeLists.txt file). The bug reproduces (at least) on linux and osx. The root cause of the bug seems to be related to relocation processing. It seems Orc processes relocations earlier than the system it is replacing. This means the relocation processing happens before we have had a chance to remap section load addresses to reflect their address in the target process memory, so they end up pointing to locations in the lldb's address space instead. I am not sure whether this is a bug in Orc jit, or in how we are using it from lldb, but in any case it is preventing us from using Orc right now. Reverting this fixes LLVM_LINK_LLVM_DYLIB build, and makes it clear that we are in fact *not* using Orc, and we never really were. This reverts commit r279327. ------------------------------------------------------------------------ llvm-svn: 318845
2017-11-22Merging r315994:Tom Stellard12-74/+795
------------------------------------------------------------------------ r315994 | ericwf | 2017-10-17 06:03:17 -0700 (Tue, 17 Oct 2017) | 18 lines [libc++] Fix PR34898 - vector iterator constructors and assign method perform push_back instead of emplace_back. Summary: The constructors `vector(Iter, Iter, Alloc = Alloc{})` and `assign(Iter, Iter)` don't correctly perform EmplaceConstruction from the result of dereferencing the iterator. This results in them performing an additional and unneeded copy. This patch addresses the issue by correctly using `emplace_back` in C++11 and newer. There are also some bugs in our `insert` implementation, but those will be handled separately. @mclow.lists We should probably merge this into 5.1, agreed? Reviewers: mclow.lists, dlj, EricWF Reviewed By: mclow.lists, EricWF Subscribers: cfe-commits, mclow.lists Differential Revision: https://reviews.llvm.org/D38757 ------------------------------------------------------------------------ llvm-svn: 318837
2017-11-22Merging r312892:Tom Stellard3-26/+54
------------------------------------------------------------------------ r312892 | ericwf | 2017-09-10 16:41:20 -0700 (Sun, 10 Sep 2017) | 10 lines Fix PR34298 - Allow std::function with an incomplete return type. This patch fixes llvm.org/PR34298. Previously libc++ incorrectly evaluated the __invokable trait via the converting constructor `function(Tp)` [with Tp = std::function] whenever the copy constructor or copy assignment operator was required. This patch further constrains that constructor to short circut before evaluating the troublesome SFINAE when `Tp` matches std::function. The original patch is from Alex Lorenz. ------------------------------------------------------------------------ llvm-svn: 318835
2017-11-17Merging r318289:Tom Stellard3-8/+67
------------------------------------------------------------------------ r318289 | jdevlieghere | 2017-11-15 02:57:05 -0800 (Wed, 15 Nov 2017) | 14 lines [DebugInfo] Fix potential CU mismatch for SubprogramScopeDIEs. In constructAbstractSubprogramScopeDIE there can be a potential mismatch between `this` and the CU of ContextDIE when a scope is shared between two DISubprograms belonging to a different CU. In that case, `this` is the CU that was specified in the IR, but the CU of ContextDIE is that of the first subprogram that was emitted. This patch fixes the mismatch by looking up the CU of ContextDIE, and switching to use that. This fixes PR35212 (https://bugs.llvm.org/show_bug.cgi?id=35212) Patch by Philip Craig! Differential revision: https://reviews.llvm.org/D39981 ------------------------------------------------------------------------ llvm-svn: 318542
2017-11-16Merging r318207:Simon Dardis16-14/+547
------------------------------------------------------------------------ r318207 | sdardis | 2017-11-14 22:26:42 +0000 (Tue, 14 Nov 2017) | 18 lines Reland "[mips][mt][6/7] Add support for mftr, mttr instructions." This adjusts the tests to hopfully pacify the llvm-clang-x86_64-expensive-checks-win buildbot. Unlike many other instructions, these instructions have aliases which take coprocessor registers, gpr register, accumulator (and dsp accumulator) registers, floating point registers, floating point control registers and coprocessor 2 data and control operands. For the moment, these aliases are treated as pseudo instructions which are expanded into the underlying instruction. As a result, disassembling these instructions shows the underlying instruction and not the alias. Reviewers: slthakur, atanasyan Differential Revision: https://reviews.llvm.org/D35253 ------------------------------------------------------------------------ llvm-svn: 318386
2017-11-15Merging r312748:Tom Stellard2-6/+5
------------------------------------------------------------------------ r312748 | jroelofs | 2017-09-07 15:01:25 -0700 (Thu, 07 Sep 2017) | 10 lines Fix validation of the -mthread-model flag in the Clang driver The ToolChain class validates the -mthread-model flag in the constructor which doesn't work correctly since the thread model methods are virtual methods. The check is moved into Clang::ConstructJob() when constructing the internal command line. https://reviews.llvm.org/D37496 Patch by: Ian Tessier! ------------------------------------------------------------------------ llvm-svn: 318346
2017-11-15Merging r312043:Tom Stellard2-7/+4
------------------------------------------------------------------------ r312043 | rnk | 2017-08-29 14:44:21 -0700 (Tue, 29 Aug 2017) | 25 lines [cmake] Stop putting the revision info in LLVM_VERSION_STRING Summary: This reduces the number of build actions after a no-op commit from thousands to about six, which should be acceptable. If six actions is still too many, developers can disable the LLVM_APPEND_VC_REV cmake option. llvm-config.h is a widely included header that should rarely change. Before this patch, it would change after every re-configure. Very few users of llvm-config.h need to know the precise version, and those that do can migrate to incorporating LLVM_REVISION as provided by llvm/Support/VCSRevision.h. This should bring LLVM back to the behavior that it had before r306858 from June 30 2017. Most LLVM tools will now print a version string like "6.0.0svn" instead of "6.0.0-git-c40c2a23de4". Fixes PR34308 Reviewers: pcc, rafael, hans Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D37272 ------------------------------------------------------------------------ llvm-svn: 318344
2017-11-15Merging r310475:Tom Stellard2-0/+23
------------------------------------------------------------------------ r310475 | belleyb | 2017-08-09 06:47:01 -0700 (Wed, 09 Aug 2017) | 28 lines [Support] PR33388 - Fix formatv_object move constructor formatv_object currently uses the implicitly defined move constructor, but it is buggy. In typical use-cases, the problem doesn't show-up because all calls to the move constructor are elided. Thus, the buggy constructors are never invoked. The issue especially shows-up when code is compiled using the -fno-elide-constructors compiler flag. For instance, this is useful when attempting to collect accurate code coverage statistics. The exact issue is the following: The Parameters data member is correctly moved, thus making the parameters occupy a new memory location in the target object. Unfortunately, the default copying of the Adapters blindly copies the vector of pointers, leaving each of these pointers referencing the parameters in the original object instead of the copied one. These pointers quickly become dangling when the original object is deleted. This quickly leads to crashes. The solution is to update the Adapters pointers when performing a move. The copy constructor isn't useful for format objects and can thus be deleted. This resolves PR33388. Differential Revision: https://reviews.llvm.org/D34463 ------------------------------------------------------------------------ llvm-svn: 318333
2017-11-15Merging r315578:Tom Stellard2-0/+37
------------------------------------------------------------------------ r315578 | abataev | 2017-10-12 06:51:32 -0700 (Thu, 12 Oct 2017) | 7 lines [OPENMP] Fix PR34925: Fix getting thread_id lvalue for inlined regions in C. If we try to get the lvalue for thread_id variables in inlined regions, we did not use the correct version of function. Fixed this bug by adding overrided version of the function getThreadIDVariableLValue for inlined regions. ------------------------------------------------------------------------ llvm-svn: 318315
2017-11-15Merging r313776:Tom Stellard1-2/+3
------------------------------------------------------------------------ r313776 | marshall | 2017-09-20 10:34:11 -0700 (Wed, 20 Sep 2017) | 1 line Fix a bit of UB in __independent_bits_engine. Fixes PR#34663 ------------------------------------------------------------------------ llvm-svn: 318236
2017-11-15Merging r315464:Tom Stellard3-3/+22
------------------------------------------------------------------------ r315464 | abataev | 2017-10-11 08:29:40 -0700 (Wed, 11 Oct 2017) | 5 lines [OPENMP] Fix PR34916: Crash on mixing taskloop|tasks directives. If both taskloop and task directives are used at the same time in one program, we may ran into the situation when the particular type for task directive is reused for taskloop directives. Patch fixes this problem. ------------------------------------------------------------------------ llvm-svn: 318233
2017-11-14Merging r310905 and r310994:Tom Stellard1-13/+11
------------------------------------------------------------------------ r310905 | rnk | 2017-08-14 18:17:47 -0700 (Mon, 14 Aug 2017) | 11 lines Avoid PointerIntPair of constexpr EvalInfo structs They are stack allocated, so their alignment is not to be trusted. 32-bit MSVC only guarantees 4 byte stack alignment, even though alignof would tell you otherwise. I tried fixing this with __declspec align, but that apparently upsets GCC. Hopefully this version will satisfy all compilers. See PR32018 for some info about the mingw issues. Should supercede https://reviews.llvm.org/D34873 ------------------------------------------------------------------------ ------------------------------------------------------------------------ r310994 | chandlerc | 2017-08-16 00:22:49 -0700 (Wed, 16 Aug 2017) | 6 lines Fix a UBSan failure where this boolean was copied when uninitialized. When r310905 moved the pointer and bool out of a PointerIntPair, it made them end up uninitialized and caused UBSan failures when copying the uninitialized boolean. However, making the pointer be null should avoid the reference to the boolean entirely. ------------------------------------------------------------------------ llvm-svn: 318225
2017-11-14Merging r315310:Tom Stellard3-4/+84
------------------------------------------------------------------------ r315310 | sdardis | 2017-10-10 06:34:45 -0700 (Tue, 10 Oct 2017) | 22 lines [mips] Partially fix PR34391 Previously, the parsing of the 'subu $reg, ($reg,) imm' relied on a parser which also rendered the operand to the instruction. In some cases the general parser could construct an MCExpr which was not a MCConstantExpr which MipsAsmParser was expecting. Address this by altering the special handling to cope with unexpected inputs and fine-tune the handling of cases where an register name that is not available in the current ABI is regarded as not a match for the custom parser but also not as an outright error. Also enforces the binutils restriction that only constants are accepted. This partially resolves PR34391. Thanks to Ed Maste for reporting the issue! Reviewers: nitesh.jain, arichardson Differential Revision: https://reviews.llvm.org/D37476 ------------------------------------------------------------------------ llvm-svn: 318192
2017-11-14Merging r317204 and r318172:Tom Stellard4-41/+263
------------------------------------------------------------------------ r317204 | sdardis | 2017-11-02 05:47:22 -0700 (Thu, 02 Nov 2017) | 15 lines [mips] Use register scavenging with MSA. MSA stores and loads to the stack are more likely to require an emergency GPR spill slot due to the smaller offsets available with those instructions. Handle this by overestimating the size of the stack by determining the largest offset presuming that all callee save registers are spilled and accounting of incoming arguments when determining whether an emergency spill slot is required. Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D39056 ------------------------------------------------------------------------ ------------------------------------------------------------------------ r318172 | sdardis | 2017-11-14 11:11:45 -0800 (Tue, 14 Nov 2017) | 5 lines [mips] Simplify test for 5.0.1 (NFC) Simplify testing that an emergency spill slot is used when MSA is used so that it can be included in the 5.0.1 release. ------------------------------------------------------------------------ llvm-svn: 318191
2017-11-14Merging r317470:Tom Stellard2-4/+32
------------------------------------------------------------------------ r317470 | sdardis | 2017-11-06 02:50:04 -0800 (Mon, 06 Nov 2017) | 12 lines [mips] Fix PR35140 Mark all symbols involved with TLS relocations as being TLS symbols. This resolves PR35140. Thanks to Alex Crichton for reporting the issue! Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D39591 ------------------------------------------------------------------------ llvm-svn: 318188
2017-11-14Merging r314798:Tom Stellard4-0/+69
------------------------------------------------------------------------ r314798 | sdardis | 2017-10-03 06:45:49 -0700 (Tue, 03 Oct 2017) | 9 lines [mips] Enable spilling and reloading of the dsp register set. The dsp register class is an alias of the gpr register class, so we have to define instructions for spilling and reloading. Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D38038 ------------------------------------------------------------------------ llvm-svn: 318183
2017-11-14Merging r310543:Tom Stellard2-0/+21
------------------------------------------------------------------------ r310543 | pcc | 2017-08-09 18:07:44 -0700 (Wed, 09 Aug 2017) | 9 lines Linker: Create a function declaration when moving a non-prevailing alias of function type. We were previously creating a global variable of function type, which is invalid IR. This issue was exposed by r304690, in which we started asserting that global variables were of a valid type. Fixes PR33462. Differential Revision: https://reviews.llvm.org/D36438 ------------------------------------------------------------------------ llvm-svn: 318181
2017-11-14Merging r310522:Tom Stellard11-2/+176
------------------------------------------------------------------------ r310522 | belleyb | 2017-08-09 13:58:39 -0700 (Wed, 09 Aug 2017) | 8 lines [Linker] PR33527 - Linker::LinkOnlyNeeded should import AppendingLinkage globals Linker::LinkOnlyNeeded should always import globals with AppendingLinkage. This resolves PR33527. Differential Revision: https://reviews.llvm.org/D34448 ------------------------------------------------------------------------ llvm-svn: 318180
2017-11-14Merging r317115:Tom Stellard2-3/+46
------------------------------------------------------------------------ r317115 | jlpeyton | 2017-11-01 12:44:42 -0700 (Wed, 01 Nov 2017) | 19 lines [OpenMP] Fix race condition in omp_init_lock This is a partial fix for bug 34050. This prevents callers of omp_set_lock (which does not hold __kmp_global_lock) from ever seeing an uninitialized version of __kmp_i_lock_table.table. It does not solve a use-after-free race condition if omp_set_lock obtains a pointer to __kmp_i_lock_table.table before it is updated and then attempts to dereference afterwards. That race is far less likely and can be handled in a separate patch. The unit test usually segfaults on the current trunk revision. It passes with the patch. Patch by Adam Azarchs Differential Revision: https://reviews.llvm.org/D39439 ------------------------------------------------------------------------ llvm-svn: 318178
2017-11-14Merging r316452:Tom Stellard1-0/+7
------------------------------------------------------------------------ r316452 | jlpeyton | 2017-10-24 09:10:09 -0700 (Tue, 24 Oct 2017) | 9 lines Disable threadprivate data cleanup if runtime is terminating The problem is due to the runtime's threadprivate cleanup code which tries to access data that was already destroyed by one of the root threads. __kmp_init_gtid is used as a checker here since it is set to false before actual resource cleanup is done in __kmp_cleanup(). Patch by Hansang Bae ------------------------------------------------------------------------ llvm-svn: 318176
2017-11-14Merging r311269:Tom Stellard1-1/+1
------------------------------------------------------------------------ r311269 | jlpeyton | 2017-08-19 16:53:36 -0700 (Sat, 19 Aug 2017) | 8 lines Use va_copy instead of __va_copy to fix building libomp against musl libc Fixes https://bugs.llvm.org/show_bug.cgi?id=34040 Patch by Peter Levine Differential Revision: https://reviews.llvm.org/D36343 ------------------------------------------------------------------------ llvm-svn: 318175