Age | Commit message (Collapse) | Author | Files | Lines |
|
Reviewers: hansw, jdoerfert
Subscribers: sylvestre.ledru, mgorny, hans, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70460
(cherry picked from commit edf6717d8d30034da932b95350898e03c90a5082)
|
|
------------------------------------------------------------------------
r372480 | ssarda | 2019-09-21 11:16:37 -0700 (Sat, 21 Sep 2019) | 9 lines
SROA: Check Total Bits of vector type
While Promoting alloca instruction of Vector Type,
Check total size in bits of its slices too.
If they don't match, don't promote the alloca instruction.
Bug : https://bugs.llvm.org/show_bug.cgi?id=42585
------------------------------------------------------------------------
|
|
These two intrinsics are lowered to calls so should prevent the formation of
CTR loops. In a subsequent patch, we will handle all currently known intrinsics
and prevent the formation of HW loops if any unknown intrinsics are encountered.
Differential revision: https://reviews.llvm.org/D68841
(cherry picked from commit 97e36260709c541044f30092b420238511e13e5b)
|
|
------------------------------------------------------------------------
r366572 | thanm | 2019-07-19 06:13:54 -0700 (Fri, 19 Jul 2019) | 12 lines
[NFC] include cstdint/string prior to using uint8_t/string
Summary: include proper header prior to use of uint8_t typedef
and std::string.
Subscribers: llvm-commits
Reviewers: cherry
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64937
------------------------------------------------------------------------
|
|
------------------------------------------------------------------------
r370113 | luismarques | 2019-08-27 14:37:57 -0700 (Tue, 27 Aug 2019) | 5 lines
[RISCV] Implement RISCVRegisterInfo::getPointerRegClass
Fixes bug 43041
Differential Revision: https://reviews.llvm.org/D66752
------------------------------------------------------------------------
|
|
When converting reg+reg shifts to reg+imm rotates, we neglect to consider the
CodeGenOnly versions of the 32-bit shift mnemonics. This means we produce a
rotate with missing operands which causes a crash.
Committing this fix without review since it is non-controversial that the list
of mnemonics to consider should include the 64-bit aliases for the exact
mnemonics.
Fixes PR44183.
(cherry picked from commit 241cbf201a6f4b7658697e3c76fc6e741d049a01)
|
|
The Overflow version of XO-Form instruction uses the SO, OV and
OV32 special registers.
This changes modifies existing multiclasses and instruction
definitions to allow for the use of the XER register to record
the various types if overflow from possible add, subtract and
multiply instructions. It then modifies the existing instructions
as to use these multiclasses as needed.
Patch By: Kamau Bridgeman
Differential Revision: https://reviews.llvm.org/D66902
(cherry picked from commit fdf3d1766bbabb48a397fae646facbe2690313f6)
|
|
------------------------------------------------------------------------
r373389 | leonardchan | 2019-10-01 13:30:46 -0700 (Tue, 01 Oct 2019) | 10 lines
[ASan] Make GlobalsMD member a const reference.
PR42924 points out that copying the GlobalsMetadata type during
construction of AddressSanitizer can result in exteremely lengthened
build times for translation units that have many globals. This can be addressed
by just making the GlobalsMD member in AddressSanitizer a reference to
avoid the copy. The GlobalsMetadata type is already passed to the
constructor as a reference anyway.
Differential Revision: https://reviews.llvm.org/D68287
------------------------------------------------------------------------
|
|
The store splitting transform was assuming a simple type (MVT),
but that's not necessarily the case as shown in the test.
(cherry picked from commit 8e34dd941cb304c785ef623633ad663b59cfced0)
|
|
On RHEL, the OS tooling (ar, ranlib) is not deterministic by default.
Therefore, we cannot get bit-for-bit identical builds.
The goal of this patch is that it adds the flags required to force determinism.
Differential Revision: https://reviews.llvm.org/D64817
(cherry picked from commit c84c62c50aa8524dbf96217c337f3b5ee4139000)
|
|
------------------------------------------------------------------------
r373655 | nickdesaulniers | 2019-10-03 13:10:02 -0700 (Thu, 03 Oct 2019) | 16 lines
[AArch64InstPrinter] prefer bfi to bfc for < armv8.2-a
Summary:
Fixes pr/42576.
Link: https://github.com/ClangBuiltLinux/linux/issues/697
Reviewers: t.p.northover
Reviewed By: t.p.northover
Subscribers: kristof.beyls, hiraditya, llvm-commits, srhines
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68356
------------------------------------------------------------------------
|
|
Summary:
This also changes the test-release.sh script to build using the monorepo
layout instead of copying sub-projects into llvm/tools or llvm/projects.
Reviewers: jdoerfert, hans
Reviewed By: hans
Subscribers: hans, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70353
(cherry picked from commit c97f303880c26007b4e74e3fd0bde5ed802a25f1)
|
|
------------------------------------------------------------------------
r373220 | probinson | 2019-09-30 08:11:23 -0700 (Mon, 30 Sep 2019) | 12 lines
[SSP] [3/3] cmpxchg and addrspacecast instructions can now
trigger stack protectors. Fixes PR42238.
Add test coverage for llvm.memset, as proxy for all llvm.mem*
intrinsics. There are two issues here: (1) they could be lowered to a
libc call, which could be intercepted, and do Bad Stuff; (2) with a
non-constant size, they could overwrite the current stack frame.
The test was mostly written by Matt Arsenault in r363169, which was
later reverted; I tweaked what he had and added the llvm.memset part.
Differential Revision: https://reviews.llvm.org/D67845
------------------------------------------------------------------------
|
|
------------------------------------------------------------------------
r373219 | probinson | 2019-09-30 08:08:38 -0700 (Mon, 30 Sep 2019) | 3 lines
[SSP] [2/3] Refactor an if/dyn_cast chain to switch on opcode. NFC
Differential Revision: https://reviews.llvm.org/D67844
------------------------------------------------------------------------
|
|
------------------------------------------------------------------------
r373216 | probinson | 2019-09-30 08:01:35 -0700 (Mon, 30 Sep 2019) | 7 lines
[SSP] [1/3] Revert "StackProtector: Use PointerMayBeCaptured"
"Captured" and "relevant to Stack Protector" are not the same thing.
This reverts commit f29366b1f594f48465c5a2754bcffac6d70fd0b1.
aka r363169.
Differential Revision: https://reviews.llvm.org/D67842
------------------------------------------------------------------------
To avoid changing the ABI, the VisitedPHIs member from the StackProtector class
was replaced with a local variable in StackProtector::RequiresStackProtector().
|
|
------------------------------------------------------------------------
r370547 | wmi | 2019-08-30 16:01:22 -0700 (Fri, 30 Aug 2019) | 24 lines
[GVN] Verify value equality before doing phi translation for call instruction
This is an updated version of https://reviews.llvm.org/D66909 to fix PR42605.
Basically, current phi translatation translates an old value number to an new
value number for a call instruction based on the literal equality of call
expression, without verifying there is no clobber in between. This is incorrect.
To get a finegrain check, use MachineDependence analysis to do the job. However,
this is still not ideal. Although given a call instruction,
`MemoryDependenceResults::getCallDependencyFrom` returns identical call
instructions without clobber in between using MemDepResult with its DepType to
be `Def`. However, identical is too strict here and we want it to be relaxed a
little to consider phi-translation -- callee is the same, param operands can be
different. That means changing the semantic of `MemDepResult::Def` and I don't
know the potential impact.
So currently the patch is still conservative to only handle
MemDepResult::NonFuncLocal, which means the current call has no function local
clobber. If there is clobber, even if the clobber doesn't stand in between the
current call and the call with the new value, we won't do phi-translate.
Differential Revision: https://reviews.llvm.org/D67013
------------------------------------------------------------------------
|
|
Summary:
Rolls back the remaining bad optimizations introduced in
eb15d00193f. Some of them were already rolled back in e661f946a7db and
this finishes the job.
Fixes https://bugs.llvm.org/show_bug.cgi?id=44012.
Reviewers: dschuff, aheejin
Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70347
(cherry picked from commit 194d7ec081c31ee4ed82bfa3cade4ef30afab912)
|
|
Summary: This is a bug fix for further issues in PR43585.
Reviewers: rnk, RKSimon, craig.topper, andrew.w.kaylor
Subscribers: hiraditya, llvm-commits, annita.zhang
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70224
(cherry picked from commit 8723b95cefa4f2a891c2b496dca79f1734cf1d1c)
|
|
------------------------------------------------------------------------
r375403 | lenary | 2019-10-21 03:00:34 -0700 (Mon, 21 Oct 2019) | 30 lines
[MemCpyOpt] Fixing Incorrect Code Motion while Handling Aggregate Type Values
Summary:
When MemCpyOpt is handling aggregate type values, if an instruction (let's call it P) between the targeting load (L) and store (S) clobbers the source pointer of L, it will try to hoist S before P. This process will also hoist S's data dependency instructions.
However, the current implementation has a bug that if one of S's dependency instructions is //also// a user of P, MemCpyOpt will not prevent it from being hoisted above P and cause a use-before-define error. For example, in the newly added test file (i.e. `aggregate-type-crash.ll`), it will try to hoist both `store %my_struct %1, %my_struct* %3` and its dependent, `%3 = bitcast i8* %2 to %my_struct*`, above `%2 = call i8* @my_malloc(%my_struct* %0)`. Creating the following BB:
```
entry:
%1 = bitcast i8* %4 to %my_struct*
%2 = bitcast %my_struct* %1 to i8*
%3 = bitcast %my_struct* %0 to i8*
call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %2, i8* align 4 %3, i64 8, i1 false)
%4 = call i8* @my_malloc(%my_struct* %0)
ret void
```
Where there is a use-before-define error between `%1` and `%4`.
Update: The compiler for the Pony Programming Language [also encounter the same bug](https://github.com/ponylang/ponyc/issues/3140)
Patch by Min-Yih Hsu (myhsu)
Reviewers: eugenis, pcc, dblaikie, dneilson, t.p.northover, lattner
Reviewed By: eugenis
Subscribers: lenary, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66060
------------------------------------------------------------------------
|
|
This change introduced a use of std::make_unique() which some compilers
reject in c++11 mode.
|
|
This still fixes PR43479, but does it in a way that does not change
the libLLVM-9.so ABI.
|
|
|
|
Summary:
Add instruction marker to MachineInstr ExtraInfo. This does almost the
same thing as Pre/PostInstrSymbols, except that it doesn't create a label until
printing instructions. This allows for labels to be put around instructions that
are deleted/duplicated somewhere.
Use this marker to track heap alloc site call instructions.
Reviewers: rnk
Subscribers: MatzeB, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69536
cherry picked from 742043047c973999eac7734e53f7872973933f24 with some
modifications.
|
|
------------------------------------------------------------------------
r373397 | ctopper | 2019-10-01 14:55:55 -0700 (Tue, 01 Oct 2019) | 8 lines
[X86] convertToThreeAddress, make sure second operand of SUB32ri is really an immediate before calling getImm().
It might be a symbol instead. We can't fold those since we can't
negate them.
Similar for other SUB with immediates.
Fixes PR43529.
------------------------------------------------------------------------
|
|
------------------------------------------------------------------------
r372886 | spatel | 2019-09-25 08:08:33 -0700 (Wed, 25 Sep 2019) | 7 lines
[DAGCombiner] add one-use restriction to vector transform with cheap extract
We might be able to do better on the example in the test,
but in general, we should not scalarize a splatted vector
binop if there are other uses of the binop. Otherwise, we
can end up with code as we had - a scalar op that is
redundant with a vector op.
------------------------------------------------------------------------
|
|
------------------------------------------------------------------------
r372883 | spatel | 2019-09-25 07:57:45 -0700 (Wed, 25 Sep 2019) | 1 line
[x86] add test for multi-use scalarization of vector binop; NFC
------------------------------------------------------------------------
|
|
------------------------------------------------------------------------
r372675 | aemerson | 2019-09-23 17:09:23 -0700 (Mon, 23 Sep 2019) | 7 lines
[GlobalISel][IRTranslator] Fix switch table lowering to use signed LE not unsigned.
We were miscompiling switch value comparisons with the wrong signedness, which
shows up when we have things like switch case values with i1 types, which end up
being legalized incorrectly.
Fixes PR43383
------------------------------------------------------------------------
|
|
------------------------------------------------------------------------
r374598 | atanasyan | 2019-10-11 14:51:33 -0700 (Fri, 11 Oct 2019) | 12 lines
[mips] Store 64-bit `li.d' operand as a single 8-byte value
Now assembler generates two consecutive `.4byte` directives to store
64-bit `li.d' operand. The first directive stores high 4-byte of the
value. The second directive stores low 4-byte of the value. But on
64-bit system we load this value at once and get wrong result if the
system is little-endian.
This patch fixes the bug. It stores the `li.d' operand as a single
8-byte value.
Differential Revision: https://reviews.llvm.org/D68778
------------------------------------------------------------------------
|
|
------------------------------------------------------------------------
r374544 | atanasyan | 2019-10-11 05:33:12 -0700 (Fri, 11 Oct 2019) | 12 lines
[mips] Fix loading "double" immediate into a GPR and FPR
If a "double" (64-bit) value has zero low 32-bits, it's possible to load
such value into a GP/FP registers as an instruction immediate. But now
assembler loads only high 32-bits of the value.
For example, if a target register is GPR the `li.d $4, 1.0` instruction
converts into the `lui $4, 16368` one. As a result, we get `0x3FF00000`
in the register. While a correct representation of the `1.0` value is
`0x3FF0000000000000`. The patch fixes that.
Differential Revision: https://reviews.llvm.org/D68776
------------------------------------------------------------------------
------------------------------------------------------------------------
r374548 | atanasyan | 2019-10-11 05:58:37 -0700 (Fri, 11 Oct 2019) | 1 line
[mips] Follow-up to r374544. Fix test case.
------------------------------------------------------------------------
|
|
------------------------------------------------------------------------
r374165 | atanasyan | 2019-10-09 06:12:27 -0700 (Wed, 09 Oct 2019) | 1 line
[mips] Rename local variable. NFC
------------------------------------------------------------------------
|
|
------------------------------------------------------------------------
r374164 | atanasyan | 2019-10-09 06:12:21 -0700 (Wed, 09 Oct 2019) | 8 lines
[mips] Split expandLoadImmReal into multiple methods. NFC
The `expandLoadImmReal` handles four different and almost non-overlapping
cases: loading a "single" float immediate into a GPR, loading a "single"
float immediate into a FPR, and the same couple for a "double" float
immediate.
It's better to move each `else if` branch into separate methods.
------------------------------------------------------------------------
|
|
This works around a bug in Debian's patchset for glibc. The bug is
described in detail in the upstream debian bug:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=943798, but the short
version of it is that glibc on any Debian based distro don't load
libraries unless it has a .ARM.attribute section.
Reviewed by: jhenderson, rupprecht, MaskRay, jakehehrlich
Differential Revision: https://reviews.llvm.org/D69188
Patch by Tobias Hieta.
(cherry picked from commit fb4a55010ee9bd03720609c8542f770775576fc8)
|
|
------------------------------------------------------------------------
r375265 | kerbowa | 2019-10-18 11:20:30 -0700 (Fri, 18 Oct 2019) | 13 lines
AMDGPU: Fix SMEM WAR hazard for gfx10 readlane
Summary: Hazard recognizer fails to see hazard with V_READLANE_B32_gfx10.
Reviewers: rampitec
Reviewed By: rampitec
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69172
------------------------------------------------------------------------
|
|
Summary:
This is an alternate approach to D63396
Currently funclets reuse the same stack slots that are used in the
parent function for saving callee-saved xmm registers. If the parent
function modifies a callee-saved xmm register before an excpetion is
thrown, the catch handler will overwrite the original saved value.
This patch allocates space in funclets stack for saving callee-saved xmm
registers and uses RSP instead RBP to access memory.
Signed-off-by: Pengfei Wang <pengfei.wang@intel.com>
Reviewers: rnk, RKSimon, craig.topper, annita.zhang, LuoYuanke, andrew.w.kaylor
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D66596
Signed-off-by: Pengfei Wang <pengfei.wang@intel.com>
llvm-svn: 370005
(cherry picked from commit 564fb58a32a808c34d809820d00e2f23c0307a71)
|
|
Summary:
In the long run we should come up with another mechanism for marking
call instructions as heap allocation sites, and remove this workaround.
For now, we've had two bug reports about this, so let's apply this
workaround. SLH (the other client of instruction labels) probably has
the same bug, but the solution there is more likely to be to mark the
call instruction as not duplicatable, which doesn't work for debug info.
Reviewers: akhuang
Subscribers: aprantl, hiraditya, aganea, chandlerc, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69068
llvm-svn: 375137
(cherry picked from commit fc69ad09882ccfbedc2d06afc971d59eb6a24ee0)
|
|
------------------------------------------------------------------------
r372020 | rnk | 2019-09-16 11:49:09 -0700 (Mon, 16 Sep 2019) | 30 lines
[PGO] Use linkonce_odr linkage for __profd_ variables in comdat groups
This fixes relocations against __profd_ symbols in discarded sections,
which is PR41380.
In general, instrumentation happens very early, and optimization and
inlining happens afterwards. The counters for a function are calculated
early, and after inlining, counters for an inlined function may be
widely referenced by other functions.
For C++ inline functions of all kinds (linkonce_odr &
available_externally mainly), instr profiling wants to deduplicate these
__profc_ and __profd_ globals. Otherwise the binary would be quite
large.
I made __profd_ and __profc_ comdat in r355044, but I chose to make
__profd_ internal. At the time, I was only dealing with coverage, and in
that case, none of the instrumentation needs to reference __profd_.
However, if you use PGO, then instrumentation passes add calls to
__llvm_profile_instrument_range which reference __profd_ globals. The
solution is to make these globals externally visible by using
linkonce_odr linkage for data as was done for counters.
This is safe because PGO adds a CFG hash to the names of the data and
counter globals, so if different TUs have different globals, they will
get different data and counter arrays.
Reviewers: xur, hans
Differential Revision: https://reviews.llvm.org/D67579
------------------------------------------------------------------------
------------------------------------------------------------------------
r372182 | rnk | 2019-09-17 14:10:49 -0700 (Tue, 17 Sep 2019) | 12 lines
[PGO] Don't use comdat groups for counters & data on COFF
For COFF, a comdat group is really a symbol marked
IMAGE_COMDAT_SELECT_ANY and zero or more other symbols marked
IMAGE_COMDAT_SELECT_ASSOCIATIVE. Typically the associative symbols in
the group are not external and are not referenced by other TUs, they are
things like debug info, C++ dynamic initializers, or other section
registration schemes. The Visual C++ linker reports a duplicate symbol
error for symbols marked IMAGE_COMDAT_SELECT_ASSOCIATIVE even if they
would be discarded after handling the leader symbol.
Fixes coverage-inline.cpp in check-profile after r372020.
------------------------------------------------------------------------
llvm-svn: 374858
|
|
------------------------------------------------------------------------
r372606 | spatel | 2019-09-23 06:30:23 -0700 (Mon, 23 Sep 2019) | 3 lines
[x86] fix assert with horizontal math + broadcast of vector (PR43402)
https://bugs.llvm.org/show_bug.cgi?id=43402
------------------------------------------------------------------------
llvm-svn: 374633
|
|
Summary:
Patch for 9.0.1.
Simplified version of r372186/r372187: fix the meaning of the "vfpv2" and "vfpv2sp" features, but keep around the useless "vfp2d16" and "vfp2d16sp" features, to reduce the risk on the release branch.
Fixes https://bugs.llvm.org/show_bug.cgi?id=43365
Reviewers: t.p.northover, tstellar
Reviewed By: t.p.northover
Subscribers: kristof.beyls, hiraditya, kristina, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68675
llvm-svn: 374433
|
|
llvm-svn: 374286
|
|
llvm-svn: 374218
|
|
llvm-svn: 374193
|
|
LDC, the LLVM-based D compiler, is already ready for LLVM 9.0.0.
llvm-svn: 372167
|
|
llvm-svn: 371964
|
|
------------------------------------------------------------------------
r371434 | efriedma | 2019-09-09 20:29:27 +0200 (Mon, 09 Sep 2019) | 15 lines
[IfConversion] Correctly handle cases where analyzeBranch fails.
If analyzeBranch fails, on some targets, the out parameters point to
some blocks in the function. But we can't use that information, so make
sure to clear it out. (In some places in IfConversion, we assume that
any block with a TrueBB is analyzable.)
The change to the testcase makes it trigger a bug on builds without this
fix: IfConvertDiamond tries to perform a followup "merge" operation,
which isn't legal, and we somehow end up with a branch to a deleted MBB.
I'm not sure how this doesn't crash the compiler.
Differential Revision: https://reviews.llvm.org/D67306
------------------------------------------------------------------------
llvm-svn: 371490
|
|
------------------------------------------------------------------------
r370592 | rksimon | 2019-08-31 18:21:31 +0200 (Sat, 31 Aug 2019) | 3 lines
[X86] EltsFromConsecutiveLoads - Don't confuse elt count with vector element count (PR43170)
EltsFromConsecutiveLoads was assuming that the number of input elts was the same as the number of elements in the output vector type when creating a zeroing shuffle, causing an assert when subvectors were being combined instead of just scalars.
------------------------------------------------------------------------
llvm-svn: 371382
|
|
------------------------------------------------------------------------
r371221 | spatel | 2019-09-06 18:10:18 +0200 (Fri, 06 Sep 2019) | 3 lines
[SimplifyLibCalls] handle pow(x,-0.0) before it can assert (PR43233)
https://bugs.llvm.org/show_bug.cgi?id=43233
------------------------------------------------------------------------
------------------------------------------------------------------------
r371224 | jfb | 2019-09-06 18:26:59 +0200 (Fri, 06 Sep 2019) | 39 lines
[InstCombine] pow(x, +/- 0.0) -> 1.0
Summary:
This isn't an important optimization at all... We're already doing:
pow(x, 0.0) -> 1.0
My patch merely teaches instcombine that -0.0 does the same.
However, doing this fixes an AMAZING bug! Compile this program:
extern "C" double pow(double, double);
double boom(double base) {
return pow(base, -0.0);
}
With:
clang++ ~/Desktop/fast-math.cpp -ffast-math -O2 -S
And clang will crash with a signal. Wow, fast math is so fast it ICEs the
compiler! Arguably, the generated math is infinitely fast.
What's actually happening is that we recurse infinitely in getPow. In debug we
hit its assertion:
assert(Exp != 0 && "Incorrect exponent 0 not handled");
We avoid this entire mess if we instead recognize that an exponent of positive
and negative zero yield 1.0.
A separate commit, r371221, fixed the same problem. This only contains the added
tests.
<rdar://problem/54598300>
Reviewers: scanon
Subscribers: hiraditya, jkorous, dexonsmith, ributzka, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67248
------------------------------------------------------------------------
llvm-svn: 371381
|
|
------------------------------------------------------------------------
r371305 | nikic | 2019-09-07 14:03:48 +0200 (Sat, 07 Sep 2019) | 1 line
[X86] Add test for PR43230; NFC
------------------------------------------------------------------------
------------------------------------------------------------------------
r371307 | nikic | 2019-09-07 14:13:44 +0200 (Sat, 07 Sep 2019) | 9 lines
[X86] Fix pshuflw formation from repeated shuffle mask (PR43230)
Fix for https://bugs.llvm.org/show_bug.cgi?id=43230.
When creating PSHUFLW from a repeated shuffle mask, we have to apply
the checks to the repeated mask, not the original one. For the test
case from PR43230 the inspected part of the original mask is all undef.
Differential Revision: https://reviews.llvm.org/D67314
------------------------------------------------------------------------
llvm-svn: 371378
|
|
------------------------------------------------------------------------
r371111 | efriedma | 2019-09-05 22:02:38 +0200 (Thu, 05 Sep 2019) | 17 lines
[IfConversion] Fix diamond conversion with unanalyzable branches.
The code was incorrectly counting the number of identical instructions,
and therefore tried to predicate an instruction which should not have
been predicated. This could have various effects: a compiler crash,
an assembler failure, a miscompile, or just generating an extra,
unnecessary instruction.
Instead of depending on TargetInstrInfo::removeBranch, which only
works on analyzable branches, just remove all branch instructions.
Fixes https://bugs.llvm.org/show_bug.cgi?id=43121 and
https://bugs.llvm.org/show_bug.cgi?id=41121 .
Differential Revision: https://reviews.llvm.org/D67203
------------------------------------------------------------------------
llvm-svn: 371377
|
|
------------------------------------------------------------------------
r371262 | nickdesaulniers | 2019-09-06 23:50:11 +0200 (Fri, 06 Sep 2019) | 45 lines
[IR] CallBrInst: scan+update arg list when indirect dest list changes
Summary:
There's an unspoken invariant of callbr that the list of BlockAddress
Constants in the "function args" list match the BasicBlocks in the
"other labels" list. (This invariant is being added to the LangRef in
https://reviews.llvm.org/D67196).
When modifying the any of the indirect destinations of a callbr
instruction (possible jump targets), we need to update the function
arguments if the argument is a BlockAddress whose BasicBlock refers to
the indirect destination BasicBlock being replaced. Otherwise, many
transforms that modify successors will end up violating that invariant.
A recent change to the arm64 Linux kernel exposed this bug, which
prevents the kernel from booting.
I considered maintaining a mapping from indirect destination BasicBlock
to argument operand BlockAddress, but this ends up being a one to
potentially many (though usually one) mapping. Also, the list of
arguments to a function (or more typically inline assembly) ends up
being less than 10. The implementation is significantly simpler to just
rescan the full list of arguments. Because of the one to potentially
many relationship, the full arg list must be scanned (we can't stop at
the first instance).
Thanks to the following folks that reported the issue and helped debug
it:
* Nathan Chancellor
* Will Deacon
* Andrew Murray
* Craig Topper
Link: https://bugs.llvm.org/show_bug.cgi?id=43222
Link: https://github.com/ClangBuiltLinux/linux/issues/649
Link: https://lists.infradead.org/pipermail/linux-arm-kernel/2019-September/678330.html
Reviewers: craig.topper, chandlerc
Reviewed By: craig.topper
Subscribers: void, javed.absar, kristof.beyls, hiraditya, llvm-commits, nathanchance, srhines
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67252
------------------------------------------------------------------------
llvm-svn: 371376
|
|
------------------------------------------------------------------------
r371088 | spatel | 2019-09-05 18:58:18 +0200 (Thu, 05 Sep 2019) | 1 line
[x86] add test for horizontal math bug (PR43225); NFC
------------------------------------------------------------------------
------------------------------------------------------------------------
r371095 | spatel | 2019-09-05 19:28:17 +0200 (Thu, 05 Sep 2019) | 3 lines
[x86] fix horizontal math bug exposed by improved demanded elements analysis (PR43225)
https://bugs.llvm.org/show_bug.cgi?id=43225
------------------------------------------------------------------------
llvm-svn: 371178
|