aboutsummaryrefslogtreecommitdiff
path: root/llvm
AgeCommit message (Collapse)AuthorFilesLines
2019-06-11Add release note for DIBuilder API changesTom Stellard1-0/+20
llvm-svn: 363027
2019-06-10Merging r358042:Tom Stellard2-2/+25
------------------------------------------------------------------------ r358042 | jimlin | 2019-04-09 18:56:32 -0700 (Tue, 09 Apr 2019) | 34 lines [Sparc] Fix incorrect MI insertion position for spilling f128. Summary: Obviously, new built MI (sethi+add or sethi+xor+add) for constructing large offset should be inserted before new created MI for storing even register into memory. So the insertion position should be *StMI instead of II. before fixed: std %f0, [%g1+80] sethi 4, %g1 <<< add %g1, %sp, %g1 <<< this two instructions should be put before "std %f0, [%g1+80]". sethi 4, %g1 add %g1, %sp, %g1 std %f2, [%g1+88] after fixed: sethi 4, %g1 add %g1, %sp, %g1 std %f0, [%g1+80] sethi 4, %g1 add %g1, %sp, %g1 std %f2, [%g1+88] Reviewers: venkatra, jyknight Reviewed By: jyknight Subscribers: jyknight, fedor.sergeev, jrtc27, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60397 ------------------------------------------------------------------------ llvm-svn: 363010
2019-06-10Merging r351577:Tom Stellard1-1/+1
------------------------------------------------------------------------ r351577 | ssijaric | 2019-01-18 11:34:20 -0800 (Fri, 18 Jan 2019) | 5 lines Fix the buildbot issue introduced by r351421 The EXPENSIVE_CHECK x86_64 Windows buildbot is failing due to this change. Fix the map access. ------------------------------------------------------------------------ llvm-svn: 363006
2019-06-06Merging r355154:Tom Stellard1-2/+2
------------------------------------------------------------------------ r355154 | joerg | 2019-02-28 15:33:09 -0800 (Thu, 28 Feb 2019) | 2 lines [PPC] Secure PLT only has meaning for PIC ------------------------------------------------------------------------ llvm-svn: 362729
2019-06-06Merging r361237:Tom Stellard2-6/+41
------------------------------------------------------------------------ r361237 | maskray | 2019-05-21 03:41:25 -0700 (Tue, 21 May 2019) | 14 lines [PPC64] Update LocalEntry from assigned symbols On PowerPC64 ELFv2 ABI, functions may have 2 entry points: global and local. The local entry point location of a function is stored in the st_other field of the symbol, as an offset relative to the global entry point. In order to make symbol assignments (e.g. .equ/.set) work properly with this, PPCTargetELFStreamer already copies the local entry bits from the source symbol to the destination one, on emitAssignment(). The problem is that this copy is performed only at the assignment location, where the source symbol may not yet have processed the .localentry directive, that sets the local entry. This may cause the destination symbol to end up with wrong local entry information. Other symbol info is not affected by this because, in this case, the destination symbol value is actually a symbol reference. This change keeps track of these assignments, and update all needed st_other fields when finish() is called. Patch by Leandro Lupori! Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D56586 ------------------------------------------------------------------------ llvm-svn: 362671
2019-06-06Merging r360442:Tom Stellard2-0/+18
------------------------------------------------------------------------ r360442 | maskray | 2019-05-10 10:09:25 -0700 (Fri, 10 May 2019) | 12 lines [MC][ELF] Copy top 3 bits of st_other to .symver aliases On PowerPC64 ELFv2 ABI, the top 3 bits of st_other encode the local entry offset. A versioned symbol alias created by .symver should copy the bits from the source symbol. This partly fixes PR41048. A full fix needs tracking of .set assignments and updating st_other fields when finish() is called, see D56586. Patch by Alfredo Dal'Ava JĂșnior Differential Revision: https://reviews.llvm.org/D59436 ------------------------------------------------------------------------ llvm-svn: 362669
2019-06-06Merging r360439:Tom Stellard2-6/+61
------------------------------------------------------------------------ r360439 | maskray | 2019-05-10 09:24:57 -0700 (Fri, 10 May 2019) | 8 lines [llvm-objdump] Print st_other Add support for ".hidden" ".internal" ".protected" and " 0x%02x" for other st_other bits used by some architectures. Reviewed By: sfertile Differential Revision: https://reviews.llvm.org/D61718 ------------------------------------------------------------------------ llvm-svn: 362668
2019-06-05Merging r360293:Matt Arsenault8-74/+202
------------------------------------------------------------------------ r360293 | arsenm | 2019-05-08 15:09:57 -0700 (Wed, 08 May 2019) | 21 lines AMDGPU: Select VOP3 form of add The VOP3 form should always be the preferred selection, to be shrunk later. This should only be an optimization issue, but this partially works around a problem from clobbering VCC when SIFixSGPRCopies rewrites an SCC defining operation directly to VCC. 3 of the testcases are regressions from failing to fold the immediate in cases it should. These can be avoided by improving the VCC liveness handling in SIFoldOperands. Simply increasing the threshold to computeRegisterLiveness works, although this is common enough that VCC liveness should probably be tracked throughout the pass. The hack of leaving behind an implicit_def instruction to avoid breaking iterator wastes instruction count, which inhibits finding the VCC def in long chains of adds. Doing this however exposes different, worse looking regressions from poor scheduling behavior. This could probably be avoided around by forcing the shrink of the addc here, but the scheduler should probably be fixed. The r600 add test needs to be split out because it asserts on the arguments in the new test during the calling convention lowering. ------------------------------------------------------------------------ llvm-svn: 362658
2019-06-05Merging r359899:Matt Arsenault3-53/+197
------------------------------------------------------------------------ r359899 | arsenm | 2019-05-03 08:37:07 -0700 (Fri, 03 May 2019) | 7 lines AMDGPU: Select VOP3 form of sub The VOP3 form should always be the preferred selection form to be shrunk later. The r600 sub test needs to be split out because it asserts on the arguments in the new test during the calling convention lowering. ------------------------------------------------------------------------ llvm-svn: 362654
2019-06-05Merging r359898:Matt Arsenault2-35/+267
------------------------------------------------------------------------ r359898 | arsenm | 2019-05-03 08:21:53 -0700 (Fri, 03 May 2019) | 3 lines AMDGPU: Support shrinking add with FI in SIFoldOperands Avoids test regression in a future patch ------------------------------------------------------------------------ llvm-svn: 362648
2019-06-05Correct test in r362634Matt Arsenault1-10/+10
llvm-svn: 362635
2019-06-05Merging r359891:Matt Arsenault2-4/+64
------------------------------------------------------------------------ r359891 | arsenm | 2019-05-03 07:40:10 -0700 (Fri, 03 May 2019) | 9 lines AMDGPU: Replace shrunk instruction with dummy implicit_def This was broken if the original operand was killed. The kill flag would appear on both instructions, and fail the verifier. Keep the kill flag, but remove the operands from the old instruction. This has an added benefit of really reducing the use count for future folds. Ideally the pass would be structured more like what PeepholeOptimizer does to avoid this hack to avoid breaking instruction iterators. ------------------------------------------------------------------------ llvm-svn: 362634
2019-05-30Merging r353865, r353866, and r353874:Tom Stellard5-2/+60
------------------------------------------------------------------------ r353865 | sfertile | 2019-02-12 09:48:22 -0800 (Tue, 12 Feb 2019) | 1 line [PowerPC] Fix printing of negative offsets in call instruction dissasembly. ------------------------------------------------------------------------ ------------------------------------------------------------------------ r353866 | sfertile | 2019-02-12 09:49:04 -0800 (Tue, 12 Feb 2019) | 4 lines [PPC64] Update tests to reflect change in printing of call operand. [NFC] The printing of branch operands for call instructions was changed to properly handle negative offsets. Updating the tests to reflect that. ------------------------------------------------------------------------ ------------------------------------------------------------------------ r353874 | sfertile | 2019-02-12 12:03:04 -0800 (Tue, 12 Feb 2019) | 5 lines Fix undefined behaviour in PPCInstPrinter::printBranchOperand. Fix the undefined behaviour introduced by my previous patch r353865 (left shifting a potentially negative value), which was caught by the bots that run UBSan. ------------------------------------------------------------------------ llvm-svn: 362043
2019-05-23Merging r351523:Tom Stellard8-17/+70
------------------------------------------------------------------------ r351523 | dylanmckay | 2019-01-17 22:10:41 -0800 (Thu, 17 Jan 2019) | 12 lines [AVR] Expand 8/16-bit multiplication to libcalls on MCUs that don't have hardware MUL This change modifies the LLVM ISel lowering settings so that 8-bit/16-bit multiplication is expanded to calls into the compiler runtime library if the MCU being targeted does not support multiplication in hardware. Before this, MUL instructions would be generated on CPUs like the ATtiny85, triggering a CPU reset due to an illegal instruction at runtime. First raised in https://github.com/avr-rust/rust/issues/124. ------------------------------------------------------------------------ llvm-svn: 361551
2019-05-16Merging r355038:llvmorg-8.0.1-rc1Tom Stellard2-0/+7
------------------------------------------------------------------------ r355038 | joerg | 2019-02-27 13:53:14 -0800 (Wed, 27 Feb 2019) | 3 lines Default to Secure PLT on PPC for NetBSD and OpenBSD. This matches the default settings of clang. ------------------------------------------------------------------------ llvm-svn: 360950
2019-05-16Merging r352806:Tom Stellard2-1/+17
------------------------------------------------------------------------ r352806 | sbc | 2019-01-31 14:38:22 -0800 (Thu, 31 Jan 2019) | 5 lines [WebAssembly] MC: Fix for outputing wasm object to /dev/null Subscribers: dschuff, jgravelle-google, aheejin, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D57479 ------------------------------------------------------------------------ llvm-svn: 360949
2019-05-15Merging r354846:Tom Stellard2-3/+32
------------------------------------------------------------------------ r354846 | djg | 2019-02-25 21:20:19 -0800 (Mon, 25 Feb 2019) | 12 lines [WebAssembly] Properly align fp128 arguments in outgoing varargs arguments For outgoing varargs arguments, it's necessary to check the OrigAlign field of the corresponding OutputArg entry to determine argument alignment, rather than just computing an alignment from the argument value type. This is because types like fp128 are split into multiple argument values, with narrower types that don't reflect the ABI alignment of the full fp128. This fixes the printf("printfL: %4.*Lf\n", 2, lval); testcase. Differential Revision: https://reviews.llvm.org/D58656 ------------------------------------------------------------------------ llvm-svn: 360826
2019-05-15Merging r359569:Tom Stellard1-0/+1
------------------------------------------------------------------------ r359569 | russell_gallop | 2019-04-30 08:35:16 -0700 (Tue, 30 Apr 2019) | 7 lines Add llvm-profdata to LLVM_TOOLCHAIN_TOOLS This is required for using PGO on Windows but isn't in the Windows release packages. Windows packages are built with LLVM_INSTALL_TOOLCHAIN_ONLY so only includes llvm "tools" listed here. Differential Revision: https://reviews.llvm.org/D61317 ------------------------------------------------------------------------ llvm-svn: 360801
2019-05-15Merging r360099:Tom Stellard2-26/+39
------------------------------------------------------------------------ r360099 | efriedma | 2019-05-06 16:21:59 -0700 (Mon, 06 May 2019) | 26 lines [ARM] Glue register copies to tail calls. This generally follows what other targets do. I don't completely understand why the special case for tail calls existed in the first place; even when the code was committed in r105413, call lowering didn't work in the way described in the comments. Stack protector lowering breaks if the register copies are not glued to a tail call: we have to insert the stack protector check before the tail call, and we choose the location based on the assumption that all physical register dependencies of a tail call are adjacent to the tail call. (See FindSplitPointForStackProtector.) This is sort of fragile, but I don't see any reason to break that assumption. I'm guessing nobody has seen this before just because it's hard to convince the scheduler to actually schedule the code in a way that breaks; even without the glue, the only computation that could actually be scheduled after the register copies is the computation of the call address, and the scheduler usually prefers to schedule that before the copies anyway. Fixes https://bugs.llvm.org/show_bug.cgi?id=41417 Differential Revision: https://reviews.llvm.org/D60427 ------------------------------------------------------------------------ llvm-svn: 360793
2019-05-15Merging r359883:Tom Stellard2-9/+12
------------------------------------------------------------------------ r359883 | arsenm | 2019-05-03 06:42:56 -0700 (Fri, 03 May 2019) | 6 lines AMDGPU: Fix incorrect commute with sub when folding immediates When a fold of an immediate into a sub/subrev required shrinking the instruction, the wrong VOP2 opcode was used. This was using the VOP2 equivalent of the original instruction, not the commuted instruction with the inverted opcode. ------------------------------------------------------------------------ llvm-svn: 360752
2019-05-15Merging r356982:Tom Stellard2-1/+4
------------------------------------------------------------------------ r356982 | mstorsjo | 2019-03-26 02:02:44 -0700 (Tue, 26 Mar 2019) | 12 lines [llvm-dlltool] Set a proper machine type for weak symbol object files This makes GNU binutils not reject the libraries outright. GNU ld handles weak externals slightly differently though, so it can't use them for aliases in import libraries, but this makes GNU ld able to use the rest of the import libraries. LLD accepted object files with machine type 0 aka IMAGE_FILE_MACHINE_UNKNOWN. Differential Revision: https://reviews.llvm.org/D59742 ------------------------------------------------------------------------ llvm-svn: 360750
2019-05-15Merging r360512:Tom Stellard2-17/+57
------------------------------------------------------------------------ r360512 | ctopper | 2019-05-10 21:19:33 -0700 (Fri, 10 May 2019) | 5 lines [X86] Don't emit MOVNTDQA loads from fast-isel without SSE4.1. We were checking for SSE4.1 for FP types, but not integer 128-bit types. Fixes PR41837. ------------------------------------------------------------------------ llvm-svn: 360749
2019-05-04Merging r359496:Tom Stellard2-1/+169
------------------------------------------------------------------------ r359496 | mstorsjo | 2019-04-29 13:25:51 -0700 (Mon, 29 Apr 2019) | 8 lines [X86] Run CFIInstrInserter on Windows if Dwarf is used This is necessary since SVN r330706, as tail merging can include CFI instructions since then. This fixes PR40322 and PR40012. Differential Revision: https://reviews.llvm.org/D61252 ------------------------------------------------------------------------ llvm-svn: 359952
2019-05-03Merging r359834:Tom Stellard3-82/+18
------------------------------------------------------------------------ r359834 | evandro | 2019-05-02 15:01:39 -0700 (Thu, 02 May 2019) | 3 lines [AArch64] Update for Exynos Fix the forwarding of multiplication results for Exynos M4. ------------------------------------------------------------------------ llvm-svn: 359946
2019-05-03Merging r357701:Tom Stellard1-1/+9
------------------------------------------------------------------------ r357701 | mgorny | 2019-04-04 07:21:38 -0700 (Thu, 04 Apr 2019) | 8 lines [llvm] [cmake] Add additional headers only if they exist Modify the add_header_files_for_glob() function to only add files that do exist, rather than all matches of the glob. This fixes CMake error when one of the include directories (which happen to include /usr/include) contain broken symlinks. Differential Revision: https://reviews.llvm.org/D59632 ------------------------------------------------------------------------ llvm-svn: 359857
2019-05-03Merging r355607:Tom Stellard2-4/+4
------------------------------------------------------------------------ r355607 | petarj | 2019-03-07 08:31:08 -0800 (Thu, 07 Mar 2019) | 11 lines [DebugInfo] Fix the type of the formated variable Change the format type of *Personality and *LSDAAddress to PRIx64 since they are of type uint64_t. The problem was detected on mips builds, where it was printing junk values and causing test failure. Patch by Milos Stojanovic. Differential Revision: https://reviews.llvm.org/D58451 ------------------------------------------------------------------------ llvm-svn: 359851
2019-04-23Merging r356039:Tom Stellard3-1/+69
------------------------------------------------------------------------ r356039 | atanasyan | 2019-03-13 04:04:38 -0700 (Wed, 13 Mar 2019) | 11 lines [MIPS][microMIPS] Fix PseudoMTLOHI_MM matching and expansion On micromips MipsMTLOHI is always matched to PseudoMTLOHI_DSP regardless of +dsp argument. This patch checks is HasDSP predicate is present for PseudoMTLOHI_DSP so PseudoMTLOHI_MM can be matched when appropriate. Add expansion of PseudoMTLOHI_MM instruction into a mtlo/mthi pair. Patch by Mirko Brkusanin. Differential Revision: http://reviews.llvm.org/D59203 ------------------------------------------------------------------------ llvm-svn: 358941
2019-04-22Merging r355825:Tom Stellard3-1/+426
------------------------------------------------------------------------ r355825 | petarj | 2019-03-11 07:13:31 -0700 (Mon, 11 Mar 2019) | 10 lines [MIPS][microMIPS] Add a pattern to match TruncIntFP A pattern needed to match TruncIntFP was missing. This was causing multiple tests from llvm test suite to fail during compilation for micromips. Patch by Mirko Brkusanin. Differential Revision: https://reviews.llvm.org/D58722 ------------------------------------------------------------------------ llvm-svn: 358936
2019-04-22Merging r354882:Tom Stellard3-3/+15
------------------------------------------------------------------------ r354882 | atanasyan | 2019-02-26 06:45:17 -0800 (Tue, 26 Feb 2019) | 4 lines [mips] Emit `.module softfloat` directive This change fixes crash on an assertion in case of using `soft float` ABI for mips32r6 target. ------------------------------------------------------------------------ llvm-svn: 358934
2019-04-22Merging r354808:Tom Stellard3-12/+49
------------------------------------------------------------------------ r354808 | nikic | 2019-02-25 10:54:17 -0800 (Mon, 25 Feb 2019) | 11 lines [Mips] Fix missing masking in fast-isel of br (PR40325) Fixes https://bugs.llvm.org/show_bug.cgi?id=40325 by zero extending (and x, 1) the condition before branching on it. To avoid regressing trivial cases, I'm combining emission of cmp+br sequences for the single-use + same block case (similar to what we do in x86). icmpbr1.ll still regresses due to the cross-bb usage of the condition. Differential Revision: https://reviews.llvm.org/D58576 ------------------------------------------------------------------------ llvm-svn: 358925
2019-04-22Merging r354672:Tom Stellard2-0/+69
------------------------------------------------------------------------ r354672 | petarj | 2019-02-22 06:53:58 -0800 (Fri, 22 Feb 2019) | 13 lines [mips][micromips] fix filling delay slots for PseudoIndirectBranch_MM Filling a delay slot in 32bit jump instructions with a 16bit instruction can cause issues. According to the documentation such an operation is unpredictable. This patch adds opcode Mips::PseudoIndirectBranch_MM alongside Mips::PseudoIndirectBranch and other instructions that are expanded to jr instruction and do not allow a 16bit instruction in their delay slots. Patch by Mirko Brkusanin. Differential Revision: https://reviews.llvm.org/D58507 ------------------------------------------------------------------------ llvm-svn: 358920
2019-04-22Merging r355854:Tom Stellard2-0/+811
------------------------------------------------------------------------ r355854 | jonpa | 2019-03-11 12:00:37 -0700 (Mon, 11 Mar 2019) | 13 lines [RegAlloc] Avoid compile time regression with multiple copy hints. As a fix for https://bugs.llvm.org/show_bug.cgi?id=40986 ("excessive compile time building opencollada"), this patch makes sure that no phys reg is hinted more than once from getRegAllocationHints(). This handles the case were many virtual registers are assigned to the same physreg. The previous compile time fix (r343686) in weightCalcHelper() only made sure that physical/virtual registers are passed no more than once to addRegAllocationHint(). Review: Dimitry Andric, Quentin Colombet https://reviews.llvm.org/D59201 ------------------------------------------------------------------------ llvm-svn: 358905
2019-03-29Merging r356924:Tom Stellard1-0/+3
------------------------------------------------------------------------ r356924 | tstellar | 2019-03-25 10:01:29 -0700 (Mon, 25 Mar 2019) | 1 line merge-request.sh: Update 8.0 metabug for 8.0.1 ------------------------------------------------------------------------ llvm-svn: 357234
2019-03-21git-llvm: Update for release_80 branchTom Stellard1-3/+3
llvm-svn: 356714
2019-03-21Bump version to 8.0.1Tom Stellard2-2/+2
llvm-svn: 356708
2019-03-13ReleaseNotes: Changes to the JIT APIs; by Lang HamesHans Wennborg1-0/+16
llvm-svn: 356034
2019-03-12ReleaseNotes: SystemZ, by Ulrich Weigand.Hans Wennborg1-0/+20
llvm-svn: 355916
2019-03-07ReleaseNotes: Open Dylan; by Peter HouselHans Wennborg1-0/+16
llvm-svn: 355584
2019-03-05Merging r355227 and r355228:Hans Wennborg2-120/+155
------------------------------------------------------------------------ r355227 | ctopper | 2019-03-01 22:02:34 +0100 (Fri, 01 Mar 2019) | 3 lines [X86] Add test case for D58805. NFC This demonstrates dead store elimination removing a store that may alias a gather that uses null as its base. ------------------------------------------------------------------------ ------------------------------------------------------------------------ r355228 | ctopper | 2019-03-01 22:02:40 +0100 (Fri, 01 Mar 2019) | 7 lines [X86] Remove IntrArgMemOnly from target specific gather/scatter intrinsics IntrArgMemOnly implies that only memory pointed to by pointer typed arguments will be accessed. But these intrinsics allow you to pass null to the pointer argument and put the full address into the index argument. Other passes won't be able to understand this. A colleague found that ISPC was creating gathers like this and then dead store elimination removed some stores because it didn't understand what the gather was doing since the pointer argument was null. Differential Revision: https://reviews.llvm.org/D58805 ------------------------------------------------------------------------ llvm-svn: 355383
2019-03-04Merging r355136:Hans Wennborg3-7/+46
------------------------------------------------------------------------ r355136 | efriedma | 2019-02-28 21:38:45 +0100 (Thu, 28 Feb 2019) | 12 lines [AArch64] [Windows] Don't skip constructing UnwindHelp. In certain cases, the first non-frame-setup instruction in a function is a branch. For example, it could be a cbz on an argument. Make sure we correctly allocate the UnwindHelp, and find an appropriate register to use to initialize it. Fixes https://bugs.llvm.org/show_bug.cgi?id=40184 Differential Revision: https://reviews.llvm.org/D58752 ------------------------------------------------------------------------ llvm-svn: 355313
2019-03-04Merging r352465:Hans Wennborg3-4/+61
------------------------------------------------------------------------ r352465 | mstorsjo | 2019-01-29 10:36:48 +0100 (Tue, 29 Jan 2019) | 17 lines [COFF, ARM64] Don't put jump table into a separate COFF section for EK_LabelDifference32 Windows ARM64 has PIC relocation model and uses jump table kind EK_LabelDifference32. This produces jump table entry as ".word LBB123 - LJTI1_2" which represents the distance between the block and jump table. A new relocation type (IMAGE_REL_ARM64_REL32) is needed to do the fixup correctly if they are in different COFF section. This change saves the jump table to the same COFF section as the associated code. An ideal fix could be utilizing IMAGE_REL_ARM64_REL32 relocation type. Patch by Tom Tan! Differential Revision: https://reviews.llvm.org/D57277 ------------------------------------------------------------------------ llvm-svn: 355311
2019-03-04Merging r355116 and r355117:Hans Wennborg2-2/+27
------------------------------------------------------------------------ r355116 | ctopper | 2019-02-28 19:49:29 +0100 (Thu, 28 Feb 2019) | 7 lines [X86] Don't peek through bitcasts before checking ISD::isBuildVectorOfConstantSDNodes in combineTruncatedArithmetic We don't have any combines that can look through a bitcast to truncate a build vector of constants. So the truncate will stick around and give us something like this pattern (binop (trunc X), (trunc (bitcast (build_vector)))) which has two truncates in it. Which will be reversed by hoistLogicOpWithSameOpcodeHands in the generic DAG combiner. Thus causing an infinite loop. Even if we had a combine for (truncate (bitcast (build_vector))), I think it would need to be implemented in getNode otherwise DAG combiner visit ordering would probably still visit the binop first and reverse it. Or combineTruncatedArithmetic would need to do its own constant folding. Differential Revision: https://reviews.llvm.org/D58705 ------------------------------------------------------------------------ ------------------------------------------------------------------------ r355117 | ctopper | 2019-02-28 19:50:16 +0100 (Thu, 28 Feb 2019) | 1 line [X86] Add test case that was supposed to go with r355116. ------------------------------------------------------------------------ llvm-svn: 355310
2019-02-27Merging r354505:Hans Wennborg1-0/+2
------------------------------------------------------------------------ r354505 | ibiryukov | 2019-02-20 20:08:06 +0100 (Wed, 20 Feb 2019) | 13 lines [clangd] Store index in '.clangd/index' instead of '.clangd-index' Summary: To take up the .clangd folder for other potential uses in the future. Reviewers: kadircet, sammccall Reviewed By: kadircet Subscribers: ioeric, MaskRay, jkorous, arphaman, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D58440 ------------------------------------------------------------------------ llvm-svn: 354981
2019-02-27ReleaseNotes: add Known Issues, clean up, etc.Hans Wennborg1-44/+17
llvm-svn: 354973
2019-02-27Merging r354207:Hans Wennborg1-1/+1
------------------------------------------------------------------------ r354207 | whitequark | 2019-02-16 23:33:10 +0100 (Sat, 16 Feb 2019) | 16 lines [bindings/go] Fix building on 32-bit systems (ARM etc.) Summary: The patch in https://reviews.llvm.org/D53883 (by me) fails to build on 32-bit systems like ARM. Fix the array size to be less ridiculously large. 2<<20 should still be enough for all practical purposes. Bug: https://bugs.llvm.org/show_bug.cgi?id=40426 Reviewers: whitequark, pcc Reviewed By: whitequark Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58030 ------------------------------------------------------------------------ llvm-svn: 354956
2019-02-26Add note on libFuzzer for Windows to release notesHans Wennborg1-0/+2
By Jonathan Metzman! Differential revision: https://reviews.llvm.org/D58676 llvm-svn: 354892
2019-02-26Merging r354733:Hans Wennborg3-17/+27
------------------------------------------------------------------------ r354733 | nikic | 2019-02-23 19:59:01 +0100 (Sat, 23 Feb 2019) | 10 lines [WebAssembly] Fix select of and (PR40805) Fixes https://bugs.llvm.org/show_bug.cgi?id=40805 introduced by patterns added in D53676. I'm removing the patterns entirely here, as they are not correct in the general case. If necessary something more specific can be added in the future. Differential Revision: https://reviews.llvm.org/D58575 ------------------------------------------------------------------------ llvm-svn: 354860
2019-02-26Merging r354756:Hans Wennborg2-5/+79
------------------------------------------------------------------------ r354756 | ctopper | 2019-02-24 20:33:37 +0100 (Sun, 24 Feb 2019) | 36 lines [X86] Fix tls variable lowering issue with large code model Summary: The problem here is the lowering for tls variable. Below is the DAG for the code. SelectionDAG has 11 nodes: t0: ch = EntryToken t8: i64,ch = load<(load 8 from `i8 addrspace(257)* null`, addrspace 257)> t0, Constant:i64<0>, undef:i64 t10: i64 = X86ISD::WrapperRIP TargetGlobalTLSAddress:i64<i32* @x> 0 [TF=10] t11: i64,ch = load<(load 8 from got)> t0, t10, undef:i64 t12: i64 = add t8, t11 t4: i32,ch = load<(dereferenceable load 4 from @x)> t0, t12, undef:i64 t6: ch = CopyToReg t0, Register:i32 %0, t4 And when mcmodel is large, below instruction can NOT be folded. t10: i64 = X86ISD::WrapperRIP TargetGlobalTLSAddress:i64<i32* @x> 0 [TF=10] t11: i64,ch = load<(load 8 from got)> t0, t10, undef:i64 So "t11: i64,ch = load<(load 8 from got)> t0, t10, undef:i64" is lowered to " Morphed node: t11: i64,ch = MOV64rm<Mem:(load 8 from got)> t10, TargetConstant:i8<1>, Register:i64 $noreg, TargetConstant:i32<0>, Register:i32 $noreg, t0" When llvm start to lower "t10: i64 = X86ISD::WrapperRIP TargetGlobalTLSAddress:i64<i32* @x> 0 [TF=10]", it fails. The patch is to fold the load and X86ISD::WrapperRIP. Fixes PR26906 Patch by LuoYuanke Reviewers: craig.topper, rnk, annita.zhang, wxiao3 Reviewed By: rnk Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58336 ------------------------------------------------------------------------ llvm-svn: 354857
2019-02-26Merging r354764:Hans Wennborg1-48/+61
------------------------------------------------------------------------ r354764 | lebedevri | 2019-02-25 08:39:07 +0100 (Mon, 25 Feb 2019) | 123 lines [XRay][tools] Revert "Use Support/JSON.h in llvm-xray convert" Summary: This reverts D50129 / rL338834: [XRay][tools] Use Support/JSON.h in llvm-xray convert Abstractions are great. Readable code is great. JSON support library is a *good* idea. However unfortunately, there is an internal detail that one needs to be aware of in `llvm::json::Object` - it uses `llvm::DenseMap`. So for **every** `llvm::json::Object`, even if you only store a single `int` entry there, you pay the whole price of `llvm::DenseMap`. Unfortunately, it matters for `llvm-xray`. I was trying to analyse the `llvm-exegesis` analysis mode performance, and for that i wanted to view the LLVM X-Ray log visualization in Chrome trace viewer. And the `llvm-xray convert` is sluggish, and sometimes even ended up being killed by OOM. `xray-log.llvm-exegesis.lwZ0sT` was acquired from `llvm-exegesis` (compiled with ` -fxray-instruction-threshold=128`) analysis mode over `-benchmarks-file` with 10099 points (one full latency measurement set), with normal runtime of 0.387s. Timings: Old: (copied from D58580) ``` $ perf stat -r 5 ./bin/llvm-xray convert -sort -symbolize -instr_map=./bin/llvm-exegesis -output-format=trace_event -output=/tmp/trace.yml xray-log.llvm-exegesis.lwZ0sT Performance counter stats for './bin/llvm-xray convert -sort -symbolize -instr_map=./bin/llvm-exegesis -output-format=trace_event -output=/tmp/trace.yml xray-log.llvm-exegesis.lwZ0sT' (5 runs): 21346.24 msec task-clock # 1.000 CPUs utilized ( +- 0.28% ) 314 context-switches # 14.701 M/sec ( +- 59.13% ) 1 cpu-migrations # 0.037 M/sec ( +-100.00% ) 2181354 page-faults # 102191.251 M/sec ( +- 0.02% ) 85477442102 cycles # 4004415.019 GHz ( +- 0.28% ) (83.33%) 14526427066 stalled-cycles-frontend # 16.99% frontend cycles idle ( +- 0.70% ) (83.33%) 32371533721 stalled-cycles-backend # 37.87% backend cycles idle ( +- 0.27% ) (33.34%) 67896890228 instructions # 0.79 insn per cycle # 0.48 stalled cycles per insn ( +- 0.03% ) (50.00%) 14592654840 branches # 683631198.653 M/sec ( +- 0.02% ) (66.67%) 212207534 branch-misses # 1.45% of all branches ( +- 0.94% ) (83.34%) 21.3502 +- 0.0585 seconds time elapsed ( +- 0.27% ) ``` New: ``` $ perf stat -r 9 ./bin/llvm-xray convert -sort -symbolize -instr_map=./bin/llvm-exegesis -output-format=trace_event -output=/tmp/trace.yml xray-log.llvm-exegesis.lwZ0sT Performance counter stats for './bin/llvm-xray convert -sort -symbolize -instr_map=./bin/llvm-exegesis -output-format=trace_event -output=/tmp/trace.yml xray-log.llvm-exegesis.lwZ0sT' (9 runs): 7178.38 msec task-clock # 1.000 CPUs utilized ( +- 0.26% ) 182 context-switches # 25.402 M/sec ( +- 28.84% ) 0 cpu-migrations # 0.046 M/sec ( +- 70.71% ) 33701 page-faults # 4694.994 M/sec ( +- 0.88% ) 28761053971 cycles # 4006833.933 GHz ( +- 0.26% ) (83.32%) 2028297997 stalled-cycles-frontend # 7.05% frontend cycles idle ( +- 1.61% ) (83.32%) 10773154901 stalled-cycles-backend # 37.46% backend cycles idle ( +- 0.38% ) (33.36%) 36199132874 instructions # 1.26 insn per cycle # 0.30 stalled cycles per insn ( +- 0.03% ) (50.02%) 6434504227 branches # 896420204.421 M/sec ( +- 0.03% ) (66.68%) 73355176 branch-misses # 1.14% of all branches ( +- 1.46% ) (83.33%) 7.1807 +- 0.0190 seconds time elapsed ( +- 0.26% ) ``` So using `llvm::json` nearly triples run-time on that test case. (+3x is times, not percent.) Memory: Old: ``` total runtime: 39.88s. bytes allocated in total (ignoring deallocations): 79.07GB (1.98GB/s) calls to allocation functions: 33267816 (834135/s) temporary memory allocations: 5832298 (146235/s) peak heap memory consumption: 9.21GB peak RSS (including heaptrack overhead): 147.98GB total memory leaked: 1.09MB ``` New: ``` total runtime: 17.42s. bytes allocated in total (ignoring deallocations): 5.12GB (293.86MB/s) calls to allocation functions: 21382982 (1227284/s) temporary memory allocations: 232858 (13364/s) peak heap memory consumption: 350.69MB peak RSS (including heaptrack overhead): 2.55GB total memory leaked: 79.95KB ``` Diff: ``` total runtime: -22.46s. bytes allocated in total (ignoring deallocations): -73.95GB (3.29GB/s) calls to allocation functions: -11884834 (529155/s) temporary memory allocations: -5599440 (249307/s) peak heap memory consumption: -8.86GB peak RSS (including heaptrack overhead): 0B total memory leaked: -1.01MB ``` So using `llvm::json` increases *peak* memory consumption on *this* testcase ~+27x. And total allocation count +15x. Both of these numbers are times, *not* percent. And note that memory usage is clearly unbound with `llvm::json`, it directly depends on the length of the log, so peak memory consumption is always increasing. This isn't so with the dumb code, there is no accumulating memory consumption, peak memory consumption is fixed. Naturally, that means it will handle *much* larger logs without OOM'ing. Readability is good, but the price is simply unacceptable here. Too bad none of this analysis was done as part of the development/review D50129 itself. Reviewers: dberris, kpw, sammccall Reviewed By: dberris Subscribers: riccibruno, hans, courbet, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58584 ------------------------------------------------------------------------ llvm-svn: 354856
2019-02-22Release notes: a few lldb changes, by Raphael Isemann!Hans Wennborg1-0/+8
llvm-svn: 354659