aboutsummaryrefslogtreecommitdiff
path: root/llvm
AgeCommit message (Collapse)AuthorFilesLines
2024-06-04[LV] Apply loop guards when checking recur during hoisting RT checks.Florian Hahn2-15/+13
Apply loop guards when checking if the recurrence is non-negative in cases where runtime checks are hoisted out of an inner loop.
2024-06-04[DirectX] Update for removal of icmp and fcmp constant expressionsJustin Bogner1-8/+0
The icmp and fcmp constant expressions were removed in deab451e7a7f "[IR] Remove support for icmp and fcmp constant expressions (#93038)". Update the DXILBitcodeWriter to stop referencing them.
2024-06-04AMDGPU/NFC: Make MACH indentation consistent (#94370)Konstantin Zhuravlyov1-58/+58
2024-06-04gn build: Sync GENERIC_TF_SOURCES with CMake.pcc1-6/+6
Reviewers: aeubanks Reviewed By: aeubanks Pull Request: https://github.com/llvm/llvm-project/pull/88456
2024-06-04gn build: Define SANITIZER_COMMON_NO_REDEFINE_BUILTINS for ubsan_minimal.pcc1-0/+1
Matches the cmake build. Reviewers: aeubanks Reviewed By: aeubanks Pull Request: https://github.com/llvm/llvm-project/pull/88458
2024-06-04gn build: Pass -fno-sanitize=vptr,function with use_ubsanpcc1-0/+1
Matches CMake LLVM_UBSAN_FLAGS. Reviewers: aeubanks Reviewed By: aeubanks Pull Request: https://github.com/llvm/llvm-project/pull/93911
2024-06-04gn build: Support llvm_enable_zstd.pcc8-4/+46
Reviewers: aeubanks Reviewed By: aeubanks Pull Request: https://github.com/llvm/llvm-project/pull/88457
2024-06-04[WebAssembly] Implement all f16x8 unary instructions. (#94063)Brendan Dahl3-2/+128
All of these instructions can be generated using regular LL intrinsics. Specified at: https://github.com/WebAssembly/half-precision/blob/29a9b9462c9285d4ccc1a5dc39214ddfd1892658/proposals/half-precision/Overview.md
2024-06-04[llvm-readobj][COFF] Consistent PDBGUID Formatting (#94256)Miguel A. Arroyo2-2/+5
## Consistent PDB GUID in `llvm-readobj` Currently, the PDB GUID is shown as a byte array: `PDBGUID: (D8 4C 88 D9 26 15 1F 11 4C 4C 44 20 50 44 42 2E)` This is inconsistent with `llvm-pdbutil` (e.g. `llvm-pdbutil dump --summary`) which shows it as a hexadecimal string. Additionally, `yaml2obj` uses the same hexadecimal string format. In general, the hexadecimal string is the common representation for PDB GUIDs on Windows. This PR changes it to be consistent as shown below: `PDBGUID: {D9884CD8-1526-111F-4C4C-44205044422E}`
2024-06-04[AMDGPU] Do not override PseudoInstr in SMEM Pseudo definitions. NFC.Jay Foad1-9/+1
2024-06-04[AMDGPU] Do not override PseudoInstr in FLAT Pseudo definitions. NFC. (#94369)Jay Foad1-17/+4
Simplify by setting PseudoInstr to the tablegen name of the Pseudo in the first place.
2024-06-04[InstCombine] Fold `select Cond, not X, X` into `Cond ^ X` (#93591)Yingwei Zheng2-0/+166
See the following example: ``` define i1 @src(i64 %x, i1 %y) { %1526 = icmp ne i64 %x, 0 %1527 = icmp eq i64 %x, 0 %sel = select i1 %y, i1 %1526, i1 %1527 ret i1 %sel } define i1 @tgt(i64 %x, i1 %y) { %1527 = icmp eq i64 %x, 0 %sel = xor i1 %y, %1527 ret i1 %sel } ``` I find that this pattern is common in C/C++/Rust code base. This patch folds `select Cond, Y, X` into `Cond ^ X` iff: 1. X has the same type as Cond 2. X is poison -> Y is poison 3. X == !Y Alive2: https://alive2.llvm.org/ce/z/hSmkHS
2024-06-04[BasicAA] Consider 'nneg' flag when comparing CastedValues (#94129)Alex MacLean2-19/+224
Any of the `zext` bits in a `zext nneg` can be converted to `sext` but when checking if casts are compatible `BasicAA` fails to take into account `nneg`. This change adds tracking of `nneg` to the `CastedValue` struct and ensures that `sext` and `zext` bits are treated as interchangeable when either `CastedValue` has a `nneg`. When distributing casted values in `GetLinearExpression` we conservatively discard the `nneg` from the `CastedValue`, except in the case of `shl nsw`, where we know the sign has not changed to negative.
2024-06-04[AMDGPU][NFC] Rename the clamp modifier definition to follow the prevailing ↵Ivan Kosarev8-52/+55
convention. (#94353) Allows to simplify the definition itself. Part of <https://github.com/llvm/llvm-project/issues/62629>.
2024-06-04[PATCH] [Xtensa] Implement FrameLowering methods and stack operation ↵Andrei Safronov10-14/+404
lowering. (#92960) Implement emitPrologue/emitEpilogue methods, determine/spill/restore callee saved registers functionality with test. Also implement lowering of the DYNAMIC_STACKALLOC/STACKSAVE/STACKRESTORE stack operations with tests.
2024-06-04[gn] remove goma configs (#93941)Takuto Ikuta3-35/+3
goma is deprecated and not maintained anymore.
2024-06-04update_test_checks: drop the other arm64_32 handlersJon Roelofs1-5/+0
2024-06-04[AArch64] Enable CmpBcc fusion for Neoverse-v2 (#90608)Elvina Yakubova2-0/+36
This adds compare and branch instructions fusion for Neoverse V2. According to the Software Optimization Guide: Specific Aarch64 instruction pairs that can be fused are as follows: CMP/CMN (immediate) + B.cond CMP/CMN (register) + B.cond Performance for SPEC2017 is neutral, but another benchmark improves significantly. Results for SPEC2017 on a Neoverse V2: 500.perlbench 0% 502.gcc_r 0% 505.mcf_r -0.15% 523.xalancbmk_r -0.43% 525.x264_r 0% 531.deepsjeng_r 0% 541.leela_r -0.16% 557.xz_r -0.47%
2024-06-04[InstCombine] Drop range attr in select of ctz foldNikita Popov2-6/+7
The range may no longer be valid after the select has been optimized away. This fixes the kernel miscompiles reported at https://github.com/ClangBuiltLinux/linux/issues/2031.
2024-06-04[InstCombine] Add tests for incorrect range handling in ctz fold (NFC)Nikita Popov1-0/+24
2024-06-04[cmake][runtimes] Add missing dependency on LLVMgold.so (#94199)Nikita Popov1-0/+5
When doing a runtimes build with LTO using ld.bfd (or ld.gold), the build starts failing with ninja 1.12, which added a new critical path scheduler. The reason is that LLVMgold.so is not available yet at the point where runtimes start being build, leading to configuration failures in the nested cmake invocation. Fix this by adding an explicit dependency on LLVMgold.so if it is available. (It may not always be necessary, e.g. if the used linker is lld, but it would be hard to detect when exactly it may or may not be needed, so always adding the dependency is safer.)
2024-06-04[IR] Accept GEPNoWrapFlags in creation APIsNikita Popov10-49/+72
Add overloads of GetElementPtrInst::Create() that accept GEPNoWrapFlags, and switch the bool parameters in IRBuilder to accept it instead as well. As a sample use, switch GEP i8 canonicalization in InstCombine to preserve the original flags.
2024-06-04[LoopUtils] Simplify code for runtime check generation a bit (NFCI).Florian Hahn1-15/+14
Store getSE result in variable to re-use and use structured bindings when looping over bounds.
2024-06-04[AMDGPU] Add gfx12 run lines to fence MMRA tests (#94333)Pierre van Houtryve2-0/+602
2024-06-04[InstSimplify] Accept GEPNoWrapFlags instead of only InBounds flagNikita Popov6-15/+27
This preserves the flags if a constexpr GEP is created (at least as long as they don't get dropped later -- the test cases uses a constexpr index to avoid that).
2024-06-04[LV] Add test for RT check hoisting where loop guards simplify check.Florian Hahn1-21/+112
Add a test case with a missed simplification when hoisting runtime checks due to not applying loop guards.
2024-06-04[ConstantFolding] Preserve all flags in CastGEPIndices()Nikita Popov1-4/+4
This preserves the flags during that transform, but currently they will still end up getting dropped at a later stage.
2024-06-04[InstCombine] Add more gep index canonicalization tests (NFC)Nikita Popov1-3/+35
Flags are already fully preserved for the instruction case, but lost on constant expressions.
2024-06-04[PowerPC] Add test for ppc-mi-peepholes on MMA register COPYs. NFC.Kai Luo1-0/+86
2024-06-04[Local] Use nusw and nuw flags in emitGEPOffset()Nikita Popov2-6/+39
2024-06-04VPlan: add missing case for LogicalAnd; fix crash (#93553)Ramkumar Ramachandra3-64/+120
VPTypeAnalysis::inferScalarTypeForRecipe is missing the case for VPInstruction::LogicalAnd, due to which the test vplan-incomplete-cases.ll crashes. Add this missing case, and move the test in vplan-infer-not-or-type.ll to vplan-incomplete-cases.ll, showing correct codegen for trip-counts 2 and 3.
2024-06-04[InstCombine] Simplify isMergedGEPInBounds() (NFCI)Nikita Popov1-6/+1
Since the switch to opaque pointers, zero-index GEPs will be optimized away anyway, so there is no need to explicitly handle them here.
2024-06-04[InstCombine] Preserve all gep nowrap flags in PointerReplacerNikita Popov2-1/+21
2024-06-04[AArch64LoopIdiomTransform] Simplify GEP construction (NFC)Nikita Popov1-12/+8
2024-06-04[PHITransAddr] Preserve all GEP nowrap flagsNikita Popov2-8/+50
2024-06-04[IR] Remove support for icmp and fcmp constant expressions (#93038)Nikita Popov116-739/+357
Remove support for the icmp and fcmp constant expressions. This is part of: https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179 As usual, many of the updated tests will no longer test what they were originally intended to -- this is hard to preserve when constant expressions get removed, and in many cases just impossible as the existence of a specific kind of constant expression was the cause of the issue in the first place.
2024-06-04[gn build] Port 8ea59ec6077eLLVM GN Syncbot1-0/+1
2024-06-03[MemProf][NFC] Use range for loop (#94308)Teresa Johnson1-3/+2
With the change in 2fa059195bb54f422cc996db96ac549888268eae we can now use a range for loop.
2024-06-03[MemProf] Use remove_if to erase MapVector elements in bulk (#94269)Teresa Johnson1-18/+8
A cycle profile showed that we were spending a lot of time invoking MapVector::erase. According to https://llvm.org/docs/ProgrammersManual.html#llvm-adt-mapvector-h, erasing elements one at a time is very inefficient for MapVector and it is better to use remove_if. This change resulted in around 7% time reduction on a large thin link. While here remove an unused function that also invokes erase on MapVectors.
2024-06-03[Attributor][FIX] Replace AANoFPClass MBEC propagation (#91030)Johannes Doerfert34-1654/+1644
The old use of must-be-executed-context (MBEC) did propagate through calls even if that was not allowed. We now only propagate from call site arguments. If there are calls/intrinsics that allows propagation, we need to add them explicitly. Fixes: https://github.com/llvm/llvm-project/issues/78507 --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2024-06-04[Asan] Teach FunctionStackPoisoner to filter out struct type with scalable ↵Yeting Kuo2-2/+15
vector type. (#93406) FunctionStackPoisoner does not serve for `AllocaInst` with scalable vector type, but it does not filter out struct type with scalable vector introduced by c8eb535aed0368c20b25fe05bca563ab38dd91e9.
2024-06-04[AArch64] Sink llvm.vscale.i32 into blocks for better isel (#93465)Fangcao Wang2-1/+177
Sink vscale calls as well when indvars is not widen (-indvars-widen-indvars=false).
2024-06-03[MemProf] Determine stack id references in BitcodeWriter without sorting ↵Teresa Johnson1-21/+31
(#94285) A cycle profile of a thin link showed a lot of time spent in sort called from the BitcodeWriter, which was being used to compute the unique references to stack ids in the summaries emitted for each backend in a distributed thinlto build. We were also frequently invoking lower_bound to locate stack id indices in the resulting vector when writing out the referencing memprof records. Change this to use a map to uniquify the references, and to hold the index of the corresponding stack id in the StackIds vector, which is now populated at the same time. This reduced the time of a large thin link by about 10%.
2024-06-04[PowerPC] Remove DAG matching in ADDIStocHA (#93905)Kai Luo4-9/+7
The MI is generated in `PPCDAGToDAGISel::Select` so the match pattern isn't used and can be removed.
2024-06-04[NewPM][CodeGen] Port `finalize-isel` to new pass manager (#94214)paperchalice16-8/+76
It should preserve more analysis results, but it happens immediately after instruction selection.
2024-06-04[CodeGen] Fix compiler conditional combination (#94297)Keith Smiley1-2/+2
Previously this assumed that `LLVM_ENABLE_ABI_BREAKING_CHECKS` would always be enabled in this case, if it's not `TTI` does not exist. Introduced in 7652a59407018c057cdc1163c9f64b5b6f0954eb
2024-06-04[LoongArch] Use R_LARCH_ALIGN without symbol as much as possible (#93775)Lu Weining3-20/+28
To support the third parameter of the alignment directive, R_LARCH_ALIGN relocations need a non-zero symbol index. In many cases we don't need the third parameter and can set the symbol index to 0. This patch will remove a lot of .Lla-relax-align* symbols and mitigate the size regression due to https://github.com/llvm/llvm-project/pull/72962. Co-authored-by: Jinyang He <hejinyang@loongson.cn> Co-authored-by: Weining Lu <luweining@loongson.cn>
2024-06-03[Codegen, BasicBlockSections] Avoid cloning blocks which have their machine ↵Rahman Lavaee2-6/+31
block address taken. (#94296) These blocks usually show up in the form of branches within inline assembly. Since it's hard to rewire them, we fully omit paths with such blocks from path cloning.
2024-06-03gn build: Use -fvisibility-global-new-delete=force-hidden to build ↵pcc3-3/+3
libcxx/libcxxabi/libunwind. -fvisibility-global-new-delete-hidden is deprecated and clang was warning about it on every build command. These libraries are always built using a stage2 compiler, so we can use the new build flag unconditionally. Reviewers: aeubanks Reviewed By: aeubanks Pull Request: https://github.com/llvm/llvm-project/pull/88459
2024-06-04Reland "[NewPM][CodeGen] Port selection dag isel to new pass manager" (#94149)paperchalice122-306/+823
- Fix build with `EXPENSIVE_CHECKS` - Remove unused `PassName::ID` to resolve warning - Mark `~SelectionDAGISel` virtual so AArch64 backend can work properly