aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-03-07[ValueTracking] Fix KnownBits conflict for calls (range vs returned) (#84353)Björn Pettersson2-0/+20
If a function only exits for certain input values we can still derive that an argument is "returned". We can also derive range metadata that describe the possible value range returned by the function. However, it turns out that those two analyses can result in conflicting information. Example: declare i16 @foo(i16 returned) ... %A = call i16 @foo(i16 4095), !range !{i16 32, i16 33} To avoid "Bits known to be one AND zero?" assertion failures we know make sure to discard the known bits for this kind of scenario.
2024-03-07[libc] Refactor stdfix extension from llvm_libc_ext.td to ↵lntue4-21/+26
llvm_libc_stdfix_ext.td. (#84365) This fixes runtime build for armv6 baremetal targets: https://github.com/llvm/llvm-project/pull/83959#issuecomment-1984221249
2024-03-07[libc++] Enable availability based on the compiler instead of ↵Louis Dionne2-4/+25
__has_extension (#84065) __has_extension(...) doesn't work as intended when -pedantic-errors is used with Clang. With that flag, __has_extension(...) is equivalent to __has_feature(...), which means that checks like __has_extension(pragma_clang_attribute_external_declaration) will return 0. In turn, this has the effect of disabling availability markup in libc++, which is undesirable. rdar://124078119
2024-03-07[CVP] Add test coverage for an upcoming generalization of expandUDivOrURemPhilip Reames2-0/+161
2024-03-07[clang-tidy] `isOnlyUsedAsConst`: Handle static method calls. (#84005)Clement Courbet2-4/+14
... using method syntax: ``` struct S { static void f() }; void DoIt(S& s) { s.f(); // Does not mutate `s` through the `this` parameter. } ```
2024-03-07[libc][c23] add memset_explicit (#83577)Schrodinger ZHU Yifan8-0/+104
2024-03-07[bazel] Port 3714f937b835c06c8c32ca4f3f61ba2317db2296Benjamin Kramer1-0/+1
2024-03-07[GISEL] Silence unused variable warning. NFCBenjamin Kramer1-2/+1
2024-03-07AMDGPU: Use OtherPredicates for v_dot2_bf16_bf16(f16_f16) pseudo (#84354)Changpeng Fang1-1/+1
This is because SubtargetPredicate is not copied from pseudo to dpp16 and dpp8 real. Actually this is the common issue for insts with _Realtriple_ --- We should avoid using SubtargetPredicate to define pseudo: the predicate will be lost in real.
2024-03-07[ORC] Propagate defineMaterializing failure when resource tracker is defunct.Lang Hames1-3/+4
Remove an overly aggressive cantFail: This call to defineMaterializing should never fail with a duplicate symbols error (since all new symbols shoul be weak), but may fail if the tracker has become defunct in the mean time. In that case we need to propagate the error.
2024-03-07[gn build] Port 00f412168cf6LLVM GN Syncbot2-0/+2
2024-03-07[ORC][JITLink] Add Intel VTune support to JITLink (#83957)Hongyu Chen10-1/+691
[ORC] Re-land https://github.com/llvm/llvm-project/pull/81826 This patch adds two plugins: VTuneSupportPlugin.cpp and JITLoaderVTune.cpp. The testing is done in a manner similar to llvm-jitlistener. Currently, we only support the old version of Intel VTune API.
2024-03-07[X86] Fold `(icmp ult (add x,-C),2)` -> `(or (icmp eq X,C), (icmp eq ↵Noah Goldstein2-223/+214
X,C+1))` for Vectors This is undoing a middle-end transform which does the opposite. Since X86 doesn't have unsigned vector comparison instructions pre-AVX512, the simplified form gets worse codegen. Fixes #66479 Proofs: https://alive2.llvm.org/ce/z/UCz3wt Closes #84104 Closes #66479
2024-03-07[X86] Add tests for folding `(icmp ult (add x,-C),2)` -> `(or (icmp eq X,C), ↵Noah Goldstein1-0/+786
(icmp eq X,C+1))`; NFC
2024-03-07[gn build] Port a6a6fca7911fLLVM GN Syncbot1-1/+2
2024-03-07[mlir][sparse] Migrate more tests to sparse_tensor.print (#84249)Yinying Li20-560/+478
Continuous efforts following #83946.
2024-03-07[mlir][sparse] Migrate to sparse_tensor.print (#83946)Yinying Li20-190/+249
Continuous efforts following #83506.
2024-03-07[libc] Fix forward missing `BigInt` specialization of `mask_leading_ones` / ↵Guillaume Chatelet8-34/+123
`mask_trailing_ones` (#84325) #84299 broke the arm32 build, this patch fixes it forward.
2024-03-07Change Get|SetNumChildren to use unint32_tAdrian Prantl2-4/+4
2024-03-07[TBAA] Add extra tests to copy structs with union members.Florian Hahn1-0/+40
Adds extra test coverage for TBAA generation for copies of structs with union members.
2024-03-07Change GetChildAtIndex to take a uint32_tAdrian Prantl37-99/+100
2024-03-07Change the return type of SyntheticFrontend::CalculateNumChildren to int32_tAdrian Prantl34-89/+89
This way it is consistent with ValueObject and TypeSystem.
2024-03-07Change the return type of ValueObject::CalculateNumChildren to uint32_t.Adrian Prantl20-23/+23
In the end this value comes from TypeSystem::GetNumChildren which returns a uint32_t, so ValueObject should be consistent with that.
2024-03-07[ubsan][pgo] Pass to remove ubsan checks based on profile data (#83471)Vitaly Buka6-0/+536
UBSAN checks can be too expensive to be used in release binaries. However not all code affect performace in the same way. Removing small number of checks in hot code we can performance loss, preserving most of the checks.
2024-03-07[clang] Add CodeGen tests for CWG 5xx issues (#84303)Vlad Serebrennikov4-19/+60
This patch covers [CWG519](https://cplusplus.github.io/CWG/issues/519.html) "Null pointer preservation in `void*` conversions", [CWG571](https://cplusplus.github.io/CWG/issues/571.html) "References declared const".
2024-03-07[clang][Interp][NFC] Add [[nodiscard]] attribute to emit functionsTimm Bäder2-2/+3
2024-03-07[lldb] Minor cleanup in StoringDiagnosticConsumer (#84263)Dave Lee1-8/+5
Removes an unused field. Retypes unshared smart pointers to `unique_ptr`.
2024-03-07[acc] Add attribute for combined constructs (#80319)Razvan Lupusoru5-7/+168
Combined constructs are decomposed into separate operations. However, this does not adhere to `acc` dialect's goal to be able to regenerate semantically equivalent clauses as user's intent. Thus, add an attribute to keep track of the combined constructs.
2024-03-07[clang][Interp][NFC] Remove unneeded forward declarationTimm Bäder1-1/+0
We already import Record.h.
2024-03-07[lldb] Don't report all progress event as completed. (#84281)Jonas Devlieghere3-11/+18
Currently, progress events reported by the ProgressManager and broadcast to eBroadcastBitProgressCategory always specify they're complete. The problem is that the ProgressManager reports kNonDeterministicTotal for both the total and the completed number of (sub)events. Because the values are the same, the event reports itself as complete. This patch fixes the issue by reporting 0 as the completed value for the start event and kNonDeterministicTotal for the end event.
2024-03-07[LinkerWrapper] Accept compression arguments for HIP fatbins (#84337)Joseph Huber4-2/+11
Summary: The HIP toolchain has support for compressing the final output. We should respect that when we create the executable.
2024-03-07Revert "[AArch64][GlobalISel] Fix incorrect selection of monotonic s32->s64 ↵Florian Mayer2-34/+8
anyext load." This reverts commit 7524ad9aa7b1b5003fe554a6ac8e434d50027dfb. Broke sanitizer build bots, e.g. https://lab.llvm.org/buildbot/#/builders/5/builds/41588/steps/9/logs/stdio
2024-03-07Fix vfork test strcmp buildbot failure (#84224)jeffreytan811-0/+1
The buildbot seems to complain about `strcmp` function not available in the vfork patch (https://github.com/llvm/llvm-project/pull/81564): https://lab.llvm.org/buildbot/#/builders/68/builds/70093/steps/6/logs/stdio Unfortunately, I can't reproduce the failure on my linux machine so this is a guessing fix. If anyone has a way to reproduce and very this fix, please feel free to merge this change. Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
2024-03-07[InstCombine] ptrmask of gep for dynamic pointer aligment (#80002)Jon Chesterfield2-4/+198
Targets the dynamic realignment pattern of `(Ptr + Align - 1) & -Align;` as implemented by gep then ptrmask. Specifically, when the pointer already has alignment information, dynamically realigning it to less than is already known should be a no-op. Discovered while writing test cases for another patch. For the zero low bits of a known aligned pointer, adding the gep index then removing it with a mask is a no-op. Folding the ptrmask effect entirely into the gep is the ideal result as that unblocks other optimisations that are not aware of ptrmask. In some other cases the gep is known to be dead and is removed without changing the ptrmask. In the least effective case, this transform creates a new gep with a rounded-down index and still leaves the ptrmask unchanged. That simplified gep is still a minor improvement, geps are cheap and ptrmask occurs in address calculation contexts so I don't think it's worth special casing to avoid the extra instruction.
2024-03-07[MLIR] Add llvm (debug) attributes to CAPI (#83992)Edgar6-1/+562
This PR adds the following to the mlir c api: - The disctinct mlir builtin attribute. - LLVM attributes (mostly debug related ones)
2024-03-07[GISEL] Add IRTranslation for shufflevector on scalable vector types (#80378)Michael Maitland15-21/+1890
Recommits llvm/llvm-project#80378 which was reverted in llvm/llvm-project#84330. The problem was that the change in llvm/test/CodeGen/AArch64/GlobalISel/legalizer-info-validation.mir used 217 as an opcode instead of a regex.
2024-03-07[AArch64] Add -verify-machineinstrs to a testJay Foad1-1/+1
This would have helped identify problems with #83905 which only showed up in an LLVM_ENABLE_EXPENSIVE_CHECKS build.
2024-03-07[HIP] Do not include the CUID module hash with the new driver (#84332)Joseph Huber1-1/+1
Summary: The new driver does not need this hash and it can lead to redefined symbol errors when the CUID hash isn't set.
2024-03-07[mlir][ArmSME] Rewrite illegal `shape_casts` to `vector.transpose` ops (#82985)Benjamin Maxwell2-14/+116
This adds a rewrite that converts illegal 2D unit-dim `shape_casts` into `vector.transpose` ops. E.g. ```mlir // Case 1: %a = vector.shape_cast %0 : vector<[4]x1xf32> to vector<1x[4]xf32> // Case 2: %b = vector.shape_cast %1 : vector<[4]x1xf32> to vector<[4]xf32> ``` Becomes: ```mlir // Case 1: %a = vector.transpose %0 : [1, 0] vector<[4]x1xf32> to vector<1x[4]xf32> // Case 2: %t = vector.transpose %1 : [1, 0] vector<[4]x1xf32> to vector<1x[4]xf32> %b = vector.shape_cast %t : vector<1x[4]xf32> to vector<[4]xf32> ``` Various lowerings and drop unit-dims patterns add such shape_casts, however, if they do not cancel out (which they likely won't if we've reached the vector-legalization pass) they will prevent lowering the IR. Rewriting them as a transpose gives `LiftIllegalVectorTransposeToMemory` a chance to eliminate the illegal types.
2024-03-07[libomptarget][nextgen-plugin][NFC] Clean-up InputSignal checks (#83458)Gheorghe-Teodor Bercea1-24/+4
Clean-up InputSignal checks.
2024-03-07[clang][Interp][NFC] Use ArrayElem{,Pop} ops more oftenTimm Bäder3-37/+42
Instead of the longer ArrayElemPtr + Load.
2024-03-07[libc][stdbit] implement stdc_bit_floor (C23) (#84233)Nick Desaulniers23-9/+357
2024-03-07[SLP][NFC]Add lshr version of the test with casting, NFC.Alexey Bataev1-0/+54
2024-03-07[flang] Changes to map variables in link clause of declare target (#83643)Anchu Rajendran S4-34/+151
As per the OpenMP standard, "If a variable appears in a link clause on a declare target directive that does not have a device_type clause with the nohost device-type-description then it is treated as if it had appeared in a map clause with a map-type of tofrom" is an implicit mapping rule. Before this change, such variables were mapped as to by default.
2024-03-07Reapply "[mlir][py] better support for arith.constant construction" (#84142)Oleksandr "Alex" Zinenko2-2/+68
Arithmetic constants for vector types can be constructed from objects implementing Python buffer protocol such as `array.array`. Note that until Python 3.12, there is no typing support for buffer protocol implementers, so the annotations use array explicitly. Reverts llvm/llvm-project#84103
2024-03-07[clang][Interp] Implement complex comparisonsTimm Bäder4-6/+157
2024-03-07[NFC][Asan] Prepare AddressSanitizer to detect inserted runtime calls (#84223)sylvain-audi1-73/+111
This is in preparation for an upcoming commit that will add "funclet" OpBundle to the inserted runtime calls where the function's EH personality requires it. See PR https://github.com/llvm/llvm-project/pull/82533
2024-03-07[LinkerWrapper] Use the correct empty file on Windows (#84322)Joseph Huber1-0/+4
Summary: The clang-offload-bundler uses an empty file to control the bundles made for embedding. Previously this still used `/dev/null` by mistake even on Windows.
2024-03-07[SLP]Improve minbitwidth analysis.Alexey Bataev15-305/+553
This improves overall analysis for minbitwidth in SLP. It allows to analyze the trees with store/insertelement root nodes. Also, instead of using single minbitwidth, detected from the very first analysis stage, it tries to detect the best one for each trunc/ext subtree in the graph and use it for the subtree. Results in better code and less vector register pressure. Metric: size..text Program size..text results results0 diff test-suite :: SingleSource/Benchmarks/Adobe-C++/simple_types_loop_invariant.test 92549.00 92609.00 0.1% test-suite :: External/SPEC/CINT2017speed/625.x264_s/625.x264_s.test 663381.00 663493.00 0.0% test-suite :: External/SPEC/CINT2017rate/525.x264_r/525.x264_r.test 663381.00 663493.00 0.0% test-suite :: MultiSource/Benchmarks/Bullet/bullet.test 307182.00 307214.00 0.0% test-suite :: External/SPEC/CFP2017speed/638.imagick_s/638.imagick_s.test 1394420.00 1394484.00 0.0% test-suite :: External/SPEC/CFP2017rate/538.imagick_r/538.imagick_r.test 1394420.00 1394484.00 0.0% test-suite :: External/SPEC/CFP2017rate/510.parest_r/510.parest_r.test 2040257.00 2040273.00 0.0% test-suite :: External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 12396098.00 12395858.00 -0.0% test-suite :: External/SPEC/CINT2006/445.gobmk/445.gobmk.test 909944.00 909768.00 -0.0% SingleSource/Benchmarks/Adobe-C++/simple_types_loop_invariant - 4 scalar instructions remain scalar (good). Spec2017/x264 - the whole function idct4x4dc is vectorized using <16 x i16> instead of <16 x i32>, also zext/trunc are removed. In other places last vector zext/sext removed and replaced by extractelement + scalar zext/sext pair. MultiSource/Benchmarks/Bullet/bullet - reduce or <4 x i32> replaced by reduce or <4 x i8> Spec2017/imagick - Removed extra zext from 2 packs of the operations. Spec2017/parest - Removed extra zext, replaced by extractelement+scalar zext Spec2017/blender - the whole bunch of vector zext/sext replaced by extractelement+scalar zext/sext, some extra code vectorized in smaller types. Spec2006/gobmk - fixed cost estimation, some small code remains scalar. Reviewers: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/84334
2024-03-07Revert "[GISEL] Add IRTranslation for shufflevector on scalable vector ↵Michael Maitland15-1890/+21
types" (#84330) Reverts llvm/llvm-project#80378 causing Buildbot failures that did not show up with check-llvm or CI.