Age | Commit message (Collapse) | Author | Files | Lines |
|
If a function only exits for certain input values we can still derive
that an argument is "returned". We can also derive range metadata that
describe the possible value range returned by the function. However, it
turns out that those two analyses can result in conflicting information.
Example:
declare i16 @foo(i16 returned)
...
%A = call i16 @foo(i16 4095), !range !{i16 32, i16 33}
To avoid "Bits known to be one AND zero?" assertion failures we know
make sure to discard the known bits for this kind of scenario.
|
|
llvm_libc_stdfix_ext.td. (#84365)
This fixes runtime build for armv6 baremetal targets:
https://github.com/llvm/llvm-project/pull/83959#issuecomment-1984221249
|
|
__has_extension (#84065)
__has_extension(...) doesn't work as intended when -pedantic-errors is
used with Clang. With that flag, __has_extension(...) is equivalent to
__has_feature(...), which means that checks like
__has_extension(pragma_clang_attribute_external_declaration)
will return 0. In turn, this has the effect of disabling availability
markup in libc++, which is undesirable.
rdar://124078119
|
|
|
|
... using method syntax:
```
struct S {
static void f()
};
void DoIt(S& s) {
s.f(); // Does not mutate `s` through the `this` parameter.
}
```
|
|
|
|
|
|
|
|
This is because SubtargetPredicate is not copied from pseudo to dpp16
and dpp8 real. Actually this is the common issue for insts with
_Realtriple_ --- We should avoid using SubtargetPredicate to define
pseudo: the predicate will be lost in real.
|
|
Remove an overly aggressive cantFail: This call to defineMaterializing should
never fail with a duplicate symbols error (since all new symbols shoul be
weak), but may fail if the tracker has become defunct in the mean time. In that
case we need to propagate the error.
|
|
|
|
[ORC] Re-land https://github.com/llvm/llvm-project/pull/81826
This patch adds two plugins: VTuneSupportPlugin.cpp and
JITLoaderVTune.cpp. The testing is done in a manner similar to
llvm-jitlistener. Currently, we only support the old version of Intel
VTune API.
|
|
X,C+1))` for Vectors
This is undoing a middle-end transform which does the opposite. Since
X86 doesn't have unsigned vector comparison instructions pre-AVX512,
the simplified form gets worse codegen.
Fixes #66479
Proofs: https://alive2.llvm.org/ce/z/UCz3wt
Closes #84104
Closes #66479
|
|
(icmp eq X,C+1))`; NFC
|
|
|
|
Continuous efforts following #83946.
|
|
Continuous efforts following #83506.
|
|
`mask_trailing_ones` (#84325)
#84299 broke the arm32 build, this patch fixes it forward.
|
|
|
|
Adds extra test coverage for TBAA generation for copies of structs with
union members.
|
|
|
|
This way it is consistent with ValueObject and TypeSystem.
|
|
In the end this value comes from TypeSystem::GetNumChildren which
returns a uint32_t, so ValueObject should be consistent with that.
|
|
UBSAN checks can be too expensive to be used
in release binaries. However not all code affect
performace in the same way. Removing small
number of checks in hot code we can performance
loss, preserving most of the checks.
|
|
This patch covers
[CWG519](https://cplusplus.github.io/CWG/issues/519.html) "Null pointer
preservation in `void*` conversions",
[CWG571](https://cplusplus.github.io/CWG/issues/571.html) "References
declared const".
|
|
|
|
Removes an unused field. Retypes unshared smart pointers to `unique_ptr`.
|
|
Combined constructs are decomposed into separate operations. However,
this does not adhere to `acc` dialect's goal to be able to regenerate
semantically equivalent clauses as user's intent. Thus, add an attribute
to keep track of the combined constructs.
|
|
We already import Record.h.
|
|
Currently, progress events reported by the ProgressManager and broadcast
to eBroadcastBitProgressCategory always specify they're complete. The
problem is that the ProgressManager reports kNonDeterministicTotal for
both the total and the completed number of (sub)events. Because the
values are the same, the event reports itself as complete.
This patch fixes the issue by reporting 0 as the completed value for the
start event and kNonDeterministicTotal for the end event.
|
|
Summary:
The HIP toolchain has support for compressing the final output. We
should respect that when we create the executable.
|
|
anyext load."
This reverts commit 7524ad9aa7b1b5003fe554a6ac8e434d50027dfb.
Broke sanitizer build bots, e.g. https://lab.llvm.org/buildbot/#/builders/5/builds/41588/steps/9/logs/stdio
|
|
The buildbot seems to complain about `strcmp` function not available in
the vfork patch (https://github.com/llvm/llvm-project/pull/81564):
https://lab.llvm.org/buildbot/#/builders/68/builds/70093/steps/6/logs/stdio
Unfortunately, I can't reproduce the failure on my linux machine so this
is a guessing fix. If anyone has a way to reproduce and very this fix,
please feel free to merge this change.
Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
|
|
Targets the dynamic realignment pattern of `(Ptr + Align - 1) & -Align;`
as implemented by gep then ptrmask.
Specifically, when the pointer already has alignment information,
dynamically realigning it to less than is already known should be a
no-op. Discovered while writing test cases for another patch.
For the zero low bits of a known aligned pointer, adding the gep index
then removing it with a mask is a no-op. Folding the ptrmask effect
entirely into the gep is the ideal result as that unblocks other
optimisations that are not aware of ptrmask.
In some other cases the gep is known to be dead and is removed without
changing the ptrmask.
In the least effective case, this transform creates a new gep with a
rounded-down index and still leaves the ptrmask unchanged. That
simplified gep is still a minor improvement, geps are cheap and ptrmask
occurs in address calculation contexts so I don't think it's worth
special casing to avoid the extra instruction.
|
|
This PR adds the following to the mlir c api:
- The disctinct mlir builtin attribute.
- LLVM attributes (mostly debug related ones)
|
|
Recommits llvm/llvm-project#80378 which was reverted in
llvm/llvm-project#84330. The problem was that the change in
llvm/test/CodeGen/AArch64/GlobalISel/legalizer-info-validation.mir used
217 as an opcode instead of a regex.
|
|
This would have helped identify problems with #83905 which only showed
up in an LLVM_ENABLE_EXPENSIVE_CHECKS build.
|
|
Summary:
The new driver does not need this hash and it can lead to redefined
symbol errors when the CUID hash isn't set.
|
|
This adds a rewrite that converts illegal 2D unit-dim `shape_casts` into
`vector.transpose` ops.
E.g.
```mlir
// Case 1:
%a = vector.shape_cast %0 : vector<[4]x1xf32> to vector<1x[4]xf32>
// Case 2:
%b = vector.shape_cast %1 : vector<[4]x1xf32> to vector<[4]xf32>
```
Becomes:
```mlir
// Case 1:
%a = vector.transpose %0 : [1, 0] vector<[4]x1xf32> to vector<1x[4]xf32>
// Case 2:
%t = vector.transpose %1 : [1, 0] vector<[4]x1xf32> to vector<1x[4]xf32>
%b = vector.shape_cast %t : vector<1x[4]xf32> to vector<[4]xf32>
```
Various lowerings and drop unit-dims patterns add such shape_casts,
however, if they do not cancel out (which they likely won't if we've
reached the vector-legalization pass) they will prevent lowering the IR.
Rewriting them as a transpose gives `LiftIllegalVectorTransposeToMemory`
a chance to eliminate the illegal types.
|
|
Clean-up InputSignal checks.
|
|
Instead of the longer ArrayElemPtr + Load.
|
|
|
|
|
|
As per the OpenMP standard, "If a variable appears in a link clause on a
declare target directive that does not have a device_type clause with
the nohost device-type-description then it is treated as if it had
appeared in a map clause with a map-type of tofrom" is an implicit
mapping rule. Before this change, such variables were mapped as to by
default.
|
|
Arithmetic constants for vector types can be constructed from objects
implementing Python buffer protocol such as `array.array`. Note that
until Python 3.12, there is no typing support for buffer protocol
implementers, so the annotations use array explicitly.
Reverts llvm/llvm-project#84103
|
|
|
|
This is in preparation for an upcoming commit that will add "funclet"
OpBundle to the inserted runtime calls where the function's EH
personality requires it.
See PR https://github.com/llvm/llvm-project/pull/82533
|
|
Summary:
The clang-offload-bundler uses an empty file to control the bundles made
for embedding. Previously this still used `/dev/null` by mistake even on
Windows.
|
|
This improves overall analysis for minbitwidth in SLP. It allows to
analyze the trees with store/insertelement root nodes. Also, instead of
using single minbitwidth, detected from the very first analysis stage,
it tries to detect the best one for each trunc/ext subtree in the graph
and use it for the subtree.
Results in better code and less vector register pressure.
Metric: size..text
Program size..text
results results0 diff
test-suite :: SingleSource/Benchmarks/Adobe-C++/simple_types_loop_invariant.test 92549.00 92609.00 0.1%
test-suite :: External/SPEC/CINT2017speed/625.x264_s/625.x264_s.test 663381.00 663493.00 0.0%
test-suite :: External/SPEC/CINT2017rate/525.x264_r/525.x264_r.test 663381.00 663493.00 0.0%
test-suite :: MultiSource/Benchmarks/Bullet/bullet.test 307182.00 307214.00 0.0%
test-suite :: External/SPEC/CFP2017speed/638.imagick_s/638.imagick_s.test 1394420.00 1394484.00 0.0%
test-suite :: External/SPEC/CFP2017rate/538.imagick_r/538.imagick_r.test 1394420.00 1394484.00 0.0%
test-suite :: External/SPEC/CFP2017rate/510.parest_r/510.parest_r.test 2040257.00 2040273.00 0.0%
test-suite :: External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 12396098.00 12395858.00 -0.0%
test-suite :: External/SPEC/CINT2006/445.gobmk/445.gobmk.test 909944.00 909768.00 -0.0%
SingleSource/Benchmarks/Adobe-C++/simple_types_loop_invariant - 4 scalar
instructions remain scalar (good).
Spec2017/x264 - the whole function idct4x4dc is vectorized using <16
x i16> instead of <16 x i32>, also zext/trunc are removed. In other
places last vector zext/sext removed and replaced by
extractelement + scalar zext/sext pair.
MultiSource/Benchmarks/Bullet/bullet - reduce or <4 x i32> replaced by
reduce or <4 x i8>
Spec2017/imagick - Removed extra zext from 2 packs of the operations.
Spec2017/parest - Removed extra zext, replaced by extractelement+scalar
zext
Spec2017/blender - the whole bunch of vector zext/sext replaced by
extractelement+scalar zext/sext, some extra code vectorized in smaller
types.
Spec2006/gobmk - fixed cost estimation, some small code remains scalar.
Reviewers: RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/84334
|
|
types" (#84330)
Reverts llvm/llvm-project#80378
causing Buildbot failures that did not show up with check-llvm or CI.
|