Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
|
|
This reverts commit ef9ec4bbcca2fa4f64df47bc426f1d1c59ea47e2.
The changes broke several bots:
https://lab.llvm.org/buildbot/#/builders/176/builds/3408
https://lab.llvm.org/buildbot/#/builders/198/builds/4028
https://lab.llvm.org/buildbot/#/builders/197/builds/8491
https://lab.llvm.org/buildbot/#/builders/197/builds/8491
|
|
|
|
rocm-device-libs and llpc were avoiding using f64 sqrt
intrinsics in favor of their own expansions. Port the
expansion into the backend. Both of these users should be
updated to call the intrinsic instead.
The library and llpc expansions are slightly different.
llpc uses an ldexp to do the scale; the library uses a multiply.
Use ldexp to do the scale instead of the multiply.
I believe v_ldexp_f64 and v_mul_f64 are always the same number of
cycles, but it's cheaper to materialize the 32-bit integer constant
than the 64-bit double constant.
The libraries have another fast version of sqrt which will
be handled separately.
I am tempted to do this in an IR expansion instead. In the IR
we could take advantage of computeKnownFPClass to avoid
the 0-or-inf argument check.
|
|
Almost all permutations of the flags are potentially relevant.
|
|
|
|
Adds a TODO for checking inlinining opportunities while traversing
the users of the specialization arguments. This was brought up in
the review of D154852.
|
|
Since the spec doesn't describe these behaviors as invalid,
the llvm-mc should just make them take care by hardware.
Differential Revision: https://reviews.llvm.org/D155669
|
|
for the false lanes.
Differential Revision: https://reviews.llvm.org/D155972
|
|
This patch allows constant folding of PHIs when estimating the user
bonus. Phi nodes are a special case since some of their inputs may
remain unresolved until all the specialization arguments have been
processed by the InstCostVisitor. Therefore, we keep a list of dead
basic blocks and then lazily visit the Phi nodes once the user bonus
has been computed for all the specialization arguments.
Differential Revision: https://reviews.llvm.org/D154852
|
|
SetVector's default template parameter to SmallVector<*, 0>"
This is failing on Windows MSVC builds:
llvm\unittests\Support\ThreadPool.cpp(380): error C2440: 'return': cannot convert from 'Vector' to 'std::vector<llvm::BitVector,std::allocator<llvm::BitVector>>'
with
[
Vector=llvm::SmallVector<llvm::BitVector,0>
]
|
|
|
|
Differential Revision: https://reviews.llvm.org/D156195
|
|
|
|
that have the same content. NFC.
|
|
/Users/jiefu/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUMachineCFGStructurizer.cpp:2603:10: error: variable 'CNI' set but not used [-Werror,-Wunused-but-set-variable]
auto CNI = CI;
^
1 error generated.
|
|
Similar to D156016 for MapVector.
|
|
This reverts commit eea9258648ce73507f6f85c395de978af659d498.
That commit triggered crashes in the following testcase:
$ cat reduced.c
typedef struct {
int a[8]
} b;
typedef struct {
b *c;
short d
} e;
void f() {
int g;
char *h;
e *i = f;
short j = i->d;
int a = i->c->a[0];
for (;;)
for (; g < a; g++) {
*h = j * i->d >> 8;
h++;
}
}
$ clang -target aarch64-linux-gnu -w -c -O2 reduced.c
|
|
Reduce the scope of some variables.
Replace an if with an assertion.
Reviewed By: kmitropoulou
Differential Revision: https://reviews.llvm.org/D156140
|
|
Reviewed By: wangpc
Differential Revision: https://reviews.llvm.org/D156200
|
|
similar to D117454, try to add vl patterns and testcases.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D155466
|
|
Reviewed By: pengfei, RKSimon, skan
Differential Revision: https://reviews.llvm.org/D155798
|
|
Depends on D152706
Solves SWDEV-408279
Reviewed By: #amdgpu, arsenm
Differential Revision: https://reviews.llvm.org/D155699
|
|
This allows PromoteAlloca to not be reliant on a second SROA run to remove the alloca completely. It just does the full transformation directly.
Note PromoteAlloca is still reliant on SROA running first to
canonicalize the IR. For instance, PromoteAlloca will no longer handle aggregate types because those should be simplified by SROA before reaching the pass.
Reviewed By: #amdgpu, arsenm
Differential Revision: https://reviews.llvm.org/D152706
|
|
CUDA and HIP have kernel attributes to tune the code generation (in the
backend). To reuse this functionality for OpenMP target regions we
introduce the `ompx_attribute` clause that takes these kernel
attributes and emits code as if they had been attached to the kernel
fuction (which is implicitly generated).
To limit the impact, we only support three kernel attributes:
`amdgpu_waves_per_eu`, for AMDGPU
`amdgpu_flat_work_group_size`, for AMDGPU
`launch_bounds`, for NVPTX
The existing implementations of those attributes are used for error
checking and code generation. `ompx_attribute` can be attached to any
executable target region and it can hold more than one kernel attribute.
Differential Revision: https://reviews.llvm.org/D156184
|
|
SmallVector<*, 0> is often a better replacement for std::vector :
both the object size and the code size are smaller.
(SmallMapVector uses SmallVector as well, but it is not common.)
clang size decreases by 0.0226%.
instructions:u decreases 0.037% when compiling a sqlite3 amalgram.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D156016
|
|
into %t instead
The cwd of the test might not be writable.
|
|
This came up in the context of #63169 - if this assert were in place it
would've been much easier to reduce the test case.
|
|
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D155784
|
|
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D156092
|
|
Implement isSExtCheaperThanZExt.
Signed-off-by: WANG Rui <wangrui@loongson.cn>
Differential Revision: https://reviews.llvm.org/D154919
|
|
Add test case showing suboptimal codegen when zero extending.
Signed-off-by: WANG Rui <wangrui@loongson.cn>
Reviewed By: xen0n
Differential Revision: https://reviews.llvm.org/D154918
|
|
The author of the following files is licongtian <licongtian@loongson.cn>:
- clang/lib/Basic/Targets/LoongArch.cpp
- llvm/lib/Target/LoongArch/LoongArchAsmPrinter.cpp
- llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
The files mentioned above implement InlineAsm for LSX and LASX as follows:
- Enable clang parsing LSX/LASX register name, such as $vr0.
- Support the case which operand type is 128bit or 256bit when the
constraints is 'f'.
- Support the way of specifying LSX/LASX register by using constraint,
such as "={$xr0}".
- Support the operand modifiers 'u' and 'w'.
- Support and legalize the data types and register classes involved in
LSX/LASX in the lowering process.
Reviewed By: xen0n, SixWeining
Differential Revision: https://reviews.llvm.org/D154931
|
|
|
|
On AIX, a libatomic supporting inline quadword atomic operations has been released, so that compatibility is not an issue now, we can enable quadword atomics by default.
Reviewed By: #powerpc, nemanjai
Differential Revision: https://reviews.llvm.org/D151312
|
|
This makes it possible to use canonicalize to perform a dynamic check
for whether denormal flushing is enabled, which will fold out when the
denormal mode is known. Previously it would only fold if denormal
flushing were known enabled.
https://reviews.llvm.org/D156107
|
|
UUID's & `installapi` flag are no longer useful in recent apple linker/tapi.
The reason for removing them is that these are attributes that record
how a library was built but not really about the library itself. TBD
files now only track information this is important as link time
dependencies.
Reviewed By: ributzka
Differential Revision: https://reviews.llvm.org/D149861
|
|
This reverts commit 6c48f57c14dcfe2410afcb4c6778dcbb40d294b5.
Build broken on GCC.
|
|
Unlike fmaxnum and fminnum, these operations propagate nan and
consider -0.0 to be less than +0.0.
Without Zfa, we don't have a single instruction for this. The
lowering I've used forces the other input to nan if one input
is a nan. If both inputs are nan, they get swapped. Then use
the fmax or fmin instruction.
New ISD nodes are needed because fmaxnum/fminnum to not define
the order of -0.0 and +0.0.
This lowering ensures the snans are quieted though that is probably not
required in default environment). Also ensures non-canonical nans
are canonicalized, though I'm also not sure that's needed.
Another option could be to use fmax/fmin and then overwrite the
result based on the inputs being nan, but I'm not sure we can do
that with any less code.
Future work will handle nonans FMF, and handling the case where
we can prove the input isn't nan.
This does fix the crash in #64022, but we need to do more work
to avoid scalarization.
Reviewed By: fakepaper56
Differential Revision: https://reviews.llvm.org/D156069
|
|
This is a preparation for ARM64EC/ARM64X binaries, which may contain both ARM64
and x86_64 code in the same file. llvm-objdump already has partial support for
mixing disassemblers for ARM thumb mode support. However, for ARM64EC we can't
share MCContext, MCInstrAnalysis and PrettyPrinter instances. This patch
provides additional abstraction which makes adding mixed code support later in
the series easier.
Reviewed By: jhenderson, MaskRay
Differential Revision: https://reviews.llvm.org/D149093
|
|
This is fixing a mistake in 4f4f49137.
|
|
|
|
|
|
SymbolSet is a structure that acts as a simple container class for exported symbols that
belong to a library interface. It allows tapi to decouple the globals
from the other library attributes. It's uniqued by symbol name and `kind`, which all contain their assigned target triples.
Reviewed By: zixuw
Differential Revision: https://reviews.llvm.org/D149860
|
|
Differential Revision: https://reviews.llvm.org/D149091
|
|
ARM64EC/ARM64X binaries use ARM64 or AMD64 machine types, but provide
additional CHPE metadata that may be used to distinguish them from
pure ARM64/AMD64 binaries.
Reviewed By: jhenderson, MaskRay, mstorsjo
Differential Revision: https://reviews.llvm.org/D149091
|
|
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D149440
|
|
Any users of LoopAccessAnalysis should use MaxSafeVectorWidthInBits.
Differential Revision: https://reviews.llvm.org/D156034
|
|
|