aboutsummaryrefslogtreecommitdiff
path: root/llvm
AgeCommit message (Collapse)AuthorFilesLines
2023-07-25Clear release notes for 18.xllvmorg-18-initTobias Hieta1-288/+0
2023-07-25Bump trunk version to 18.0.0gitTobias Hieta3-3/+3
2023-07-25Revert "[OpenMP] Add the `ompx_attribute` clause for target directives"Aaron Ballman1-63/+28
This reverts commit ef9ec4bbcca2fa4f64df47bc426f1d1c59ea47e2. The changes broke several bots: https://lab.llvm.org/buildbot/#/builders/176/builds/3408 https://lab.llvm.org/buildbot/#/builders/198/builds/4028 https://lab.llvm.org/buildbot/#/builders/197/builds/8491 https://lab.llvm.org/buildbot/#/builders/197/builds/8491
2023-07-25AMDGPU: Remove trailing whitespace from documentationMatt Arsenault1-4/+4
2023-07-25AMDGPU: Correctly expand f64 sqrt intrinsicMatt Arsenault13-881/+5932
rocm-device-libs and llpc were avoiding using f64 sqrt intrinsics in favor of their own expansions. Port the expansion into the backend. Both of these users should be updated to call the intrinsic instead. The library and llpc expansions are slightly different. llpc uses an ldexp to do the scale; the library uses a multiply. Use ldexp to do the scale instead of the multiply. I believe v_ldexp_f64 and v_mul_f64 are always the same number of cycles, but it's cheaper to materialize the 32-bit integer constant than the 64-bit double constant. The libraries have another fast version of sqrt which will be handled separately. I am tempted to do this in an IR expansion instead. In the IR we could take advantage of computeKnownFPClass to avoid the 0-or-inf argument check.
2023-07-25AMDGPU: Add more sqrt f64 lowering testsMatt Arsenault2-303/+2727
Almost all permutations of the flags are potentially relevant.
2023-07-25Attributor: Fix typoMatt Arsenault1-1/+1
2023-07-25[FuncSpec][NFC] Leave a comment for future improvements.Alexandros Lamprineas1-0/+3
Adds a TODO for checking inlinining opportunities while traversing the users of the specialization arguments. This was brought up in the review of D154852.
2023-07-25[RISCV] Remove zvk uimm constraints4vtomat6-25/+21
Since the spec doesn't describe these behaviors as invalid, the llvm-mc should just make them take care by hardware. Differential Revision: https://reviews.llvm.org/D155669
2023-07-25[SVE] Add vselect(mla/mls) patterns for cases where a multiplicand is used ↵Paul Walker3-129/+128
for the false lanes. Differential Revision: https://reviews.llvm.org/D155972
2023-07-25[FuncSpec] Add Phi nodes to the InstCostVisitor.Alexandros Lamprineas3-7/+143
This patch allows constant folding of PHIs when estimating the user bonus. Phi nodes are a special case since some of their inputs may remain unresolved until all the specialization arguments have been processed by the InstCostVisitor. Therefore, we keep a list of dead basic blocks and then lazily visit the Phi nodes once the user bonus has been computed for all the specialization arguments. Differential Revision: https://reviews.llvm.org/D154852
2023-07-25Revert rGfae7b98c221b5b28797f7b56b656b6b819d99f27 "[Support] Change ↵Simon Pilgrim3-32/+20
SetVector's default template parameter to SmallVector<*, 0>" This is failing on Windows MSVC builds: llvm\unittests\Support\ThreadPool.cpp(380): error C2440: 'return': cannot convert from 'Vector' to 'std::vector<llvm::BitVector,std::allocator<llvm::BitVector>>' with [ Vector=llvm::SmallVector<llvm::BitVector,0> ]
2023-07-25[gn build] Port 6084ee742064LLVM GN Syncbot1-0/+1
2023-07-25[docs] Add llvm & clang release notes for LoongArchWeining Lu1-0/+4
Differential Revision: https://reviews.llvm.org/D156195
2023-07-25[JITLink][PowerPC] Pre-commit test for D155925. NFC.Kai Luo3-0/+67
2023-07-25[RISCV] Merge rv32/rv64 vector narrowing integer right shift intrinsic tests ↵Jim Lin4-4592/+728
that have the same content. NFC.
2023-07-25[AMDGPU] Remove unused variable 'CNI' in /AMDGPUMachineCFGStructurizer.cpp (NFC)Jie Fu1-3/+0
/Users/jiefu/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUMachineCFGStructurizer.cpp:2603:10: error: variable 'CNI' set but not used [-Werror,-Wunused-but-set-variable] auto CNI = CI; ^ 1 error generated.
2023-07-25[Support] Change SetVector's default template parameter to SmallVector<*, 0>Fangrui Song3-20/+32
Similar to D156016 for MapVector.
2023-07-25Revert "[LV] Re-use existing broadcast value for live-ins."Martin Storsjö19-587/+742
This reverts commit eea9258648ce73507f6f85c395de978af659d498. That commit triggered crashes in the following testcase: $ cat reduced.c typedef struct { int a[8] } b; typedef struct { b *c; short d } e; void f() { int g; char *h; e *i = f; short j = i->d; int a = i->c->a[0]; for (;;) for (; g < a; g++) { *h = j * i->d >> 8; h++; } } $ clang -target aarch64-linux-gnu -w -c -O2 reduced.c
2023-07-25[DAGCombiner] Minor improvements to foldAndOrOfSETCC. NFCCraig Topper1-5/+4
Reduce the scope of some variables. Replace an if with an assertion. Reviewed By: kmitropoulou Differential Revision: https://reviews.llvm.org/D156140
2023-07-24[RISCV] Don't print a tab after mnemonics that don't have operands.Craig Topper1-1/+1
Reviewed By: wangpc Differential Revision: https://reviews.llvm.org/D156200
2023-07-25[RISCV] Match ext_vl+sra_vl/srl_vl+trunc_vector_vl to vnsra.wv/vnsrl.wvLiaoChunyu4-0/+236
similar to D117454, try to add vl patterns and testcases. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D155466
2023-07-25[X86] Support -march=graniterapids-d and update -march=graniterapidsFreddy Ye7-5/+26
Reviewed By: pengfei, RKSimon, skan Differential Revision: https://reviews.llvm.org/D155798
2023-07-25[AMDGPU] Allow vector access types in PromoteAllocaToVectorpvanhout3-32/+480
Depends on D152706 Solves SWDEV-408279 Reviewed By: #amdgpu, arsenm Differential Revision: https://reviews.llvm.org/D155699
2023-07-25[AMDGPU] Use SSAUpdater in PromoteAllocapvanhout10-341/+570
This allows PromoteAlloca to not be reliant on a second SROA run to remove the alloca completely. It just does the full transformation directly. Note PromoteAlloca is still reliant on SROA running first to canonicalize the IR. For instance, PromoteAlloca will no longer handle aggregate types because those should be simplified by SROA before reaching the pass. Reviewed By: #amdgpu, arsenm Differential Revision: https://reviews.llvm.org/D152706
2023-07-24[OpenMP] Add the `ompx_attribute` clause for target directivesJohannes Doerfert1-28/+63
CUDA and HIP have kernel attributes to tune the code generation (in the backend). To reuse this functionality for OpenMP target regions we introduce the `ompx_attribute` clause that takes these kernel attributes and emits code as if they had been attached to the kernel fuction (which is implicitly generated). To limit the impact, we only support three kernel attributes: `amdgpu_waves_per_eu`, for AMDGPU `amdgpu_flat_work_group_size`, for AMDGPU `launch_bounds`, for NVPTX The existing implementations of those attributes are used for error checking and code generation. `ompx_attribute` can be attached to any executable target region and it can hold more than one kernel attribute. Differential Revision: https://reviews.llvm.org/D156184
2023-07-24[Support] Change MapVector's default template parameter to SmallVector<*, 0>Fangrui Song2-7/+8
SmallVector<*, 0> is often a better replacement for std::vector : both the object size and the code size are smaller. (SmallMapVector uses SmallVector as well, but it is not common.) clang size decreases by 0.0226%. instructions:u decreases 0.037% when compiling a sqlite3 amalgram. Reviewed By: JDevlieghere Differential Revision: https://reviews.llvm.org/D156016
2023-07-25test/.../print-dot-dom.ll: Avoid writing to cwd of test by creating/cding ↵David Blaikie1-0/+3
into %t instead The cwd of the test might not be writable.
2023-07-25ADT: ArrayRef: Assert that begin <= endDavid Blaikie1-1/+3
This came up in the context of #63169 - if this assert were in place it would've been much easier to reduce the test case.
2023-07-25[X86] Update features for sierraforest, grandridgeFreddy Ye2-2/+4
Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D155784
2023-07-25[RISCV] Add a common class for cm.push, cm.popret, cm.popretz and cm.pop.Jim Lin1-42/+16
Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D156092
2023-07-25[LoongArch] Implement isSExtCheaperThanZExtWANG Rui13-85/+83
Implement isSExtCheaperThanZExt. Signed-off-by: WANG Rui <wangrui@loongson.cn> Differential Revision: https://reviews.llvm.org/D154919
2023-07-25[LoongArch] Add test case showing suboptimal codegen when zero extendingWANG Rui1-0/+17
Add test case showing suboptimal codegen when zero extending. Signed-off-by: WANG Rui <wangrui@loongson.cn> Reviewed By: xen0n Differential Revision: https://reviews.llvm.org/D154918
2023-07-25[LoongArch] Support InlineAsm for LSX and LASXchenli6-1/+174
The author of the following files is licongtian <licongtian@loongson.cn>: - clang/lib/Basic/Targets/LoongArch.cpp - llvm/lib/Target/LoongArch/LoongArchAsmPrinter.cpp - llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp The files mentioned above implement InlineAsm for LSX and LASX as follows: - Enable clang parsing LSX/LASX register name, such as $vr0. - Support the case which operand type is 128bit or 256bit when the constraints is 'f'. - Support the way of specifying LSX/LASX register by using constraint, such as "={$xr0}". - Support the operand modifiers 'u' and 'w'. - Support and legalize the data types and register classes involved in LSX/LASX in the lowering process. Reviewed By: xen0n, SixWeining Differential Revision: https://reviews.llvm.org/D154931
2023-07-24[llvm-objdump][test] Improve elf-aarch64-mapping-symbols.testFangrui Song1-15/+24
2023-07-25[PowerPC][AIX] Enable quadword atomics by default for AIXKai Luo5-297/+254
On AIX, a libatomic supporting inline quadword atomic operations has been released, so that compatibility is not an issue now, we can enable quadword atomics by default. Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D151312
2023-07-24ConstantFolding: Constant fold denormal inputs to canonicalize for IEEEMatt Arsenault2-12/+244
This makes it possible to use canonicalize to perform a dynamic check for whether denormal flushing is enabled, which will fold out when the denormal mode is known. Previously it would only fold if denormal flushing were known enabled. https://reviews.llvm.org/D156107
2023-07-24[TextAPI] Remove TBD file attributes that aren't used anymore.Cyndy Ishida14-181/+10
UUID's & `installapi` flag are no longer useful in recent apple linker/tapi. The reason for removing them is that these are attributes that record how a library was built but not really about the library itself. TBD files now only track information this is important as link time dependencies. Reviewed By: ributzka Differential Revision: https://reviews.llvm.org/D149861
2023-07-24Revert "[llvm-objdump] [NFC] Factor out DisassemblerTarget class."Jacek Caban1-121/+77
This reverts commit 6c48f57c14dcfe2410afcb4c6778dcbb40d294b5. Build broken on GCC.
2023-07-24[RISCV] Add lowering for scalar fmaximum/fminimum.Craig Topper9-20/+521
Unlike fmaxnum and fminnum, these operations propagate nan and consider -0.0 to be less than +0.0. Without Zfa, we don't have a single instruction for this. The lowering I've used forces the other input to nan if one input is a nan. If both inputs are nan, they get swapped. Then use the fmax or fmin instruction. New ISD nodes are needed because fmaxnum/fminnum to not define the order of -0.0 and +0.0. This lowering ensures the snans are quieted though that is probably not required in default environment). Also ensures non-canonical nans are canonicalized, though I'm also not sure that's needed. Another option could be to use fmax/fmin and then overwrite the result based on the inputs being nan, but I'm not sure we can do that with any less code. Future work will handle nonans FMF, and handling the case where we can prove the input isn't nan. This does fix the crash in #64022, but we need to do more work to avoid scalarization. Reviewed By: fakepaper56 Differential Revision: https://reviews.llvm.org/D156069
2023-07-24[llvm-objdump] [NFC] Factor out DisassemblerTarget class.Jacek Caban1-77/+121
This is a preparation for ARM64EC/ARM64X binaries, which may contain both ARM64 and x86_64 code in the same file. llvm-objdump already has partial support for mixing disassemblers for ARM thumb mode support. However, for ARM64EC we can't share MCContext, MCInstrAnalysis and PrettyPrinter instances. This patch provides additional abstraction which makes adding mixed code support later in the series easier. Reviewed By: jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D149093
2023-07-24[RISCV] Adjust memcpy lowering test coverage w/VPhilip Reames1-618/+95
This is fixing a mistake in 4f4f49137.
2023-07-24[RISCV] Add memcpy lowering test coverage with and without VPhilip Reames3-304/+3007
2023-07-24[gn build] Port 0882c70df222LLVM GN Syncbot1-0/+1
2023-07-24[TextAPI] Introduce SymbolSetCyndy Ishida8-135/+324
SymbolSet is a structure that acts as a simple container class for exported symbols that belong to a library interface. It allows tapi to decouple the globals from the other library attributes. It's uniqued by symbol name and `kind`, which all contain their assigned target triples. Reviewed By: zixuw Differential Revision: https://reviews.llvm.org/D149860
2023-07-24[llvm-objdump] [NFC] Add missing REQUIRES to arm64ec.yaml.Jacek Caban1-0/+2
Differential Revision: https://reviews.llvm.org/D149091
2023-07-24Recognize ARM64EC binaries in COFFObjectFile::getMachine.Jacek Caban2-1/+98
ARM64EC/ARM64X binaries use ARM64 or AMD64 machine types, but provide additional CHPE metadata that may be used to distinguish them from pure ARM64/AMD64 binaries. Reviewed By: jhenderson, MaskRay, mstorsjo Differential Revision: https://reviews.llvm.org/D149091
2023-07-24[yaml2obj] Add support for load config section data.Jacek Caban7-6/+498
Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D149440
2023-07-24[LAA] Make MaxSafeDepDistBytes private in LoopAccessAnalysis. NFCMichael Maitland1-1/+0
Any users of LoopAccessAnalysis should use MaxSafeVectorWidthInBits. Differential Revision: https://reviews.llvm.org/D156034
2023-07-24[gn build] Port 7c36b416b6b1LLVM GN Syncbot1-0/+1