Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
|
|
Added in D156163
|
|
scf.forall
|
|
|
|
llvm/unittests/Target/AArch64/AArch64SVESchedPseudoTest.cpp:38:10: error: module @llvm-project//llvm/unittests:target_aarch64_tests does not depend on a module exporting 'AArch64GenInstrInfo.inc'
Test was added in 57329ca94630742ce3b0f6b239b263d757a9eb4a
|
|
|
|
|
|
70c2e0618a0f3c09ed7149d88b4987b932eb6705
|
|
|
|
|
|
|
|
|
|
|
|
The new printf writer design focuses on optimizing the fast path. It
inlines any write to a buffer or string, and by handling buffering
itself can more effectively work with both internal and external file
implementations. The overflow hook should allow for expansion to
asprintf with minimal extra code.
Reviewed By: sivachandra
Differential Revision: https://reviews.llvm.org/D153999
|
|
|
|
8b5d3ba829c162fd4890fd65a4629ce0715825ee
|
|
|
|
Adds a generic lowering that suppors all cases of bufferization.dealloc
and one specialized, more efficient lowering for the simple case. Using
a helper function with for loops in the general case enables
O(|num_dealloc_memrefs|+|num_retain_memrefs|) size of the lowered code.
Depends on D155467
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D155468
|
|
Differential Revision: https://reviews.llvm.org/D155686
|
|
|
|
This patch mostly renames files so it better reflects the function they declare.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D155607
|
|
Preparation to update bazel builder to use LLVM 16 release
where layering check was enabled https://reviews.llvm.org/D132779
Current setup missed some backsliding in layering check as it has
only on for projects with the check enforced.
Disabled it completely for libc and fixed for DWARFLinkerParallel.
It would be great to re-enable it for libc later.
|
|
|
|
|
|
|
|
This transform looks for suitable vector transfers from global memory to shared memory and converts them to async device copies.
Differential Revision: https://reviews.llvm.org/D155569
|
|
This is a follow up on D154800 and D154770 to make the code structure more principled and avoid too many nested #ifdef/#endif.
Reviewed By: courbet
Differential Revision: https://reviews.llvm.org/D155515
|
|
|
|
This fixes builds for 7e78ecfe10ea9071234de8d385b87d338d280266 (both cmake and bazel) as well as trim unnecessary dependencies.
This is achieved by moving the functionality to test/lib/GPU which is a more natural landing pad.
|
|
and Vector dialects
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D155450
|
|
Add a simple transform operation to the NVGPU extension that performs
software pipelining of copies to shared memory. The functionality is
extremely minimalistic in this version and only supports copies from
global to shared memory inside an `scf.for` loop with either
`vector.transfer` or `nvgpu.device_async_copy` operations when
pipelining preconditions are already satisfied in the IR. This is the
minimally useful version that uses the more general loop pipeliner in an
NVGPU-specific way. Further extensions and orthogonalizations will be
necessary.
This required a change to the loop pipeliner itself to properly
propagate errors should the predicate generator fail.
This is loosely inspired from the vesion in IREE, but has less unsafe
assumptions and more principled way of communicating decisions.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D155223
|
|
* Move passes to `Transforms` directory.
* Add `Utils.h` (will be utilized in a subsequent change).
Differential Revision: https://reviews.llvm.org/D155427
|
|
This is a follow up on D154800 and D154770 to make the code structure more principled and avoid too many nested #ifdef/#endif.
Reviewed By: courbet
Differential Revision: https://reviews.llvm.org/D155181
|
|
This is a follow up on D154800 and D154770 to make the code structure more principled and avoid too many nested #ifdef/#endif.
Reviewed By: courbet
Differential Revision: https://reviews.llvm.org/D155174
|
|
|
|
This new option allows users to specify a custom memcpy op.
Differential Revision: https://reviews.llvm.org/D155280
|
|
It also unifies the computation of StridedLayoutAttr. If the stride is
static known value, we can just use it.
Differential Revision: https://reviews.llvm.org/D155017
|
|
This is a follow up on D154800 and D154770 to make the code structure more principled and avoid too many nested #ifdef/#endif.
Reviewed By: courbet
Differential Revision: https://reviews.llvm.org/D155099
|
|
This is a follow up on D154800 and D154770 to make the code structure more principled and avoid too many nested #ifdef/#endif.
Reviewed By: courbet
Differential Revision: https://reviews.llvm.org/D155076
|
|
|
|
Switch the parse of command line options from llvm::cl to OptTable.
The motivation for this change is to continue adding llvm based tools
to the llvm driver multicall. For more information about the proposal
and motivation, please see https://discourse.llvm.org/t/rfc-llvm-busybox-proposal/58494
Reviewed By: abrachet
Differential Revision: https://reviews.llvm.org/D154642
|
|
|
|
|
|
|
|
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D155004
|
|
This is the counterpart to the forward dense dataflow analysis and
integrates into the dataflow framework. The implementation follows the
structure of existing dataflow analyses.
Reviewed By: Mogball, phisiart
Differential Revision: https://reviews.llvm.org/D154713
|
|
Differential Revision: https://reviews.llvm.org/D154976
|
|
(global-isel-combiner-matchtable)
|
|
|