Age | Commit message (Collapse) | Author | Files | Lines |
|
OpenMP 6.0 has changed the modifiers on the MAP clause. Previous patch
has introduced parsing support for them. This patch introduces
processing of the new forms in semantic checks and in lowering. This
only applies to existing modifiers, which were updated in the 6.0 spec.
Any of the newly introduced modifiers (SELF and REF) are ignored.
|
|
Additionally, add sentinel values <Enum>::First_ and <Enum>::Last_ to
each one of those enums.
This will allow using `enum_seq_inclusive` to generate the list of
enum-typed values of any generated scoped (non-bitmask) enum.
|
|
With the advent of intrinsic-less debug-info, we no longer need to
scatter calls to getPrevNonDebugInstruction around the codebase. Remove
most of them -- there are one or two that have the "SkipPseudoOp" flag
turned on, however they don't seem to be in positions where skipping
anything would be reasonable.
|
|
There are no longer debug-info instructions, thus we don't need this
skipping. Horray!
|
|
Reduction support: https://github.com/llvm/llvm-project/pull/146671
If Support is fixed in this PR
The problem for the IF clause in composite constructs was that wsloop
and simd both operate on the same CanonicalLoopInfo structure: with the
SIMD processed first, followed by the wsloop. Previously the IF clause
generated code like
```
if (cond) {
while (...) {
simd_loop_body;
}
} else {
while (...) {
nonsimd_loop_body;
}
}
```
The problem with this is that this invalidates the CanonicalLoopInfo
structure to be processed by the wsloop later. To avoid this, in this
patch I preserve the original loop, moving the IF clause inside of the
loop:
```
while (...) {
if (cond) {
simd_loop_body;
} else {
non_simd_loop_body;
}
}
```
On simple examples I tried LLVM was able to hoist the if condition
outside of the loop at -O3.
The disadvantage of this is that we cannot add the
llvm.loop.vectorize.enable attribute on either the SIMD or non-SIMD
loops because they both share a loop back edge. There's no way of
solving this without keeping the old design of having two different
loops: which cannot be represented using only one CanonicalLoopInfo
structure. I don't think the presence or absence of this attribute makes
much difference. In my testing it is the llvm.loop.parallel_access
metadata which makes the difference to vectorization. LLVM will
vectorize if legal whether or not this attribute is there in the TRUE
branch. In the FALSE branch this means the loop might be vectorized even
when the condition is false: but I think this is still standards
compliant: OpenMP 6.0 says that when the if clause is false that should
be treated like the SIMDLEN clause is one. The SIMDLEN clause is defined
as a "hint". For the same reason, SIMDLEN and SAFELEN clauses are
silently ignored when SIMD IF is used.
I think it is better to implement SIMD IF and ignore SIMDLEN and SAFELEN
and some vectorization encouragement metadata when combined with IF than
to ignore IF because IF could have correctness consequences whereas the
rest are optimiztion hints. For example, the user might use the IF
clause to disable SIMD programatically when it is known not safe to
vectorize the loop. In this case it is not at all safe to add the
parallel access or SAFELEN metadata.
|
|
Version (#145828)
This pr updates `setDefaultFlags` in `HLSLRootSignature.h` to account
for which version it should initialize the default flag values for.
- Updates `setDefaultFlags` with a `Version` argument and initializes
them to be compliant as described
[here](https://github.com/llvm/wg-hlsl/pull/297).
- Updates `RootSignatureParser` to retain the `Version` and pass this
into `setDefaultFlags`
- Updates all uses of `setDefaultFlags` in test-cases
- Adds some new unit testing to ensure behaviour is as expected and that
the Parser correctly passes down the version
Resolves https://github.com/llvm/llvm-project/issues/145820.
|
|
This pr breaks-up `HLSLRootSignatureUtils` into separate orthogonal and
meaningful libraries. This prevents it ending up as a dumping grounds of
many different parts.
- Creates a library `RootSignatureMetadata` to contain helper functions
for interacting the root signatures in their metadata representation
- Create a library `RootSignatureValidations` to contain helper
functions that will validate various values of root signatures
- Move the serialization of root signature elements to
`HLSLRootSignature`
Resolves: https://github.com/llvm/llvm-project/issues/145946
|
|
(#145986)
This pr removes the redundancy of having the same enums defined in both
the front-end and back-end of handling root signatures. Since there are
many more uses of the enum in the front-end of the code, we will adhere
to the naming conventions used in the front-end, to minimize the diff.
The macros in `DXContainerConstants.def` are also touched-up to be
consistent and to have each macro name follow its respective definition
in d3d12.h and searchable by name
[here](https://learn.microsoft.com/en-us/windows/win32/api/d3d12/).
Additionally, the many `getEnumNames` are moved to `DXContainer` from
`HLSLRootSignatureUtils` as they we will want them to be exposed
publicly anyways.
Changes for each enum follow the pattern of a commit that will make the
enum definition in `DXContainer` adhere to above listed naming
conventions, followed by a commit to actually use that enum in the
front-end.
Resolves https://github.com/llvm/llvm-project/issues/145815
|
|
Implement a state machine that consumes tokens (words delimited by white
space), and returns the corresponding directive id, or fails if the tokens
did not form a valid name.
|
|
For background information see
https://discourse.llvm.org/t/rfc-alternative-spellings-of-openmp-directives/85507
|
|
(#144769)
This pr extends `dumpRootElements` to invoke the print methods of all
`RootElement`s now that they are all implemented.
Extends the `RootSignatures-AST.hlsl` testcase to have a root element of
each type being parsed, constructed to the in-memory representation mode
and then being dumped as part of the AST dump.
- Update `HLSLRootSignatureUtils.cpp` to extend `dumpRootElements`
- Extend `AST/HLSL/RootSigantures-AST.hlsl` testcase
- Defines the helper `operator<<` for `RootElement`
- Small correction to the output of `numDescriptors` to be `unbounded`
in special case
Resolves https://github.com/llvm/llvm-project/issues/124595.
|
|
(#143198)
Implements serialization of the remaining `RootElement`s, namely
`RootDescriptor`s and `StaticSampler`s.
- Adds unit testing for the serialization methods
Resolves https://github.com/llvm/llvm-project/issues/138191
Resolves https://github.com/llvm/llvm-project/issues/138193
|
|
A resource range consists of a closed interval, `[a;b]`, denoting which
shader registers it is bound to.
For instance:
- `CBV(b1)` corresponds to the resource range of `[1;1]`
- `CBV(b0, numDescriptors = 3)` likewise to `[0;2]`
We want to provide an error diagnostic when there is an overlap in the
required registers (an overlap in the resource ranges).
The goal of this pr is to implement a structure to model a set of
resource ranges and provide an api to detect any overlap over a set of
resource ranges.
`ResourceRange` models this by implementing an `IntervalMap` to denote a
mapping from an interval of registers back to a resource range. It
allows for a new `ResourceRange` to be added to the mapping and it will
report if and what the first overlap is.
For the context of how this will be used in validation of a
`RootSignatureDecl` please see the proceeding pull request here:
https://github.com/llvm/llvm-project/pull/140962.
- Implements `ResourceRange` as an `IntervalMap`
- Adds unit testing of the various `insert` scenarios
Note: it was also considered to implement this as an `IntervalTree`,
this would allow reporting of a diagnostic for each overlap that is
encountered, as opposed to just the first. However, error generation of
just reporting the first error is already rather verbose, and adding the
additional diagnostics only made this worse.
Part 1 of https://github.com/llvm/llvm-project/issues/129942
|
|
RootFlags" (#143019)
This relands #141130.
The initial commit uncovered that we are missing the correct linking of
FrontendHLSL into clang/lib/Parse and clang/lib/unittests/Parse.
This change addreses this by linking them accordingly.
It was also checked and ensured that the LexHLSLRootSignature libraries
do not depend on FrontendHLSL and so we are not required to link there.
Resolves: #138190 and #138192
|
|
`HLSLRootSignature.h` was originally created to hold the struct
definitions of an `llvm::hlsl::rootsig::RootElement` and some helper
functions for it.
However, there many users of the structs that don't require any of the
helper methods. This requires us to link the `FrontendHLSL` library,
where we otherwise wouldn't need to.
For instance:
- This [revert](https://github.com/llvm/llvm-project/pull/142005) was
required as it requires linking to the unrequired `FrontendHLSL` library
- As part of the change required here:
https://github.com/llvm/llvm-project/issues/126557. We will want to add
an `HLSLRootSignatureVersion` enum. Ideally this could live with the
root signature struct defs, but we don't want to link the helper objects
into `clang/Basic/TargetOptions.h`
This change allows the struct definitions to be kept in a single header
file and to then have the `FrontendHLSL` library only be linked when
required.
|
|
`RootFlags`" (#142005)
The commit caused build failures,
[here](https://lab.llvm.org/buildbot/#/builders/10/builds/6308), due to
a missing linked llvm library (HLSLFrontend) into
`clang/unittests/Parse/CMakeLists.txt`.
While it seems like the fix is straightforwardly to just add this
library, I will revert now to build and verify locally it correctly
fixes it.
Reverts llvm/llvm-project#141130
|
|
`RootFlags` (#141130)
- Implements serialization of the currently completely defined
`RootElement`s, namely `RootConstants` and `RootFlags`
- Adds unit testing for the serialization methods
Resolves: https://github.com/llvm/llvm-project/issues/138190 and
https://github.com/llvm/llvm-project/issues/138192
|
|
(#141127)
- we will need to provide a way to dump `RootFlags` for serialization
and by using operator overloads we can maintain a consistent interface
This is an NFC to allow for
https://github.com/llvm/llvm-project/issues/138192 to be more
straightforwardly implemented.
|
|
Of the 128-bits of buffer descriptor only 48 bits are address bits, so
following the discussion on https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/54,
the logic conclusion is to set the index width to 48 bits instead of
the current value of 128.
Most of the test changes are mechanical datalayout updates, but there
is one actual change: the ptrmask test now uses .i48 instead of .i128
and I had to update SelectionDAGBuilder to correctly extend the mask.
Reviewed By: krzysz00
Pull Request: https://github.com/llvm/llvm-project/pull/139419
|
|
Static analysis flagged the passing of Dependencies to emitTargetCall as
a
place we could use std::move to avoid copying. A closer look indicated
we could
instead turn the parameter into a const & and not have a default value
since it
was only used in two lines in a test and changing those two locations
was easy.
|
|
- defines the `dump` method for in-memory descriptor table data structs
in `Frontend/HLSLRootSignature`
- creates unit test infrastructure to support unit tests of the dump
methods
Resolves https://github.com/llvm/llvm-project/issues/138189
|
|
Some OpenMP directives have different spellings in different versions of
the OpenMP spec. To use the proper spelling for a given spec version
pass "version" as a parameter to getOpenMPDirectiveName.
This parameter won't be used at the moment, and will have a default
value to allow callers not to pass it, for gradual adoption in various
components.
RFC:
https://discourse.llvm.org/t/rfc-alternative-spellings-of-openmp-directives/85507
|
|
|
|
This PR adds functionality for `__atomic_store` libcall in AtomicInfo.
This allows for supporting complex types in `atomic write`.
Fixes https://github.com/llvm/llvm-project/issues/113479
Fixes https://github.com/llvm/llvm-project/issues/115652
|
|
|
|
Current implementation of `__atomic_compare_exchange` uses an alloca for
`__atomic_load`, leading to issues like
https://github.com/llvm/llvm-project/issues/120724. This PR hoists this
alloca to `AllocaIP`.
Fixes: https://github.com/llvm/llvm-project/issues/120724
|
|
This patch adds the lowering of teams reductions from the omp dialect to
LLVM-IR. Some minor cleanup was done in clang to remove an unused
parameter.
|
|
This patch splits off the calculation of canonical loop trip counts from
the creation of canonical loops. This makes it possible to reuse this
logic to, for instance, populate the `__tgt_target_kernel` runtime call
for SPMD kernels.
This feature is used to simplify one of the existing OpenMPIRBuilder
tests.
|
|
(#124746)
This patch adds OpenMPToLLVMIRTranslation support for the OpenMP Declare
Mapper directive.
Since both MLIR and Clang now support custom mappers, I've changed the
respective function params to no longer be optional as well.
Depends on #121005
|
|
Fixes #125088.
When splitBB is called with createBranch=true, it creates a branch
instruction in the old block. But no debug loc is set on that branch
instruction. If that is used as InsertPoint in the restoreIP, it has the
potential to set the current debug location to null and subsequent
instruction will come out without a debug location. This caused the
verification check to fail as shown in the bug report.
This PR changes splitBB and spliceBB function to also take a debugLoc
parameter which can be used to set the debug location of the branch
instruction.
|
|
This is same as PR #125106 which somehow is stuck in a "Processing
Update" loop for many hours now. I am going to close that one and push
this one instead.
While working on https://github.com/llvm/llvm-project/issues/125088, I
noticed a problem with the TargetBodyGenCallbackTy and
TargetGenArgAccessorsCallbackTy. The OMPIRBuilder and MLIR side Both
maintain their own IRBuilder and when control goes from one to other, we
have to take care to not use a stale debug location. The code currently
rely on restoreIP to set the insertion point and the debug location. But
if the passes InsertPointTy has an empty block, then the debug location
will not be updated (see SetInsertPoint). This can cause invalid debug
location to be attached to instruction and the verifier will complain.
Similarly when we exit the callback, the debug location of the Builder
is not set to what it was before the callback. This again can cause
verification failures.
This PR resets the debug location at the start and also uses an
InsertPointGuard to restore the debug location at exit.
Both of these problems would have been caught by the unit tests but they
were not setting the debug location of the builder before calling the
createTarget so the problem was hidden. I have updated the tests
accordingly.
|
|
This patch adds initial support for target_device selector set - Section
9.2 (Spec 6.0)
|
|
(#124287)
As part of the "RemoveDIs" work to eliminate debug intrinsics, we're
replacing methods that use Instruction*'s as positions with iterators. A
number of these (such as getFirstNonPHIOrDbg) are sufficiently
infrequently used that we can just replace the pointer-returning version
with an iterator-returning version, hopefully without much/any
disruption.
Thus this patch has getFirstNonPHIOrDbg and
getFirstNonPHIOrDbgOrLifetime return an iterator, and updates all
call-sites. There are no concerns about the iterators returned being
converted to Instruction*'s and losing the debug-info bit: because the
methods skip debug intrinsics, the iterator head bit is always false
anyway.
|
|
Specifying a kernel with the `ptx_kernel` or `amdgpu_kernel` calling
convention is a more idiomatic and compile-time performant than using
the `nvvm.annoation !"kernel"` metadata.
Transition OMPIRBuilder to use calling conventions for PTX kernels and
no longer emit `nvvm.annoation`. Update OpenMPOpt to work with kernels
specified via calling convention as well as metadata. Update OpenMP
tests to use the calling conventions.
|
|
(#123737)
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to getFirstNonPHI use the iterator-returning version.
This patch changes a bunch of call-sites calling getFirstNonPHI to use
getFirstNonPHIIt, which returns an iterator. All these call sites are
where it's obviously safe to fetch the iterator then dereference it. A
follow-up patch will contain less-obviously-safe changes.
We'll eventually deprecate and remove the instruction-pointer
getFirstNonPHI, but not before adding concise documentation of what
considerations are needed (very few).
---------
Co-authored-by: Stephen Tozer <Melamoto@gmail.com>
|
|
(NFC) (#123901)
Follow up to https://github.com/llvm/llvm-project/issues/123569
|
|
This patch implements support for handling the 'if' clause of OpenMP
'target' constructs in the OMPIRBuilder and updates MLIR to LLVM IR
translation of the `omp.target` MLIR operation to make use of this new
feature.
|
|
This patch copies the target-cpu and target-features attributes of
functions containing target regions into the corresponding outlined
function holding the target region.
This mirrors what is currently being done for all other outlined
functions through the `CodeExtractor` in `OpenMPIRBuilder::finalize()`.
|
|
(#116051)
This patch introduces a `TargetKernelRuntimeAttrs` structure to hold
host-evaluated `num_teams`, `thread_limit`, `num_threads` and trip count
values passed to the runtime kernel offloading call.
Additionally, kernel type information is used to influence target device
code generation and the `IsSPMD` flag is replaced by `ExecFlags`, which
provides more granularity.
|
|
This patch introduces the `OpenMPIRBuilder::TargetKernelDefaultAttrs`
structure used to simplify passing default and constant values for
number of teams and threads, and possibly other target kernel-related
information in the future.
This is used to forward values passed to `createTarget` to
`createTargetInit`, which previously used a default unrelated set of
values.
|
|
The preprocessor definition used to enable asserts and the one that
`llvm::Error` and `llvm::Expected` use to ensure all created instances are
checked are not the same. By making these checks inside of an `assert` in cases
where errors are not expected, certain build configurations would trigger
runtime failures (e.g. `-DLLVM_ENABLE_ASSERTIONS=OFF
-DLLVM_UNREACHABLE_OPTIMIZE=ON`).
The `llvm::cantFail()` function, which was intended for this use case, is used
by this patch in place of `assert` to prevent these runtime failures. In tests,
new preprocessor definitions based on `ASSERT_THAT_EXPECTED` and
`EXPECT_THAT_EXPECTED` are used instead, to avoid silent failures in release
builds.
|
|
This patch,
- Added a translation support for aligned clause in SIMD directive by passing the alignment details to "llvm.assume" intrinsic.
- Updated the insertion point for llvm.assume intrinsic call in "OMPIRBuilder.cpp".
- Added a check in aligned clause MLIR lowering, to ensure that the alignment value must be a power of 2.
|
|
The OmpLinearClause class was a variant of two classes, one for when the
linear modifier was present, and one for when it was absent. These two
classes did not follow the conventions for parse tree nodes, (i.e.
tuple/wrapper/union formats), which necessitated specialization of the
parse tree visitor.
The new form of OmpLinearClause is the standard tuple with a list of
modifiers and an object list. The specialization of parse tree visitor
for it has been removed.
Parsing and unparsing of the new form bears additional complexity due to
syntactical differences between OpenMP 5.2 and prior versions: in OpenMP
5.2 the argument list is post-modified, while in the prior versions, the
step modifier was a post-modifier while the linear modifier had an
unusual syntax of `modifier(list)`.
With this change the LINEAR clause is no different from any other
clauses in terms of its structure and use of modifiers. Modifier
validation and all other checks work the same as with other clauses.
|
|
privatization of allocatables in `omp.target` ops (#116576)
This PR adds support to translate the `private` clause from MLIR to
LLVMIR when used on allocatables in the context of an `omp.target` op.
This replaces https://github.com/llvm/llvm-project/pull/113208.
Parent PR: https://github.com/llvm/llvm-project/pull/116770. Only the
latest commit is relevant to the PR.
|
|
Again, this simplifies the semantic checks and lowering quite a bit.
Update the check for positive alignment to use a more informative
message, and to highlight the modifier itsef, not the whole clause.
Remove the checks for the allocator expression itself being positive:
there is nothing in the spec that says that it should be positive.
Remove the "simple" modifier from the AllocateT template, since both
simple and complex modifiers are the same thing, only differing in
syntax.
|
|
This patch implements an approach to communicate errors between the
OMPIRBuilder and its users. It introduces `llvm::Error` and
`llvm::Expected` objects to replace the values returned by callbacks
passed to `OMPIRBuilder` codegen functions. These functions then check
the result for errors when callbacks are called and forward them back to
the caller, which has the flexibility to recover, exit cleanly or dump a
stack trace.
This prevents a failed callback to leave the IR in an invalid state and
still continue the codegen process, triggering unrelated assertions or
segmentation faults. In the case of MLIR to LLVM IR translation of the
'omp' dialect, this change results in the compiler emitting errors and
exiting early instead of triggering a crash for not-yet-implemented
errors. The behavior in Clang and openmp-opt stays unchanged, since
callbacks will continue always returning 'success'.
|
|
`llvm::Type::getPointerTo()` is to be deprecated & removed soon.
|
|
(#109388)
Follow up to #109133.
|
|
When an outlined function is generated for omp target region, a
corresponding DISubprogram was not being generated. This resulted in all
the debug information for the target region being dropped.
This commit adds DISubprogram for the outlined function if there is one
available for the parent function. It also updates the current debug
location so that the right scope is used for the entries in the outlined
function.
There are places in the OpenMPIRBuilder which changes insertion point but
don't update the debug location accordingly. They cause issue when debug info
is enabled. I have fixed a few that I observed to cause issue. But there may be
more and a systematic cleanup may be required.
With this change in place, I can set source line breakpoint in target
region and run to them in debugger.
|
|
(#100156)
This patch modifies MLIR to LLVM IR lowering of the OpenMP dialect to take into
consideration the contents of the `omp.target_triples` module attribute while
generating code for `omp.target` operations.
It adds the `OpenMPIRBuilderConfig::TargetTriples` field and initializes it
using the `amendOperation` flow of the `OpenMPToLLVMIRTranslation` pass. Some
changes are introduced into the `OpenMPIRBuilder` to allow passing the
information about whether a target region is intended to be offloaded from
outside.
The result of this change is that offloading calls are only generated when the
`--offload-arch` or `-fopenmp-targets` options are given to the compiler.
Otherwise, only the host fallback code is generated. This fixes linker errors
currently triggered by `flang-new` if a source file containing a `target`
construct is compiled without any of the aforementioned options.
Several unit tests impacted by these changes, which are intended to check host
code generated for `omp.target` operations, are updated to contain the new
attribute. Without it, no calls to `__tgt_target_kernel` and associated control
flow operations are generated.
Fixes #100209.
|