Age | Commit message (Collapse) | Author | Files | Lines |
|
atomic semantics (#71480)
This PR fixes https://github.com/llvm/llvm-project/issues/69069 .
|
|
This patch moves `OMPDeclareReductionDecl::InitKind` to DeclBase.h, so that it's complete at the point where corresponding bit-field is declared. This patch also converts it to scoped enum named `OMPDeclareReductionInitKind`
|
|
The KernelEnvironment is for compile time information about a kernel. It
allows the compiler to feed information to the runtime. The
KernelLaunchEnvironment is for dynamic information *per* kernel launch.
It allows the rutime to feed information to the kernel that is not
shared with other invocations of the kernel. The first use case is to
replace the globals that synchronize teams reductions with per-launch
versions. This allows concurrent teams reductions. More uses cases will
follow, e.g., per launch memory pools.
Fixes: https://github.com/llvm/llvm-project/issues/70249
|
|
This patch moves `ArraySizeModifier` before `Type` declaration so that it's complete at `ArrayTypeBitfields` declaration. It's also converted to scoped enum along the way.
|
|
This reverts commit 85f6b2fac9a367337e43ca288c45ea783981cc16.
|
|
Differential revision: https://reviews.llvm.org/D158266
|
|
modifier on 'linear' clause
Reference:
(1) OpenMP 5.2 Specification - Seciton 5.4.6
Differential revision: https://reviews.llvm.org/D159546
|
|
This patch starts the support for OpenMP kernel language, basically to write
OpenMP target region in SIMT style, similar to kernel languages such as CUDA.
What included in this first patch is the `ompx_bare` clause for `target teams`
directive. When `ompx_bare` exists, globalization is disabled such that local
variables will not be globalized. The runtime init/deinit function calls will
not be emitted. That being said, almost all OpenMP executable directives are
not supported in the region, such as parallel, task. This patch doesn't include
the Sema checks for that, so the use of them is UB. Simple directives, such as
atomic, can be used. We provide a set of APIs (for C, they are prefix with
`ompx_`; for C++, they are in `ompx` namespace) to get thread id, block id, etc.
Please refer to
https://tianshilei.me/wp-content/uploads/llvm-hpc-2023.pdf for more details.
|
|
(#66844)"
This reverts commit e997dca3333823ffe2ea3aea288299f551532dcd.
|
|
This patch starts the support for OpenMP kernel language, basically to
write
OpenMP target region in SIMT style, similar to kernel languages such as
CUDA.
What included in this first patch is the `ompx_bare` clause for `target
teams`
directive. When `ompx_bare` exists, globalization is disabled such that
local
variables will not be globalized. The runtime init/deinit function calls
will
not be emitted. That being said, almost all OpenMP executable directives
are
not supported in the region, such as parallel, task. This patch doesn't
include
the Sema checks for that, so the use of them is UB. Simple directives,
such as
atomic, can be used. We provide a set of APIs (for C, they are prefix
with
`ompx_`; for C++, they are in `ompx` namespace) to get thread id, block
id, etc.
For more details, you can refer to
https://tianshilei.me/wp-content/uploads/llvm-hpc-2023.pdf.
|
|
|
|
offloading
- This patch adds support for thread_limit clause on target directive according to OpenMP 51 [2.14.5]
- The idea is to create an outer task for target region, when there is a thread_limit clause, and manipulate the thread_limit of task instead. This way, thread_limit will be applied to all the relevant constructs enclosed by the target region.
Differential Revision: https://reviews.llvm.org/D152054
|
|
structured-block
where clause is one of the following:
private(list)
reduction([reduction-modifier ,] reduction-identifier : list)
nowait
Differential Revision: https://reviews.llvm.org/D157933
|
|
Support for Code Generation of "#pragma loop bind" clause.
1) bind(parallel)
2) bind(teams)
3) bind(thread)
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D144634
|
|
This reverts commit 8e7f0320ad7fb760fff598aba4b2c86528c58c2d.
As requested on https://reviews.llvm.org/rG8e7f0320ad7fb760f#1236457
|
|
list
|
|
|
|
This reverts commit 4097a24584121dba562d471fab97d3dfec1b5bff.
Breaks tests on macOS, see https://reviews.llvm.org/rG4097a2458412#1235854
|
|
|
|
The if-clause on 'target teams loop' should only accept "target" as a
directive name modifier. Any other directive name should generate an
error.
Differential Revision: https://reviews.llvm.org/D156352
|
|
This reverts commit 0d12683046ca75fb08e285f4622f2af5c82609dc and
reapplies ef9ec4bbcca2fa4f64df47bc426f1d1c59ea47e2 with an extension to
fix the Flang build.
Differential Revision: https://reviews.llvm.org/D156184
|
|
This reverts commit ef9ec4bbcca2fa4f64df47bc426f1d1c59ea47e2.
The changes broke several bots:
https://lab.llvm.org/buildbot/#/builders/176/builds/3408
https://lab.llvm.org/buildbot/#/builders/198/builds/4028
https://lab.llvm.org/buildbot/#/builders/197/builds/8491
https://lab.llvm.org/buildbot/#/builders/197/builds/8491
|
|
CUDA and HIP have kernel attributes to tune the code generation (in the
backend). To reuse this functionality for OpenMP target regions we
introduce the `ompx_attribute` clause that takes these kernel
attributes and emits code as if they had been attached to the kernel
fuction (which is implicitly generated).
To limit the impact, we only support three kernel attributes:
`amdgpu_waves_per_eu`, for AMDGPU
`amdgpu_flat_work_group_size`, for AMDGPU
`launch_bounds`, for NVPTX
The existing implementations of those attributes are used for error
checking and code generation. `ompx_attribute` can be attached to any
executable target region and it can hold more than one kernel attribute.
Differential Revision: https://reviews.llvm.org/D156184
|
|
Currently, clang gives an incorrect reduction identifier error for the PLUS
operator for OpenMP version > 52. But, PLUS operator is allowed in OpenMP
version > 52. This revision fixes this issue and also modified the error
messages to show the correct expected operators in the message based on the OpenMP
version used (prior to OMP 6.0 and since OMP 6.0).
Test Src:
void foo() {
int a = 0 ;
#pragma omp parallel reduction(+:a)
;
#pragma omp parallel reduction(-:a)
;
}
Before this revision:
$ clang -fopenmp -fopenmp-version=60 test.c -c
test.c:3:34: error: incorrect reduction identifier, expected one of '+', '-', '*', '&', '|', '^', '&&', '||', 'min' or 'max' or declare reduction for type 'int'
3 | #pragma omp parallel reduction(+:a)
| ^
test.c:5:34: error: incorrect reduction identifier, expected one of '+', '-', '*', '&', '|', '^', '&&', '||', 'min' or 'max' or declare reduction for type 'int'
5 | #pragma omp parallel reduction(-:a)
| ^
2 errors generated.
Wit this revision:
$ clang -fopenmp -fopenmp-version=60 test.c -c
test.c:5:34: error: incorrect reduction identifier, expected one of '+', '*', '&', '|', '^', '&&', '||', 'min' or 'max' or declare reduction for type 'int'
5 | #pragma omp parallel reduction(-:a)
|
1 error generated.
Differential Revision: https://reviews.llvm.org/D155635
|
|
This patch renames the `OpenMPIRBuilderConfig` flags to reduce confusion over
their meaning. `IsTargetCodegen` becomes `IsGPU`, whereas `IsEmbedded` becomes
`IsTargetDevice`. The `-fopenmp-is-device` compiler option is also renamed to
`-fopenmp-is-target-device` and the `omp.is_device` MLIR attribute is renamed
to `omp.is_target_device`. Getters and setters of all these renamed properties
are also updated accordingly. Many unit tests have been updated to use the new
names, but an alias for the `-fopenmp-is-device` option is created so that
external programs do not stop working after the name change.
`IsGPU` is set when the target triple is AMDGCN or NVIDIA PTX, and it is only
valid if `IsTargetDevice` is specified as well. `IsTargetDevice` is set by the
`-fopenmp-is-target-device` compiler frontend option, which is only added to
the OpenMP device invocation for offloading-enabled programs.
Differential Revision: https://reviews.llvm.org/D154591
|
|
clause.
This is just syntax to make it easier for the user. It doesn't add any
new functionality.
for
doacross(sink: omp_cur_iteration - 1)
Equivalent to
doacross(sink: ConterVar - 1, ...)
doacross(source: omp_cur_iteration)
Equivalent to
doacross(source)
And restriction is:
OMP5.2 p.327
If vector is specified with the omp_cur_iteration keyword and with
sink as the dependence-type then it must be omp_cur_iteration - 1.
If vector is specified with source as the dependence-type then it must be
omp_cur_iteration.
Differential Revision: https://reviews.llvm.org/D154556
|
|
The loop directive is a descriptive construct which allows the compiler
flexibility in how it generates code for the directive's associated
loop(s). See OpenMP specification 5.2 [257:8-9].
Codegen added in this patch for the combined 'loop' directives are:
'target teams loop' -> 'target teams distribute parallel for'
'teams loop' -> 'teams distribute parallel for'
'target parallel loop' -> 'target parallel for'
'parallel loop' -> 'parallel for'
NOTE: The implementation of the 'loop' directive itself is unchanged.
Differential Revision: https://reviews.llvm.org/D145823
|
|
|
|
This was added by 453e02ca0903c9f65529d21c513925ab0fdea1e1.Use
isa instead since we don't use the result.
Fixes:
<..>SemaOpenMP.cpp:23149:13: warning: unused variable ‘TargetVarDecl’ [-Wunused-variable]
23149 | if (auto *TargetVarDecl = dyn_cast_or_null<VarDecl>(TargetDecl))
| ^~~~~~~~~~~~~
Which came up when building with GCC 9.
|
|
Without this patch, the following example crashes Clang:
```
#pragma omp target map(i)
#pragma omp tile sizes(2)
for (i = 0; i < N; ++i)
;
```
This patch fixes the crash by changing `Sema::isOpenMPPrivateDecl` not
to identify `i` as private just because it's the loop variable of a
`tile` construct.
While OpenMP TR11 and earlier do specify privacy for loop variables of
loops *generated* from a `tile` construct, I haven't found text
stating that the original loop variable must be private in the above
example, so this patch leaves it shared. Even so, it is a bit
unexpected that value of `i` after the loop is `N - 1` instead of `N`.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D151356
|
|
Initial support for OpenMP 5.0 declare target "as if" behavior for "initializer expressions".
OpenMP 5.0, 2.12.7 declare target.
Reviewed By: Alexey
Differential Revision: https://reviews.llvm.org/D146418
|
|
|
|
This patch migrates uses of StringRef::{starts,ends}with_insensitive
to StringRef::{starts,ends}_with_insensitive so that we can use names
similar to those used in std::string_view.
Note that the llvm/ directory has migrated in commit
6c3ea866e93003e16fc55d3b5cedd3bc371d1fde.
I'll post a separate patch to deprecate
StringRef::{starts,ends}with_insensitive.
Differential Revision: https://reviews.llvm.org/D150506
|
|
Sema.h is huge. This makes a small reduction to it by moving
EnterExpressionEvaluationContext into a new header, since it is an
independent component.
Differential Revision: https://reviews.llvm.org/D149796
|
|
And some general code style cleanup.
Differential Revision: https://reviews.llvm.org/D148696
|
|
When constructing an attribute, the syntactic form was specified
using two arguments: an attribute-independent syntax type and an
attribute-specific spelling index. This patch replaces them with
a single argument.
In most cases, that's done using a new Form class that combines the
syntax and spelling into a single object. This has the minor benefit
of removing a couple of constructors. But the main purpose is to allow
additional information to be stored as well, beyond just the syntax and
spelling enums.
In the case of the attribute-specific Create and CreateImplicit
functions, the patch instead uses the attribute-specific spelling
enum. This helps to ensure that the syntax and spelling are
consistent with each other and with the Attr.td definition.
If a Create or CreateImplicit caller specified a syntax and
a spelling, the patch drops the syntax argument and keeps the
spelling. If the caller instead specified only a syntax
(so that the spelling was SpellingNotCalculated), the patch
simply drops the syntax argument.
There were two cases of the latter: TargetVersion and Weak.
TargetVersionAttrs were created with GNU syntax, which matches
their definition in Attr.td, but which is also the default.
WeakAttrs were created with Pragma syntax, which does not match
their definition in Attr.td. Dropping the argument switches
them to AS_GNU too (to match [GCC<"weak">]).
Differential Revision: https://reviews.llvm.org/D148102
|
|
Adds a warning, issued by the clang semantic analysis, if HIP and OpenMP target offloading is requested concurrently.
That is, if HIP language mode is active but OpenMP target directives are encountered.
Previously, a user might not have been aware that target directives are ignored in such a case.
Generation of this warning is (lit-)tested via "make check-clang-semaopenmp".
The warning can be ignored via "-Wno-hip-omp-target-directives".
Differential Revision: https://reviews.llvm.org/D145591
|
|
|
|
This reverts commit 8cf85a0cadb033fed3d96aa5283deb4bfbbaf2c8.
This is add back change of "Add map info for dereference pointer."
In addition turn off test run on amdgpu, since I don't know the way to
reprodue the problem.
|
|
breaks amdgpu buildbot
This reverts commit 0f2f378425821de77e50a0dcb67c4504389a56e8.
|
|
This is to fix run time problem when use:
int **a;
map((*a)[:3]), (*a)[1] or map(**a).
current we skip generate map info for dereference pointer:
&(*a), &(*a)[0], 3*sizeof(int), TARGET_PARAM | TO | FROM
One way to fix runtime problem is to generate map info for dereference
pointer.
map((*a)[:3]):
&(*a), &(*a), sizeof(pointer), TARGET_PARAM | TO | FROM
&(*a), &(*a)[0], 3*sizeof(int), PTR_AND_OBJ | TO | FROM
map(**a):
&(*a), &(*a), sizeof(pointer), TARGET_PARAM | TO | FROM
&(*a), &(**a), sizeof(int), PTR_AND_OBJ | TO | FROM
The change in CGOpenMPRuntime.cpp add that.
The change in SemaOpenMP is to fix variable of dereference pointer to array
captured by reference. That is wrong. That cause run time to fail.
The rule is:
If variable is identified in a map clause it is always captured by
reference except if it is a pointer that is dereferenced somehow.
Differential Revision: https://reviews.llvm.org/D145093
|
|
The indeces of the dependent loops are properly ordered, just start from
1, so need just subtract 1 to get correct loop index.
Differential Revision: https://reviews.llvm.org/D145514
|
|
Need to assign the calculated lower bound back to temp variable,
otherwise incorrect value (upper bound instead of lower bound) might be
used.
Differential Revision: https://reviews.llvm.org/D144015
|
|
Deprecate and diagnose uses of the 'destroy' clauses without an argument
in 'omp depobj' directives.
Differential Revision: https://reviews.llvm.org/D143021
|
|
In Clang, in order to determine the type of `omp_allocator_handle_t`, Clang
checks the type of those predefined allocators. The first one it checks is
`omp_null_allocator`. If the language is C, and the system is 64-bit, what Clang
gets is a `int`, instead of an enum of size 8, given the fact how we define
`omp_allocator_handle_t` in `omp.h`. If the allocator is captured by a region,
let's say a parallel region, the allocator will be privatized. Because Clang deems
`omp_allocator_handle_t` as an `int`, it will first cast the value returned by
the runtime library (for `libomp` it is a `void *`) to `int`, and then in the
outlined function, it casts back to `omp_allocator_handle_t`. This two casts
completely shaves the first 32-bit of the pointer value returned from `libomp`,
and when the private "new" pointer is fed to another runtime function
`__kmpc_allocate()`, it causes segment fault. That is the root cause of PR54082.
I have no idea why `-fno-pic` could hide this bug.
In this patch, we detect `omp_allocator_handle_t` using roughly the same method
as `omp_event_handle_t`, by looking it up into the identifier table.
Fix #54082.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D142297
|
|
Dynamic memory allows users to allocate fast shared memory when a kernel
is launched. We support a single size for all kernels via the
`LIBOMPTARGET_SHARED_MEMORY_SIZE` environment variable but now we can
control it per kernel invocation, hence allow computed values.
Note: Only the nextgen plugins will allocate memory based on the clause,
the old plugins will silently miscompile.
Differential Revision: https://reviews.llvm.org/D141233
|
|
When there is any compile error, clang still tries to compile as many code as
possible, therefore `Scope` can be `nullptr` here. However, we didn't check it
beforehand, causing compiler crash.
Fix #59944.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D142233
|
|
|
|
task region
This patch fixes the issue that a functor is not captured properly if
that is used in a task region. It was introduced by https://reviews.llvm.org/D114546
where `CallExpr` is treated specially, but the callee itself is not properly visited.
https://reviews.llvm.org/D115902 already did some fix for one case. This patch
fixes another case.
Fix #57757.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D141873
|
|
This patch replaces (llvm::|)Optional< with std::optional<. I'll post
a separate patch to remove #include "llvm/ADT/Optional.h".
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
|