Age | Commit message (Collapse) | Author | Files | Lines |
|
The default build of openmp (`cmake -S <llvm-project>/runtimes
-DLLVM_ENABLE_RUNTIMES=openmp`) current fails with
```
CMake Error at /home/meinersbur/src/llvm/flangrt/_src/cmake/Modules/GetClangResourceDir.cmake:17 (string):
string sub-command REGEX, mode MATCH needs at least 5 arguments total to
command.
Call Stack (most recent call first):
/home/meinersbur/src/llvm/flangrt/_src/openmp/CMakeLists.txt:126 (get_clang_resource_dir)
```
The reason is that because it is not a bootstrapping-build, the clang
resource dir that it intends to write files such as `omp-tools.h` into,
is unavailable. Using the Clang resource dir for writing files is
conceptually broken, as that dir might be located in
`/usr/lib/clang/<version>/`. Writing to it is only intended in
bootstrapping builds where Clang is built alongside openmp.
This patch unifies the identification of being in a bootstrapping built.
The same `LLVM_TREE_AVAILABLE` definition is going to be used in
#137828. No reason for each runtime to define its own.
|
|
A lot of these only trip when using sanitizers with the library.
* Insert forgotten free()s
* Change (-1) << amount to 0xffffffffu as left shifting a negative is UB
* Fixup integer parser to return INT_MAX when parsing huge string of
digits. e.g., 452523423423423423 returns INT_MAX
* Fixup range parsing for affinity mask so integer overflow does not
occur
* Don't assert when branch bits are 0, instead warn user that is invalid
and use the default value.
* Fixup kmp_set_defaults() so the C version only uses null terminated
strings and the Fortran version uses the string + size version.
* Make sure the KMP_ALIGN_ALLOC is power of two, otherwise use
CACHE_LINE.
* Disallow ability to set KMP_TASKING=1 (task barrier) this doesn't work
and hasn't worked for a long time.
* Limit KMP_HOT_TEAMS_MAX_LEVEL to 1024, an array is allocated based on
this value.
* Remove integer values for OMP_PROC_BIND. The specification only allows
strings and CSV of strings.
* Fix setting KMP_AFFINITY=disabled + OMP_DISPLAY_AFFINITY=TRUE
|
|
The feature was introduced back in 2014 and has been on ever since.
Leave the feature in place. Removing only the macro.
|
|
Ticket lock has a yield operation (shown below) which degrades
performance on larger server machines due to an unconditional pause
operation.
```
#define KMP_YIELD(cond) \
{ \
KMP_CPU_PAUSE(); \
if ((cond) && (KMP_TRY_YIELD)) \
__kmp_yield(); \
}
```
|
|
This code hasn't been enabled since the first code changes were
introduced. Remove the dead code.
|
|
OpenMP 6.0 12.1.2 specifies the behavior of the strict modifier for the
num_threads clause on parallel directives, along with the message and
severity clauses. This commit implements necessary host runtime changes.
Reland https://github.com/llvm/llvm-project/pull/146403. After manual
testing on a gfx90a machine, I could not reproduce the failing test,
which makes it even more likely that the test has just been flaky. (Or
at least that it's not an issue related to this patch.)
|
|
Initial runtime support for threadset clause in task and taskloop
directives [Section 14.8 in in OpenMP 6.0 spec]
Frontend PR- https://github.com/llvm/llvm-project/pull/135807
|
|
(#147379)
Reverts llvm/llvm-project#146403
|
|
OpenMP 6.0 12.1.2 specifies the behavior of the strict modifier for the
num_threads clause on parallel directives, along with the message and
severity clauses. This commit implements necessary host runtime changes.
|
|
When running the `openmp` testsuite on 32-bit SPARC, several tests
`FAIL` apparently randomly, but always with the same kind of error:
```
# error: command failed with exit status: -11
```
The tests die with `SIGBUS`, as can be seen in `truss` output:
```
26461/1: Incurred fault #5, FLTACCESS %pc = 0x00010EAC
26461/1: siginfo: SIGBUS BUS_ADRALN addr=0x0013D12C
26461/1: Received signal #10, SIGBUS [default]
26461/1: siginfo: SIGBUS BUS_ADRALN addr=0x0013D12C
```
i.e. the code is trying an unaligned access which cannot work on SPARC,
a strict-alignment target which enforces natural alignment on access.
This explains the apparent randomness of the failures: if the memory
happens to be aligned appropriately, the tests work, but fail if not.
A `Debug` build reveals much more:
- `__kmp_alloc` currently aligns to `sizeof(void *)`, which isn't enough
on strict-alignment targets when the data are accessed as types
requiring larger alignment. Therefore, this patch increases `alignment`
to `SizeQuant`.
- 32-bit Solaris/sparc `libc` guarantees 8-byte alignment from `malloc`,
so this patch adjusts `SizeQuant` to match.
- There's a `SIGBUS` in
```
__kmpc_fork_teams (loc=0x112f8, argc=0,
microtask=0x16cc8
<__omp_offloading_ffbc020a_4b1abe_main_l9_debug__.omp_outlined>)
at openmp/runtime/src/kmp_csupport.cpp:573
573 *(kmp_int64 *)(&this_thr->th.th_teams_size) = 0L;
```
Casting to a pointer to a type requiring 64-bit alignment when that
isn't guaranteed is wrong. Instead, this patch uses `memset` instead.
- There's another `SIGBUS` in
```
0xfef8cb9c in __kmp_taskloop_recur (loc=0x10cb8, gtid=0, task=0x23cd00,
lb=0x23cd18, ub=0x23cd20, st=1, ub_glob=499, num_tasks=100, grainsize=5,
extras=0, last_chunk=0, tc=500, num_t_min=20,
codeptr_ra=0xfef8dbc8 <__kmpc_taskloop(ident_t*, int, kmp_task_t*, int,
kmp_uint64*, kmp_uint64*, kmp_int64, int, int, kmp_uint64, void*)+240>,
task_dup=0x0)
at openmp/runtime/src/kmp_tasking.cpp:5147
5147 p->st = st;
```
`p->st` doesn't currently guarantee the 8-byte alignment required by
`kmp_int64 st`. `p` is set in
```
__taskloop_params_t *p = (__taskloop_params_t *)new_task->shareds;
```
but `shareds_offset` is currently aligned to `sizeof(void *)` only.
Increasing it to `sizeof(kmp_uint64)` to match its use fixes the
`SIGBUS`.
With these fixes I get clean `openmp` test results on 32-bit SPARC (both
Solaris and Linux), with one unrelated exception.
Tested on `sparc-sun-solaris2.11`, `sparcv9-sun-solaris2.11`,
`sparc-unknown-linux-gnu`, `sparc64-unknown-linux-gnu`,
`i386-pc-solaris2.11`, `amd64-pc-solaris2.11`, `i686-pc-linux-gnu`, and
`x86_64-pc-linux-gnu`.
|
|
This can happen in static destructors when called after the
runtime is already shutdown (e.g., by ompt_finalize_tool). Even
though it is technically an error to call omp_destroy_lock after
shutdown, the application doesn't necessarily know that omp_destroy_lock
was already called. This is safe becaues all indirect locks are
destoryed in __kmp_cleanup_indirect_user_locks so the return
value will always be valid or a nullptr, not garbage.
|
|
Removes usage of __AMDGCN_WAVEFRONT_SIZE as compile time constant.
---------
Co-authored-by: Shilei Tian <i@tianshilei.me>
|
|
(#142990)
|
|
`openmp` currently doesn't compile on 32-bit Solaris:
```
FAILED: projects/openmp/runtime/src/CMakeFiles/omp.dir/z_Linux_util.cpp.o
[...]
In file included from openmp/runtime/src/z_Linux_util.cpp:78:
In file included from /usr/include/libproc.h:25:
In file included from /usr/include/gelf.h:10:
/usr/include/libelf.h:22:2: error: "large files are not supported by libelf"
22 | #error "large files are not supported by libelf"
| ^
In file included from openmp/runtime/src/z_Linux_util.cpp:78:
/usr/include/libproc.h:42:2: error: "Cannot use libproc in the large file compilation environment"
42 | #error "Cannot use libproc in the large file compilation environment"
| ^
```
Looking closer, there's no point in using `Pgrab` (the only interface
from `<libproc.h>`) at all: the resulting `ps_prochandle_t *` isn't used
in the remainder of the code and the original PR #82930 gives no
indication why it is deemed necessary/useful.
While at it, this patch switches to use `/proc/self/xmap`, matching
`compiler-rt`'s `sanitizer_procmaps_solaris.cpp`, and makes some minor
formatting fixes.
Tested on `sparc-sun-solaris2.11`, `sparcv9-sun-solaris2.11`,
`i386-pc-solaris2.11`, and `amd64-pc-solaris2.11`.
|
|
`libomp` uses `__builtin_return_address` in two places. However, on some
targets those calls need to wrapped in `___builtin_extract_return_addr`
to get at the actual return address. SPARC is among those targets and
the only one where `clang` actually implements this, cf. [[clang][Sparc]
Fix __builtin_extract_return_addr
etc.](https://reviews.llvm.org/D91607). `compiler-rt` needed the same
adjustment, cf. [[sanitizer_common][test] Enable tests on
SPARC](https://reviews.llvm.org/D91608). On other targets, this is a
no-op. However, there are more targets that have the same issue and
`gcc`, unlike `clang`, correctly implements it, so there might be issues
when building `libomp` with `gcc`.
This patch adds the necessary calls.
Tested on `sparcv9-sun-solaris2.11`, `sparc64-unknown-linux-gnu`,
`amd64-pc-solaris2.11`, and `x86_64-pc-linux-gnu`.
|
|
PR #138517 broke the Android LLVM builders: ARM doesn't understand the
`@object` form. As it turns out, one can use `%object` instead, which
does assemble on all targets currently supported by `z_Linux_asm.S`.
Tested by rebuilding `libomp.so` on `sparcv9-sun-solaris2.11`.
|
|
`libomp` doesn't currently build on Linux/sparc64 due to lack of
`__NR_sched_setaffinity` and `__NR_sched_getaffinity` definitions.
This patch provides those.
Tested on `sparcv9-sun-solaris2.11`, `sparc64-unknown-linux-gnu`,
`amd64-pc-solaris2.11`, and `x86_64-pc-linux-gnu`.
|
|
`libomp.so` currently fails to link on SPARC, both Solaris/sparcv9 and
Linux/sparc64:
```
Undefined first referenced
symbol in file
__kmp_unnamed_critical_addr projects/openmp/runtime/src/CMakeFiles/omp.dir/kmp_gsupport.cpp.o
ld: fatal: symbol referencing errors
```
This patch provides the necessary definition. While at it, I noticed
that on non-x86 targets the symbol wasn't marked as `@object`, which
this patch corrects, too.
Tested on `sparcv9-sun-solaris2.11`, `sparc64-unknown-linux-gnu`,
`amd64-pc-solaris2.11`, and `x86_64-pc-linux-gnu`.
|
|
Building `openmp` on Solaris/amd64, I get
```
In file included from openmp/runtime/src/kmp_utility.cpp:16:
openmp/runtime/src/kmp_wrapper_getpid.h:47:2: warning: No gettid found, use getpid instead [-W#warnings]
47 | #warning No gettid found, use getpid instead
| ^
```
There's no reason to do this: Solaris can use `pthread_self` just as AIX
does.
Tested on `amd64-pc-solaris2.11` and `x86_64-pc-linux-gnu`.
|
|
When running the `openmp` testsuite on Solaris/amd64, many tests `FAIL`
like
```
# | OMP: Error #11: Stack overflow detected for OpenMP thread #1
```
In a `Debug` build, I also get
```
# | Assertion failure at kmp_runtime.cpp(203): __kmp_gtid_get_specific() < 0 || __kmp_gtid_get_specific() == i.
```
Further investigation shows that just setting `__kmp_gtid_mode` to 3
massively reduces the number of failures.
Tested on `amd64-pc-solaris2.11` and `x86_64-pc-linux-gnu`.
|
|
When building `openmp` on Linux/sparc64, I get
```
In file included fromopenmp/runtime/src/kmp_utility.cpp:16:
openmp/runtime/src/kmp_wrapper_getpid.h:47:2: warning: No gettid found, use getpid instead [-W#warnings]
47 | #warning No gettid found, use getpid instead
| ^
```
This is highly confusing since `<sys/syscall.h>` **does** define
`SYS_gettid` and the header is supposed to be included:
```
#if !defined(KMP_OS_AIX) && !defined(KMP_OS_HAIKU)
#include <sys/syscall.h>
#endif
```
However, this actually is **not** the case for two reasons:
- `KMP_OS_HAIKU` is always defined, either as 1 on Haiku or as 0
otherwise.
- `KMP_OS_AIX` is even worse: it is only defined as 1 on on AIX, but
undefined otherwise.
All those `KMP_OS_*` macros are supposed to always be defined as 1/0 as
appropriate, and to be checked with `#if`, not `#ifdef`. AIX is
violating this, causing the problem above.
Other targets probably get `<sys/syscall.h>` indirectly otherwise, but
Linux/sparc64 does not.
This patch fixes this by also defining `KMP_OS_AIX` as 0 on other OSes
and changing the checks to `#if` as necessary.
Tested on `sparc64-unknown-linux-gnu`, `sparcv9-sun-solaris2.11`,
`amd64-pc-solaris2.11`, and `x86_64-pc-linux-gnu`.
|
|
initialization (#136837)
This commit resolves multiple issues in the OpenMP taskgraph implementation:
- Fix a potential use of uninitialized is_taskgraph and tdg fields when a task is created outside of a taskgraph construct.
- Fix use of task ID field when accessing the taskgraph’s record_map.
- Fix resizing and copying of the successors array when its capacity is exceeded.
Fixes memory management flaws, invalid memory accesses, and uninitialized data risks in taskgraph operations.
|
|
|
|
Haiku does not have a means of retrieving the desired information
and the -1 setting causes the code to fallback anyway.
|
|
TR11 introduced changes to support target memory management in a unified
way by defining a series of API routines and additional traits. Host
runtime is oblivious to how actual memory resources are mapped when
using the new API routines, so it can only support how the composed
memory space is maintained, and the offload backend must handle which
memory resources are actually used to allocate memory from the memory
space.
Here is summary of the implementation.
* Implemented 12 API routines to get/mainpulate memory space/allocator.
* Memory space composed with a list of devices has a state with resource
description, and runtime is responsible for maintaining the allocated
memory space objects.
* Defined interface with offload runtime to access memory resource list,
and to redirect calls to omp_alloc/omp_free since it requires
backend-specific information.
* Value of omp_default_mem_space changed from 0 to 99, and
omp_null_mem_space took the value 0 as defined in the language.
* New allocator traits were introduced, but how to use them is up to the
offload backend.
* Added basic tests for the new API routines.
|
|
This patch adds support for memory allocation using hwloc. To enable
memory allocation using hwloc, env KMP_TOPOLOGY_METHOD=hwloc needs to be
used. If hwloc is not supported/available, allocation will fallback to
default path.
|
|
|
|
Changes from https://github.com/llvm/llvm-project/pull/133034 removed a
`}` presumably accidentally that are causing failures in the AIX flang
bot.
|
|
Co-authored-by: Jérôme Duval <jerome.duval@gmail.com>
|
|
(#131571)
|
|
This patch attempts to provide a fix for an issue that appears when the
`__kmp_dist_for_static_init` function is called from a serialized team.
This is triggered by code generated by flang for `distribute parallel
do` constructs whenever an `if` clause for the `parallel` leaf construct
is present. This results in the introduction of a call to
`__kmpc_fork_call_if` in place of `__kmpc_fork_call`. When it evaluates
to `false`, it defers execution to `__kmp_serialized_parallel`, which
creates a new serial team that is picked up by
`__kmp_dist_for_static_init`, resulting in an incorrect `team` pointer
that causes the `nteams == (kmp_uint32)team->t.t_parent->t.t_nproc`
assertion to fail.
The sequence of calls replicating this issue can be summarized as:
- `__kmpc_fork_teams`
- `__kmpc_fork_call_if`
- `__kmpc_dist_for_static_init_*`
Since I am not familiar with the implementation of the OpenMP runtime,
it is possible that the above sequence of calls is incorrect, or that
the bug can be better fixed in another way, so I am open to discussing
this.
The following Fortran program can be compiled with flang to show the
issue:
```f90
! Compile and run: flang -fopenmp test.f90 -o test && ./test
! Check LLVM IR: flang -fc1 -emit-llvm -fopenmp test.f90 -o -
program main
implicit none
integer, parameter :: n = 10
integer :: i, idx(n)
!$omp teams
!$omp distribute parallel do if(.false.)
do i=1,n
idx(i) = i
end do
!$omp end teams
print *, idx
end program
```
|
|
(#130751)
Updating OpenMP runtime taskgraph support(record/replay mechanism):
- Adds a `graph_reset` bit in `kmp_taskgraph_flags_t` to discard
existing TDG records.
- Switches from a strict index-based TDG ID/IDX to a more flexible
integer-based, which can be any integer (e.g. hashed).
- Adds helper functions like `__kmp_find_tdg`, `__kmp_alloc_tdg`, and
`__kmp_free_tdg` to manage TDGs by their IDs.
These changes pave the way for the integration of OpenMP taskgraph (spec
6.0). Taskgraphs are still recorded in an array with a lookup efficiency
reduced to O(n), where n ≤ `__kmp_max_tdgs`. This can be optimized by
moving the TDGs to a hashtable, making lookups more efficient. The
provided helper routines facilitate easier future optimizations.
|
|
the debugger (#130660)
This PR creates a new member for task data, which is used to identify
the task in its taskgraph (when ompx taskgraph is enabled).
It aims to remove the overloading of the td_task_id member, which was
used both by the debugger and the taskgraph. This resulted in the
identifier's non-unicity in the case of multiple taskgraphs.
Co-authored-by: Rémy Neveu <rem2007@free.fr>
|
|
signedness (#128204)
This PR fixes warning which occurs if one compiles OpenMP runtime with
GCC:
```
warning: comparison of integer expressions of different signedness: 'kmp_intptr_t' {aka 'long int'} and 'long unsigned int' [-Wsign-compare]
```
|
|
In several cases the flags entries in ompt_frame_t are not initialized.
According to @jdelsign the address provided as reenter and exit address
is the canonical frame address (cfa) rather than a "framepointer". This
patch makes sure that the flags entry is always initialized and changes
the value from ompt_frame_framepointer to ompt_frame_cfa.
The assertion in the tests makes sure that the flags are always set,
when a tool (callback.h in this case) looks at the value.
Fixes #89058
|
|
|
|
|
|
|
|
Fixes build on OpenBSD/aarch64.
```
FAILED: openmp/runtime/src/CMakeFiles/omp.dir/kmp_runtime.cpp.o
/home/brad/tmp/llvm-build/bin/clang++ --target=aarch64-unknown-openbsd7.6 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Domp_EXPORTS -I/home/brad/tmp/llvm-build/runtimes/runtimes-bins/openmp/runtime/src -I/home/brad/tmp/llvm-brad/openmp/runtime/src -I/home/brad/tmp/llvm-brad/openmp/runtime/src/i18n -I/home/brad/tmp/llvm-brad/openmp/runtime/src/include -I/home/brad/tmp/llvm-brad/openmp/runtime/src/thirdparty/ittnotify -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Wall -fcolor-diagnostics -Wcast-qual -Wformat-pedantic -Wimplicit-fallthrough -Wsign-compare -Wno-extra -Wno-pedantic -fno-semantic-interposition -fdata-sections -O3 -DNDEBUG -std=c++17 -fPIC -D _GNU_SOURCE -D _REENTRANT -U_GLIBCXX_ASSERTIONS -UNDEBUG -fno-exceptions -fno-rtti -Wno-covered-switch-default -Wno-frame-address -Wno-strict-aliasing -Wno-switch -Wno-uninitialized -Wno-return-type-c-linkage -Wno-cast-qual -Wno-int-to-void-pointer-cast -MD -MT openmp/runtime/src/CMakeFiles/omp.dir/kmp_runtime.cpp.o -MF openmp/runtime/src/CMakeFiles/omp.dir/kmp_runtime.cpp.o.d -o openmp/runtime/src/CMakeFiles/omp.dir/kmp_runtime.cpp.o -c /home/brad/tmp/llvm-brad/openmp/runtime/src/kmp_runtime.cpp
/home/brad/tmp/llvm-brad/openmp/runtime/src/kmp_runtime.cpp:1449:47: error: value of type 'kmp_va_list' (aka '__builtin_va_list') is not contextually convertible to 'bool'
1449 | return (master_th->th.th_teams_microtask && ap &&
| ^~
/home/brad/tmp/llvm-brad/openmp/runtime/src/kmp_runtime.cpp:1449:44: error: invalid operands to binary expression ('microtask_t' (aka 'void (*)(int *, int *, ...)') and 'kmp_va_list' (aka '__builtin_va_list'))
1449 | return (master_th->th.th_teams_microtask && ap &&
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~
/home/brad/tmp/llvm-brad/openmp/runtime/src/kmp_runtime.cpp:1457:15: warning: comparison between NULL and non-pointer ('kmp_va_list' (aka '__builtin_va_list') and NULL) [-Wnull-arithmetic]
1457 | return ((ap == NULL && active_level == 0) ||
| ~~ ^ ~~~~
/home/brad/tmp/llvm-brad/openmp/runtime/src/kmp_runtime.cpp:1457:15: error: invalid operands to binary expression ('kmp_va_list' (aka '__builtin_va_list') and 'long')
1457 | return ((ap == NULL && active_level == 0) ||
| ~~ ^ ~~~~
/home/brad/tmp/llvm-brad/openmp/runtime/src/kmp_runtime.cpp:1458:12: error: value of type 'kmp_va_list' (aka '__builtin_va_list') is not contextually convertible to 'bool'
1458 | (ap && teams_level > 0 && teams_level == level));
| ^~
/home/brad/tmp/llvm-brad/openmp/runtime/src/kmp_runtime.cpp:1458:15: error: invalid operands to binary expression ('kmp_va_list' (aka '__builtin_va_list') and 'bool')
1458 | (ap && teams_level > 0 && teams_level == level));
| ~~ ^ ~~~~~~~~~~~~~~~
/home/brad/tmp/llvm-brad/openmp/runtime/src/kmp_runtime.cpp:1735:9: error: invalid argument type 'kmp_va_list' (aka '__builtin_va_list') to unary expression
1735 | if (!ap) {
| ^~~
/home/brad/tmp/llvm-brad/openmp/runtime/src/kmp_runtime.cpp:2169:66: warning: comparison between NULL and non-pointer ('kmp_va_list' (aka '__builtin_va_list') and NULL) [-Wnull-arithmetic]
2169 | !(microtask == (microtask_t)__kmp_teams_master || ap == NULL))
| ~~ ^ ~~~~
/home/brad/tmp/llvm-brad/openmp/runtime/src/kmp_runtime.cpp:2169:66: error: invalid operands to binary expression ('kmp_va_list' (aka '__builtin_va_list') and 'long')
2169 | !(microtask == (microtask_t)__kmp_teams_master || ap == NULL))
| ~~ ^ ~~~~
/home/brad/tmp/llvm-brad/openmp/runtime/src/kmp_runtime.cpp:2284:9: error: value of type 'kmp_va_list' (aka '__builtin_va_list') is not contextually convertible to 'bool'
2284 | if (ap) {
| ^~
/home/brad/tmp/llvm-brad/openmp/runtime/src/kmp_runtime.cpp:2302:58: error: invalid argument type 'kmp_va_list' (aka '__builtin_va_list') to unary expression
2302 | __kmp_fork_team_threads(root, team, master_th, gtid, !ap);
| ^~~
/home/brad/tmp/llvm-brad/openmp/runtime/src/kmp_runtime.cpp:2363:9: error: value of type 'kmp_va_list' (aka '__builtin_va_list') is not contextually convertible to 'bool'
2363 | if (ap) {
| ^~
/home/brad/tmp/llvm-brad/openmp/runtime/src/kmp_runtime.cpp:7803:3: error: no matching function for call to '__kmp_fork_call'
7803 | __kmp_fork_call(loc, gtid, fork_context_intel, team->t.t_argc,
| ^~~~~~~~~~~~~~~
/home/brad/tmp/llvm-brad/openmp/runtime/src/kmp_runtime.cpp:1927:5: note: candidate function not viable: no known conversion from 'long' to 'kmp_va_list' (aka '__builtin_va_list') for 7th argument
1927 | int __kmp_fork_call(ident_t *loc, int gtid,
| ^
1928 | enum fork_context_e call_context, // Intel, GNU, ...
1929 | kmp_int32 argc, microtask_t microtask, launch_t invoker,
1930 | kmp_va_list ap) {
| ~~~~~~~~~~~~~~
2 warnings and 11 errors generated.
```
|
|
This patch series demonstrates and fixes a bug that causes crashes with
OpenMP 'taskwait' directives in heavily multi-threaded scenarios.
TLDR: The early return from __kmpc_omp_taskwait_deps_51 missed the
synchronization mechanism in place for the late return.
Additional debug assertions check for the implied invariants of the code.
@jpeyton52 found the timing hole as this sequence of events:
>
> 1. THREAD 1: A regular task with dependences is created, call it T1
> 2. THREAD 1: Call into `__kmpc_omp_taskwait_deps_51()` and create a stack
based depnode (`NULL` task), call it T2 (stack)
> 3. THREAD 2: Steals task T1 and executes it getting to
`__kmp_release_deps()` region.
> 4. THREAD 1: During processing of dependences for T2 (stack) (within
`__kmp_check_deps()` region), a link is created T1 -> T2. This increases
T2's (stack) `nrefs` count.
> 5. THREAD 2: Iterates through the successors list: decrement the T2's
(stack) npredecessor count. BUT HASN'T YET `__kmp_node_deref()`-ed it.
> 6. THREAD 1: Now when finished with `__kmp_check_deps()`, it returns false
because npredecessor count is 0, but T2's (stack) `nrefs` count is 2 because
THREAD 2 still references it!
> 7. THREAD 1: Because `__kmp_check_deps()` returns false, early exit.
> _Now the stack based depnode is invalid, but THREAD 2 still references it._
>
> We've reached improper stack referencing behavior. Varied results/crashes/
asserts can occur if THREAD 1 comes back and recreates the exact same depnode
in the exact same stack address during the same time THREAD 2 calls
`__kmp_node_deref()`.
|
|
|
|
Summary:
Freestanding builds have `stddef.h` and `stdint.h` but not `stdlib.h`.
We don't actually use any `stdlib.h` definitions in the OpenMP headers,
and some definitions from this header are usable without the OpenMP
runtime (allocators) so we should be able to do this. This ignores the
include if possible, removing the implicit include would possibly break
some applications so it stays here.
|
|
When libomp is built with -cf-protection, add endbr instructions to the
start of functions for Intel CET support.
|
|
Emscripten on Windows (#116874)
|
|
The `openmp` runtime failed to build on LoP with LLVM18 on LoP due to
the addition of `-Wvla-cxx-extension` as
```
llvm-project/openmp/runtime/src/ompt-general.cpp:711:15: error: variable length arrays in C++ are a Clang extension [-Werror,-Wvla-cxx-extension]
711 | int tmp_ids[ids_size];
| ^~~~~~~~
llvm-project/openmp/runtime/src/ompt-general.cpp:711:15: note: function parameter 'ids_size' with unknown value cannot be used in a constant expression
llvm-project/openmp/runtime/src/ompt-general.cpp:704:65: note: declared here
704 | OMPT_API_ROUTINE int ompt_get_place_proc_ids(int place_num, int ids_size,
| ^
1 error generated.
```
This patch is to ignore the checking against this usage.
|
|
|
|
Add libgomp.1.dylib for MacOS and libgomp.so.1 for Linux
Linkers on Mac and Linux pick up versioned libgomp dynamic library
files. The existing softlinks (libgomp.dylib for MacOS and libgomp.so
for Linux) are insufficient. This helps alleviate the issue of mixing
libgomp and libomp at runtime.
|
|
This patch modifies the signature of the `__kmp_print_tdg_dot` function
in `kmp_tasking.cpp` to include the global thread ID (gtid) as an
argument. The gtid is now correctly passed to the function.
- Updated the function declaration to accept the gtid parameter.
- Modified all calls to `__kmp_print_tdg_dot` to pass the correct gtid
value.
This change addresses issues encountered when compiling with
`OMPX_TASKGRAPH` enabled. No functional changes are expected beyond
successful compilation.
|
|
On powerpc, physical_package_id may not be available. Currently, this
causes openmp to fall back to flat topology and various affinity tests
fail.
Fix this by parsing core_siblings_list to deterimine which cpus belong
to the same socket. This matches what the testing code does. The code to
parse the CPU list format thankfully already exists.
Fixes https://github.com/llvm/llvm-project/issues/111809.
|
|
For `libomp` on AIX, we build shared object `libomp.so` first and then
archive it into libomp.a. This patch changes to use SO version `1` and
name the shared object `libomp.so.1` so that it is consistent with the
naming of other shared objects in AIX libraries, e.g., `libc++.so.1` in
`libc++.a`. With this change, the change made in commit
bde51d9b0d473447ea12fb14924f14ea167eec85 to ensure only `libomp.a` is
published on AIX is no longer necessary and is removed.
|