aboutsummaryrefslogtreecommitdiff
path: root/openmp/runtime/src
AgeCommit message (Collapse)AuthorFilesLines
2024-04-12[OpenMP] Fix re-locking hang found in issue 86684 (#88539)Jonathan Peyton1-6/+6
This was initially reported here (including stacktraces): https://stackoverflow.com/questions/78183545/does-compiling-imagick-with-openmp-enabled-in-freebsd-13-2-cause-sched-yield If `__kmp_register_library_startup()` detects that another instance of the library is present, `__kmp_is_address_mapped()` is eventually called. which uses `kmpc_alloc()` to allocate memory. This function calls `__kmp_entry_thread()` to access the thread-local memory pool, which is a bad idea during initialization. This macro internally calls `__kmp_get_global_thread_id_reg()` which sets the bootstrap lock at the beginning (before calling `__kmp_register_library_startup()`). The fix is to use `KMP_INTERNAL_MALLOC()`/`KMP_INTERNAL_FREE()` instead of `kmpc_malloc()`/`kmpc_free()`. `KMP_INTERNAL_MALLOC` and `KMP_INTERNAL_FREE` do not use any bootstrap locks. They just translate to `malloc()`/`free()` and are meant to be used during library initialization before other library-specific allocators have been initialized. Fixes: #86684
2024-04-09[Libomp] Place generated OpenMP headers into build resource directory (#88007)Joseph Huber1-9/+17
Summary: These headers are a part of the compiler's resource directory once installed. However, they are currently placed in the binary directory temporarily. This makes it more difficult to use the compiler out of the build directory and will cause issues when moving to `liboffload`. This patch changes the logic to write these instead to the copmiler's resource directory inside of the build tree. NOTE: This doesn't change the Fortran headers, I don't know enough about those and it won't use the same directory.
2024-04-08Revert "[Libomp] Place generated OpenMP headers into build resource d… ↵Pete Steinfeld1-16/+9
(#88083) …irectory (#88007)" This reverts commit 8671429151d5e67d3f21a737809953ae8bdfbfde. This commit broke the flang build, so I'm reverting it. See the comments in merge request #88007 for more information.
2024-04-08[Libomp] Place generated OpenMP headers into build resource directory (#88007)Joseph Huber1-9/+16
Summary: These headers are a part of the compiler's resource directory once installed. However, they are currently placed in the binary directory temporarily. This makes it more difficult to use the compiler out of the build directory and will cause issues when moving to `liboffload`. This patch changes the logic to write these instead to the copmiler's resource directory inside of the build tree. NOTE: This doesn't change the Fortran headers, I don't know enough about those and it won't use the same directory.
2024-04-03[OpenMP] Add absolute KMP_HW_SUBSET functionality (#85326)Jonathan Peyton2-75/+139
Users can put a : in front of KMP_HW_SUBSET to indicate that the specified subset is an "absolute" subset. Currently, when a user puts KMP_HW_SUBSET=1t. This gets translated to KMP_HW_SUBSET="*s,*c,1t", where * means "use all of". If a user wants only one thread as the entire topology they can now do KMP_HW_SUBSET=:1t. Along with the absolute syntax is a fix for newer machines and making them easier to use with only the 3-level topology syntax. When a user puts KMP_HW_SUBSET=1s,4c,2t on a machine which actually has 4 layers, (say 1s,2m,3c,2t as the entire machine) the user gets an unexpected "too many resources asked" message because KMP_HW_SUBSET currently translates the "4c" value to mean 4 cores per module. To help users out, the runtime can assume that these newer layers, module in this case, should be ignored if they are not specified, but the topology should always take into account the sockets, cores, and threads layers.
2024-04-02[OpenMP] Fix nested parallel with tasking (#87309)Jonathan Peyton1-6/+9
When a nested parallel region ends, the runtime calls __kmp_join_call(). During this call, the primary thread of the nested parallel region will reset its tid (retval of omp_get_thread_num()) to what it was in the outer parallel region. A data race occurs with the current code when another worker thread from the nested inner parallel region tries to steal tasks from the primary thread's task deque. The worker thread reads the tid value directly from the primary thread's data structure and may read the wrong value. This change just uses the calculated victim_tid from execute_tasks() directly in the steal_task() routine rather than reading tid from the data structure. Fixes: #87307
2024-04-02[OpenMP] get logical core count on modern apple platform (#87231)nihui1-15/+2
`hw.logicalcpu` returns the available logical core count Fix build error for watchOS ``` runtime/src/z_Linux_util.cpp:1821:8: error: 'host_info' is unavailable: not available on watchOS rc = host_info(mach_host_self(), HOST_BASIC_INFO, (host_info_t)&info, &num); ^ /Applications/Xcode_15.2.app/Contents/Developer/Platforms/WatchOS.platform/Developer/SDKs/WatchOS10.2.sdk/usr/include/mach/mach_host.h:82:15: note: 'host_info' has been explicitly marked unavailable here kern_return_t host_info ^ 1 warning and 1 error generated. make[2]: *** [runtime/src/CMakeFiles/omp.dir/z_Linux_util.cpp.o] Error 1 ```
2024-04-02[OpenMP] arm64_32 port for Apple WatchOS (#87246)nihui6-17/+26
detect `aarch64_32` with compiler defined macro `__ARM64_ARCH_8_32__` reuse ARM `__kmp_unnamed_critical_addr` and add `KMP_PREFIX_UNDERSCORE` macro like AARCH64 reuse AARCH64 `__kmp_invoke_microtask` build log for watchos armv7k + arm64_32 and watchos simulator x86_64 + arm64 https://github.com/nihui/action-protobuf/actions/runs/8520684611/job/23337305030
2024-03-29[OpenMP] Have hidden helper team allocate new OS threads only (#87119)Jonathan Peyton1-3/+5
The hidden helper team pre-allocates the gtid space [1, num_hidden_helpers] (inclusive). If regular host threads are allocated, then put back in the thread pool, then the hidden helper team is initialized, the hidden helper team tries to allocate the threads from the thread pool with gtids higher than [1, num_hidden_helpers]. Instead, have the hidden helper team fork OS threads so the correct gtid range used for hidden helper threads. Fixes: #87117
2024-03-28[OpenMP] Fix node destruction race in __kmpc_omp_taskwait_deps_51 (#86130)Ulrich Weigand1-0/+6
The __kmpc_omp_taskwait_deps_51 allocates a kmp_depnode_t node on its stack, and there is currently a race condition where another thread might still be accessing that node after the function has returned and its stack frame was released. While the function does wait until the node's npredecessors count has reached zero before exiting, there is still a window where the function that last decremented the npredecessors count assumes the node is still accessible. For heap-allocated kmp_depnode_t nodes, this normally works via a separate ndeps count that only reaches zero at the point where no accesses to the node are expected at all; in fact, at this point the heap allocation will be freed. For this case of a stack-allocated kmp_depnode_t node, it therefore makes sense to similarly respect the ndeps count; we need to wait until this reaches 1 (not 0, because it is not heap-allocated so there's always one extra count to prevent it from being freed), before we can safely deallocate our stack frame. As this is expected to be a short race window of only a few instructions, it should be fine to just use a busy wait loop checking the ndeps count. Fixes: https://github.com/llvm/llvm-project/issues/85963
2024-03-27[OpenMP] Close up permissions on /tmp files (#85469)Terry Wilmarth1-5/+5
The SHM or /tmp files that might be created during library registration don't need to have such open permissions, so this change fixes that.
2024-03-27[NFC][OpenMP] Use `SimpleVLA` to replace variable length arrays in C++Shilei Tian1-2/+3
2024-03-27[NFC][OpenMP] Silent unused variable in `kmp_collapse.cpp`Shilei Tian1-2/+2
2024-03-26[OpenMP] add loop collapse tests (#86243)Vadim Paretsky1-8/+3
This PR adds loop collapse tests ported from MSVC. --------- Co-authored-by: Vadim Paretsky <b-vadipa@microsoft.com>
2024-03-22[OpenMP][AIX] Affinity implementation for AIX (#84984)Xing Xue5-18/+232
This patch implements `affinity` for AIX, which is quite different from platforms such as Linux. - Setting CPU affinity through masks and related functions are not supported. System call `bindprocessor()` is used to bind a thread to one CPU per call. - There are no system routines to get the affinity info of a thread. The implementation of `get_system_affinity()` for AIX gets the mask of all available CPUs, to be used as the full mask only. - Topology is not available from the file system. It is obtained through system SRAD (Scheduler Resource Allocation Domain). This patch has run through the libomp LIT tests successfully with `affinity` enabled.
2024-03-20[flang][OpenMP] Compile proper `omp_lib.mod` from the `openmp/src/include` ↵Michael Klemm3-151/+372
sources (#80874) This PR changes the build system to use use the sources for the module `omp_lib` and the `omp_lib.h` include file from the `openmp` runtime project and not from a separate copy of these files. This will greatly reduce potential for inconsistencies when adding features to the OpenMP runtime implementation. When the OpenMP subproject is not configured, this PR also disables the corresponding LIT tests with a "REQUIRES" directive at the beginning of the OpenMP test files. --------- Co-authored-by: Valentin Clement (バレンタイン クレメン) <clementval@gmail.com>
2024-03-18[OpenMP] Add OpenMP extension API to dump mapping tables (#85381)nicebert1-0/+2
This adds an API call ompx_dump_mapping_tables. This allows users to debug the mapping tables and can be especially useful for unified shared memory applications to check if the code behaves in the way it should. The implementation reuses code already present to dump mapping tables (in a debug setting). --------- Co-authored-by: Joseph Huber <huberjn@outlook.com>
2024-03-16Revert "[openmp] __kmp_x86_cpuid fix for i386/PIC builds." (#85526)David CARLIER1-10/+0
Reverts llvm/llvm-project#84626
2024-03-14[openmp][wasm] Fix microtask type mismatch (#84355)Andrew Brown1-30/+77
When OpenMP is compiled for WebAssembly (see #71297), it invokes a microtask via a `switch` statement that dispatches to the `void *` microtask pointer with spelled-out arguments (not varargs). As #83329 points out, however, this can result in a type mismatch when the indirect call is executed by WebAssembly; WebAssembly expects the called pointer to have the precise type of the call site. This change fixes the issue by bringing back the approach in [D142593] of type-casting all the `switch` arms to the precise type. This fixes #83329. [D142593]: https://reviews.llvm.org/D142593
2024-03-13[OpenMP] Sort topology after adding processor group layer. (#83943)MessyHack1-0/+3
Various behavior around creating affinity masks and detecting uniform topology depends on the topology being sorted. resort topology after adding processor group layer to ensure that the updated topology reflects the newly added processor group info. Observed that the topology was not sorted correctly on high core count AMD Epyc Genoa (2 sockets, 96 cores, 2 threads) using NUMA (NPS 2+).
2024-03-12[OpenMP] Remove unused logical/physical CPUID information (#83298)Jonathan Peyton2-69/+1
2024-03-12[OpenMP] Make sure mask is set to nullptr (#83299)Jonathan Peyton1-2/+2
2024-03-12[OpenMP] Add debug checks for divide by zero (#83300)Jonathan Peyton2-3/+22
2024-03-11[OpenMP] Fixup while loops to avoid bad NULL check (#83302)Jonathan Peyton2-12/+7
2024-03-11[OpenMP] Remove unnecessary check of ap (#83303)Jonathan Peyton1-8/+2
2024-03-11[OpenMP] Make sure ptr is used after NULL check (#83304)Jonathan Peyton2-8/+8
2024-03-11[OpenMP] Remove dead code of checking int > INT_MAX (#83305)Jonathan Peyton1-6/+0
2024-03-11[openmp] __kmp_x86_cpuid fix for i386/PIC builds. (#84626)David CARLIER1-0/+10
2024-03-10[openmp] adding affinity support to DragonFlyBSD. (#84672)David CARLIER6-16/+22
2024-03-09[OpenMP] fix endianness dependent definitions in OMP headers for MSVC (#84540)Vadim Paretsky2-3/+4
MSVC does not define __BYTE_ORDER__ making the check for BigEndian erroneously evaluate to true and breaking the struct definitions in MSVC compiled builds correspondingly. The fix adds an additional check for whether __BYTE_ORDER__ is defined by the compiler to fix these. --------- Co-authored-by: Vadim Paretsky <b-vadipa@microsoft.com>
2024-03-09[openmp] porting affinity feature to netbsd. (#84618)David CARLIER6-14/+26
netbsd supports the portable hwloc's layer as well. for a hardware with 4 cpus, a cpu set is 4 and maxcpus is 256.
2024-03-08[OpenMP] Implements __kmp_is_address_mapped for Solaris/Illumos. (#82930)David CARLIER1-4/+68
Also fixing OpenMP build itself for this platform.
2024-03-07[OpenMP] runtime support for efficient partitioning of collapsed triangular ↵vadikp-intel2-2/+320
loops (#83939) This PR adds OMP runtime support for more efficient partitioning of certain types of collapsed loops that can be used by compilers that support loop collapsing (i.e. MSVC) to achieve more optimal thread load balancing. In particular, this PR addresses double nested upper and lower isosceles triangular loops of the following types 1. lower triangular 'less_than' for (int i=0; i<N; i++) for (int j=0; j<i; j++) 2. lower triangular 'less_than_equal' for (int i=0; i<N; j++) for (int j=0; j<=i; j++) 3. upper triangular for (int i=0; i<N; i++) for (int j=i; j<N; j++) Includes tests for the three supported loop types. --------- Co-authored-by: Vadim Paretsky <b-vadipa@microsoft.com>
2024-02-27[OpenMP] Fix distributed barrier hang for OMP_WAIT_POLICY=passive (#83058)Jonathan Peyton1-3/+2
The resume thread logic inside __kmp_free_team() is faulty. Only checking b_go for sleep status doesn't wake up distributed barrier. Change to generic check for th_sleep_loc and calling __kmp_null_resume_wrapper(). Fixes: #80664
2024-02-27[OpenMP][OMPD] libompd must not link libomp (#83119)Joachim1-2/+3
Fixes a regression introduced in 91ccd8248. The code for libompd includes kmp.h for enum kmp_sched. The dependency to hwloc is not necessary. Avoid the dependency by skipping the definitions in kmp.h using types from hwloc.h. Fixes #80750
2024-02-25[OpenMP] Implement __kmp_is_address_mapped on DragonFlyBSD. (#82895)David CARLIER1-5/+46
implement internal __kmp_is_address_mapped.
2024-02-20[OpenMP][AIX]Add assembly file containing microtasking routines and unnamed ↵Xing Xue1-0/+410
common block definitions (#81770) This patch adds assembly file `z_AIX_asm.S` that contains the 32- and 64-bit XCOFF version of microtasking routines and unnamed common block definitions. This code has been run through the libomp LIT tests and a user package successfully.
2024-02-16[OpenMP][AIX] Set worker stack size to 2 x KMP_DEFAULT_STKSIZE if system ↵Xing Xue2-0/+9
stack size is too big (#81996) This patch sets the stack size of worker threads to `2 x KMP_DEFAULT_STKSIZE` (2 x 4MB) for AIX if the system stack size is too big. Also defines maximum stack size for 32-bit AIX.
2024-02-13[OpenMP][AIX]Define struct kmp_base_tas_lock with the order of two members ↵Xing Xue4-10/+20
swapped for big-endian (#79188) The direct lock data structure has bit `0` (the least significant bit) of the first 32-bit word set to `1` to indicate it is a direct lock. On the other hand, the first word (in 32-bit mode) or first two words (in 64-bit mode) of an indirect lock are the address of the entry allocated from the indirect lock table. The runtime checks bit `0` of the first 32-bit word to tell if this is a direct or an indirect lock. This works fine for 32-bit and 64-bit little-endian because its memory layout of a 64-bit address is (`low word`, `high word`). However, this causes problems for big-endian where the memory layout of a 64-bit address is (`high word`, `low word`). If an address of the indirect lock table entry is something like `0x110035300`, i.e., (`0x1`, `0x10035300`), it is treated as a direct lock. This patch defines `struct kmp_base_tas_lock` with the ordering of the two 32-bit members flipped for big-endian PPC64 so that when checking/setting tags in member `poll`, the second word (the low word) is used. This patch also changes places where `poll` is not already explicitly specified for checking/setting tags.
2024-02-09[OpenMP] Fix libomp debug build. (#81029)Daniil Fukalov1-0/+4
Disable libstdc++ assertions in the runtime library just like in https://reviews.llvm.org/D143168.
2024-02-07[OpenMP][test]Flip bit-fields in 'struct flags' for big-endian in test cases ↵Xing Xue1-1/+2
(#79895) This patch flips bit-fields in `struct flags` for big-endian in test cases to be consistent with the definition of the structure in libomp `kmp.h`.
2024-02-06[OMPD] Runtime Entry Point functions for OMPD in libomp.so need C linkage as ↵vigbalu1-0/+8
per standard. (#79246) Adding extern "C" to all the entry point functions to make sure that these functions are not mangled.
2024-02-03[openmp] Add a dependency on the separate import library (#80449)Martin Storsjö1-0/+1
Currently, when doing e.g. "ninja check-openmp", the check-openmp target only depends on the target "omp", which builds the library. Thus by doing that, the separate import library "libomp.lib", which is generated directly from a def file, never gets created, unless one does a separate invocation first, that builds all targets. To fix this, make the "omp" target depend on the target for the separate import library, whenever that is created/used.
2024-01-25[openmp] Silence warning when compiling with MSVC targetting x86Alexandre Ganea1-1/+1
This fixes: ``` [3593/7449] Building CXX object projects\openmp\runtime\src\CMakeFiles\omp.dir\kmp_debug.cpp.obj C:\git\llvm-project\openmp\runtime\src\kmp_os.h(471): warning C4163: '_InlineInterlockedExchange64': not available as an intrinsic function ```
2024-01-23Re-land [openmp] Fix warnings when building on Windows with latest MSVC or ↵Alexandre Ganea7-38/+58
Clang ToT (#77853) The reverts 94f960925b7f609636fc2ffd83053814d5e45ed1 and fixes it.
2024-01-23Revert 10f3296dd7d74c975f208a8569221dc8f96d1db1 - [openmp] Fix warnings when ↵Alexandre Ganea7-51/+34
building on Windows with latest MSVC or Clang ToT (#77853) It broke the AMDGPU buildbot: https://lab.llvm.org/buildbot/#/builders/193/builds/45378
2024-01-23[openmp] Fix warnings when building on Windows with latest MSVC or Clang ToT ↵Alexandre Ganea7-34/+51
(#77853) There were quite a few compilation warnings when building openmp on Windows with the latest Visual Studios 2022 version 17.8.4. Some other warnings were visible with the latest Clang at tip. This commit fixes all of them.
2024-01-18[openmp] Revert 64874e5ab5fd102344d43ac9465537a44130bf19 since it was ↵Alexandre Ganea9-84/+15
committed by mistake and the PR (https://github.com/llvm/llvm-project/pull/77853) wasn't approved yet.
2024-01-18[OpenMP][omp_lib] Restore compatibility with more restrictive Fortran ↵Paul Osmialowski1-3/+3
compilers (#77780) The most recent changes to `omp_lib.h.var` have re-introduced some compatibility issues that had to be fixed due to the similar changes in the past. Namely: 1. D120707 has removed the "use omp_lib_kinds" statement and replaced it with import 2. D114537 added line continuation to the long lines This patch introduces the same kind of changes in order to restore compatibility with some more restrictive Fortran compilers so their users could still benefit from the LLVM's OpenMP Fortran library.
2024-01-17[openmp] Silence warnings when building the LLVM release with MSVCAlexandre Ganea9-15/+84