aboutsummaryrefslogtreecommitdiff
path: root/offload
AgeCommit message (Collapse)AuthorFilesLines
2026-01-12[NFC][Offload] Rename a function (#175673)Hansang Bae1-6/+6
Renamed a function as suggested in #175664.
2026-01-12[Offload] Fix level_zero plugin build (#175664)Hansang Bae1-2/+0
Build has been broken when OMPTARGET_DEBUG is undefined.
2026-01-12[Offload] Update debug message printig in the plugins (#175205)Hansang Bae17-264/+302
* Prepare a set of debug types in llvm::offload::debug to be used in plugin code * Update debug messages in the plugins
2026-01-12[OFFLOAD] Add memory data locking API for libomptarget migration (#173138)fineg745-3/+149
Add liboffload memory data locking API for libomptarget migration This PR adds liboffload memory data locking API that needed to make libomptarget to use liboffload
2026-01-12[OFFLOAD][OpenMP] Remove old style REPORT support (#175607)Alex Duran4-21/+6
Fix the few remaining usages and remove the support for the old REPORT macro.
2026-01-12[CMake][NFC] Don't use uninitialized LLVM_REQUIRES_* (#175554)Alexis Engelke1-1/+1
LLVM_REQUIRES_* are per-target flags that are never set globally. Yet, some files used these (undefined) flags for some logic. This patch emoves these dead checks/unconditionally executes the logic. Note that the referenced *.exports files are empty, so there is no need to make related logic conditional on MSVC.
2026-01-09[OpenMP] Preserve the original address when `use_device_ptr/addr` lookup ↵Abhinav Gaba11-118/+74
fails. (#174659) As per OpenMP 5.1, we need to assume that when the lookup for `use_device_ptr/addr` fails, the incoming pointer was already device accessible. Prior to 5.1, a lookup-failure meant a user-error (for `use_device_ptr`), so we could do anything in that scenario. For `use_device_addr`, it was always incorrect to set the address to null. OpenMP 6.1 adds a way to retain the previous behavior of nullifying a pointer when the lookup fails. That will be tackled by the PR stack starting with https://github.com/llvm/llvm-project/pull/169603.
2026-01-09[Offload] Fix failing Fortran test w/ line number (#175247)Jan Patrick Lehr1-1/+1
This test also depends on the line number. Following similar approach as other with [[@LINE]] macro.
2026-01-09[OpenMP] Fix tests that depend on the line numberJoseph Huber11-34/+34
2026-01-09[OpenMP] Remove testing LTO variant on CPU targets (#175187)Joseph Huber31-85/+7
Summary: This is only really meaningful for the NVPTX target. Not all build environments support host LTO and these are redundant tests, just clean this up and make it run faster.
2026-01-09[flang][OpenMP] Prevent `omp.map.info` ops with user-defined mappers from ↵Kareem Ergawy1-0/+34
being marked as parial maps (#175133) The following test was triggering a runtime crash **on the host before launching the kernel**: ```fortran program test_omp_target_map_bug_v5 implicit none type nested_type real, allocatable :: alloc_field(:) end type nested_type type nesting_type integer :: int_field type(nested_type) :: derived_field end type nesting_type type(nesting_type) :: config allocate(config%derived_field%alloc_field(1)) !$OMP TARGET ENTER DATA MAP(TO:config, config%derived_field%alloc_field) !$OMP TARGET config%derived_field%alloc_field(1) = 1.0 !$OMP END TARGET deallocate(config%derived_field%alloc_field) end program test_omp_target_map_bug_v5 ``` In particular, the runtime was producing a segmentation fault when the test is compiled with any optimization level > 0; if you compile with -O0 the sample ran fine. After debugging the runtime, it turned out the crash was happening at the point where the runtime calls the default mapper emitted by the compiler for `nesting_type; in particular at this point in the runtime: https://github.com/llvm/llvm-project/blob/c62cd2877cc25a0d708ad22a70c2a57590449c4d/offload/libomptarget/omptarget.cpp#L307. Bisecting the optimization pipeline using `-mllvm -opt-bisect-limit=N`, the first pass that triggered the issue on `O1` was the `instcombine` pass. Debugging this further, the issue narrows down to canonicalizing `getelementptr` instructions from using struct types (in this case the `nesting_type` in the sample above) to using addressing bytes (`i8`). In particular, in `O0`, you would see something like this: ```llvm define internal void @.omp_mapper._QQFnesting_type_omp_default_mapper(ptr noundef %0, ptr noundef %1, ptr noundef %2, i64 noundef %3, i64 noundef %4, ptr noundef %5) #6 { entry: %6 = udiv exact i64 %3, 56 %7 = getelementptr %_QFTnesting_type, ptr %2, i64 %6 .... } ``` ```llvm define internal void @.omp_mapper._QQFnesting_type_omp_default_mapper(ptr noundef %0, ptr noundef %1, ptr noundef %2, i64 noundef %3, i64 noundef %4, ptr noundef %5) #6 { entry: %6 = getelementptr i8, ptr %2, i64 %3 .... } ``` The `udiv exact` instruction emitted by the OMP IR Builder (see: https://github.com/llvm/llvm-project/blob/c62cd2877cc25a0d708ad22a70c2a57590449c4d/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp#L9154) allows `instcombine` to assume that `%3` is divisible by the struct size (here `56`) and, therefore, replaces the result of the division with direct GEP on `i8` rather than the struct type. However, the runtime was calling `@.omp_mapper._QQFnesting_type_omp_default_mapper` not with `56` (the proper struct size) but with `48`! Debugging this further, I found that the size of `omp.map.info` operation to which the default mapper is attached computes the value of `48` because we set the map to partial (see: https://github.com/llvm/llvm-project/blob/c62cd2877cc25a0d708ad22a70c2a57590449c4d/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp#L1146 and https://github.com/llvm/llvm-project/blob/c62cd2877cc25a0d708ad22a70c2a57590449c4d/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp#L4501-L4512). However, I think this is incorrect since the emitted mapper (and user-defined mappers in general) are defined on the whole struct type and should never be marked as partial. Hence, the fix in this PR.
2026-01-08[offload][lit] Enable llvm-omp-device-info.c on Intel GPUs (#175084)Nick Sarnie1-1/+0
It's XPASSing after https://github.com/llvm/llvm-project/pull/172946. https://lab.llvm.org/staging/#/builders/225/builds/313 Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>
2026-01-08[OFFLOAD] Make L0 provide more information about device to be consistent ↵fineg742-1/+26
with other plugins (#172946) Update information about devices provided by level zero plugin in order to be more consistent with other plugins.
2026-01-08[Offload] Fix line numbers after #174804 (#174932)Jan Patrick Lehr12-35/+33
The changes in line numbers caused a few CHECK macros to now fail. This is fixed by this PR. Build w/ breakages: https://lab.llvm.org/staging/#/builders/105/builds/39748
2026-01-07[offload][lit] XFAIL all failing tests on the Level Zero plugin (#174804)Nick Sarnie217-4/+246
We finally got our buildbot added (to staging, at least) so we want to start running L0 tests in CI. We need `check-offload` to pass though, so XFAIL everything failing. There's a couple `UNSUPPORTED` as well, those are for sporadic fails. Also make set the `gpu` and `intelgpu` LIT variables when testing the `spirv64-intel` triple. We have no DeviceRTL yet so basically everything fails, but we manage to get ``` Total Discovered Tests: 432 Unsupported : 169 (39.12%) Passed : 67 (15.51%) Expectedly Failed: 196 (45.37%) ``` We still don't build the level zero plugin by default and these tests don't run unless the plugin was built, so this has no effect on most builds. --------- Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>
2026-01-05[offload][lit] Use '%not' instead of 'not' in requires.c (#174506)Nick Sarnie1-1/+1
Typo exposed by recent `not` behavior change, we need to make sure we're using the LLVM one. Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>
2026-01-05[offload][lit] Fix requires.c after 'not' behavior change (#174499)Nick Sarnie1-1/+1
`not` behavior change in https://github.com/llvm/llvm-project/pull/174298 requires `--crash` passed now. Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>
2026-01-03[offload] [test] Mark bug 51781 test as requiring GPU (#174284)Michał Górny1-0/+1
While the main problem with the test is that it requires LLD, given that it is unlikely to be testing anything meaningful for a CPU-only build, just mark it as requiring GPU. Fixes #100780 Signed-off-by: Michał Górny <mgorny@gentoo.org>
2025-12-23[Clang][OpenMP] Handle check for non-contiguous mapping in pointer-based ↵Amit Tiwari12-0/+1182
array sections (#157443) ### 1. ElementType deduction for pointer-based array sections Problem: Pointer-based array sections were previously ignored during `ElementType` deduction, leading to incorrect assumptions about array item types. This often resulted in out-of-bounds access, as seen in the assertion failure: ``` Assertion `idx < size()' failed. llvm-project/llvm/include/llvm/ADT/SmallVector.h:292: reference llvm::SmallVectorTemplateCommon<llvm::Value *>::operatorsize_type [T = llvm::Value *] ``` Fix: Added a check in clang/lib/CodeGen/CGOpenMPRuntime.cpp to ensure `ElementType` is correctly detected for cases involving non-contiguous updates with a base pointer. Impact: Resolves failures in OpenMP_VV (formerly sollve_vv) and other offload/clang-OpenMP tests: All tests under: https://github.com/OpenMP-Validation-and-Verification/OpenMP_VV/tree/master/tests/5.0/target_update test_target_update_mapper_from_discontiguous.c test_target_update_mapper_to_discontiguous.c test_target_update_to_discontiguous.c test_target_update_from_discontiguous.c ### 2. Zero-dimension propagation in struct member mappings Problem: A zero-dimension entry for struct members introduced inconsistencies in complex mapping logic within OMPIRBuilder.cpp. Placeholder zeros propagated to emitNonContiguousDescriptor(), breaking reverse indexing logic and corrupting IR: Loops assume `Dims[I] >= 1`. When `Dims[I] == 0`: Reverse indexing still stores pointers to uninitialized allocas or mismatched slots. Runtime interprets `ArgSizes[I]` (derived from `Dims[I])` as dimensionality, causing size/offset calculations to collapse to zero → results in `size=0` async copy and plugin interface errors. Fix: Prepend a synthetic dimension of size 1 instead of appending a zero, preserving correctness in `targetDataUpdate()` for non-contiguous updates. Impact: Added dedicated test cases that previously failed on main.
2025-12-21[offload] Fix unittests when multiple devices are available (#173209)Kevin Sala Penades1-6/+6
This commit appends a device number after the device name (used as unittest param name). The number is between 0 and the number of available non-host devices. In this way, it allows multiple devices of the same vendor to be tested.
2025-12-21[offload] Fix kernel launch unittest (#173203)Kevin Sala Penades1-2/+2
This commit fixes the error introduced in #172249.
2025-12-19[Offload] Make sure error is consumed (#172924)Hansang Bae1-2/+2
2025-12-18[OFFLOAD][L0] Expose native ELF to upper layers (#172819)Alex Duran5-94/+91
This PR refactors how the device image is built so we can expose the native ELF of the device to DeviceImageTy which solves several issues regarding symbol look up (as DeviceImageTy expects an ELF). It also simplifies the module linking code taking into account the latest changes in the driver (which adds "-library-compilation when necessary). --------- Co-authored-by: Alexey Sachkov <alexey.sachkov@intel.com> Co-authored-by: Nick Sarnie <nick.sarnie@intel.com> Co-authored-by: Joseph Huber <huberjn@outlook.com>
2025-12-18[OFFLOAD] Recognize level_zero backend in liboffload (#172818)Alex Duran2-0/+3
The code to recognize the level_zero plugin as a liboffload backend was split from #158900. This PR adds the support back. --------- Co-authored-by: Alexey Sachkov <alexey.sachkov@intel.com> Co-authored-by: Nick Sarnie <nick.sarnie@intel.com> Co-authored-by: Joseph Huber <huberjn@outlook.com>
2025-12-18[OFFLOAD][L0] Improve symbol device lookup (#172820)Alex Duran3-12/+11
When looking for the device address of a symbol, we need to also look if it's a function symbol if not found as global symbol in the device. --------- Co-authored-by: Alexey Sachkov <alexey.sachkov@intel.com> Co-authored-by: Nick Sarnie <nick.sarnie@intel.com> Co-authored-by: Joseph Huber <huberjn@outlook.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-12-18[OFFLOAD][L0] Fix usages of getDebugLevel in L0 plugin (#172815)Alex Duran2-50/+60
Support for getDebugLevel was removed as part of the new debug macros (#165416). This PR updates such usages to use the new ODBG_* macros. --------- Co-authored-by: Alexey Sachkov <alexey.sachkov@intel.com> Co-authored-by: Nick Sarnie <nick.sarnie@intel.com> Co-authored-by: Joseph Huber <huberjn@outlook.com>
2025-12-18[OpenMP][Offload] Fix test after #172382 (#172865)Jan Patrick Lehr1-0/+2
The test added in #172382 requires a debug build.
2025-12-18Revert "[OMPT][Offload][OpenMP] Fixes for OMPT data used by libomptarget" ↵Joachim6-480/+21
(#172827) Reverts llvm/llvm-project#156020 We will need some time for investigating buildbot failures
2025-12-18[OMPT][Offload][OpenMP] Fixes for OMPT data used by libomptarget (#156020)Kaloyan Ignatov6-21/+480
These commits fix issues regarding storage of tool data within libomptarget. Both libomp and libomptarget have been modified to accommodate this. We differentiate between two cases depending on the type of the target region: - merged target regions (default, without `nowait` clause): behavior remains unchanged, tool data is stored in the thread local RegionInterface class within libomptarget. - deferred target regions (using `nowait` clause): tool data is moved to `ompt_task_info_t` struct within libomp, as `RegionInterface` is thread local and its data is lost whenever another task is scheduled on the thread, which happens with deferred target regions. In the new implementation, `RegionInterface` receives pointers to `ompt_task_info_t` within libomp which are handled transparently within libomptarget. Thus, the problem of tool data getting lost when a thread receives a new task is resolved: `target_data` and `target_task_data` remain set. Another issue was the value of `task_data` which is supposed to belong to the generating task of the region according to the OpenMP standard, but instead had been set to the `task_data` of the target task itself until now. Test cases have been added which check both of these fixes. --------- Co-authored-by: Joachim <jenke@itc.rwth-aachen.de>
2025-12-18[OFFLOAD] Add plugin with support for Intel oneAPI Level Zero (#158900)Alex Duran26-1/+5888
Add a new nextgen plugin that supports GPU devices through the Intel oneAPI Level Zero library. The plugin is not enabled by default and needs to be added to LIBOMPTARGET_PLUGINS_TO_BUILD explicitely. --------- Co-authored-by: Alexey Sachkov <alexey.sachkov@intel.com> Co-authored-by: Nick Sarnie <nick.sarnie@intel.com> Co-authored-by: Joseph Huber <huberjn@outlook.com>
2025-12-17[Offload] Debug message update part 3 (#171684)Hansang Bae5-127/+140
Update debug messages based on the new method from #170425. Updated the following files. - plugins-nextgen/common/include/MemoryManager.h - plugins-nextgen/common/include/PluginInterface.h - plugins-nextgen/common/src/GlobalHandler.cpp - plugins-nextgen/common/src/PluginInterface.cpp - plugins-nextgen/host/dynamic_ffi/ffi.cpp
2025-12-17[Offload] Debug message update part 2 (#171683)Hansang Bae7-109/+128
Update debug messages based on the new method from #170425. Added a new debug type `Tool` and updated the following files. - include/OffloadPolicy.h - include/OpenMP/OMPT/Connector.h - include/Shared/Debug.h - include/Shared/EnvironmentVar.h - libomptarget/OpenMP/Mapping.cpp - libomptarget/OpenMP/OMPT/Callback.cpp - libomptarget/PluginManager.cpp
2025-12-17[Offload] Debug message update part 1 (#171672)Hansang Bae3-134/+149
Update debug messages based on the new method from #170425. Updated the following files. - libomptarget/LegacyAPI.cpp - libomptarget/OpenMP/API.cpp - libomptarget/OpenMP/InteropAPI.cpp
2025-12-17[NFC][Offload] Missing test change in #153683 (#172587)Hansang Bae1-1/+1
2025-12-17[OpenMP][Offload] Add support for lambdas with debug conditions (#172573)Alex Duran1-9/+45
This PR adds a new set of debug macros that allow a certain code to be only executed when certain debug conditions are met. This is useful to guard things that are not strictly messages but compute and store things that are related to those messages. Strictly speaking the existing ODBG_OS could be used as well but that requires a stream object to be created which is unnecessary in some cases. Example of how it works: ```cpp ODBG_IF("Counters", [&](uint32_t Level) { someCounter++; if (Level == 2) moreDetailedCounter += f(); }); ODBG("Counters") << "Counter" = someCounter << ODBG_IF(2) << "DetailedCounter" << moreDetailedCounter; ```
2025-12-16[OpenMP][Offload] Add `LIBOMPTARGET_TREAT_ATTACH_AUTO_AS_ALWAYS` to treat ↵Abhinav Gaba4-1/+76
`attach(auto)` as `attach(always)`. (#172382) This is needed as a way to support older code that was expecting unconditional attachment to happen for cases like: ```c int *p; int x; #pragma omp targret enter data map(p) // (A) #pragma omp target enter data map(x) // (B) p = &x; // By default, this does NOT attach p and x #pragma omp target enter data map(p[0:0]) // (C) ``` When the environment variable is set, such maps, where both the pointer and the pointee already have corresponding copies on the device, but are not attached to one another, will be attached as-if OpenMP 6.1 TR14's `attach(always)` map-type-modifier was specified on `(C)`.
2025-12-16Revert "[OpenMP][Offload] Add support for lambdas with debug conditions" ↵Alex Duran1-46/+9
(#172570) Reverts llvm/llvm-project#172107
2025-12-16[OpenMP][Offload] Add support for lambdas with debug conditions (#172107)Alex Duran1-9/+46
This PR adds a new set of debug macros that allow a certain code to be only executed when certain debug conditions are met. This is useful to guard things that are not strictly messages but compute and store things that are related to those messages. Strictly speaking the existing ODBG_OS could be used as well but that requires a stream object to be created which is unnecessary in some cases. Example of how it works: ``` ODBG_IF("Counters", [&](uint32_t Level) { someCounter++; if (Level == 2) moreDetailedCounter += f(); }); ODBG("Counters") << "Counter" = someCounter << ODBG_IF(2) << "DetailedCounter" << moreDetailedCounter; ```
2025-12-16[OpenMP][Offload] Revert format of changed messages (#171995)Alex Duran3-7/+14
Adjust format of some of the updated debug output to match the old format as there are a number of tests that rely on it.
2025-12-15[OpenMP][Clang] Use `ATTACH` map-type for list-items with base-pointers. ↵Abhinav Gaba23-82/+55
(#153683) This adds support for using `ATTACH` map-type for proper pointer-attachment when mapping list-items that have base-pointers. For example, for the following: ```c int *p; #pragma omp target enter data map(p[1:10]) ``` The following maps are now emitted by clang: ``` (A) &p[0], &p[1], 10 * sizeof(p[1]), TO | FROM &p, &p[1], sizeof(p), ATTACH ``` Previously, the two possible maps emitted by clang were: ``` (B) &p[0], &p[1], 10 * sizeof(p[1]), TO | FROM (C) &p, &p[1], 10 * sizeof(p[1]), TO | FROM | PTR_AND_OBJ ```` (B) does not perform any pointer attachment, while (C) also maps the pointer p, both of which are incorrect. ----- With this change, we are using ATTACH-style maps, like `(A)`, for cases where the expression has a base-pointer. For example: ```cpp int *p, **pp; S *ps, **pps; ... map(p[0]) ... map(p[10:20]) ... map(*p) ... map(([20])p) ... map(ps->a) ... map(pps->p->a) ... map(pp[0][0]) ... map(*(pp + 10)[0]) ``` #### Grouping of maps based on attach base-pointers We also group mapping of clauses with the same base decl in the order of the increasing complexity of their base-pointers, e.g. for something like: ``` S **spp; map(spp[0][0], spp[0][0].a), // attach-ptr: spp[0] map(spp[0]), // attach-ptr: spp map(spp), // attach-ptr: N/A ``` We first map `spp`, then `spp[0]` then `spp[0][0]` and `spp[0][0].a`. This allows us to also group "struct" allocation based on their attach pointers. This resolves the issues of us always mapping everything from the beginning of the symbol `spp`. Each group is mapped independently, and at the same level, like `spp[0][0]` and its member `spp[0][0].a`, we still get map them together as part of the same contiguous struct `spp[0][0]`. This resolves issue #141042. #### use_device_ptr/addr fixes The handling of `use_device_ptr/addr` was updated to use the attach-ptr information, and works for many cases that were failing before. It has to be done as part of this series because otherwise, the switch from ptr_to_obj to attach-style mapping would have caused regressions in existing use_device_ptr/addr tests. #### Handling of attach-pointers that are members of implicitly mapped structs: * When a struct member-pointer, like `p` below, is a base-pointer in a `map` clause on a target construct (like `map(p[0:1])`, and the base of that struct is either the `this` pointer (implicitly or explicitly), or a struct that is implicitly mapped on that construct, we add an implicit `map(p)` so that we don't implicitly map the full struct. ```c struct S { int *p; void f1() { #pragma omp target map(p[0:1]) // Implicitly map this->p, to ensure // that the implicit map of `this[:]` does // not map the full struct printf("%p %p\n", &p, p); } ``` #### Scope for improvement: * We may be able to compute attach-ptr expr while collecting component-lists in Sema. * But we cache the computation results already, and `findAttachPtrExpr` is fairly simple, and fast. * There may be a better way to implement semantic expr comparison. #### Needs future work: * Attach-style maps not yet emitted for declare mappers. * Mapping of class member references: We are still using PTR_AND_OBJ maps for them. We will likely need to change that to handle `ref_ptr/ref_ptee`, and `attach` map-type-modifier on them. * Implicit capturing of "this" needs to map the full `this[0:1]` unless there is an explicit map on one of the members, or a map with a member as its base-pointer. * Implicit map added for capturing a class member pointer needs to also add a zero-length-array-section map. * `use_device_addr` on array-sections-on-pointers need further improvements (documented using FIXMEs) #### Why a large PR While it's unfortunate that this PR has gotten large and difficult to review, the issue is that all the functional changes have to be made together, to prevent regressions from partially implemented changes. For example, the changes to capturing were previously done separately (#145454), but they would still cause stability issues in absence of full attach-mapping. And attach-mapping needs those changes to be able to launch kernels. We extracted the utilities and functions, like those for finding attach-ptrs, or comparing exprs, out as a separate NFC PR that doesn't call those functions, just adds them (#155625). Maybe the change that adds a new error message for use_device_addr on array-sections with non-var base-pointers could have been extracted out too (but that would have had to be a follow-up change in that case, and we would get comp-fails with this PR when the erroneous case was not caught/diagnosed). --------- Co-authored-by: Alex Duran <alejandro.duran@intel.com>
2025-12-15[NFC][Offload] Fix minor debug print issues introduced in #170425. (#172377)Abhinav Gaba3-24/+25
2025-12-14[offload] Fix CUDA args size by subtracting tail padding (#172249)Kevin Sala Penades7-5/+52
This commit makes the cuLaunchKernel call to pass the total arguments size without tail padding.
2025-12-10[OpenMP][Offload] Continue to update libomptarget debug messages (#170425)Alex Duran3-253/+336
* Add support to use lambdas to output debug messages (like LDBG_OS) * Update messages for interface.cpp and omptarget.cpp
2025-12-05Reland "[Flang][OpenMP] Add lowering support for is_device_ptr clause ↵Akash Banerjee1-0/+60
(#169331)" (#170851) Add support for OpenMP is_device_ptr clause for target directives. [MLIR][OpenMP] Add OpenMPToLLVMIRTranslation support for is_device_ptr #169367 This PR adds support for the OpenMP is_device_ptr clause in the MLIR to LLVM IR translation for target regions. The is_device_ptr clause allows device pointers (allocated via OpenMP runtime APIs) to be used directly in target regions without implicit mapping.
2025-12-04Revert "[Flang][OpenMP] Add lowering support for is_device_ptr clause" (#170778)theRonShark1-60/+0
Reverts llvm/llvm-project#169331
2025-12-04[Flang][OpenMP] Add lowering support for is_device_ptr clause (#169331)Akash Banerjee1-0/+60
Add support for OpenMP is_device_ptr clause for target directives. [MLIR][OpenMP] Add OpenMPToLLVMIRTranslation support for is_device_ptr #169367 This PR adds support for the OpenMP is_device_ptr clause in the MLIR to LLVM IR translation for target regions. The is_device_ptr clause allows device pointers (allocated via OpenMP runtime APIs) to be used directly in target regions without implicit mapping.
2025-12-02[OFFLOAD][LIBOMPTARGET] Start to update debug messages in libomptarget (#170265)Alex Duran3-142/+205
* Add compatibility support for DP and REPORT macros * Define a set of predefined Debug Type for libomptarget * Start to update libomptarget files (OffloadRTL.cpp, device.cpp)
2025-12-01Reland: [OpenMP] Implement omp_get_uid_from_device() / ↵Robert Imschweiler5-0/+146
omp_get_device_from_uid() (#168554) Reland https://github.com/llvm/llvm-project/pull/164392 with Fortran support moved to follow-up PR
2025-11-26[OpenMP][clang] Register vtables on device for indirect calls runtime (#167011)Jason-VanBeusekom4-19/+167
This is a branch off of https://github.com/llvm/llvm-project/pull/159856, in which consists of the runtime portion of the changes required to support indirect function and virtual function calls on an `omp target device` when the virtual class / indirect function is mapped to the device from the host. Key Changes - Introduced a new flag OMP_DECLARE_TARGET_INDIRECT_VTABLE to mark VTable registrations - Modified setupIndirectCallTable to support both VTable entries and indirect function pointers Details: The setupIndirectCallTable implementation was modified to support this registration type by retrieving the first address of the VTable and inferring the remaining data needed to build the indirect call table. Since the Vtables / Classes registered as indirect can be larger than 8 bytes, and the vtables may not be at the first address we either need to pass the size to __llvm_omp_indirect_call_lookup and have a check at each step of the binary search, or add multiple entries to the indirect table for each address registered. The latter was chosen. Commit: a00def3f20e166d4fb9328e6f0bc0742cd0afa31 is not a part of this PR and is handled / reviewed in: https://github.com/llvm/llvm-project/pull/159856, This is PR (2/3) Register Vtable PR (1/3): https://github.com/llvm/llvm-project/pull/159856, Codegen / _llvm_omp_indirect_call_lookup PR (3/3): https://github.com/llvm/llvm-project/pull/159857
2025-11-26[OFFLOAD] Add support for indexed per-thread containers (#164263)Alex Duran2-59/+208
Split from #158900 it adds a PerThreadContainer that can use STL-like indexed containers based on a slightly refactored PerThreadTable. --------- Co-authored-by: Joseph Huber <huberjn@outlook.com>