aboutsummaryrefslogtreecommitdiff
path: root/offload
AgeCommit message (Collapse)AuthorFilesLines
2026-01-03[offload] [test] Mark bug 51781 test as requiring GPU (#174284)Michał Górny1-0/+1
While the main problem with the test is that it requires LLD, given that it is unlikely to be testing anything meaningful for a CPU-only build, just mark it as requiring GPU. Fixes #100780 Signed-off-by: Michał Górny <mgorny@gentoo.org>
2025-12-23[Clang][OpenMP] Handle check for non-contiguous mapping in pointer-based ↵Amit Tiwari12-0/+1182
array sections (#157443) ### 1. ElementType deduction for pointer-based array sections Problem: Pointer-based array sections were previously ignored during `ElementType` deduction, leading to incorrect assumptions about array item types. This often resulted in out-of-bounds access, as seen in the assertion failure: ``` Assertion `idx < size()' failed. llvm-project/llvm/include/llvm/ADT/SmallVector.h:292: reference llvm::SmallVectorTemplateCommon<llvm::Value *>::operatorsize_type [T = llvm::Value *] ``` Fix: Added a check in clang/lib/CodeGen/CGOpenMPRuntime.cpp to ensure `ElementType` is correctly detected for cases involving non-contiguous updates with a base pointer. Impact: Resolves failures in OpenMP_VV (formerly sollve_vv) and other offload/clang-OpenMP tests: All tests under: https://github.com/OpenMP-Validation-and-Verification/OpenMP_VV/tree/master/tests/5.0/target_update test_target_update_mapper_from_discontiguous.c test_target_update_mapper_to_discontiguous.c test_target_update_to_discontiguous.c test_target_update_from_discontiguous.c ### 2. Zero-dimension propagation in struct member mappings Problem: A zero-dimension entry for struct members introduced inconsistencies in complex mapping logic within OMPIRBuilder.cpp. Placeholder zeros propagated to emitNonContiguousDescriptor(), breaking reverse indexing logic and corrupting IR: Loops assume `Dims[I] >= 1`. When `Dims[I] == 0`: Reverse indexing still stores pointers to uninitialized allocas or mismatched slots. Runtime interprets `ArgSizes[I]` (derived from `Dims[I])` as dimensionality, causing size/offset calculations to collapse to zero → results in `size=0` async copy and plugin interface errors. Fix: Prepend a synthetic dimension of size 1 instead of appending a zero, preserving correctness in `targetDataUpdate()` for non-contiguous updates. Impact: Added dedicated test cases that previously failed on main.
2025-12-21[offload] Fix unittests when multiple devices are available (#173209)Kevin Sala Penades1-6/+6
This commit appends a device number after the device name (used as unittest param name). The number is between 0 and the number of available non-host devices. In this way, it allows multiple devices of the same vendor to be tested.
2025-12-21[offload] Fix kernel launch unittest (#173203)Kevin Sala Penades1-2/+2
This commit fixes the error introduced in #172249.
2025-12-19[Offload] Make sure error is consumed (#172924)Hansang Bae1-2/+2
2025-12-18[OFFLOAD][L0] Expose native ELF to upper layers (#172819)Alex Duran5-94/+91
This PR refactors how the device image is built so we can expose the native ELF of the device to DeviceImageTy which solves several issues regarding symbol look up (as DeviceImageTy expects an ELF). It also simplifies the module linking code taking into account the latest changes in the driver (which adds "-library-compilation when necessary). --------- Co-authored-by: Alexey Sachkov <alexey.sachkov@intel.com> Co-authored-by: Nick Sarnie <nick.sarnie@intel.com> Co-authored-by: Joseph Huber <huberjn@outlook.com>
2025-12-18[OFFLOAD] Recognize level_zero backend in liboffload (#172818)Alex Duran2-0/+3
The code to recognize the level_zero plugin as a liboffload backend was split from #158900. This PR adds the support back. --------- Co-authored-by: Alexey Sachkov <alexey.sachkov@intel.com> Co-authored-by: Nick Sarnie <nick.sarnie@intel.com> Co-authored-by: Joseph Huber <huberjn@outlook.com>
2025-12-18[OFFLOAD][L0] Improve symbol device lookup (#172820)Alex Duran3-12/+11
When looking for the device address of a symbol, we need to also look if it's a function symbol if not found as global symbol in the device. --------- Co-authored-by: Alexey Sachkov <alexey.sachkov@intel.com> Co-authored-by: Nick Sarnie <nick.sarnie@intel.com> Co-authored-by: Joseph Huber <huberjn@outlook.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-12-18[OFFLOAD][L0] Fix usages of getDebugLevel in L0 plugin (#172815)Alex Duran2-50/+60
Support for getDebugLevel was removed as part of the new debug macros (#165416). This PR updates such usages to use the new ODBG_* macros. --------- Co-authored-by: Alexey Sachkov <alexey.sachkov@intel.com> Co-authored-by: Nick Sarnie <nick.sarnie@intel.com> Co-authored-by: Joseph Huber <huberjn@outlook.com>
2025-12-18[OpenMP][Offload] Fix test after #172382 (#172865)Jan Patrick Lehr1-0/+2
The test added in #172382 requires a debug build.
2025-12-18Revert "[OMPT][Offload][OpenMP] Fixes for OMPT data used by libomptarget" ↵Joachim6-480/+21
(#172827) Reverts llvm/llvm-project#156020 We will need some time for investigating buildbot failures
2025-12-18[OMPT][Offload][OpenMP] Fixes for OMPT data used by libomptarget (#156020)Kaloyan Ignatov6-21/+480
These commits fix issues regarding storage of tool data within libomptarget. Both libomp and libomptarget have been modified to accommodate this. We differentiate between two cases depending on the type of the target region: - merged target regions (default, without `nowait` clause): behavior remains unchanged, tool data is stored in the thread local RegionInterface class within libomptarget. - deferred target regions (using `nowait` clause): tool data is moved to `ompt_task_info_t` struct within libomp, as `RegionInterface` is thread local and its data is lost whenever another task is scheduled on the thread, which happens with deferred target regions. In the new implementation, `RegionInterface` receives pointers to `ompt_task_info_t` within libomp which are handled transparently within libomptarget. Thus, the problem of tool data getting lost when a thread receives a new task is resolved: `target_data` and `target_task_data` remain set. Another issue was the value of `task_data` which is supposed to belong to the generating task of the region according to the OpenMP standard, but instead had been set to the `task_data` of the target task itself until now. Test cases have been added which check both of these fixes. --------- Co-authored-by: Joachim <jenke@itc.rwth-aachen.de>
2025-12-18[OFFLOAD] Add plugin with support for Intel oneAPI Level Zero (#158900)Alex Duran26-1/+5888
Add a new nextgen plugin that supports GPU devices through the Intel oneAPI Level Zero library. The plugin is not enabled by default and needs to be added to LIBOMPTARGET_PLUGINS_TO_BUILD explicitely. --------- Co-authored-by: Alexey Sachkov <alexey.sachkov@intel.com> Co-authored-by: Nick Sarnie <nick.sarnie@intel.com> Co-authored-by: Joseph Huber <huberjn@outlook.com>
2025-12-17[Offload] Debug message update part 3 (#171684)Hansang Bae5-127/+140
Update debug messages based on the new method from #170425. Updated the following files. - plugins-nextgen/common/include/MemoryManager.h - plugins-nextgen/common/include/PluginInterface.h - plugins-nextgen/common/src/GlobalHandler.cpp - plugins-nextgen/common/src/PluginInterface.cpp - plugins-nextgen/host/dynamic_ffi/ffi.cpp
2025-12-17[Offload] Debug message update part 2 (#171683)Hansang Bae7-109/+128
Update debug messages based on the new method from #170425. Added a new debug type `Tool` and updated the following files. - include/OffloadPolicy.h - include/OpenMP/OMPT/Connector.h - include/Shared/Debug.h - include/Shared/EnvironmentVar.h - libomptarget/OpenMP/Mapping.cpp - libomptarget/OpenMP/OMPT/Callback.cpp - libomptarget/PluginManager.cpp
2025-12-17[Offload] Debug message update part 1 (#171672)Hansang Bae3-134/+149
Update debug messages based on the new method from #170425. Updated the following files. - libomptarget/LegacyAPI.cpp - libomptarget/OpenMP/API.cpp - libomptarget/OpenMP/InteropAPI.cpp
2025-12-17[NFC][Offload] Missing test change in #153683 (#172587)Hansang Bae1-1/+1
2025-12-17[OpenMP][Offload] Add support for lambdas with debug conditions (#172573)Alex Duran1-9/+45
This PR adds a new set of debug macros that allow a certain code to be only executed when certain debug conditions are met. This is useful to guard things that are not strictly messages but compute and store things that are related to those messages. Strictly speaking the existing ODBG_OS could be used as well but that requires a stream object to be created which is unnecessary in some cases. Example of how it works: ```cpp ODBG_IF("Counters", [&](uint32_t Level) { someCounter++; if (Level == 2) moreDetailedCounter += f(); }); ODBG("Counters") << "Counter" = someCounter << ODBG_IF(2) << "DetailedCounter" << moreDetailedCounter; ```
2025-12-16[OpenMP][Offload] Add `LIBOMPTARGET_TREAT_ATTACH_AUTO_AS_ALWAYS` to treat ↵Abhinav Gaba4-1/+76
`attach(auto)` as `attach(always)`. (#172382) This is needed as a way to support older code that was expecting unconditional attachment to happen for cases like: ```c int *p; int x; #pragma omp targret enter data map(p) // (A) #pragma omp target enter data map(x) // (B) p = &x; // By default, this does NOT attach p and x #pragma omp target enter data map(p[0:0]) // (C) ``` When the environment variable is set, such maps, where both the pointer and the pointee already have corresponding copies on the device, but are not attached to one another, will be attached as-if OpenMP 6.1 TR14's `attach(always)` map-type-modifier was specified on `(C)`.
2025-12-16Revert "[OpenMP][Offload] Add support for lambdas with debug conditions" ↵Alex Duran1-46/+9
(#172570) Reverts llvm/llvm-project#172107
2025-12-16[OpenMP][Offload] Add support for lambdas with debug conditions (#172107)Alex Duran1-9/+46
This PR adds a new set of debug macros that allow a certain code to be only executed when certain debug conditions are met. This is useful to guard things that are not strictly messages but compute and store things that are related to those messages. Strictly speaking the existing ODBG_OS could be used as well but that requires a stream object to be created which is unnecessary in some cases. Example of how it works: ``` ODBG_IF("Counters", [&](uint32_t Level) { someCounter++; if (Level == 2) moreDetailedCounter += f(); }); ODBG("Counters") << "Counter" = someCounter << ODBG_IF(2) << "DetailedCounter" << moreDetailedCounter; ```
2025-12-16[OpenMP][Offload] Revert format of changed messages (#171995)Alex Duran3-7/+14
Adjust format of some of the updated debug output to match the old format as there are a number of tests that rely on it.
2025-12-15[OpenMP][Clang] Use `ATTACH` map-type for list-items with base-pointers. ↵Abhinav Gaba23-82/+55
(#153683) This adds support for using `ATTACH` map-type for proper pointer-attachment when mapping list-items that have base-pointers. For example, for the following: ```c int *p; #pragma omp target enter data map(p[1:10]) ``` The following maps are now emitted by clang: ``` (A) &p[0], &p[1], 10 * sizeof(p[1]), TO | FROM &p, &p[1], sizeof(p), ATTACH ``` Previously, the two possible maps emitted by clang were: ``` (B) &p[0], &p[1], 10 * sizeof(p[1]), TO | FROM (C) &p, &p[1], 10 * sizeof(p[1]), TO | FROM | PTR_AND_OBJ ```` (B) does not perform any pointer attachment, while (C) also maps the pointer p, both of which are incorrect. ----- With this change, we are using ATTACH-style maps, like `(A)`, for cases where the expression has a base-pointer. For example: ```cpp int *p, **pp; S *ps, **pps; ... map(p[0]) ... map(p[10:20]) ... map(*p) ... map(([20])p) ... map(ps->a) ... map(pps->p->a) ... map(pp[0][0]) ... map(*(pp + 10)[0]) ``` #### Grouping of maps based on attach base-pointers We also group mapping of clauses with the same base decl in the order of the increasing complexity of their base-pointers, e.g. for something like: ``` S **spp; map(spp[0][0], spp[0][0].a), // attach-ptr: spp[0] map(spp[0]), // attach-ptr: spp map(spp), // attach-ptr: N/A ``` We first map `spp`, then `spp[0]` then `spp[0][0]` and `spp[0][0].a`. This allows us to also group "struct" allocation based on their attach pointers. This resolves the issues of us always mapping everything from the beginning of the symbol `spp`. Each group is mapped independently, and at the same level, like `spp[0][0]` and its member `spp[0][0].a`, we still get map them together as part of the same contiguous struct `spp[0][0]`. This resolves issue #141042. #### use_device_ptr/addr fixes The handling of `use_device_ptr/addr` was updated to use the attach-ptr information, and works for many cases that were failing before. It has to be done as part of this series because otherwise, the switch from ptr_to_obj to attach-style mapping would have caused regressions in existing use_device_ptr/addr tests. #### Handling of attach-pointers that are members of implicitly mapped structs: * When a struct member-pointer, like `p` below, is a base-pointer in a `map` clause on a target construct (like `map(p[0:1])`, and the base of that struct is either the `this` pointer (implicitly or explicitly), or a struct that is implicitly mapped on that construct, we add an implicit `map(p)` so that we don't implicitly map the full struct. ```c struct S { int *p; void f1() { #pragma omp target map(p[0:1]) // Implicitly map this->p, to ensure // that the implicit map of `this[:]` does // not map the full struct printf("%p %p\n", &p, p); } ``` #### Scope for improvement: * We may be able to compute attach-ptr expr while collecting component-lists in Sema. * But we cache the computation results already, and `findAttachPtrExpr` is fairly simple, and fast. * There may be a better way to implement semantic expr comparison. #### Needs future work: * Attach-style maps not yet emitted for declare mappers. * Mapping of class member references: We are still using PTR_AND_OBJ maps for them. We will likely need to change that to handle `ref_ptr/ref_ptee`, and `attach` map-type-modifier on them. * Implicit capturing of "this" needs to map the full `this[0:1]` unless there is an explicit map on one of the members, or a map with a member as its base-pointer. * Implicit map added for capturing a class member pointer needs to also add a zero-length-array-section map. * `use_device_addr` on array-sections-on-pointers need further improvements (documented using FIXMEs) #### Why a large PR While it's unfortunate that this PR has gotten large and difficult to review, the issue is that all the functional changes have to be made together, to prevent regressions from partially implemented changes. For example, the changes to capturing were previously done separately (#145454), but they would still cause stability issues in absence of full attach-mapping. And attach-mapping needs those changes to be able to launch kernels. We extracted the utilities and functions, like those for finding attach-ptrs, or comparing exprs, out as a separate NFC PR that doesn't call those functions, just adds them (#155625). Maybe the change that adds a new error message for use_device_addr on array-sections with non-var base-pointers could have been extracted out too (but that would have had to be a follow-up change in that case, and we would get comp-fails with this PR when the erroneous case was not caught/diagnosed). --------- Co-authored-by: Alex Duran <alejandro.duran@intel.com>
2025-12-15[NFC][Offload] Fix minor debug print issues introduced in #170425. (#172377)Abhinav Gaba3-24/+25
2025-12-14[offload] Fix CUDA args size by subtracting tail padding (#172249)Kevin Sala Penades7-5/+52
This commit makes the cuLaunchKernel call to pass the total arguments size without tail padding.
2025-12-10[OpenMP][Offload] Continue to update libomptarget debug messages (#170425)Alex Duran3-253/+336
* Add support to use lambdas to output debug messages (like LDBG_OS) * Update messages for interface.cpp and omptarget.cpp
2025-12-05Reland "[Flang][OpenMP] Add lowering support for is_device_ptr clause ↵Akash Banerjee1-0/+60
(#169331)" (#170851) Add support for OpenMP is_device_ptr clause for target directives. [MLIR][OpenMP] Add OpenMPToLLVMIRTranslation support for is_device_ptr #169367 This PR adds support for the OpenMP is_device_ptr clause in the MLIR to LLVM IR translation for target regions. The is_device_ptr clause allows device pointers (allocated via OpenMP runtime APIs) to be used directly in target regions without implicit mapping.
2025-12-04Revert "[Flang][OpenMP] Add lowering support for is_device_ptr clause" (#170778)theRonShark1-60/+0
Reverts llvm/llvm-project#169331
2025-12-04[Flang][OpenMP] Add lowering support for is_device_ptr clause (#169331)Akash Banerjee1-0/+60
Add support for OpenMP is_device_ptr clause for target directives. [MLIR][OpenMP] Add OpenMPToLLVMIRTranslation support for is_device_ptr #169367 This PR adds support for the OpenMP is_device_ptr clause in the MLIR to LLVM IR translation for target regions. The is_device_ptr clause allows device pointers (allocated via OpenMP runtime APIs) to be used directly in target regions without implicit mapping.
2025-12-02[OFFLOAD][LIBOMPTARGET] Start to update debug messages in libomptarget (#170265)Alex Duran3-142/+205
* Add compatibility support for DP and REPORT macros * Define a set of predefined Debug Type for libomptarget * Start to update libomptarget files (OffloadRTL.cpp, device.cpp)
2025-12-01Reland: [OpenMP] Implement omp_get_uid_from_device() / ↵Robert Imschweiler5-0/+146
omp_get_device_from_uid() (#168554) Reland https://github.com/llvm/llvm-project/pull/164392 with Fortran support moved to follow-up PR
2025-11-26[OpenMP][clang] Register vtables on device for indirect calls runtime (#167011)Jason-VanBeusekom4-19/+167
This is a branch off of https://github.com/llvm/llvm-project/pull/159856, in which consists of the runtime portion of the changes required to support indirect function and virtual function calls on an `omp target device` when the virtual class / indirect function is mapped to the device from the host. Key Changes - Introduced a new flag OMP_DECLARE_TARGET_INDIRECT_VTABLE to mark VTable registrations - Modified setupIndirectCallTable to support both VTable entries and indirect function pointers Details: The setupIndirectCallTable implementation was modified to support this registration type by retrieving the first address of the VTable and inferring the remaining data needed to build the indirect call table. Since the Vtables / Classes registered as indirect can be larger than 8 bytes, and the vtables may not be at the first address we either need to pass the size to __llvm_omp_indirect_call_lookup and have a check at each step of the binary search, or add multiple entries to the indirect table for each address registered. The latter was chosen. Commit: a00def3f20e166d4fb9328e6f0bc0742cd0afa31 is not a part of this PR and is handled / reviewed in: https://github.com/llvm/llvm-project/pull/159856, This is PR (2/3) Register Vtable PR (1/3): https://github.com/llvm/llvm-project/pull/159856, Codegen / _llvm_omp_indirect_call_lookup PR (3/3): https://github.com/llvm/llvm-project/pull/159857
2025-11-26[OFFLOAD] Add support for indexed per-thread containers (#164263)Alex Duran2-59/+208
Split from #158900 it adds a PerThreadContainer that can use STL-like indexed containers based on a slightly refactored PerThreadTable. --------- Co-authored-by: Joseph Huber <huberjn@outlook.com>
2025-11-25[offload][lit] Fix compilation of two offload tests (#169399)Nick Sarnie2-3/+3
These are C tests, not C++, so no function parameters means unspecified number of parameters, not `void`. These compile fine on the current tested offload targets because an error is only [thrown](https://github.com/llvm/llvm-project/blob/main/clang/lib/Sema/SemaDecl.cpp#L10695) if the calling convention doesn't support variadic arguments, which they happen to. When compiling this test for other targets that do not support variadic arguments, we get an error, which does not seem intentional. Just add `void` to the parameter list. --------- Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>
2025-11-24[NFC][OpenMP] Add use_device_ptr/addr tests for when the lookup fails. (#169428)Abhinav Gaba3-0/+77
As per OpenMP 5.1, the pointers are expected to retain their original values when a lookup fails and there is no device pointer to translate to.
2025-11-24[OpenMP][flang] Lowering of OpenMP custom reductions to MLIR (#168417)Jan Leyonberg1-0/+88
This patch add support for lowering of custom reductions to MLIR. It also enhances the capability of the pass to automatically mark functions as "declare target" by traversing custom reduction initializers and combiners.
2025-11-24[Flang][OpenMP][MLIR] Initial declare target to for variables implementation ↵agozillon3-0/+111
(#119589) While the infrastructure for declare target to/enter and link for variables exists in the MLIR dialect and at the Flang level, the current lowering from MLIR -> LLVM IR isn't in place, it's only in place for variables that have the link clause applied. This PR aims to extend that lowering to an initial implementation that incorporates declare target to as well, which primarily requires changes in the OpenMPToLLVMIRTranslation phase. However, a minor addition to the OpenMP dialect was required to extend the declare target enumerator to include a default None field as well. This also requires a minor change to the Flang lowering's MapInfoFinlization.cpp pass to alter the map type for descriptors to deal with cases where a variable is marked declare to. Currently, when a descriptor variable is mapped declare target to the descriptor component can become attatched, and cannot be updated, this results in issues when an unusual allocation range is specified (effectively an off-by X error). The current solution is to map the descriptor always, as we always require an up-to-date version of this data. However, this also requires an interlinked PR that adds a more intricate type of mapping of structures/record types that clang currently implements, to circumvent the overwriting of the pointer in the descriptor. 3/3 required PRs to enable declare target to mapping, this PR should pass all tests and provide an all green CI. Co-authored-by: Raghu Maddhipatla raghu.maddhipatla@amd.com
2025-11-24[MLIR][OpenMP] Introduce overlapped record type map support (#119588)agozillon1-0/+56
This PR introduces a new additional type of map lowering for record types that Clang currently supports, in which a user can map a top-level record type and then individual members with different mapping, effectively creating a sort of "overlapping" mapping that we attempt to cut around. This is currently most predominantly used in Fortran, when mapping descriptors and there data, we map the descriptor and its data with separate map modifiers and "cut around" the pointer data, so that wedo not overwrite it unless the runtime deems it a neccesary action based on its reference counting mechanism. However, it is a mechanism that will come in handy/trigger when a user explitily maps a record type (derived type or structure) and then explicitly maps a member with a different map type. These additions were predominantly in the OpenMPToLLVMIRTranslation.cpp file and phase, however, one Flang test that checks end-to-end IR compilation (as far as we care for now at least) was altered. 2/3 required PRs to enable declare target to mapping, should look at PR 3/3 to check for full green passes (this one will fail a number due to some dependencies). Co-authored-by: Raghu Maddhipatla raghu.maddhipatla@amd.com
2025-11-20[OFFLOAD] Add support for more fine grained debug messages control (#165416)Alex Duran4-2/+285
This PR introduces new debug macros that allow a more fined control of which debug message to output and introduce C++ stream style for debug messages. Changing existing messages (except a few that I changed for testing) will come in subsequent PRs. I also think that we should make debug enabling OpenMP agnostic but, for now, I prioritized maintaing the current libomptarget behavior for now, and we might need more changes further down the line as we we decouple libomptarget.
2025-11-19[Offload] Make the RPC thread sleep briefly when idle (#168596)Joseph Huber1-3/+18
Summary: We start this thread if the RPC client symbol is detected in the loaded binary. We should make this sleep if there's no work to avoid the thread running at high priority when the (scarecely used) RPC call is actually required. So, right now after 25 microseconds we will assume the server is inactive and begin sleeping. This resets once we do find work. AMD supports a more intelligent way to do this. HSA signals can wake a sleeping thread from the kernel, and signals can be sent from the GPU side. This would be nice to have and I'm planning on working with it in the future to make this infrastructure more usable with existing AMD workloads.
2025-11-19[Runtimes] Default build must use its own output dirs (#168266)Michael Kruse3-9/+9
Post-commit fix of #164794 reported at https://github.com/llvm/llvm-project/pull/164794#issuecomment-3536253493 `LLVM_LIBRARY_OUTPUT_INTDIR` and `LLVM_RUNTIME_OUTPUT_INTDIR` is used by `AddLLVM.cmake` as output directories. Unless we are in a bootstrapping-build, It must not point to directories found by `find_package(LLVM)` which may be read-only directories. MLIR for instance sets thesese variables to its own build output directory, so should the runtimes.
2025-11-18Revert "[OpenMP] Implement omp_get_uid_from_device() / ↵Robert Imschweiler5-145/+0
omp_get_device_from_uid()" (#168547) Reverts llvm/llvm-project#164392 due to fortran issues
2025-11-18[OpenMP] Implement omp_get_uid_from_device() / omp_get_device_from_uid() ↵Robert Imschweiler5-0/+145
(#164392) Use the implementation in libomptarget. If libomptarget is not available, always return the UID / device number of the host / the initial device.
2025-11-14[OpenMP][Flang] Emit default declare mappers implicitly for derived types ↵Akash Banerjee1-0/+65
(#140562) This patch adds support to emit default declare mappers for implicit mapping of derived types when not supplied by user. This especially helps tackle mapping of allocatables of derived types.
2025-11-13[Offload] Add device info for shared memory (#167817)Kevin Sala Penades8-4/+47
2025-11-13[offload] defer "---> olInit" trace message (#167893)Łukasz Plewa1-5/+11
Tracing requires liboffload to be initialized, so calling isTracingEnabled() before olInit always returns false. This caused the first trace log to look like: ``` -> OL_SUCCESS ``` instead of: ``` ---> olInit() -> OL_SUCCESS ``` This patch moves the pre-call trace print for olInit so it is emitted only after initialization. It would be possible to add extra logic to detect whether liboffload is already initialized and only postpone the first pre-call print, but this would add unnecessary complexity, especially since this is tablegen code. The difference would matter only in the unlikely case of a crash during a second olInit call. --------- Co-authored-by: Joseph Huber <huberjn@outlook.com>
2025-11-10[PGO][Offload] Fix missing names bug in GPU PGO (#166444)Ethan Luis McDonough4-4/+0
After #163011 was merged, the tests in [`offload/test/offloading/gpupgo`](https://github.com/llvm/llvm-project/compare/main...EthanLuisMcDonough:llvm-project:gpupgo-names-fix-pr?expand=1#diff-f769f6cebd25fa527bd1c1150cc64eb585c41cb8a8b325c2bc80c690e47506a1) broke because the offload plugins were no longer able to find `__llvm_prf_nm`. This pull request explicitly makes `__llvm_prf_nm` visible to the host on GPU targets and reverses the changes made in f7e9968a5ba99521e6e51161f789f0cc1745193f.
2025-11-08[Offload] Remove unused KernelArgsTy instantiation (#167197)Kevin Sala Penades1-4/+0
2025-11-06[OpenMP] Fix tests relying on the heap size variableJoseph Huber4-7/+12
Summary: I made that an unimplemented error, but forgot that it was used for this environment variable.
2025-11-06[Offload] Remove handling for device memory pool (#163629)Joseph Huber9-150/+12
Summary: This was a lot of code that was only used for upstream LLVM builds of AMDGPU offloading. We have a generic and fast `malloc` in `libc` now so just use that. Simplifies code, can be added back if we start providing alternate forms but I don't think there's a single use-case that would justify it yet.