aboutsummaryrefslogtreecommitdiff
path: root/offload/include
AgeCommit message (Collapse)AuthorFilesLines
13 days[OFFLOAD] Implement excluding filters for debugging (#180538)Alex Duran1-12/+32
Allow a to define a set of Types that are not shown by default when doing default debug loggin (e.g., LIBOMPTARGET_DEBUG=All). Users can enable output of those types of messages by explicitly adding them to LIBOMPTARGET_DEBUG. Used to implement: #180545 --------- Co-authored-by: Michael Klemm <michael.klemm@amd.com>
2026-01-30[Offload] Add a function to register an RPC Server callback (#178774)Joseph Huber1-0/+5
Summary: We provide an RPC server to manage calls initiated by the device to run on the host. This is very useful for the built-in handling we have, however there are cases where we would want to extend this functionality. Cases like Fortran or MPI would be useful, but we cannot put references to these in the core offloading runtime. This way, we can provide this as a library interface that registers custom handlers for whatever code people want.
2026-01-27[Offload] Allow specifying debug level with `all` debug type (#178213)Hansang Bae1-2/+8
This change allows users to specify debug level with `all` debug type. The effect of `all:<num>` is equivalent to `<num>`.
2026-01-21[OpenMP][Offload] Remove old DP macro (#177156)Alex Duran1-5/+2
Old usages have been updated so we can remove it now
2026-01-15[OpenMP][Offload] Translate Info types to Debug types when debug enabled ↵Alex Duran1-1/+37
(#175599) Eventually we might want to rework the INFO macro to work like the new ODBG macro but in the meantime at least translate the Info type to the correct Debug type instead of just using DP directly (which uses the default type).
2026-01-15[OpenMP][Offload] Add a buffer layer to debug messaging (#176153)Alex Duran1-18/+20
To reduce interference between threads, instead of writing the components of a debug message directly to the underlying stream, write them to a buffer and flush the buffer to the stream when its completed.
2026-01-14[OpenMP][Offload] Add FB_NULLIFY map-type for `use_device_ptr(fb_nullify)`. ↵Abhinav Gaba1-0/+4
(1/4) (#169603) Depends on #174659. This PR adds a new map-type bit to control the fallback behavior when when a pointer lookup fails. For now, this is only meaningful with `RETURN_PARAM`, and can be used for `need_device_ptr` (for which the default is to use `nullptr` as the result when lookup fails), and OpenMP 6.1's `use_device_ptr(fb_nullify)`. Eventually, this can be extended to work with assumed-size maps on `target` constructs, to control what the argument should be set to when lookup fails (the OpenMP spec does not have a way to control that yet). Dependent PR: #170578.
2026-01-12[Offload] Update debug message printig in the plugins (#175205)Hansang Bae2-17/+32
* Prepare a set of debug types in llvm::offload::debug to be used in plugin code * Update debug messages in the plugins
2026-01-12[OFFLOAD][OpenMP] Remove old style REPORT support (#175607)Alex Duran2-19/+4
Fix the few remaining usages and remove the support for the old REPORT macro.
2025-12-18Revert "[OMPT][Offload][OpenMP] Fixes for OMPT data used by libomptarget" ↵Joachim1-20/+9
(#172827) Reverts llvm/llvm-project#156020 We will need some time for investigating buildbot failures
2025-12-18[OMPT][Offload][OpenMP] Fixes for OMPT data used by libomptarget (#156020)Kaloyan Ignatov1-9/+20
These commits fix issues regarding storage of tool data within libomptarget. Both libomp and libomptarget have been modified to accommodate this. We differentiate between two cases depending on the type of the target region: - merged target regions (default, without `nowait` clause): behavior remains unchanged, tool data is stored in the thread local RegionInterface class within libomptarget. - deferred target regions (using `nowait` clause): tool data is moved to `ompt_task_info_t` struct within libomp, as `RegionInterface` is thread local and its data is lost whenever another task is scheduled on the thread, which happens with deferred target regions. In the new implementation, `RegionInterface` receives pointers to `ompt_task_info_t` within libomp which are handled transparently within libomptarget. Thus, the problem of tool data getting lost when a thread receives a new task is resolved: `target_data` and `target_task_data` remain set. Another issue was the value of `task_data` which is supposed to belong to the generating task of the region according to the OpenMP standard, but instead had been set to the `task_data` of the target task itself until now. Test cases have been added which check both of these fixes. --------- Co-authored-by: Joachim <jenke@itc.rwth-aachen.de>
2025-12-18[OFFLOAD] Add plugin with support for Intel oneAPI Level Zero (#158900)Alex Duran1-0/+1
Add a new nextgen plugin that supports GPU devices through the Intel oneAPI Level Zero library. The plugin is not enabled by default and needs to be added to LIBOMPTARGET_PLUGINS_TO_BUILD explicitely. --------- Co-authored-by: Alexey Sachkov <alexey.sachkov@intel.com> Co-authored-by: Nick Sarnie <nick.sarnie@intel.com> Co-authored-by: Joseph Huber <huberjn@outlook.com>
2025-12-17[Offload] Debug message update part 2 (#171683)Hansang Bae4-12/+22
Update debug messages based on the new method from #170425. Added a new debug type `Tool` and updated the following files. - include/OffloadPolicy.h - include/OpenMP/OMPT/Connector.h - include/Shared/Debug.h - include/Shared/EnvironmentVar.h - libomptarget/OpenMP/Mapping.cpp - libomptarget/OpenMP/OMPT/Callback.cpp - libomptarget/PluginManager.cpp
2025-12-17[OpenMP][Offload] Add support for lambdas with debug conditions (#172573)Alex Duran1-9/+45
This PR adds a new set of debug macros that allow a certain code to be only executed when certain debug conditions are met. This is useful to guard things that are not strictly messages but compute and store things that are related to those messages. Strictly speaking the existing ODBG_OS could be used as well but that requires a stream object to be created which is unnecessary in some cases. Example of how it works: ```cpp ODBG_IF("Counters", [&](uint32_t Level) { someCounter++; if (Level == 2) moreDetailedCounter += f(); }); ODBG("Counters") << "Counter" = someCounter << ODBG_IF(2) << "DetailedCounter" << moreDetailedCounter; ```
2025-12-16[OpenMP][Offload] Add `LIBOMPTARGET_TREAT_ATTACH_AUTO_AS_ALWAYS` to treat ↵Abhinav Gaba1-0/+11
`attach(auto)` as `attach(always)`. (#172382) This is needed as a way to support older code that was expecting unconditional attachment to happen for cases like: ```c int *p; int x; #pragma omp targret enter data map(p) // (A) #pragma omp target enter data map(x) // (B) p = &x; // By default, this does NOT attach p and x #pragma omp target enter data map(p[0:0]) // (C) ``` When the environment variable is set, such maps, where both the pointer and the pointee already have corresponding copies on the device, but are not attached to one another, will be attached as-if OpenMP 6.1 TR14's `attach(always)` map-type-modifier was specified on `(C)`.
2025-12-16Revert "[OpenMP][Offload] Add support for lambdas with debug conditions" ↵Alex Duran1-46/+9
(#172570) Reverts llvm/llvm-project#172107
2025-12-16[OpenMP][Offload] Add support for lambdas with debug conditions (#172107)Alex Duran1-9/+46
This PR adds a new set of debug macros that allow a certain code to be only executed when certain debug conditions are met. This is useful to guard things that are not strictly messages but compute and store things that are related to those messages. Strictly speaking the existing ODBG_OS could be used as well but that requires a stream object to be created which is unnecessary in some cases. Example of how it works: ``` ODBG_IF("Counters", [&](uint32_t Level) { someCounter++; if (Level == 2) moreDetailedCounter += f(); }); ODBG("Counters") << "Counter" = someCounter << ODBG_IF(2) << "DetailedCounter" << moreDetailedCounter; ```
2025-12-16[OpenMP][Offload] Revert format of changed messages (#171995)Alex Duran1-0/+7
Adjust format of some of the updated debug output to match the old format as there are a number of tests that rely on it.
2025-12-10[OpenMP][Offload] Continue to update libomptarget debug messages (#170425)Alex Duran1-0/+53
* Add support to use lambdas to output debug messages (like LDBG_OS) * Update messages for interface.cpp and omptarget.cpp
2025-12-02[OFFLOAD][LIBOMPTARGET] Start to update debug messages in libomptarget (#170265)Alex Duran1-131/+192
* Add compatibility support for DP and REPORT macros * Define a set of predefined Debug Type for libomptarget * Start to update libomptarget files (OffloadRTL.cpp, device.cpp)
2025-12-01Reland: [OpenMP] Implement omp_get_uid_from_device() / ↵Robert Imschweiler2-0/+10
omp_get_device_from_uid() (#168554) Reland https://github.com/llvm/llvm-project/pull/164392 with Fortran support moved to follow-up PR
2025-11-26[OpenMP][clang] Register vtables on device for indirect calls runtime (#167011)Jason-VanBeusekom1-0/+2
This is a branch off of https://github.com/llvm/llvm-project/pull/159856, in which consists of the runtime portion of the changes required to support indirect function and virtual function calls on an `omp target device` when the virtual class / indirect function is mapped to the device from the host. Key Changes - Introduced a new flag OMP_DECLARE_TARGET_INDIRECT_VTABLE to mark VTable registrations - Modified setupIndirectCallTable to support both VTable entries and indirect function pointers Details: The setupIndirectCallTable implementation was modified to support this registration type by retrieving the first address of the VTable and inferring the remaining data needed to build the indirect call table. Since the Vtables / Classes registered as indirect can be larger than 8 bytes, and the vtables may not be at the first address we either need to pass the size to __llvm_omp_indirect_call_lookup and have a check at each step of the binary search, or add multiple entries to the indirect table for each address registered. The latter was chosen. Commit: a00def3f20e166d4fb9328e6f0bc0742cd0afa31 is not a part of this PR and is handled / reviewed in: https://github.com/llvm/llvm-project/pull/159856, This is PR (2/3) Register Vtable PR (1/3): https://github.com/llvm/llvm-project/pull/159856, Codegen / _llvm_omp_indirect_call_lookup PR (3/3): https://github.com/llvm/llvm-project/pull/159857
2025-11-26[OFFLOAD] Add support for indexed per-thread containers (#164263)Alex Duran2-59/+208
Split from #158900 it adds a PerThreadContainer that can use STL-like indexed containers based on a slightly refactored PerThreadTable. --------- Co-authored-by: Joseph Huber <huberjn@outlook.com>
2025-11-20[OFFLOAD] Add support for more fine grained debug messages control (#165416)Alex Duran1-0/+281
This PR introduces new debug macros that allow a more fined control of which debug message to output and introduce C++ stream style for debug messages. Changing existing messages (except a few that I changed for testing) will come in subsequent PRs. I also think that we should make debug enabling OpenMP agnostic but, for now, I prioritized maintaing the current libomptarget behavior for now, and we might need more changes further down the line as we we decouple libomptarget.
2025-11-18Revert "[OpenMP] Implement omp_get_uid_from_device() / ↵Robert Imschweiler2-9/+0
omp_get_device_from_uid()" (#168547) Reverts llvm/llvm-project#164392 due to fortran issues
2025-11-18[OpenMP] Implement omp_get_uid_from_device() / omp_get_device_from_uid() ↵Robert Imschweiler2-0/+9
(#164392) Use the implementation in libomptarget. If libomptarget is not available, always return the UID / device number of the host / the initial device.
2025-11-08[Offload] Remove unused KernelArgsTy instantiation (#167197)Kevin Sala Penades1-4/+0
2025-11-06[Offload] Remove handling for device memory pool (#163629)Joseph Huber1-22/+0
Summary: This was a lot of code that was only used for upstream LLVM builds of AMDGPU offloading. We have a generic and fast `malloc` in `libc` now so just use that. Simplifies code, can be added back if we start providing alternate forms but I don't think there's a single use-case that would justify it yet.
2025-10-24[OFFLOAD] Remove weak from __kmpc_* calls and gather them in one header ↵Alex Duran3-7/+18
(#164613) Follow-up from #162652 --------- Co-authored-by: Michael Klemm <michael.klemm@amd.com>
2025-10-22[OpenMP] Adds omp_target_is_accessible routine (#138294)Nicole Aschenbrenner2-0/+4
Adds omp_target_is_accessible routine. Refactors common code from omp_target_is_present to work for both routines. --------- Co-authored-by: Shilei Tian <i@tianshilei.me>
2025-10-17[OFFLOAD] Interop fixes for Windows (#162652)Alex Duran1-6/+6
On Windows, for a reason I don't fully understand boolean bits get extra padding (even when asking for packed structures) in the structures that messes the offsets between the compiler and the runtime. Also, "weak" works differently on Windows than Linux (i.e., the "local" routine has preference) which causes it to crash as we don't really have an alternate implementation of __kmpc_omp_wait_deps. Given this, it doesn't make sense to mark it as "weak" for Linux either.
2025-09-20[Offload] Remove non-blocking allocation type (#159851)Joseph Huber1-2/+0
Summary: This was originally added in as a hack to work around CUDA's limitation on allocation. The `libc` implementation now isn't even used for CUDA so this code is never hit. Even if this case, this code never truly worked. A true solution would be to use CUDA's virtual memory API instead to allocate 2MiB slabs independenctly from the normal memory management done in the stream.
2025-09-15[Offload][OpenMP] Support shadow-pointer tracking for Fortran descriptors. ↵Abhinav Gaba1-3/+56
(#158370) This change adds support for saving full contents of attached Fortran descriptors, and not just their pointee address, in the shadow-pointer table. With this, we now support: * comparing full contents of descriptors to check whether a previous shadow-pointer entry is stale; * restoring the full contents of descriptors And with that, we can now use ATTACH map-types (added in #149036) for mapping Fortran pointer/allocatable arrays, and array-sections on them. e.g.: ```f90 integer, allocatable :: x(:) !$omp target enter data map(to: x(:)) ``` as: ``` void* addr_of_pointee = allocated(x) ? &x(1) : nullptr; int64_t sizeof_pointee = allocated(x) ? sizeof(x(:)) : 0 addr_of_pointee, addr_of_pointee, sizeof_pointee, TO addr_of_descriptor, addr_of_pointee, size_of_descriptor, ATTACH ```
2025-09-08[OpenMP] Move `__omp_rtl_data_environment' handling to OpenMP (#157182)Joseph Huber1-0/+2
Summary: This operation is done every time we load a binary, this behavior should be moved into OpenMP since it concerns an OpenMP specific data struct. This is a little messy, because ideally we should only be using public APIs, but more can be extracted later.
2025-08-17[Offload] Introduce ATTACH map-type support for pointer attachment. (#149036)Abhinav Gaba3-2/+45
This patch introduces libomptarget support for the ATTACH map-type, which can be used to implement OpenMP conditional compliant pointer attachment, based on whether the pointer/pointee is newly mapped on a given construct. For example, for the following: ```c int *p; #pragma omp target enter data map(p[1:10]) ``` The following maps can be emitted by clang: ``` (A) &p[0], &p[1], 10 * sizeof(p[1]), TO | FROM &p, &p[1], sizeof(p), ATTACH ``` Without this map-type, these two possible maps could be emitted by clang: ``` (B) &p[0], &p[1], 10 * sizeof(p[1]), TO | FROM (C) &p, &p[1], 10 * sizeof(p[1]), TO | FROM | PTR_AND_OBJ ```` (B) does not perform any pointer attachment, while (C) also maps the pointer p, which are both incorrect. In terms of implementation, maps with the ATTACH map-type are handled after all other maps have been processed, as it requires knowledge of which new allocations happened as part of the construct. As per OpenMP 5.0, an attachment should happen only when either the pointer or the pointee was newly mapped while handling the construct. Maps with ATTACH map-type-bit do not increase/decrease the ref-count. With OpenMP 6.1, `attach(always/never)` can be used to force/prevent attachment. For `attach(always)`, the compiler will insert the ALWAYS map-type, which would let libomptarget bypass the check about one of the pointer/pointee being new. With `attach(never)`, the ATTACH map will not be emitted at all. The size argument of the ATTACH map-type can specify values greater than `sizeof(void*)` which can be used to support pointer attachment on Fortran descriptors. Note that this also requires shadow-pointer tracking to also support them. That has not been implemented in this patch. This was worked upon in coordination with Ravi Narayanaswamy, who has since retired. Happy retirement, Ravi! --------- Co-authored-by: Alex Duran <alejandro.duran@intel.com>
2025-08-10[Offload][NFC] Re-enable clang-format for omptarget.h (#152937)Kevin Sala Penades1-22/+19
2025-08-08[Offload] Make olLaunchKernel test thread safe (#149497)Ross Brunton1-0/+4
This sprinkles a few mutexes around the plugin interface so that the olLaunchKernel CTS test now passes when ran on multiple threads. Part of this also involved changing the interface for device synchronise so that it can optionally not free the underlying queue (which introduced a race condition in liboffload).
2025-08-06[OFFLOAD][OPENMP] 6.0 compatible interop interface (#143491)Alex Duran5-28/+296
The following patch introduces a new interop interface implementation with the following characteristics: * It supports the new 6.0 prefer_type specification * It supports both explicit objects (from interop constructs) and implicit objects (from variant calls). * Implements a per-thread reuse mechanism for implicit objects to reduce overheads. * It provides a plugin interface that allows selecting the supported interop types, and managing all the backend related interop operations (init, sync, ...). * It enables cooperation with the OpenMP runtime to allow progress on OpenMP synchronizations. * It cleanups some vendor/fr_id mismatchs from the current query routines. * It supports extension to define interop callbacks for library cleanup.
2025-06-03[Offload] Don't check in generated files (#141982)Callum Fare2-138/+0
Previously we decided to check in files that we generate with tablegen. The justification at the time was that it helped reviewers unfamiliar with `offload-tblgen` see the actual changes to the headers in PRs. After trying it for a while, it's ended up causing some headaches and is also not how tablegen is used elsewhere in LLVM. This changes our use of tablegen to be more conventional. Where possible, files are still clang-formatted, but this is no longer a hard requirement. Because `OffloadErrcodes.inc` is shared with libomptarget it now gets generated in a more appropriate place.
2025-05-20[Offload] Use new error code handling mechanism and lower-case messages ↵Ross Brunton2-17/+67
(#139275) [Offload] Use new error code handling mechanism This removes the old ErrorCode-less error method and requires every user to provide a concrete error code. All calls have been updated. In addition, for consistency with error messages elsewhere in LLVM, all messages have been made to start lower case.
2025-05-19[Offload] Add Error Codes to PluginInterface (#138258)Ross Brunton2-0/+88
A new ErrorCode enumeration is present in PluginInterface which can be used when returning an llvm::Error from offload and PluginInterface functions. This enum must be kept up to sync with liboffload's ol_errc_t enum, so both are automatically generated from liboffload's enum definition. Some error codes have also been shuffled around to allow for future work. Note that this patch only adds the machinery; actual error codes will be added in a future patch. ~~Depends on #137339 , please ignore first commit of this MR.~~ This has been merged.
2025-04-21[OpenMP] Remove dependency on LLVM include directory from DeviceRTL (#136359)Joseph Huber1-10/+2
Summary: Currently we depend on a single LLVM include directory. This is actually only required to define one enum, which is highly unlikely to change. THis patch makes the `Environment.h` include directory more hermetic so we no long depend on other libraries. In exchange, we get a simpler dependency list for the price of hard-coding `1` somewhere. I think it's a valid trade considering that this flag is highly unlikely to change at this point. @ronlieb AMD version https://gist.github.com/jhuber6/3313e6f957be14dc79fe85e5126d2cb3
2025-03-28[OFFLOAD] Stricter enforcement of user offload disable (#133470)Alex1-1/+6
If user specifies offload is disabled (e.g., OMP_TARGET_OFFLOAD=disable), disable library almost completely. This reduces resources spent to a minimum and ensures all APIs behave as if the only available device is the host device. Currently some of the APIs behave as if there were devices avaible for offload even when under OMP_TARGET_OFFLOAD=disable. --------- Co-authored-by: Joseph Huber <huberjn@outlook.com>
2025-03-18[openmp][nfc] Use builtin align in the devicertl (#131918)Jon Chesterfield1-4/+0
Noticed while extracting the smartstack as a test case
2025-02-11 [PGO][Offload] Profile profraw generation for GPU instrumentation #76587 ↵Ethan Luis McDonough1-0/+1
(#93365) This pull request is the second part of an ongoing effort to extends PGO instrumentation to GPU device code and depends on #76587. This PR makes the following changes: - Introduces `__llvm_write_custom_profile` to PGO compiler-rt library. This is an external function that can be used to write profiles with custom data to target-specific files. - Adds `__llvm_write_custom_profile` as weak symbol to libomptarget so that it can write the collected data to a profraw file. - Adds `PGODump` debug flag and only displays dump when the aforementioned flag is set
2025-01-31[Offload][NFC] Fix typos discovered by codespell (#125119)Christian Clauss4-5/+5
https://github.com/codespell-project/codespell % `codespell --ignore-words-list=archtype,hsa,identty,inout,iself,nd,te,ths,vertexes --write-changes`
2025-01-28[Offload] Rework offloading entry type to be more generic (#124018)Joseph Huber1-0/+6
Summary: The previous offloading entry type did not fit the current use-cases very well. This widens it and adds a version to prevent further annoyances. It also includes the kind to better sort who's using it. The first 64-bytes are reserved as zero so the OpenMP runtime can detect the old format for binary compatibilitry.
2025-01-21[Offload][NFC] Factor out and rename the `__tgt_offload_entry` struct (#123785)Joseph Huber4-31/+27
Summary: This patch is an NFC renaming to make using the offloading entry type more portable between other targets. Right now this is just moving its definition to LLVM so others can use it. Future work will rework the struct layout.
2025-01-09[OpenMP] Use __builtin_bit_cast instead of UB type punning (#122325)Joseph Huber1-5/+0
Summary: Use a normal bitcast, remove from the shared utils since it's not available in GCC 7.4
2024-12-02[OpenMP] Unconditionally provide an RPC client interface for OpenMP (#117933)Joseph Huber1-0/+25
Summary: This patch adds an RPC interface that lives directly in the OpenMP device runtime. This allows OpenMP to implement custom opcodes. Currently this is only providing the host call interface, which is the raw version of reverse offloading. Previously this lived in `libc/` as an extension which is not the correct place. The interface here uses a weak symbol for the RPC client by the same name that the `libc` interface uses. This means that it will defer to the libc one if both are present so we don't need to set up multiple instances. The presense of this symbol is what controls whether or not we set up the RPC server. Because this is an external symbol it normally won't be optimized out, so there's a special pass in OpenMPOpt that deletes this symbol if it is unused during linking. That means at `O0` the RPC server will always be present now, but will be removed trivially if it's not used at O1 and higher.