aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/CodeGen/CGOpenMPRuntime.cpp
AgeCommit message (Collapse)AuthorFilesLines
10 days[OpenMP] Fix initialization order for CopyOverlappedEntryGaps (#150431)Julian Brown1-2/+2
NFC.
10 days[OpenMP] Don't emit redundant zero-sized mapping nodes for overlapped ↵Julian Brown1-47/+110
structs (#148947) The handling of overlapped structure mapping in CGOpenMPRuntime.cpp can lead to redundant zero-sized mapping nodes at runtime. This patch fixes it using a combination of approaches: trivially adjacent struct members won't have a mapping node created between them, and for more complicated cases (inheritance) the physical layout of the struct/class is used to make sure that elements aren't missed. I've introduced a new class to track the state whilst iterating over the struct. This reduces a bit of redundancy in the code (accumulating CombinedInfo both during and after the loop), which I think is a bit neater. Before: omptarget --> Entry 0: Base=0x00007fff8d483830, Begin=0x00007fff8d483830, Size=48, Type=0x20, Name=unknown omptarget --> Entry 1: Base=0x00007fff8d483830, Begin=0x00007fff8d483830, Size=0, Type=0x1000000000003, Name=unknown omptarget --> Entry 2: Base=0x00007fff8d483830, Begin=0x00007fff8d483834, Size=0, Type=0x1000000000003, Name=unknown omptarget --> Entry 3: Base=0x00007fff8d483830, Begin=0x00007fff8d483838, Size=0, Type=0x1000000000003, Name=unknown omptarget --> Entry 4: Base=0x00007fff8d483830, Begin=0x00007fff8d48383c, Size=20, Type=0x1000000000003, Name=unknown omptarget --> Entry 5: Base=0x00007fff8d483830, Begin=0x00007fff8d483854, Size=0, Type=0x1000000000003, Name=unknown omptarget --> Entry 6: Base=0x00007fff8d483830, Begin=0x00007fff8d483858, Size=0, Type=0x1000000000003, Name=unknown omptarget --> Entry 7: Base=0x00007fff8d483830, Begin=0x00007fff8d48385c, Size=4, Type=0x1000000000003, Name=unknown omptarget --> Entry 8: Base=0x00007fff8d483830, Begin=0x00007fff8d483830, Size=4, Type=0x1000000000003, Name=unknown omptarget --> Entry 9: Base=0x00007fff8d483830, Begin=0x00007fff8d483834, Size=4, Type=0x1000000000003, Name=unknown omptarget --> Entry 10: Base=0x00007fff8d483830, Begin=0x00007fff8d483838, Size=4, Type=0x1000000000003, Name=unknown omptarget --> Entry 11: Base=0x00007fff8d483840, Begin=0x00005e7665275130, Size=32, Type=0x1000000000013, Name=unknown omptarget --> Entry 12: Base=0x00007fff8d483830, Begin=0x00007fff8d483850, Size=4, Type=0x1000000000003, Name=unknown omptarget --> Entry 13: Base=0x00007fff8d483830, Begin=0x00007fff8d483854, Size=4, Type=0x1000000000003, Name=unknown omptarget --> Entry 14: Base=0x00007fff8d483830, Begin=0x00007fff8d483858, Size=4, Type=0x1000000000003, Name=unknown After: omptarget --> Entry 0: Base=0x00007fffd0f562e0, Begin=0x00007fffd0f562e0, Size=48, Type=0x20, Name=unknown omptarget --> Entry 1: Base=0x00007fffd0f562e0, Begin=0x00007fffd0f562ec, Size=20, Type=0x1000000000003, Name=unknown omptarget --> Entry 2: Base=0x00007fffd0f562e0, Begin=0x00007fffd0f5630c, Size=4, Type=0x1000000000003, Name=unknown omptarget --> Entry 3: Base=0x00007fffd0f562e0, Begin=0x00007fffd0f562e0, Size=4, Type=0x1000000000003, Name=unknown omptarget --> Entry 4: Base=0x00007fffd0f562e0, Begin=0x00007fffd0f562e4, Size=4, Type=0x1000000000003, Name=unknown omptarget --> Entry 5: Base=0x00007fffd0f562e0, Begin=0x00007fffd0f562e8, Size=4, Type=0x1000000000003, Name=unknown omptarget --> Entry 6: Base=0x00007fffd0f562f0, Begin=0x000058b6013fb130, Size=32, Type=0x1000000000013, Name=unknown omptarget --> Entry 7: Base=0x00007fffd0f562e0, Begin=0x00007fffd0f56300, Size=4, Type=0x1000000000003, Name=unknown omptarget --> Entry 8: Base=0x00007fffd0f562e0, Begin=0x00007fffd0f56304, Size=4, Type=0x1000000000003, Name=unknown omptarget --> Entry 9: Base=0x00007fffd0f562e0, Begin=0x00007fffd0f56308, Size=4, Type=0x1000000000003, Name=unknown For code: #include <cstdlib> #include <cstdio> struct S { int x; int y; int z; int *p1; int *p2; }; struct T : public S { int a; int b; int c; }; int main() { T v; v.p1 = (int*) calloc(8, sizeof(int)); v.p2 = (int*) calloc(8, sizeof(int)); #pragma omp target map(tofrom: v, v.x, v.y, v.z, v.p1[:8], v.a, v.b, v.c) { v.x++; v.y += 2; v.z += 3; v.p1[0] += 4; v.a += 7; v.b += 5; v.c += 6; } return 0; }
2025-07-15[clang][modules] Serialize `CodeGenOptions` (#146422)Jan Svoboda1-2/+2
Some `LangOptions` duplicate their `CodeGenOptions` counterparts. My understanding is that this was done solely because some infrastructure (like preprocessor initialization, serialization, module compatibility checks, etc.) were only possible/convenient for `LangOptions`. This PR implements the missing support for `CodeGenOptions`, which makes it possible to remove some duplicate `LangOptions` fields and simplify the logic. Motivated by https://github.com/llvm/llvm-project/pull/146342.
2025-07-07[NFC][Clang][OpenMP] Refactor mapinfo generation for captured vars (#146891)Abhinav Gaba1-42/+86
The refactored code would allow creating multiple member-of maps for the same captured var, which would be useful for changes like https://github.com/llvm/llvm-project/pull/145454.
2025-06-21[CodeGen] Use range-based for loops (NFC) (#145142)Kazu Hirata1-16/+10
2025-06-11[OpenMP 6.0 ]Codegen for Reduction over private variables with reduction ↵CHANDRA GHALE1-9/+283
clause (#134709) Codegen support for reduction over private variable with reduction clause. Section 7.6.10 in in OpenMP 6.0 spec. - An internal shared copy is initialized with an initializer value. - The shared copy is updated by combining its value with the values from the private copies created by the clause. - Once an encountering thread verifies that all updates are complete, its original list item is updated by merging its value with that of the shared copy and then broadcast to all threads. Sample Test Case from OpenMP 6.0 Example ``` #include <assert.h> #include <omp.h> #define N 10 void do_red(int n, int *v, int &sum_v) { sum_v = 0; // sum_v is private #pragma omp for reduction(original(private),+: sum_v) for (int i = 0; i < n; i++) { sum_v += v[i]; } } int main(void) { int v[N]; for (int i = 0; i < N; i++) v[i] = i; #pragma omp parallel num_threads(4) { int s_v; // s_v is private do_red(N, v, s_v); assert(s_v == 45); } return 0; } ``` Expected Codegen: ``` // A shared global/static variable is introduced for the reduction result. // This variable is initialized (e.g., using memset or a UDR initializer) // e.g., .omp.reduction.internal_private_var // Barrier before any thread performs combination call void @__kmpc_barrier(...) // Initialization block (executed by thread 0) // e.g., call void @llvm.memset.p0.i64(...) or call @udr_initializer(...) call void @__kmpc_critical(...) // Inside critical section: // Load the current value from the shared variable // Load the thread-local private variable's value // Perform the reduction operation // Store the result back to the shared variable call void @__kmpc_end_critical(...) // Barrier after all threads complete their combinations call void @__kmpc_barrier(...) // Broadcast phase: // Load the final result from the shared variable) // Store the final result to the original private variable in each thread // Final barrier after broadcast call void @__kmpc_barrier(...) ``` --------- Co-authored-by: Chandra Ghale <ghale@pe31.hpc.amslabs.hpecorp.net>
2025-06-05[Clang] Remap paths in OpenMP runtime calls (#82541) (#141250)Dan McGregor1-5/+18
Apply the debug prefix mapping to the OpenMP location strings. Fixes https://github.com/llvm/llvm-project/issues/82541
2025-05-19[clang] Use *Map::try_emplace (NFC) (#140477)Kazu Hirata1-1/+1
We can simplify the code with *Map::try_emplace where we need default-constructed values while avoding calling constructors when keys are already present.
2025-05-18[clang] Use llvm::max_element (NFC) (#140435)Kazu Hirata1-2/+1
2025-04-14[clang] AST: remove source locations from [Variable/Dependent]SizedArrayType ↵Matheus Izvekov1-2/+2
(#135511)
2025-04-03[CodeGen] Don't include CGDebugInfo.h in CodeGenFunction.h (NFC) (#134100)Nikita Popov1-0/+1
This is an expensive header, only include it where needed. Move some functions out of line to achieve that. This reduces time to build clang by ~0.5% in terms of instructions retired.
2025-03-28[clang][flang][Triple][llvm] Add isOffload function to LangOpts and isGPU ↵Nick Sarnie1-4/+2
function to Triple (#126956) I'm adding support for SPIR-V, so let's consolidate these checks. --------- Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-03-03Remove leftover unused variable from #128711Mats Jun Larsen1-1/+0
2025-03-03[CodeGen] Replace PointerType::getUnqual(Type) with opaque pointer version ↵Mats Jun Larsen1-12/+2
(NFC) (#128711) pointer version (NFC) Follow-up to #123569
2025-02-18[MLIR][OpenMP] Add LLVM translation support for OpenMP UserDefinedMappers ↵Akash Banerjee1-13/+14
(#124746) This patch adds OpenMPToLLVMIRTranslation support for the OpenMP Declare Mapper directive. Since both MLIR and Clang now support custom mappers, I've changed the respective function params to no longer be optional as well. Depends on #121005
2025-02-13[clang][NFC] Avoid potential null dereferences (#127017)schittir1-2/+2
Add null checking.
2025-02-11[CodeGen] Avoid repeated hash lookups (NFC) (#126672)Kazu Hirata1-6/+5
2025-01-24[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583)Jeremy Morse1-1/+1
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a debug-info bit that's needed when getFirstNonPHI and similar feed into instruction insertion positions. Call-sites where that's necessary were updated a year ago; but to ensure some type safety however, we'd like to have all calls to moveBefore use iterators. This patch adds a (guaranteed dereferenceable) iterator-taking moveBefore, and changes a bunch of call-sites where it's obviously safe to change to use it by just calling getIterator() on an instruction pointer. A follow-up patch will contain less-obviously-safe changes. We'll eventually deprecate and remove the instruction-pointer insertBefore, but not before adding concise documentation of what considerations are needed (very few).
2025-01-23[CodeGen] Migrate away from PointerUnion::dyn_cast (NFC) (#124076)Kazu Hirata1-2/+2
Note that PointerUnion::dyn_cast has been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> Literal migration would result in dyn_cast_if_present (see the definition of PointerUnion::dyn_cast), but this patch uses dyn_cast because we expect Pos to be nonnull.
2025-01-15[CodeGen] Migrate away from PointerUnion::dyn_cast (NFC) (#123013)Kazu Hirata1-1/+1
Note that PointerUnion::dyn_cast has been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> Literal migration would result in dyn_cast_if_present (see the definition of PointerUnion::dyn_cast), but this patch uses dyn_cast because we expect Data to be nonnull.
2025-01-14[OMPIRBuilder] Introduce struct to hold default kernel teams/threads (#116050)Sergio Afonso1-5/+8
This patch introduces the `OpenMPIRBuilder::TargetKernelDefaultAttrs` structure used to simplify passing default and constant values for number of teams and threads, and possibly other target kernel-related information in the future. This is used to forward values passed to `createTarget` to `createTargetInit`, which previously used a default unrelated set of values.
2025-01-09[OpenMP][OMPIRBuilder] Handle non-failing calls properly (#115863)Sergio Afonso1-18/+14
The preprocessor definition used to enable asserts and the one that `llvm::Error` and `llvm::Expected` use to ensure all created instances are checked are not the same. By making these checks inside of an `assert` in cases where errors are not expected, certain build configurations would trigger runtime failures (e.g. `-DLLVM_ENABLE_ASSERTIONS=OFF -DLLVM_UNREACHABLE_OPTIMIZE=ON`). The `llvm::cantFail()` function, which was intended for this use case, is used by this patch in place of `assert` to prevent these runtime failures. In tests, new preprocessor definitions based on `ASSERT_THAT_EXPECTED` and `EXPECT_THAT_EXPECTED` are used instead, to avoid silent failures in release builds.
2024-12-18[OpenMP][Clang] Migrate OpenMP UserDefinedMapper from Clang to OMPIRBuilder ↵Akash Banerjee1-317/+49
(#110001) This patch migrates the OpenMP UserDefinedMapper codegen from Clang to the OpenMPIRBuilder. I will be adding further patches in the near future so that OpenMP dialect in MLIR can make use of these.
2024-12-12[clang] Migrate away from PointerUnion::{is,get} (NFC) (#119724)Kazu Hirata1-2/+2
Note that PointerUnion::{is,get} have been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> I'm not touching PointerUnion::dyn_cast for now because it's a bit complicated; we could blindly migrate it to dyn_cast_if_present, but we should probably use dyn_cast when the operand is known to be non-null.
2024-12-06[CodeGen] Migrate away from PointerUnion::{is,get} (NFC) (#118600)Kazu Hirata1-7/+7
Note that PointerUnion::{is,get} have been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> I'm not touching PointerUnion::dyn_cast for now because it's a bit complicated; we could blindly migrate it to dyn_cast_if_present, but we should probably use dyn_cast when the operand is known to be non-null.
2024-11-28Codegen changes for strict modifier with grainsize/num_tasks of taskloop ↵CHANDRA GHALE1-6/+12
construct (#117196) Initial parsing/sema for 'strict' modifier with 'num_tasks' and ‘grainsize’ clause is present in these commits [grainsize_parsing](https://github.com/llvm/llvm-project/commit/ab9eac762c35068e77f57795e660d06f578c9614) and [num_tasks_parsing](https://github.com/llvm/llvm-project/commit/56c166017055595a9f26933e85bfd89e30c528d0#diff-4184486638e85284c3a2c961a81e7752231022daf97e411007c13a6732b50db9R6545) . However, this implementation appears incomplete as it lacks code generation support. A runtime patch was introduced in this runtime commit [runtime_patch](https://github.com/llvm/llvm-project/commit/540007b42701b5ac9adba076824bfd648a265413#diff-5e95f9319910d6965d09c301359dbe6b23f3eef5ce4d262ef2c2d2137875b5c4R374) , which adds a new API, _kmpc_taskloop_5, to accommodate the strict modifier.  In this patch I have added codegen support. When the strict modifier is present alongside the grainsize or num_tasks clauses of taskloop construct, the code now emits a call to _kmpc_taskloop_5, which includes an additional parameter of type i32 with the value 1 to indicate the strict modifier. If the strict modifier is not present, it falls back to the existing _kmpc_taskloop API call. --------- Co-authored-by: Chandra Ghale <ghale@pe31.hpc.amslabs.hpecorp.net>
2024-11-16[CodeGen] Remove unused includes (NFC) (#116459)Kazu Hirata1-5/+0
Identified with misc-include-cleaner.
2024-10-25[OpenMP][OMPIRBuilder] Error propagation across callbacks (#112533)Sergio Afonso1-10/+22
This patch implements an approach to communicate errors between the OMPIRBuilder and its users. It introduces `llvm::Error` and `llvm::Expected` objects to replace the values returned by callbacks passed to `OMPIRBuilder` codegen functions. These functions then check the result for errors when callbacks are called and forward them back to the caller, which has the flexibility to recover, exit cleanly or dump a stack trace. This prevents a failed callback to leave the IR in an invalid state and still continue the codegen process, triggering unrelated assertions or segmentation faults. In the case of MLIR to LLVM IR translation of the 'omp' dialect, this change results in the compiler emitting errors and exiting early instead of triggering a crash for not-yet-implemented errors. The behavior in Clang and openmp-opt stays unchanged, since callbacks will continue always returning 'success'.
2024-10-24[clang] Use {} instead of std::nullopt to initialize empty ArrayRef (#109399)Jay Foad1-16/+15
Follow up to #109133.
2024-10-23[flang][OpenMP] Support `target enter|update|exit .. nowait` (#113305)Kareem Ergawy1-2/+2
Extends `nowait` support for other device directives. This PR refactors the task generation utils used for the `target` directive so that they are general enough to be reused for other device directives as well.
2024-10-11[clang][CGOpenMPRuntime] Avoid Type::getPointerTo() (NFC) (#112017)Youngsuk Kim1-31/+14
`llvm::Type::getPointerTo()` is to be deprecated & removed soon.
2024-09-24[codegen][NFC] add static mark for internal usage variable and function ↵Congcong Cai1-5/+5
(#109431) Detect by clang-tidy misc-use-internal-linkage
2024-09-15[CodeGen] Avoid repeated hash lookup (NFC) (#108735)Kazu Hirata1-6/+1
2024-09-13[clang][CodeGen] Strip unneeded calls to raw_string_ostream::str() (NFC)JOE19941-1/+0
Try to avoid excess layer of indirection when possible. p.s. Remove a call to raw_string_ostream::flush() which is a no-op.
2024-09-05[CGOpenMPRuntime] Avoid repeated hash lookups (NFC) (#107358)Kazu Hirata1-3/+1
2024-09-04[CGOpenMPRuntime] Use DenseMap::operator[] (NFC) (#107185)Kazu Hirata1-15/+14
I'm planning to deprecate DenseMap::FindAndConstruct in favor of DenseMap::operator[].
2024-09-03[CGOpenMPRuntime] Use DenseMap::operator[] (NFC) (#107158)Kazu Hirata1-14/+7
I'm planning to deprecate DenseMap::FindAndConstruct in favor of DenseMap::operator[].
2024-08-16[Clang][OMPX] Add the code generation for multi-dim `thread_limit` clause ↵Shilei Tian1-13/+20
(#102717)
2024-08-10[Clang][Sema][OpenMP] Allow `thread_limit` to accept multiple expressions ↵Shilei Tian1-5/+10
(#102715)
2024-08-09[Clang][OMPX] Add the code generation for multi-dim `num_teams` (#101407)Shilei Tian1-1/+21
This patch adds the code generation support for multi-dim `num_teams` clause when it is used with `target teams ompx_bare` construct.
2024-08-06[Clang][Sema][OpenMP] Allow `num_teams` to accept multiple expressions (#99732)Shilei Tian1-3/+4
By the OpenMP standard, `num_teams` clause can only accept one expression (for now). In this patch, we extend it to allow to accept multiple expressions when it is used with `target teams ompx_bare` construct. This will allow to launch a multi-dim grid, same as CUDA/HIP.
2024-08-05[OpenMP][Map][NFC] improve map chain. (#101903)jyu2-git1-8/+14
This is for mapping structure has data members, which have 'default' mappers, where needs to map these members individually using their 'default' mappers. example map(tofrom: spp[0][0]), look at test case. currently create 6 maps: 1>&spp, &spp[0], size 8, maptype TARGET_PARAM | FROM | TO 2>&spp[0], &spp[0][0], size(D)with maptype OMP_MAP_NONE, nullptr 3>&spp[0], &spp[0][0].e, size(e) with maptype MEMBER_OF | FROM | TO 4>&spp[0], &spp[0][0].h, size(h) with maptype MEMBER_OF | FROM | TO 5>&spp, &spp[0],size(8), maptype MEMBER_OF | IMPLICIT | FROM | TO 6>&spp[0], &spp[0][0].f size(D) with maptype MEMBER_OF |IMPLICIT |PTR_AND_OBJ, @.omp_mapper._ZTS1C.default maptype with/without OMP_MAP_PTR_AND_OBJ For "2" and "5", since it is mapping pointer and pointee pair, PTR_AND_OBJ should be set But for "6" the PTR_AND_OBJ should not set. However, "5" is duplicate with "1" can be skip. To fix "2", during the call to emitCombinEntry with false with NotTargetParams instead !PartialStruct.PreliminaryMapData.BasePointers.empty(), since all captures need to be TARGET_PARAM And inside emitCombineEntry: check !PartialStruct.PreliminaryMapData.BasePointers.empty() to set PTR_AND_OBJ For "5" and "6": the fix in generateInfoForComponentList: Add new variable IsPartialMapped set with !PartialStruct.PreliminaryMapData.BasePointers.empty(); When that is true, skip generate "5" and don"t set IsExpressionFirstInfo to false, so that PTR_AND_OBJ would be set. After fix: will have 5 maps instead 6 1>&spp, &spp[0], size 8, maptype TARGET_PARAM | FROM | TO 2>&spp[0], &spp[0][0], size(D), maptype PTR_AND_OBJ, nullptr 3>&spp[0], &spp[0][0].e, size(e), maptype MEMBER_OF_2 | FROM | TO 4>&spp[0], &spp[0][0].h, size(h), maptype MEMBER_OF_2 | FROM | TO 5>&spp[0], &spp[0][0].f size(32), maptype MEMBER_OF_2 | IMPLICIT, @.omp_mapper._ZTS1C.default For map(sppp[0][0][0]): after fix: will have 6 maps instead 8. https://github.com/llvm/llvm-project/pull/101903
2024-07-30[clang][OpenMP] Rename `varlists` to `varlist`, NFC (#101058)Krzysztof Parzyszek1-9/+9
It returns a range of variables (via Expr*), not a range of lists.
2024-07-25[OpenMPIRBuilder][Clang][NFC] - Combine `emitOffloadingArrays` and ↵Pranav Bhandarkar1-62/+71
`emitOffloadingArraysArgument` in OpenMPIRBuilder (#97088) This patch introduces a new interface in `OpenMPIRBuilder` that combines the creation of the so-called offloading pointer arrays and their subsequent preparation as arguments to the OpenMP runtime library. We then use this in Clang. This is intended to be used in the near future by other frontends such as Flang when lowering MLIR to LLVMIR.
2024-07-25[Clang] Remove some dead code in getNumTeamsExprForTargetDirective (#95695)Shivam Gupta1-5/+0
This was reported in https://pvs-studio.com/en/blog/posts/cpp/1126/, fragment N9. V523 The 'then' statement is equivalent to the subsequent code fragment. CGOpenMPRuntime.cpp:6040, 6036 --------- Co-authored-by: Shivam Gupta <shivma98.tkg@gmail.com>
2024-07-18[OpenMP] Fix calculation of dependencies for multi-dimensional iteration ↵Joachim1-4/+8
space (#99347) The expectation for multiple iterators used in a single depend clause (`depend(iterator(i=0:5,j=0:5), in:x[i][j])`) is that the iterator space is the product of the iteration vectors (25 in that case). The current codeGen only works correctly, if `numIterators() = 1`. For more iterators, the execution results in runtime assertions or segfaults. The modified codeGen first calculates the iteration space, then multiplies to the number of dependencies in the depend clause and finally adds to the total number of iterator dependencies.
2024-07-16[clang][CGRecordLayout] Remove dependency on isZeroSize (#96422)Michael Buch1-8/+15
This is a follow-up from the conversation starting at https://github.com/llvm/llvm-project/pull/93809#issuecomment-2173729801 The root problem that motivated the change are external AST sources that compute `ASTRecordLayout`s themselves instead of letting Clang compute them from the AST. One such example is LLDB using DWARF to get the definitive offsets and sizes of C++ structures. Such layouts should be considered correct (modulo buggy DWARF), but various assertions and lowering logic around the `CGRecordLayoutBuilder` relies on the AST having `[[no_unique_address]]` attached to them. This is a layout-altering attribute which is not encoded in DWARF. This causes us LLDB to trip over the various LLVM<->Clang layout consistency checks. There has been precedent for avoiding such layout-altering attributes from affecting lowering with externally-provided layouts (e.g., packed structs). This patch proposes to replace the `isZeroSize` checks in `CGRecordLayoutBuilder` (which roughly means "empty field with [[no_unique_address]]") with checks for `CodeGen::isEmptyField`/`CodeGen::isEmptyRecord`. **Details** The main strategy here was to change the `isZeroSize` check in `CGRecordLowering::accumulateFields` and `CGRecordLowering::accumulateBases` to use the `isEmptyXXX` APIs instead, preventing empty fields from being added to the `Members` and `Bases` structures. The rest of the changes fall out from here, to prevent lookups into these structures (for field numbers or base indices) from failing. Added `isEmptyRecordForLayout` and `isEmptyFieldForLayout` (open to better naming suggestions). The main difference to the existing `isEmptyRecord`/`isEmptyField` APIs, is that the `isEmptyXXXForLayout` counterparts don't have special treatment for `unnamed bitfields`/arrays and also treat fields of empty types as if they had `[[no_unique_address]]` (i.e., just like the `AsIfNoUniqueAddr` in `isEmptyField` does).
2024-07-05[OpenMP] Fix stack corruption due to argument mismatch (#96386)Sushant Gokhale1-10/+13
While lowering (#pragma omp target update from), clang's generated .omp_task_entry. is setting up 9 arguments while calling __tgt_target_data_update_nowait_mapper. At the same time, in __tgt_target_data_update_nowait_mapper, call to targetData<TaskAsyncInfoWrapperTy>() is converted to a sibcall assuming it has the argument count listed in the signature. AARCH64 asm sequence for this is as follows (removed unrelated insns): ` .omp_task_entry..108: sub sp, sp, #32 stp x29, x30, sp, #16 // 16-byte Folded Spill add x29, sp, #16 str x8, sp, #8. // stack canary str xzr, [sp] bl __tgt_target_data_update_nowait_mapper __tgt_target_data_update_nowait_mapper: sub sp, sp, #32 stp x29, x30, sp, #16 // 16-byte Folded Spill add x29, sp, #16 str x8, sp, #8 // stack canary // Sibcall argument setup adrp x8, :got:_Z16targetDataUpdateP7ident_tR8DeviceTyiPPvS4_PlS5_S4_S4_R11AsyncInfoTyb ldr x8, [x8, :got_lo12:_Z16targetDataUpdateP7ident_tR8DeviceTyiPPvS4_PlS5_S4_S4_R11AsyncInfoTyb] stp x9, x8, x29, #16 adrp x8, .L.str.8 add x8, x8, :lo12:.L.str.8 str x8, x29, #32. <==. This is the insn that erases $fp ldp x29, x30, sp, #16 // 16-byte Folded Reload add sp, sp, #32 // Sibcall b ZL10targetDataI22TaskAsyncInfoWrapperTyEvP7ident_tliPPvS4_PlS5_S4_S4_PFiS2_R8DeviceTyiS4_S4_S5_S5_S4_S4_R11AsyncInfoTybEPKcSD ` On AArch64, call to __tgt_target_data_update_nowait_mapper in .omp_task_entry. sets up only single space on stack and this results in ovewriting $fp and subsequent stack corruption. This issue can be credited to discrepancy of __tgt_target_data_update_nowait_mapper signature in openmp/libomptarget/include/omptarget.h taking 13 arguments while clang/lib/CodeGen/CGOpenMPRuntime.cpp and llvm/include/llvm/Frontend/OpenMP/OMPKinds.def taking only 9 arguments. This patch modifies __tgt_target_data_update_nowait_mapper signature to match .omp_task_entry usage(and other 2 files mentioned above). Co-authored-by: Kugan Vivekanandarajah <kvivekananda@nvidia.com>
2024-07-03[Clang][OpenMP] This is addition fix for #92210. (#94802)jyu2-git1-1/+17
Fix another runtime problem when explicit map both pointer and pointee in target data region. In #92210, problem is only addressed in target region, but missing for target data region. The change just passing AreBothBasePtrAndPteeMapped in generateInfoForComponentList when processing target data. --------- Co-authored-by: Alexey Bataev <a.bataev@gmx.com>
2024-07-01[OpenMP][offload] Fix dynamic schedule tracking (#97065)Gheorghe-Teodor Bercea1-0/+14
This patch fixes the dynamic schedule tracking.