aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/Sema/SemaCUDA.cpp
AgeCommit message (Collapse)AuthorFilesLines
13 days[clang] [CUDA] Support calling `consteval` function between different ↵Tang Jiajun1-0/+3
target. (#158688) In CUDA, calling `consteval` functions cross excution space is allowed. So the function with `consteval` attribute need be treated as a `__host__ __device__` function.
2025-08-25[clang] NFC: change more places to use Type::getAsTagDecl and friends (#155313)Matheus Izvekov1-12/+5
This changes a bunch of places which use getAs<TagType>, including derived types, just to obtain the tag definition. This is preparation for #155028, offloading all the changes that PR used to introduce which don't depend on any new helpers.
2025-08-09[clang] Improve nested name specifier AST representation (#147835)Matheus Izvekov1-2/+5
This is a major change on how we represent nested name qualifications in the AST. * The nested name specifier itself and how it's stored is changed. The prefixes for types are handled within the type hierarchy, which makes canonicalization for them super cheap, no memory allocation required. Also translating a type into nested name specifier form becomes a no-op. An identifier is stored as a DependentNameType. The nested name specifier gains a lightweight handle class, to be used instead of passing around pointers, which is similar to what is implemented for TemplateName. There is still one free bit available, and this handle can be used within a PointerUnion and PointerIntPair, which should keep bit-packing aficionados happy. * The ElaboratedType node is removed, all type nodes in which it could previously apply to can now store the elaborated keyword and name qualifier, tail allocating when present. * TagTypes can now point to the exact declaration found when producing these, as opposed to the previous situation of there only existing one TagType per entity. This increases the amount of type sugar retained, and can have several applications, for example in tracking module ownership, and other tools which care about source file origins, such as IWYU. These TagTypes are lazily allocated, in order to limit the increase in AST size. This patch offers a great performance benefit. It greatly improves compilation time for [stdexec](https://github.com/NVIDIA/stdexec). For one datapoint, for `test_on2.cpp` in that project, which is the slowest compiling test, this patch improves `-c` compilation time by about 7.2%, with the `-fsyntax-only` improvement being at ~12%. This has great results on compile-time-tracker as well: ![image](https://github.com/user-attachments/assets/700dce98-2cab-4aa8-97d1-b038c0bee831) This patch also further enables other optimziations in the future, and will reduce the performance impact of template specialization resugaring when that lands. It has some other miscelaneous drive-by fixes. About the review: Yes the patch is huge, sorry about that. Part of the reason is that I started by the nested name specifier part, before the ElaboratedType part, but that had a huge performance downside, as ElaboratedType is a big performance hog. I didn't have the steam to go back and change the patch after the fact. There is also a lot of internal API changes, and it made sense to remove ElaboratedType in one go, versus removing it from one type at a time, as that would present much more churn to the users. Also, the nested name specifier having a different API avoids missing changes related to how prefixes work now, which could make existing code compile but not work. How to review: The important changes are all in `clang/include/clang/AST` and `clang/lib/AST`, with also important changes in `clang/lib/Sema/TreeTransform.h`. The rest and bulk of the changes are mostly consequences of the changes in API. PS: TagType::getDecl is renamed to `getOriginalDecl` in this patch, just for easier to rebasing. I plan to rename it back after this lands. Fixes #136624 Fixes https://github.com/llvm/llvm-project/issues/43179 Fixes https://github.com/llvm/llvm-project/issues/68670 Fixes https://github.com/llvm/llvm-project/issues/92757
2025-06-09[CUDA] Disallow use of address_space(N) on CUDA device variables. (#142857)Artem Belevich1-6/+11
The variables have implicit host-side shadow instances and explicit address space attribute breaks them on the host.
2025-05-25[Sema] Remove unused includes (NFC) (#141419)Kazu Hirata1-1/+0
These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.
2025-05-18[clang] Use llvm::max_element (NFC) (#140435)Kazu Hirata1-3/+4
2025-05-01[clang] Add scoped enum support to `StreamingDiagnostic` (#138089)Vlad Serebrennikov1-10/+7
This patch adds templated `operator<<` for diagnostics that pass scoped enums, saving people from `llvm::to_underlying()` clutter on the side of emitting the diagnostic. This eliminates 80 out of 220 usages of `llvm::to_underlying()` in Clang. I also backported `std::is_scoped_enum_v` from C++23.
2025-04-23[CUDA][HIP] capture possible ODR-used var (#136645)Yaxun (Sam) Liu1-0/+47
In a lambda function, a call of a function may resolve to host and device functions with different signatures. Especially, a constexpr local variable may be passed by value by the device function and passed by reference by the host function, which will cause the constexpr variable captured by the lambda function in host compilation but not in the device compilation. The discrepancy in the lambda captures will violate ODR and causes UB for kernels using these lambdas. This PR fixes the issue by identifying discrepancy of ODR/non-ODR usages of constexpr local variables passed to host/device functions and conservatively capture them. Fixes: https://github.com/llvm/llvm-project/issues/132068
2025-03-03[CUDA][HIP] fix virtual dtor host/device attr (#128926)Yaxun (Sam) Liu1-2/+6
Currently if CUDA/HIP users use template class with virtual dtor and std::string data member with C++20 and MSVC. When the template class is explicitly instantiated, there is error about host function called by host device function (used to be undefined symbols in linking stage before member destructors were checked by deferred diagnostics). It was caused by clang inferring host/device attributes for default dtors. Since all dtors of member and parent classes have implicit host device attrs, clang infers the virtual dtor have implicit host and device attrs. Since virtual dtor of explicitly instantiated template class must be emitted, this causes constexpr dtor of std::string emitted, which calls a host function which was not emitted on device side. This is a serious issue since it prevents users from using std::string with C++20 on Windows. When inferring host device attr of virtual dtor of explicit template class instantiation, clang should be conservative since it is sure to be emitted. Since an implicit host device function may call a host function, clang cannot assume it is always available on device. This guarantees dtors that may call host functions not to have implicit device attr, therefore will not be emitted on device side. Fixes: https://github.com/llvm/llvm-project/issues/108548 Fixes: SWDEV-517435
2025-02-28[CUDA][HIP] check dtor in deferred diag (#129117)Yaxun (Sam) Liu1-0/+15
Currently the deferred diag fails to diagnose calling of host function in host device function in device compilation triggered by destructors. This can be further divided into two issuse: 1. the deferred diag visitor does not visit dtor of member and parent class when visiting dtor, which it should 2. the deferred diag visitor does not visit virtual dtor of explicit template class instantiation, which it should Due to these issues, some constexpr functions which call host functions are emitted on device side, which causes undefind symbols in linking stage, as revealed by https://github.com/llvm/llvm-project/issues/108548 By fixing these issue, clang will diag the issues early during compilation instead of linking.
2025-01-28[clang] Extend diagnose_if to accept more detailed warning information, take ↵Nikolas Klauser1-2/+2
2 (#119712) This is take two of #70976. This iteration of the patch makes sure that custom diagnostics without any warning group don't get promoted by `-Werror` or `-Wfatal-errors`. This implements parts of the extension proposed in https://discourse.llvm.org/t/exposing-the-diagnostic-engine-to-c/73092/7. Specifically, this makes it possible to specify a diagnostic group in an optional third argument.
2024-11-16[Sema] Remove unused includes (NFC) (#116461)Kazu Hirata1-2/+0
Identified with misc-include-cleaner.
2024-11-15[clang] [NFC] Merge two ifs to a single one (#116226)Boaz Brickner1-3/+3
2024-09-26Revert "[clang] Extend diagnose_if to accept more detailed warning ↵Kadir Cetinkaya1-2/+2
information (#70976)" This reverts commit e39205654dc11c50bd117e8ccac243a641ebd71f. There are further discussions in https://github.com/llvm/llvm-project/pull/70976, happening for past two weeks. Since there were no responses for couple weeks now, reverting until author is back.
2024-09-14Reapply "Reapply "[clang] Extend diagnose_if to accept more detailed warning ↵Nikolas Klauser1-2/+2
information (#70976)" (#108453)" This reverts commit e1bd9740faa62c11cc785a7b70ec1ad17e286bd1. Fixes incorrect use of the `DiagnosticsEngine` in the clangd tests.
2024-09-13Revert "Reapply "[clang] Extend diagnose_if to accept more detailed warning ↵Florian Mayer1-2/+2
information (#70976)" (#108453)" This reverts commit e7f782e7481cea23ef452a75607d3d61f5bd0d22. This had UBSan failures: [----------] 1 test from ConfigCompileTests [ RUN ] ConfigCompileTests.DiagnosticSuppression Config fragment: compiling <unknown>:0 -> 0x00007B8366E2F7D8 (trusted=false) /usr/local/google/home/fmayer/large/llvm-project/llvm/include/llvm/ADT/IntrusiveRefCntPtr.h:203:33: runtime error: reference binding to null pointer of type 'clang::DiagnosticIDs' UndefinedBehaviorSanitizer: undefined-behavior /usr/local/google/home/fmayer/large/llvm-project/llvm/include/llvm/ADT/IntrusiveRefCntPtr.h:203:33 Pull Request: https://github.com/llvm/llvm-project/pull/108645
2024-09-13Reapply "[clang] Extend diagnose_if to accept more detailed warning ↵Nikolas Klauser1-2/+2
information (#70976)" (#108453) This reverts commit e0cd11eba526234ca14a0b91f5598ca3363b6aca. Update the use of `getWarningOptionForDiag` in flang to use the DiagnosticIDs.
2024-09-12Revert "[clang] Extend diagnose_if to accept more detailed warning ↵Kazu Hirata1-2/+2
information (#70976)" This reverts commit 030c6da7af826b641db005be925b20f956c3a6bb. Several build bots are failing: https://lab.llvm.org/buildbot/#/builders/89/builds/6211 https://lab.llvm.org/buildbot/#/builders/157/builds/7578 https://lab.llvm.org/buildbot/#/builders/140/builds/6429
2024-09-12[clang] Extend diagnose_if to accept more detailed warning information (#70976)Nikolas Klauser1-2/+2
This implements parts of the extension proposed in https://discourse.llvm.org/t/exposing-the-diagnostic-engine-to-c/73092/7. Specifically, this makes it possible to specify a diagnostic group in an optional third argument.
2024-08-12[Offload][CUDA] Allow CUDA kernels to use LLVM/Offload (#94549)Johannes Doerfert1-0/+3
Through the new `-foffload-via-llvm` flag, CUDA kernels can now be lowered to the LLVM/Offload API. On the Clang side, this is simply done by using the OpenMP offload toolchain and emitting calls to `llvm*` functions to orchestrate the kernel launch rather than `cuda*` functions. These `llvm*` functions are implemented on top of the existing LLVM/Offload API. As we are about to redefine the Offload API, this wil help us in the design process as a second offload language. We do not support any CUDA APIs yet, however, we could: https://www.osti.gov/servlets/purl/1892137 For proper host execution we need to resurrect/rebase https://tianshilei.me/wp-content/uploads/2021/12/llpp-2021.pdf (which was designed for debugging). ``` ❯❯❯ cat test.cu extern "C" { void *llvm_omp_target_alloc_shared(size_t Size, int DeviceNum); void llvm_omp_target_free_shared(void *DevicePtr, int DeviceNum); } __global__ void square(int *A) { *A = 42; } int main(int argc, char **argv) { int DevNo = 0; int *Ptr = reinterpret_cast<int *>(llvm_omp_target_alloc_shared(4, DevNo)); *Ptr = 7; printf("Ptr %p, *Ptr %i\n", Ptr, *Ptr); square<<<1, 1>>>(Ptr); printf("Ptr %p, *Ptr %i\n", Ptr, *Ptr); llvm_omp_target_free_shared(Ptr, DevNo); } ❯❯❯ clang++ test.cu -O3 -o test123 -foffload-via-llvm --offload-arch=native ❯❯❯ llvm-objdump --offloading test123 test123: file format elf64-x86-64 OFFLOADING IMAGE [0]: kind elf arch gfx90a triple amdgcn-amd-amdhsa producer openmp ❯❯❯ LIBOMPTARGET_INFO=16 ./test123 Ptr 0x155448ac8000, *Ptr 7 Ptr 0x155448ac8000, *Ptr 42 ```
2024-06-10[CUDA][HIP] warn incompatible redeclare (#77359)Yaxun (Sam) Liu1-16/+25
nvcc warns about the following code: `void f(); __device__ void f() {}` but clang does not since clang allows device function to overload host function. Users want clang to emit similar warning to help code to be compatible with nvcc. Since this may cause regression with existing code, the warning is off by default and can be enabled by -Wnvcc-compat. It won't cause warning in system headers, even with -Wnvcc-compat.
2024-04-13[clang] Introduce `SemaCUDA` (#88559)Vlad Serebrennikov1-134/+143
This patch moves CUDA-related `Sema` function into new `SemaCUDA` class, following the recent example of SYCL, OpenACC, and HLSL. This is a part of the effort to split Sema. Additional context can be found in https://github.com/llvm/llvm-project/pull/82217, https://github.com/llvm/llvm-project/pull/84184, https://github.com/llvm/llvm-project/pull/87634.
2024-04-12[clang][NFC] Refactor `CUDAFunctionTarget`Vlad Serebrennikov1-65/+81
Refactor `CUDAFunctionTarget` into a scoped enum at namespace scope, so that it can be forward declared. This is done in preparation for `SemaCUDA`.
2024-04-12[clang][NFC] Refactor `CXXSpecialMember`Vlad Serebrennikov1-1/+1
In preparation for `SemaCUDA`, which requires this enum to be forward-declarable.
2024-03-08[HIP] fix host-used external kernel (#83870)Yaxun (Sam) Liu1-1/+4
In -fgpu-rdc mode, when an external kernel is used by a host function with weak_odr linkage (e.g. explicitly instantiated template function), the kernel should not be marked as host-used external kernel, since the host function may be dropped by the linker. Mark the external kernel as host-used external kernel will force a reference to the external kernel, which the user may not define in other TU. Fixes: https://github.com/llvm/llvm-project/issues/83771
2023-12-01[CUDA][HIP] allow trivial ctor/dtor in device var init (#73140)Yaxun (Sam) Liu1-0/+9
Treat ctor/dtor in device var init as host device function so that they can be used to initialize file-scope device variables to match nvcc behavior. If they are non-trivial they will be diagnosed. We cannot add implicit host device attrs to non-trivial ctor/dtor since determining whether they are non-trivial needs to know whether they have a trivial body and all their member and base classes' ctor/dtor have trivial body, which is affected by where their bodies are defined or instantiated. Fixes: #72261 Fixes: SWDEV-432412
2023-11-22Revert "[CUDA][HIP] make trivial ctor/dtor host device (#72394)"Yaxun (Sam) Liu1-16/+0
This reverts commit 27e6e4a4d0e3296cebad8db577ec0469a286795e. This patch is reverted due to regression. A testcase is: `template <class T> struct ptr { ~ptr() { static int x = 1;} }; template <class T> struct Abc : ptr<T> { public: Abc(); ~Abc() {} }; template class Abc<int>; `
2023-11-16[CUDA][HIP] make trivial ctor/dtor host device (#72394)Yaxun (Sam) Liu1-0/+16
Make trivial ctor/dtor implicitly host device functions so that they can be used to initialize file-scope device variables to match nvcc behavior. Fixes: https://github.com/llvm/llvm-project/issues/72261 Fixes: SWDEV-432412
2023-11-09[CUDA][HIP] Make template implicitly host device (#70369)Yaxun (Sam) Liu1-1/+41
Added option -foffload-implicit-host-device-templates which is off by default. When the option is on, template functions and specializations without host/device attributes have implicit host device attributes. They can be overridden by device template functions with the same signagure. They are emitted on device side only if they are used on device side. This feature is added as an extension. `__has_extension(cuda_implicit_host_device_templates)` can be used to check whether it is enabled. This is to facilitate using standard C++ headers for device. Fixes: https://github.com/llvm/llvm-project/issues/69956 Fixes: SWDEV-428314
2023-10-17[CUDA][HIP] Fix init var diag in temmplate (#69081)Yaxun (Sam) Liu1-0/+7
Currently clang diagnoses the following code: (https://godbolt.org/z/s8zK3E5P5) but nvcc does not. ` struct A { constexpr A(){} }; struct B { A a; int b; }; template<typename T> __global__ void kernel( ) { __shared__ B x; } ` Clang generates an implicit trivial ctor for struct B, which should be allowed for initializing a shared variable. However, the body of the ctor is defined only if the template kernel is instantiated. Clang checks the initialization of variable in non-instantiated templates, where it cannot find the body of the ctor, therefore diagnoses it. This patch skips the check for non-instantiated templates.
2023-10-05[Clang] Handle consteval expression in array bounds expressions (#66222)cor3ntin1-1/+1
The bounds of a c++ array is a _constant-expression_. And in C++ it is also a constant expression. But we also support VLAs, ie arrays with non-constant bounds. We need to take care to handle the case of a consteval function (which are specified to be only immediately called in non-constant contexts) that appear in arrays bounds. This introduces `Sema::isAlwayConstantEvaluatedContext`, and a flag in ExpressionEvaluationContextRecord, such that immediate functions in array bounds are always immediately invoked. Sema had both `isConstantEvaluatedContext` and `isConstantEvaluated`, so I took the opportunity to cleanup that. The change in `TimeProfilerTest.cpp` is an unfortunate manifestation of the problem that #66203 seeks to address. Fixes #65520
2023-10-03[HIP][Clang][Sema] Add Sema support for `hipstdpar`Alex Voicu1-1/+10
This patch adds the Sema changes needed for enabling HIP parallel algorithm offload on AMDGPU targets. This change impacts the CUDA / HIP language specific checks, and only manifests if compiling in `hipstdpar` mode. In this case, we essentially do three things: 1. Allow device side callers to call host side callees - since the user visible HLL would be standard C++, with no annotations / restriction mechanisms, we cannot unambiguously establish that such a call is an error, so we conservatively allow all such calls, deferring actual cleanup to a subsequent pass over IR; 2. Allow host formed lambdas to capture by reference; 3. Allow device functions to use host global variables. Reviewed by: yaxunl Differential Revision: https://reviews.llvm.org/D155833
2023-09-07Reland "[CUDA][HIP] Fix overloading resolution in global var init" (#65606)Yaxun (Sam) Liu1-3/+21
https://reviews.llvm.org/D158247 caused regressions for HIP on Windows and was reverted. A reduced test case is: ``` typedef void (__stdcall* funcTy)(); void invoke(funcTy f); static void __stdcall callee() noexcept { } void foo() { invoke(callee); } ``` It is due to clang missing handling host/device attributes for calling convention at a few places This patch fixes that.
2023-08-31Revert "[CUDA][HIP] Fix overloading resolution in global variable initializer"Yaxun (Sam) Liu1-21/+3
This reverts commit de0df639724b10001ea9a74539381ea494296be9. It was reverted due to regression in HIP unit test on Windows: In file included from C:\hip-tests\catch\unit\graph\hipGraphClone.cc:37: In file included from C:\hip-tests\catch\.\include\hip_test_common.hh:24: In file included from C:\hip-tests\catch\.\include/hip_test_context.hh:24: In file included from C:/install/native/Release/x64/hip/include\hip/hip_runtime.h:54: C:/dk/win\vc\14.31.31107\include\thread:76:70: error: cannot initialize a parameter of type '_beginthreadex_proc_type' (aka 'unsigned int (*)(void *) __attribute__((stdcall))') with an lvalue of type 'const unsigned int (*)(void *) noexcept __attribute__((stdcall))': different exception specifications 76 | reinterpret_cast<void*>(_CSTD _beginthreadex(nullptr, 0, _Invoker_proc, _Decay_copied.get(), 0, &_Thr._Id)); | ^~~~~~~~~~~~~ C:\hip-tests\catch\unit\graph\hipGraphClone.cc:290:21) &>' requested here 90 | _Start(_STD forward<_Fn>(_Fx), _STD forward<_Args>(_Ax)...); | ^ C:\hip-tests\catch\unit\graph\hipGraphClone.cc:290:21) &, 0>' requested here 311 | std::thread t(lambdaFunc); | ^ C:/dk/win\ms_wdk\e22621\Include\10.0.22621.0\ucrt\process.h:99:40: note: passing argument to parameter '_StartAddress' here 99 | _In_ _beginthreadex_proc_type _StartAddress, | ^ 1 error generated when compiling for gfx1030.
2023-08-29[CUDA][HIP] Fix overloading resolution in global variable initializerYaxun (Sam) Liu1-3/+21
Currently, clang does not resolve certain overloaded functions correctly in the initializer of global variables, e.g. template<typename T1, typename U> T1 mypow(T1, U); __attribute__((device)) double mypow(double, int); double t_extent = mypow(1.0, 2); In the above example, mypow is supposed to resolve to the host version but clang resolves it to the device version instead, and emits an error (https://godbolt.org/z/17xxzaa67). However, if the variable is assigned in a host function, there is no error. The discrepancy in overloading resolution inside and outside of a function is due to clang not accounting for the host/device target when resolving functions called in the initializer of a global variable. This patch introduces a global host/device target context for CUDA/HIP for functions called outside of functions. For global variable initialization, it is determined by the host/device attribute of the variable. For other situations, a default value of host_device is sufficient. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D158247 Fixes: SWDEV-416731
2023-01-14[clang] Remove remaining uses of llvm::Optional (NFC)Kazu Hirata1-1/+0
This patch removes several "using" declarations and #include "llvm/ADT/Optional.h". This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2023-01-14[clang] Use std::optional instead of llvm::Optional (NFC)Kazu Hirata1-1/+1
This patch replaces (llvm::|)Optional< with std::optional<. I'll post a separate patch to remove #include "llvm/ADT/Optional.h". This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2023-01-14[clang] Add #include <optional> (NFC)Kazu Hirata1-0/+1
This patch adds #include <optional> to those files containing llvm::Optional<...> or Optional<...>. I'll post a separate patch to actually replace llvm::Optional with std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-17llvm::Optional::value => operator*/operator->Fangrui Song1-6/+6
std::optional::value() has undesired exception checking semantics and is unavailable in older Xcode (see _LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS). The call sites block std::optional migration. This makes `ninja clang` work in the absence of llvm::Optional::value.
2022-11-21Don't use Optional::getPointer (NFC)Kazu Hirata1-4/+2
Since std::optional does not offer getPointer(), this patch replaces X.getPointer() with &*X to make the migration from llvm::Optional to std::optional easier. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716 Differential Revision: https://reviews.llvm.org/D138466
2022-07-22[CUDA] remove duplicate conditionVitalyR1-1/+1
Reviewed by: Yaxun Liu Differential Revision: https://reviews.llvm.org/D130168 Change-Id: Ia00c3dfa9ea20e61235817fd4bb61d33c7c98a60
2022-07-13[clang] Use value instead of getValue (NFC)Kazu Hirata1-7/+6
2022-06-25[clang] Don't use Optional::hasValue (NFC)Kazu Hirata1-1/+1
This patch replaces Optional::hasValue with the implicit cast to bool in conditionals only.
2022-06-25Revert "Don't use Optional::hasValue (NFC)"Kazu Hirata1-7/+10
This reverts commit aa8feeefd3ac6c78ee8f67bf033976fc7d68bc6d.
2022-06-25Don't use Optional::hasValue (NFC)Kazu Hirata1-10/+7
2022-06-20[clang] Don't use Optional::hasValue (NFC)Kazu Hirata1-2/+2
2022-04-13[CUDA][HIP] Fix host used external kernel in archiveYaxun (Sam) Liu1-1/+6
For -fgpu-rdc, a host function may call an external kernel which is defined in an archive of bitcode. Since this external kernel is only referenced in host function, the device bitcode does not contain reference to this external kernel, then the linker will not try to resolve this external kernel in the archive. To fix this issue, host-used external kernels and device variables are tracked. A global array containing pointers to these external kernels and variables is emitted which serves as an artificial references to the external kernels and variables used by host. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D123441
2022-03-24[CUDA][HIP] Fix hostness check with -fopenmpYaxun (Sam) Liu1-10/+10
CUDA/HIP determines whether a function can be called based on the device/host attributes of callee and caller. Clang assumes the caller is CurContext. This is correct in most cases, however, it is not correct in OpenMP parallel region when CUDA/HIP program is compiled with -fopenmp. This causes incorrect overloading resolution and missed diagnostics. To get the correct caller, clang needs to chase the parent chain of DeclContext starting from CurContext until a function decl or a lambda decl is reached. Sema API is adapted to achieve that and used to determine the caller in hostness check. Reviewed by: Artem Belevich, Richard Smith Differential Revision: https://reviews.llvm.org/D121765
2022-03-18Use llvm::append_range instead of push_back loops where applicable. NFCI.Benjamin Kramer1-3/+1
2022-02-15[CUDA][HIP] Do not promote constexpr var with non-constant initializerYaxun (Sam) Liu1-6/+8
constexpr var may be initialized with address of non-const variable. In this case the initializer is not constant in device compilation. This has been handled for const vars but not for constexpr vars. This patch makes handling of const var and constexpr var consistent. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D119615 Fixes: https://github.com/llvm/llvm-project/issues/53780