aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/CodeGen/CodeGenModule.h
AgeCommit message (Collapse)AuthorFilesLines
2023-11-29[clang][CodeGen] Emit annotations for function declarations. (#66716)Brendan Dahl1-0/+4
Previously, annotations were only emitted for function definitions. With this change annotations are also emitted for declarations. Also, emitting function annotations is now deferred until the end so that the most up to date declaration is used which will have any inherited annotations.
2023-11-10 [OpenMP] Rework handling of global ctor/dtors in OpenMP (#71739)Joseph Huber1-7/+7
Summary: This patch reworks how we handle global constructors in OpenMP. Previously, we emitted individual kernels that were all registered and called individually. In order to provide more generic support, this patch moves all handling of this to the target backend and the runtime plugin. This has the benefit of supporting the GNU extensions for constructors an destructors, removing a class of failures related to shared library destruction order, and allows targets other than OpenMP to use the same support without needing to change the frontend. This is primarily done by calling kernels that the backend emits to iterate a list of ctor / dtor functions. For x64, this is automatic and we get it for free with the standard `dlopen` handling. For AMDGPU, we emit `amdgcn.device.init` and `amdgcn.device.fini` functions which handle everything atuomatically and simply need to be called. For NVPTX, a patch https://github.com/llvm/llvm-project/pull/71549 provides the kernels to call, but the runtime needs to set up the array manually by pulling out all the known constructor / destructor functions. One concession that this patch requires is the change that for GPU targets in OpenMP offloading we will use `llvm.global_dtors` instead of using `atexit`. This is because `atexit` is a separate runtime function that does not mesh well with the handling we're trying to do here. This should be equivalent in all cases except for cases where we would need to destruct manually such as: ``` struct S { ~S() { foo(); } }; void foo() { static S s; } ``` However this is broken in many other ways on the GPU, so it is not regressing any support, simply increasing the scope of what we can handle. This changes the handling of ctors / dtors. This patch now outputs a information message regarding the deprecation if the old format is used. This will be completely removed in a later release. Depends on: https://github.com/llvm/llvm-project/pull/71549
2023-10-26[OpenMP] Pass min/max thread and team count to the OMPIRBuilder (#70247)Johannes Doerfert1-3/+11
We now provide the information about the min/max thread and team count from to the OMPIRBuilder, no matter what the source was. That means we unify `thread_limit`, `num_teams`, `num_threads` handling with the target specific attriutes (`__launch_bounds__` and `amdgpu_flat_work_group_size`). This is in preparation to pass the values to the runtime, and to allow the middle-end (OpenMP-opt) to tighten the values if it seems appropriate. There is no "real" change after this commit.
2023-09-13Revert "[clang][CodeGen] Emit annotations for function declarations."Benjamin Kramer1-4/+0
This reverts commit c6a33ff49dfb3498dae15c718820ea3d9c19f3cb. Makes clang segfault. // clang t.cc class a; class c { public: [[clang::annotate("")]] c(const c *) {} }; class d { d(const c *, a *, a *); c e; }; d::d(const c *f, a *, a *) : e(f) {}
2023-09-12[clang][CodeGen] Emit annotations for function declarations.Brendan Dahl1-0/+4
Previously, annotations were only emitted for function definitions. With this change annotations are also emitted for declarations. Also, emitting function annotations is now deferred until the end so that the most up to date declaration is used which will have any inherited annotations. Differential Revision: https://reviews.llvm.org/D156172/new/
2023-09-08Take math-errno into account with '#pragma float_control(precise,on)' andZahira Ammarguellat1-0/+6
'attribute__((optnone)). Differential Revision: https://reviews.llvm.org/D151834
2023-09-05[NFC] Remove unneeded header includesBill Wendling1-0/+1
Use forward decls instead of #including the header files. Differential Revision: https://reviews.llvm.org/D159421
2023-08-31[NFC][Clang] Remove redundant function definitionsJuan Manuel MARTINEZ CAAMAÑO1-20/+0
There were 3 definitions of the mergeDefaultFunctionDefinitionAttributes function: A private implementation, a version exposed in CodeGen, a version exposed in CodeGenModule. This patch removes the private and the CodeGenModule versions and keeps a single definition in CodeGen. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D159256
2023-08-30[NFC][Clang] Remove unused function ↵Juan Manuel MARTINEZ CAAMAÑO1-1/+0
`CodeGenModule::addDefaultFunctionDefinitionAttributes` This patch deletes the unused `addDefaultFunctionDefinitionAttributes(llvm::Function);` function, while it still keeps `void addDefaultFunctionDefinitionAttributes(llvm::AttrBuilder &attrs);` which is being used. Differential Revision: https://reviews.llvm.org/D158990
2023-08-29[OpenMP][DeviceRTL][AMDGPU] Support code object version 5Saiyedul Islam1-5/+5
Update DeviceRTL and the AMDGPU plugin to support code object version 5. Default is code object version 4. CodeGen for __builtin_amdgpu_workgroup_size generates code for cov4 as well as cov5 if -mcode-object-version=none is specified. DeviceRTL compilation passes this argument via Xclang option to generate abi-agnostic code. Generated code for the above builtin uses a clang control constant "llvm.amdgcn.abi.version" to branch on the abi version, which is available during linking of user's OpenMP code. Load of this constant gets eliminated during linking. AMDGPU plugin queries the ELF for code object version and then prepares various implicitargs accordingly. Differential Revision: https://reviews.llvm.org/D139730 Reviewed By: jhuber6, yaxunl
2023-08-27[CodeGen] Modernize InstrProfStats (NFC)Kazu Hirata1-8/+6
2023-08-17[CodeGen] Restrict addEmittedDeferredDecl to incremental extensionsJonas Hahnfeld1-0/+4
Reemission is only needed in incremental mode. With this early return, we avoid overhead from addEmittedDeferredDecl in non-incremental mode. Differential Revision: https://reviews.llvm.org/D157379
2023-08-17[CodeGen] Clean up access to EmittedDeferredDecls, NFCI.Jonas Hahnfeld1-4/+9
GlobalDecls should only be added to EmittedDeferredDecls if they need reemission. This is checked in addEmittedDeferredDecl, which is called via addDeferredDeclToEmit. Extend these checks to also handle VarDecls (for lambdas, as tested in Interpreter/lambda.cpp) and remove the direct access of EmittedDeferredDecls in EmitGlobal that may actually end up duplicating FunctionDecls. Differential Revision: https://reviews.llvm.org/D156897
2023-08-17[CodeGen] Remove Constant arguments from linkage functions, NFCI.Jonas Hahnfeld1-3/+2
This was unused since commit dd2362a8ba last year. Differential Revision: https://reviews.llvm.org/D156891
2023-08-14Make globals with mutable members non-constant, even in custom sectionsDavid Blaikie1-2/+0
Turned out we were making overly simple assumptions about which sections (& section flags) would be used when emitting a global into a custom section. This lead to sections with read-only flags being used for globals of struct types with mutable members. Fixed by porting the codegen function with the more nuanced handling/checking for mutable members out of codegen for use in the sema code that does this initial checking/mapping to section flags. Differential Revision: https://reviews.llvm.org/D156726
2023-07-25[clang][CodeGenModule] remove declaration of GetAddrOfConstantStringNick Desaulniers1-5/+0
It looks like the definition was removed in cd21d541397e but the declaration was not. Surprisingly (to me), that doesn't seem to produce any kind of diagnostic. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D156182
2023-07-25[Support] Change SetVector's default template parameter to SmallVector<*, 0>Fangrui Song1-1/+1
Similar to D156016 for MapVector. This brings back commit fae7b98c221b5b28797f7b56b656b6b819d99f27 with a fix to llvm/unittests/Support/ThreadPool.cpp's `_WIN32` code path.
2023-07-25Reapply "[OpenMP] Add the `ompx_attribute` clause for target directives"Johannes Doerfert1-0/+15
This reverts commit 0d12683046ca75fb08e285f4622f2af5c82609dc and reapplies ef9ec4bbcca2fa4f64df47bc426f1d1c59ea47e2 with an extension to fix the Flang build. Differential Revision: https://reviews.llvm.org/D156184
2023-07-25Revert "[OpenMP] Add the `ompx_attribute` clause for target directives"Aaron Ballman1-15/+0
This reverts commit ef9ec4bbcca2fa4f64df47bc426f1d1c59ea47e2. The changes broke several bots: https://lab.llvm.org/buildbot/#/builders/176/builds/3408 https://lab.llvm.org/buildbot/#/builders/198/builds/4028 https://lab.llvm.org/buildbot/#/builders/197/builds/8491 https://lab.llvm.org/buildbot/#/builders/197/builds/8491
2023-07-25Revert rGfae7b98c221b5b28797f7b56b656b6b819d99f27 "[Support] Change ↵Simon Pilgrim1-1/+1
SetVector's default template parameter to SmallVector<*, 0>" This is failing on Windows MSVC builds: llvm\unittests\Support\ThreadPool.cpp(380): error C2440: 'return': cannot convert from 'Vector' to 'std::vector<llvm::BitVector,std::allocator<llvm::BitVector>>' with [ Vector=llvm::SmallVector<llvm::BitVector,0> ]
2023-07-25[Support] Change SetVector's default template parameter to SmallVector<*, 0>Fangrui Song1-1/+1
Similar to D156016 for MapVector.
2023-07-24[OpenMP] Add the `ompx_attribute` clause for target directivesJohannes Doerfert1-0/+15
CUDA and HIP have kernel attributes to tune the code generation (in the backend). To reuse this functionality for OpenMP target regions we introduce the `ompx_attribute` clause that takes these kernel attributes and emits code as if they had been attached to the kernel fuction (which is implicitly generated). To limit the impact, we only support three kernel attributes: `amdgpu_waves_per_eu`, for AMDGPU `amdgpu_flat_work_group_size`, for AMDGPU `launch_bounds`, for NVPTX The existing implementations of those attributes are used for error checking and code generation. `ompx_attribute` can be attached to any executable target region and it can hold more than one kernel attribute. Differential Revision: https://reviews.llvm.org/D156184
2023-07-22[CodeGen] Stabilize C2/D2 to C1/D1 replacement orderFangrui Song1-2/+2
The conversion iterates over CodeGenModule::Replacements (a StringMap) and replaces C2/D2 and moves C1/D1 ( commit 0196a1d98f8a206259a4b5ce93c21807243af92f in 2013, to make the output look nicer). The iteration order is not guaranteed to be deterministic, and may cause destructors.cpp to exhibit different function orders. Use a MapVector instead. While here, fix an IWYU issue by adding an explicit include, though MapVector is already used in CodeGenModule.h.
2023-07-21Optimize emission of `dynamic_cast` to final classes.Richard Smith1-0/+7
- When the destination is a final class type that does not derive from the source type, the cast always fails and is now emitted as a null pointer or call to __cxa_bad_cast. - When the destination is a final class type that does derive from the source type, emit a direct comparison against the corresponding base class vptr value(s). There may be more than one such value in the case of multiple inheritance; check them all. For now, this is supported only for the Itanium ABI. I expect the same thing is possible for the MS ABI too, but I don't know what guarantees are made about vfptr uniqueness. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D154658
2023-06-14[CodeGen] Remove unused function GetOrCreateRTTIProxyGlobalVariableKazu Hirata1-5/+0
The last use was removed by: commit 46f366494f3ca8cc98daa6fb4f29c7c446c176b6 Author: Fangrui Song <i@maskray.me> Date: Sat May 20 08:24:20 2023 -0700 This patch also removes RTTIProxyMap, which becomes unused once I remove GetOrCreateRTTIProxyGlobalVariable. Differential Revision: https://reviews.llvm.org/D152782
2023-06-07[clang][CodeGen] Fix GPU-specific attributes being dropped by bitcode linkingpvanhout1-1/+2
Device libs make use of patterns like this: ``` __attribute__((target("gfx11-insts"))) static unsigned do_intrin_stuff(void) { return __builtin_amdgcn_s_sendmsg_rtnl(0x0); } ``` For functions that are assumed to be eliminated if the currennt GPU target doesn't support them. At O0 such functions aren't eliminated by common optimizations but often by AMDGPURemoveIncompatibleFunctions instead, which sees the "+gfx11-insts" attribute on, say, GFX9 and knows it's not valid, so it removes the function. D142907 accidentally made it so such attributes were dropped during bitcode linking, making it impossible for RemoveIncompatibleFunctions to catch the functions and causing ISel to catch fire eventually. This fixes the issue and adds a new test to ensure we don't accidentally fall into this trap again. Fixes SWDEV-403642 Reviewed By: arsenm, yaxunl Differential Revision: https://reviews.llvm.org/D152251
2023-04-29LangRef: Add "dynamic" option to "denormal-fp-math"Matt Arsenault1-0/+8
This is stricter than the default "ieee", and should probably be the default. This patch leaves the default alone. I can change this in a future patch. There are non-reversible transforms I would like to perform which are legal under IEEE denormal handling, but illegal with flushing zero behavior. Namely, conversions between llvm.is.fpclass and fcmp with zeroes. Under "ieee" handling, it is legal to translate between llvm.is.fpclass(x, fcZero) and fcmp x, 0. Under "preserve-sign" handling, it is legal to translate between llvm.is.fpclass(x, fcSubnormal|fcZero) and fcmp x, 0. I would like to compile and distribute some math library functions in a mode where it's callable from code with and without denormals enabled, which requires not changing the compares with denormals or zeroes. If an IEEE function transforms an llvm.is.fpclass call into an fcmp 0, it is no longer possible to call the function from code with denormals enabled, or write an optimization to move the function into a denormal flushing mode. For the original function, if x was a denormal, the class would evaluate to false. If the function compiled with denormal handling was converted to or called from a preserve-sign function, the fcmp now evaluates to true. This could also be of use for strictfp handling, where code may be changing the denormal mode. Alternative name could be "unknown". Replaces the old AMDGPU custom inlining logic with more conservative logic which tries to permit inlining for callees with dynamic handling and avoids inlining other mismatched modes.
2023-04-28[NFC][CLANG] Fix coverity remarks about large copy by valuesManna, Soumi1-1/+1
Reported By Static Analyzer Tool, Coverity: Big parameter passed by value Copying large values is inefficient, consider passing by reference; Low, medium, and high size thresholds for detection can be adjusted. 1. Inside "CodeGenModule.cpp" file, in clang::CodeGen::CodeGenModule::EmitBackendOptionsMetadata(clang::CodeGenOptions): A very large function call parameter exceeding the high threshold is passed by value. pass_by_value: Passing parameter CodeGenOpts of type clang::CodeGenOptions const (size 2168 bytes) by value, which exceeds the high threshold of 512 bytes. 2. Inside "SemaType.cpp" file, in IsNoDerefableChunk(clang::DeclaratorChunk): A large function call parameter exceeding the low threshold is passed by value. pass_by_value: Passing parameter Chunk of type clang::DeclaratorChunk (size 176 bytes) by value, which exceeds the low threshold of 128 bytes. 3. Inside "CGNonTrivialStruct.cpp" file, in <unnamed>::getParamAddrs<1ull, <0ull...>>(std::integer_sequence<unsigned long long, T2...>, std::array<clang::CharUnits, T1>, clang::CodeGen::FunctionArgList, clang::CodeGen::CodeGenFunction *): A large function call parameter exceeding the low threshold is passed by value. .i. pass_by_value: Passing parameter Args of type clang::CodeGen::FunctionArgList (size 144 bytes) by value, which exceeds the low threshold of 128 bytes. 4. Inside "CGGPUBuiltin.cpp" file, in <unnamed>::containsNonScalarVarargs(clang::CodeGen::CodeGenFunction *, clang::CodeGen::CallArgList): A very large function call parameter exceeding the high threshold is passed by value. i. pass_by_value: Passing parameter Args of type clang::CodeGen::CallArgList (size 1176 bytes) by value, which exceeds the high threshold of 512 bytes. Reviewed By: tahonermann Differential Revision: https://reviews.llvm.org/D149163
2023-03-16Emit const globals with constexpr destructor as constant LLVM valuesHans Wennborg1-1/+1
This follows 2b4fa53 which made Clang not emit destructor calls for such objects. However, they would still not get emitted as constants since CodeGenModule::isTypeConstant() returns false if the destructor is constexpr. This change adds a param to make isTypeConstant() ignore the dtor, allowing the caller to check it instead. Fixes Issue #61212 Differential revision: https://reviews.llvm.org/D145369
2023-01-14[clang] Use std::optional instead of llvm::Optional (NFC)Kazu Hirata1-1/+1
This patch replaces (llvm::|)Optional< with std::optional<. I'll post a separate patch to remove #include "llvm/ADT/Optional.h". This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2023-01-14[clang] Add #include <optional> (NFC)Kazu Hirata1-0/+1
This patch adds #include <optional> to those files containing llvm::Optional<...> or Optional<...>. I'll post a separate patch to actually replace llvm::Optional with std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2023-01-13[clang][NFC] Remove dependency on DataLayout::getPrefTypeAlignmentGuillaume Chatelet1-1/+1
2022-12-08[clang] Ensure correct metadata for relative vtablesLeonard Chan1-0/+6
Prior to this, metadata pertaining to the size or address point offsets into a relative vtable were twice the value they should be (treating component widths as pointer width rather than 4 bytes). This prevented some vtables from being devirtualized with D134320. This ensures the correct metadata is written so whole program devirtualization can catch these remaining devirt targets. Differential Revision: https://reviews.llvm.org/D134687
2022-12-04[NFC][CodeGen] Add const to a methodVitaly Buka1-1/+2
2022-12-03[CodeGen] Use std::nullopt instead of None (NFC)Kazu Hirata1-2/+4
This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-03[clang-repl] Support statements on global scope in incremental mode.Vassil Vassilev1-0/+6
This patch teaches clang to parse statements on the global scope to allow: ``` ./bin/clang-repl clang-repl> int i = 12; clang-repl> ++i; clang-repl> extern "C" int printf(const char*,...); clang-repl> printf("%d\n", i); 13 clang-repl> %quit ``` Generally, disambiguating between statements and declarations is a non-trivial task for a C++ parser. The challenge is to allow both standard C++ to be translated as if this patch does not exist and in the cases where the user typed a statement to be executed as if it were in a function body. Clang's Parser does pretty well in disambiguating between declarations and expressions. We have added DisambiguatingWithExpression flag which allows us to preserve the existing and optimized behavior where needed and implement the extra rules for disambiguating. Only few cases require additional attention: * Constructors/destructors -- Parser::isConstructorDeclarator was used in to disambiguate between ctor-looking declarations and statements on the global scope(eg. `Ns::f()`). * The template keyword -- the template keyword can appear in both declarations and statements. This patch considers the template keyword to be a declaration starter which breaks a few cases in incremental mode which will be tackled later. * The inline (and similar) keyword -- looking at the first token in many cases allows us to classify what is a declaration. * Other language keywords and specifiers -- ObjC/ObjC++/OpenCL/OpenMP rely on pragmas or special tokens which will be handled in subsequent patches. The patch conceptually models a "top-level" statement into a TopLevelStmtDecl. The TopLevelStmtDecl is lowered into a void function with no arguments. We attach this function to the global initializer list to execute the statement blocks in the correct order. Differential revision: https://reviews.llvm.org/D127284
2022-08-24KCFI sanitizerSami Tolvanen1-0/+9
The KCFI sanitizer, enabled with `-fsanitize=kcfi`, implements a forward-edge control flow integrity scheme for indirect calls. It uses a !kcfi_type metadata node to attach a type identifier for each function and injects verification code before indirect calls. Unlike the current CFI schemes implemented in LLVM, KCFI does not require LTO, does not alter function references to point to a jump table, and never breaks function address equality. KCFI is intended to be used in low-level code, such as operating system kernels, where the existing schemes can cause undue complications because of the aforementioned properties. However, unlike the existing schemes, KCFI is limited to validating only function pointers and is not compatible with executable-only memory. KCFI does not provide runtime support, but always traps when a type mismatch is encountered. Users of the scheme are expected to handle the trap. With `-fsanitize=kcfi`, Clang emits a `kcfi` operand bundle to indirect calls, and LLVM lowers this to a known architecture-specific sequence of instructions for each callsite to make runtime patching easier for users who require this functionality. A KCFI type identifier is a 32-bit constant produced by taking the lower half of xxHash64 from a C++ mangled typename. If a program contains indirect calls to assembly functions, they must be manually annotated with the expected type identifiers to prevent errors. To make this easier, Clang generates a weak SHN_ABS `__kcfi_typeid_<function>` symbol for each address-taken function declaration, which can be used to annotate functions in assembly as long as at least one C translation unit linked into the program takes the function address. For example on AArch64, we might have the following code: ``` .c: int f(void); int (*p)(void) = f; p(); .s: .4byte __kcfi_typeid_f .global f f: ... ``` Note that X86 uses a different preamble format for compatibility with Linux kernel tooling. See the comments in `X86AsmPrinter::emitKCFITypeId` for details. As users of KCFI may need to locate trap locations for binary validation and error handling, LLVM can additionally emit the locations of traps to a `.kcfi_traps` section. Similarly to other sanitizers, KCFI checking can be disabled for a function with a `no_sanitize("kcfi")` function attribute. Relands 67504c95494ff05be2a613129110c9bcf17f6c13 with a fix for 32-bit builds. Reviewed By: nickdesaulniers, kees, joaomoreira, MaskRay Differential Revision: https://reviews.llvm.org/D119296
2022-08-24Revert "KCFI sanitizer"Sami Tolvanen1-9/+0
This reverts commit 67504c95494ff05be2a613129110c9bcf17f6c13 as using PointerEmbeddedInt to store 32 bits breaks 32-bit arm builds.
2022-08-24KCFI sanitizerSami Tolvanen1-0/+9
The KCFI sanitizer, enabled with `-fsanitize=kcfi`, implements a forward-edge control flow integrity scheme for indirect calls. It uses a !kcfi_type metadata node to attach a type identifier for each function and injects verification code before indirect calls. Unlike the current CFI schemes implemented in LLVM, KCFI does not require LTO, does not alter function references to point to a jump table, and never breaks function address equality. KCFI is intended to be used in low-level code, such as operating system kernels, where the existing schemes can cause undue complications because of the aforementioned properties. However, unlike the existing schemes, KCFI is limited to validating only function pointers and is not compatible with executable-only memory. KCFI does not provide runtime support, but always traps when a type mismatch is encountered. Users of the scheme are expected to handle the trap. With `-fsanitize=kcfi`, Clang emits a `kcfi` operand bundle to indirect calls, and LLVM lowers this to a known architecture-specific sequence of instructions for each callsite to make runtime patching easier for users who require this functionality. A KCFI type identifier is a 32-bit constant produced by taking the lower half of xxHash64 from a C++ mangled typename. If a program contains indirect calls to assembly functions, they must be manually annotated with the expected type identifiers to prevent errors. To make this easier, Clang generates a weak SHN_ABS `__kcfi_typeid_<function>` symbol for each address-taken function declaration, which can be used to annotate functions in assembly as long as at least one C translation unit linked into the program takes the function address. For example on AArch64, we might have the following code: ``` .c: int f(void); int (*p)(void) = f; p(); .s: .4byte __kcfi_typeid_f .global f f: ... ``` Note that X86 uses a different preamble format for compatibility with Linux kernel tooling. See the comments in `X86AsmPrinter::emitKCFITypeId` for details. As users of KCFI may need to locate trap locations for binary validation and error handling, LLVM can additionally emit the locations of traps to a `.kcfi_traps` section. Similarly to other sanitizers, KCFI checking can be disabled for a function with a `no_sanitize("kcfi")` function attribute. Reviewed By: nickdesaulniers, kees, joaomoreira, MaskRay Differential Revision: https://reviews.llvm.org/D119296
2022-08-22[CodeGen] Sort llvm.global_ctors by lexing order before emissionYuanfang Chen1-3/+7
Fixes https://github.com/llvm/llvm-project/issues/55804 The lexing order is already bookkept in DelayedCXXInitPosition but we were not using it based on the wrong assumption that inline variable is unordered. This patch fixes it by ordering entries in llvm.global_ctors by orders in DelayedCXXInitPosition. for llvm.global_ctors entries without a lexing order, ordering them by the insertion order. (This *mostly* orders the template instantiation in https://reviews.llvm.org/D126341 intuitively, minus one tweak for which I'll submit a separate patch.) Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D127233
2022-08-04[InstrProf][attempt 2] Add new format for -fprofile-list=Ellis Hoag1-4/+5
In D130807 we added the `skipprofile` attribute. This commit changes the format so we can either `forbid` or `skip` profiling functions by adding the `noprofile` or `skipprofile` attributes, respectively. The behavior of the original format remains unchanged. Also, add the `skipprofile` attribute when using `-fprofile-function-groups`. This was originally landed as https://reviews.llvm.org/D130808 but was reverted due to a Windows test failure. Differential Revision: https://reviews.llvm.org/D131195
2022-08-04Revert "[InstrProf] Add new format for -fprofile-list="Nico Weber1-5/+4
This reverts commit b692312ca432d9a379f67a8d83177a6f1722baaa. Breaks tests on Windows, see https://reviews.llvm.org/D130808#3699952
2022-08-04[InstrProf] Add new format for -fprofile-list=Ellis Hoag1-4/+5
In D130807 we added the `skipprofile` attribute. This commit changes the format so we can either `forbid` or `skip` profiling functions by adding the `noprofile` or `skipprofile` attributes, respectively. The behavior of the original format remains unchanged. Also, add the `skipprofile` attribute when using `-fprofile-function-groups`. Reviewed By: phosek Differential Revision: https://reviews.llvm.org/D130808
2022-07-26[CGDebugInfo] Access the current working directory from the `VFS`Argyrios Kyrtzidis1-1/+10
...instead of calling `llvm::sys::fs::current_path()` directly. Differential Revision: https://reviews.llvm.org/D130443
2022-07-23[NFC] Move function definition to cpp fileJun Zhang1-28/+1
Signed-off-by: Jun Zhang <jun@junz.org>
2022-07-22re-land [C++20][Modules] Build module static initializers per P1874R1.Iain Sandoe1-1/+7
The re-land fixes module map module dependencies seen on Greendragon, but not in the clang test suite. --- Currently we only implement this for the Itanium ABI since the correct mangling for the initializers in other ABIs is not yet known. Intended result: For a module interface [which includes partition interface and implementation units] (instead of the generic CXX initializer) we emit a module init that: - wraps the contained initializations in a control variable to ensure that the inits only happen once, even if a module is imported many times by imports of the main unit. - calls module initializers for imported modules first. Note that the order of module import is not significant, and therefore neither is the order of imported module initializers. - We then call initializers for the Global Module Fragment (if present) - We then call initializers for the current module. - We then call initializers for the Private Module Fragment (if present) For a module implementation unit, or a non-module TU that imports at least one module we emit a regular CXX init that: - Calls the initializers for any imported modules first. - Then proceeds as normal with remaining inits. For all module unit kinds we include a global constructor entry, this allows for the (in most cases unusual) possibility that a module object could be included in a final binary without a specific call to its initializer. Implementation: - We provide the module pointer in the AST Context so that CodeGen can act on it and its sub-modules. - We need to account for module build lines like this: ` clang -cc1 -std=c++20 Foo.pcm -emit-obj -o Foo.o` or ` clang -cc1 -std=c++20 -xc++-module Foo.cpp -emit-obj -o Foo.o` - in order to do this, we add to ParseAST to set the module pointer in the ASTContext, once we establish that this is a module build and we know the module pointer. To be able to do this, we make the query for current module public in Sema. - In CodeGen, we determine if the current build requires a CXX20-style module init and, if so, we defer any module initializers during the "Eagerly Emitted" phase. - We then walk the module initializers at the end of the TU but before emitting deferred inits (which adds any hidden and static ones, fixing https://github.com/llvm/llvm-project/issues/51873 ). - We then proceed to emit the deferred inits and continue to emit the CXX init function. Differential Revision: https://reviews.llvm.org/D126189
2022-07-14[InstrProf] Add options to profile function groupsEllis Hoag1-3/+9
Add two options, `-fprofile-function-groups=N` and `-fprofile-selected-function-group=i` used to partition functions into `N` groups and only instrument the functions in group `i`. Similar options were added to xray in https://reviews.llvm.org/D87953 and the goal is the same; to reduce instrumented size overhead by spreading the overhead across multiple builds. Raw profiles from different groups can be added like normal using the `llvm-profdata merge` command. Reviewed By: ianlevesque Differential Revision: https://reviews.llvm.org/D129594
2022-07-13[CodeGen] Keep track of decls that were deferred and have been emitted.Jun Zhang1-0/+19
This patch adds a new field called EmittedDeferredDecls in CodeGenModule that keeps track of decls that were deferred and have been emitted. The intention of this patch is to solve issues in the incremental c++, we'll lose info of decls that are lazily emitted when we undo their usage. See example below: clang-repl> inline int foo() { return 42;} clang-repl> int bar = foo(); clang-repl> %undo clang-repl> int baz = foo(); JIT session error: Symbols not found: [ _Z3foov ] error: Failed to materialize symbols: { (main, { baz, $.incr_module_2.inits.0, orc_init_func.incr_module_2 }) } Signed-off-by: Jun Zhang <jun@junz.org> Differential Revision: https://reviews.llvm.org/D128782
2022-07-11Revert "[C++20][Modules] Build module static initializers per P1874R1."Iain Sandoe1-7/+1
This reverts commit ac507102d258b6fc0cb57eb60c9dfabd57ff562f. reverting while we figuere out why one of the green dragon lldb test fails.
2022-07-09[C++20][Modules] Build module static initializers per P1874R1.Iain Sandoe1-1/+7
Currently we only implement this for the Itanium ABI since the correct mangling for the initializers in other ABIs is not yet known. Intended result: For a module interface [which includes partition interface and implementation units] (instead of the generic CXX initializer) we emit a module init that: - wraps the contained initializations in a control variable to ensure that the inits only happen once, even if a module is imported many times by imports of the main unit. - calls module initializers for imported modules first. Note that the order of module import is not significant, and therefore neither is the order of imported module initializers. - We then call initializers for the Global Module Fragment (if present) - We then call initializers for the current module. - We then call initializers for the Private Module Fragment (if present) For a module implementation unit, or a non-module TU that imports at least one module we emit a regular CXX init that: - Calls the initializers for any imported modules first. - Then proceeds as normal with remaining inits. For all module unit kinds we include a global constructor entry, this allows for the (in most cases unusual) possibility that a module object could be included in a final binary without a specific call to its initializer. Implementation: - We provide the module pointer in the AST Context so that CodeGen can act on it and its sub-modules. - We need to account for module build lines like this: ` clang -cc1 -std=c++20 Foo.pcm -emit-obj -o Foo.o` or ` clang -cc1 -std=c++20 -xc++-module Foo.cpp -emit-obj -o Foo.o` - in order to do this, we add to ParseAST to set the module pointer in the ASTContext, once we establish that this is a module build and we know the module pointer. To be able to do this, we make the query for current module public in Sema. - In CodeGen, we determine if the current build requires a CXX20-style module init and, if so, we defer any module initializers during the "Eagerly Emitted" phase. - We then walk the module initializers at the end of the TU but before emitting deferred inits (which adds any hidden and static ones, fixing https://github.com/llvm/llvm-project/issues/51873 ). - We then proceed to emit the deferred inits and continue to emit the CXX init function. Differential Revision: https://reviews.llvm.org/D126189