aboutsummaryrefslogtreecommitdiff
path: root/llvm/lib/IR/Attributes.cpp
AgeCommit message (Collapse)AuthorFilesLines
2021-07-14[Verifier] Improve incompatible attribute type checkNikita Popov1-2/+3
A couple of attributes had explicit checks for incompatibility with pointer types. However, this is already handled generically by the typeIncompatible() check. We can drop these after adding SwiftError to typeIncompatible(). However, the previous implementation of the check prints out all attributes that are incompatible with a given type, even though those attributes aren't actually used. This has the annoying result that the error message changes every time a new attribute is added to the list. Improve this by explicitly finding which attribute isn't compatible and printing just that.
2021-07-12[Attributes] Determine attribute properties from TableGen dataNikita Popov1-0/+29
Continuing from D105763, this allows placing certain properties about attributes in the TableGen definition. In particular, we store whether an attribute applies to fn/param/ret (or a combination thereof). This information is used by the Verifier, as well as the ForceFunctionAttrs pass. I also plan to use this in LLParser, which also duplicates info on which attributes are valid where. This keeps metadata about attributes in one place, and makes it more likely that it stays in sync, rather than in various functions spread across the codebase. Differential Revision: https://reviews.llvm.org/D105780
2021-07-12[Attributes] Remove duplicate attribute in typeIncompatible() (NFC)Nikita Popov1-1/+0
InAlloca was listed twice, once as a normal attribute, once as a type attribute.
2021-07-12[Attributes] Replace doesAttrKindHaveArgument() (NFC)Nikita Popov1-9/+0
This is now the same as isIntAttrKind(), so use that instead, as it does not require manual maintenance. The naming is also more accurate in that both int and type attributes have an argument, but this method was only targeting int attributes. I initially wanted to tighten the AttrBuilder assertion, but we have some in-tree uses that would violate it.
2021-07-12[Attributes] Simplify attribute sorting (NFCI)Nikita Popov1-30/+12
It's not necessary to explicitly sort by enum/int/type attribute, as the attribute kinds are already sorted this way. We can directly sort by kind.
2021-07-12[Attributes] Assert correct attribute constructor is used (NFCI)Nikita Popov1-0/+5
Assert that enum/int/type attributes go through the constructor they are supposed to use. To make sure this can't happen via invalid bitcode, explicitly verify that the attribute kind if correct there.
2021-07-12[Attributes] Make type attribute handling more generic (NFCI)Nikita Popov1-31/+17
Followup to D105658 to make AttrBuilder automatically work with new type attributes. TableGen is tweaked to emit First/LastTypeAttr markers, based on which we can handle type attributes programmatically. Differential Revision: https://reviews.llvm.org/D105763
2021-07-09[AttrBuilder] Make handling of type attributes more generic (NFCI)Nikita Popov1-75/+47
While working on the elementtype attribute, I felt that the type attribute handling in AttrBuilder is overly repetitive. This patch converts the separate Type* members into an std::array<Type*>, so that all type attribute kinds can be handled generically. There's more room for improvement here (especially when it comes to converting the AttrBuilder to an Attribute), but this seems like a good starting point. Differential Revision: https://reviews.llvm.org/D105658
2021-07-07[IR] Simplify Attribute::getAsString() (NFC)Nikita Popov1-153/+4
Avoid enumerating all attributes here and instead use getNameFromAttrKind(), which is based on the tablegen data. This only leaves us with custom handling for int attributes, which don't have uniform printing.
2021-06-28[IR] remove assert since always_inline can appear on CallBaseNick Desaulniers1-10/+0
I added an assertion in D91816 (documenting behavior added in D93422) that callers and callees with mismatched fn attr's related to stack protectors should not occur unless the callee was attributed always_inline. This falls apart when a call, invoke, or callbr (any instruction inheriting from CallBase) itself has an always_inline attribute. Clang will emit such attributes on Instructions when __attribute__((flatten)) is used to recursively force inlining from a caller. Since these assertions only had the caller and callee Functions, and not the call site (CallBase derived classes), we would have to search the caller for such instructions to reconstruct the call site information. But at that point, inlining has already occurred; the call site has already been removed from the caller. Remove the assertions, add a unit test for always_inline call sites, and update the LangRef. Another curiosity is that the always_inline Attribute on Instructions is only expanded by the inline pass, not the always_inline pass. Thanks to @pcc on this report when building Android's RunTime (ART) interpreter. Reviewed By: pcc, MaskRay Differential Revision: https://reviews.llvm.org/D104944
2021-05-25[SanitizeCoverage] Add support for NoSanitizeCoverage function attributeMarco Elver1-0/+2
We really ought to support no_sanitize("coverage") in line with other sanitizers. This came up again in discussions on the Linux-kernel mailing lists, because we currently do workarounds using objtool to remove coverage instrumentation. Since that support is only on x86, to continue support coverage instrumentation on other architectures, we must support selectively disabling coverage instrumentation via function attributes. Unfortunately, for SanitizeCoverage, it has not been implemented as a sanitizer via fsanitize= and associated options in Sanitizers.def, but rolls its own option fsanitize-coverage. This meant that we never got "automatic" no_sanitize attribute support. Implement no_sanitize attribute support by special-casing the string "coverage" in the NoSanitizeAttr implementation. To keep the feature as unintrusive to existing IR generation as possible, define a new negative function attribute NoSanitizeCoverage to propagate the information through to the instrumentation pass. Fixes: https://bugs.llvm.org/show_bug.cgi?id=49035 Reviewed By: vitalybuka, morehouse Differential Revision: https://reviews.llvm.org/D102772
2021-05-22[IR] Optimize no-op removal from AttributeList (NFC)Nikita Popov1-11/+6
When removing an AttrBuilder from an index of an AttributeList, directly return the original list if no attributes were actually removed.
2021-05-22[IR] Optimize no-op removal from AttributeSet (NFC)Nikita Popov1-1/+5
When removing an AttrBuilder from an AttributeSet, first check whether there is any overlap. If nothing is being removed, we can directly return the original set.
2021-05-14IR+AArch64: add a "swiftasync" argument attribute.Tim Northover1-0/+2
This extends any frame record created in the function to include that parameter, passed in X22. The new record looks like [X22, FP, LR] in memory, and FP is stored with 0b0001 in bits 63:60 (CodeGen assumes they are 0b0000 in normal operation). The effect of this is that tools walking the stack should expect to see one of three values there: * 0b0000 => a normal, non-extended record with just [FP, LR] * 0b0001 => the extended record [X22, FP, LR] * 0b1111 => kernel space, and a non-extended record. All other values are currently reserved. If compiling for arm64e this context pointer is address-discriminated with the discriminator 0xc31a and the DB (process-specific) key. There is also an "i8** @llvm.swift.async.context.addr()" intrinsic providing front-ends access to this slot (and forcing its creation initialized to nullptr if necessary).
2021-04-29Improve error messages for attributes in the wrong context.Nick Lewycky1-3/+15
verifyFunctionAttrs has a comment that the value V is printed in error messages. The recently added errors for attributes didn't print V. Make them print V. Change the stringification of AttributeList. Firstly they started with 'PAL[' which stood for ParamAttrsList. Change that to 'AttributeList[' matching its current name AttributeList. Print out semantic meaning of the index instead of the raw index value (i.e. 'return', 'function' or 'arg(n)'). Differential revision: https://reviews.llvm.org/D101484
2021-04-17Normalize interaction with boolean attributesSerge Guelton1-0/+12
Such attributes can either be unset, or set to "true" or "false" (as string). throughout the codebase, this led to inelegant checks ranging from if (Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true") to if (Fn->hasAttribute("no-jump-tables") && Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true") Introduce a getValueAsBool that normalize the check, with the following behavior: no attributes or attribute set to "false" => return false attribute set to "true" => return true Differential Revision: https://reviews.llvm.org/D99299
2021-04-16Verify the LLVMContext that an Attribute belongs to.Nick Lewycky1-5/+31
Attributes don't know their parent Context, adding this would make Attribute larger. Instead, we add hasParentContext that answers whether this Attribute belongs to a particular LLVMContext by checking for itself inside the context's FoldingSet. Same with AttributeSet and AttributeList. The Verifier checks them with the Module context. Differential Revision: https://reviews.llvm.org/D99362
2021-04-15[clang][AArch64] Correctly align HFA arguments when passed on the stackMomchil Velikov1-0/+4
When we pass a AArch64 Homogeneous Floating-Point Aggregate (HFA) argument with increased alignment requirements, for example struct S { __attribute__ ((__aligned__(16))) double v[4]; }; Clang uses `[4 x double]` for the parameter, which is passed on the stack at alignment 8, whereas it should be at alignment 16, following Rule C.4 in AAPCS (https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#642parameter-passing-rules) Currently we don't have a way to express in LLVM IR the alignment requirements of the function arguments. The align attribute is applicable to pointers only, and only for some special ways of passing arguments (e..g byval). When implementing AAPCS32/AAPCS64, clang resorts to dubious hacks of coercing to types, which naturally have the needed alignment. We don't have enough types to cover all the cases, though. This patch introduces a new use of the stackalign attribute to control stack slot alignment, when and if an argument is passed in memory. The attribute align is left as an optimizer hint - it still applies to pointer types only and pertains to the content of the pointer, whereas the alignment of the pointer itself is determined by the stackalign attribute. For byval arguments, the stackalign attribute assumes the role, previously perfomed by align, falling back to align if stackalign` is absent. On the clang side, when passing arguments using the "direct" style (cf. `ABIArgInfo::Kind`), now we can optionally specify an alignment, which is emitted as the new `stackalign` attribute. Patch by Momchil Velikov and Lucas Prates. Differential Revision: https://reviews.llvm.org/D98794
2021-03-29Reapply "OpaquePtr: Turn inalloca into a type attribute"Matt Arsenault1-17/+61
This reverts commit 07e46367baeca96d84b03fa215b41775f69d5989.
2021-03-29Revert "Reapply "OpaquePtr: Turn inalloca into a type attribute""Oliver Stannard1-61/+17
Reverting because test 'Bindings/Go/go.test' is failing on most buildbots. This reverts commit fc9df309917e57de704f3ce4372138a8d4a23d7a.
2021-03-28Reapply "OpaquePtr: Turn inalloca into a type attribute"Matt Arsenault1-17/+61
This reverts commit 20d5c42e0ef5d252b434bcb610b04f1cb79fe771.
2021-03-28Revert "OpaquePtr: Turn inalloca into a type attribute"Nico Weber1-61/+17
This reverts commit 4fefed65637ec46c8c2edad6b07b5569ac61e9e5. Broke check-clang everywhere.
2021-03-28OpaquePtr: Turn inalloca into a type attributeMatt Arsenault1-17/+61
I think byval/sret and the others are close to being able to rip out the code to support the missing type case. A lot of this code is shared with inalloca, so catch this up to the others so that can happen.
2021-03-25[DAE] Adjust param/arg attributes when changing parameter to undefGuozhi Wei1-0/+11
In DeadArgumentElimination pass, if a function's argument is never used, corresponding caller's parameter can be changed to undef. If the param/arg has attribute noundef or other related attributes, LLVM LangRef(https://llvm.org/docs/LangRef.html#parameter-attributes) says its behavior is undefined. SimplifyCFG(D97244) takes advantage of this behavior and does bad transformation on valid code. To avoid this undefined behavior when change caller's parameter to undef, this patch removes noundef attribute and other attributes imply noundef on param/arg. Differential Revision: https://reviews.llvm.org/D98899
2021-03-22[IR] Add vscale_range IR function attributeBradley Smith1-2/+100
This attribute represents the minimum and maximum values vscale can take. For now this attribute is not hooked up to anything during codegen, this will be added in the future when such codegen is considered stable. Additionally hook up the -msve-vector-bits=<x> clang option to emit this attribute. Differential Revision: https://reviews.llvm.org/D98030
2021-03-16[NFC] Use SmallString instead of std::string for the AttrBuilderserge-sans-paille1-1/+1
This avoids a few unnecessary conversion from StringRef to std::string, and a bunch of extra allocation thanks to the SmallString. Differential Revision: https://reviews.llvm.org/D98190
2021-02-27[IR] Use range-based for loops (NFC)Kazu Hirata1-3/+2
2021-01-26Support for instrumenting only selected files or functionsPetr Hosek1-0/+2
This change implements support for applying profile instrumentation only to selected files or functions. The implementation uses the sanitizer special case list format to select which files and functions to instrument, and relies on the new noprofile IR attribute to exclude functions from instrumentation. Differential Revision: https://reviews.llvm.org/D94820
2021-01-26Revert "Support for instrumenting only selected files or functions"Petr Hosek1-2/+0
This reverts commit 4edf35f11a9e20bd5df3cb47283715f0ff38b751 because the test fails on Windows bots.
2021-01-26Support for instrumenting only selected files or functionsPetr Hosek1-0/+2
This change implements support for applying profile instrumentation only to selected files or functions. The implementation uses the sanitizer special case list format to select which files and functions to instrument, and relies on the new noprofile IR attribute to exclude functions from instrumentation. Differential Revision: https://reviews.llvm.org/D94820
2021-01-22[IR] Optimize adding attribute to AttributeList (NFC)Nikita Popov1-12/+17
When adding an enum attribute to an AttributeList, avoid going through an AttrBuilder and instead directly add the attribute to the correct set. Going through AttrBuilder is expensive, because it requires all string attributes to be reconstructed. This can be further improved by inserting the attribute at the right position and using the AttributeSetNode::getSorted() API. This recovers the small compile-time regression from D94633.
2021-01-06[llvm] Use llvm::append_range (NFC)Kazu Hirata1-1/+1
2020-12-31[IROutliner] Adding consistent function attribute mergingAndrew Litteken1-0/+18
When combining extracted functions, they may have different function attributes. We want to make sure that we do not make any assumptions, or lose any information. This attempts to make sure that we consolidate function attributes to their most general case. Tests: llvm/test/Transforms/IROutliner/outlining-compatible-and-attribute-transfer.ll llvm/test/Transforms/IROutliner/outlining-compatible-or-attribute-transfer.ll Reviewers: jdoefert, paquette Differential Revision: https://reviews.llvm.org/D87301
2020-12-17[IR][PGO] Add hot func attribute and use hot/cold attribute in func sectionRong Xu1-0/+2
Clang FE currently has hot/cold function attribute. But we only have cold function attribute in LLVM IR. This patch adds support of hot function attribute to LLVM IR. This attribute will be used in setting function section prefix/suffix. Currently .hot and .unlikely suffix only are added in PGO (Sample PGO) compilation (through isFunctionHotInCallGraph and isFunctionColdInCallGraph). This patch changes the behavior. The new behavior is: (1) If the user annotates a function as hot or isFunctionHotInCallGraph is true, this function will be marked as hot. Otherwise, (2) If the user annotates a function as cold or isFunctionColdInCallGraph is true, this function will be marked as cold. The changes are: (1) user annotated function attribute will used in setting function section prefix/suffix. (2) hot attribute overwrites profile count based hotness. (3) profile count based hotness overwrite user annotated cold attribute. The intention for these changes is to provide the user a way to mark certain function as hot in cases where training input is hard to cover all the hot functions. Differential Revision: https://reviews.llvm.org/D92493
2020-12-17Make LLVM build in C++20 modeBarry Revzin1-1/+1
Part of the <=> changes in C++20 make certain patterns of writing equality operators ambiguous with themselves (sorry!). This patch goes through and adjusts all the comparison operators such that they should work in both C++17 and C++20 modes. It also makes two other small C++20-specific changes (adding a constructor to a type that cases to be an aggregate, and adding casts from u8 literals which no longer have type const char*). There were four categories of errors that this review fixes. Here are canonical examples of them, ordered from most to least common: // 1) Missing const namespace missing_const { struct A { #ifndef FIXED bool operator==(A const&); #else bool operator==(A const&) const; #endif }; bool a = A{} == A{}; // error } // 2) Type mismatch on CRTP namespace crtp_mismatch { template <typename Derived> struct Base { #ifndef FIXED bool operator==(Derived const&) const; #else // in one case changed to taking Base const& friend bool operator==(Derived const&, Derived const&); #endif }; struct D : Base<D> { }; bool b = D{} == D{}; // error } // 3) iterator/const_iterator with only mixed comparison namespace iter_const_iter { template <bool Const> struct iterator { using const_iterator = iterator<true>; iterator(); template <bool B, std::enable_if_t<(Const && !B), int> = 0> iterator(iterator<B> const&); #ifndef FIXED bool operator==(const_iterator const&) const; #else friend bool operator==(iterator const&, iterator const&); #endif }; bool c = iterator<false>{} == iterator<false>{} // error || iterator<false>{} == iterator<true>{} || iterator<true>{} == iterator<false>{} || iterator<true>{} == iterator<true>{}; } // 4) Same-type comparison but only have mixed-type operator namespace ambiguous_choice { enum Color { Red }; struct C { C(); C(Color); operator Color() const; bool operator==(Color) const; friend bool operator==(C, C); }; bool c = C{} == C{}; // error bool d = C{} == Red; } Differential revision: https://reviews.llvm.org/D78938
2020-12-14[clang][IR] Add support for leaf attributeGulfem Savrun Yeniceri1-0/+2
This patch adds support for leaf attribute as an optimization hint in Clang/LLVM. Differential Revision: https://reviews.llvm.org/D90275
2020-12-02[Inline] prevent inlining on stack protector mismatchNick Desaulniers1-0/+10
It's common for code that manipulates the stack via inline assembly or that has to set up its own stack canary (such as the Linux kernel) would like to avoid stack protectors in certain functions. In this case, we've been bitten by numerous bugs where a callee with a stack protector is inlined into an attribute((no_stack_protector)) caller, which generally breaks the caller's assumptions about not having a stack protector. LTO exacerbates the issue. While developers can avoid this by putting all no_stack_protector functions in one translation unit together and compiling those with -fno-stack-protector, it's generally not very ergonomic or as ergonomic as a function attribute, and still doesn't work for LTO. See also: https://lore.kernel.org/linux-pm/20200915172658.1432732-1-rkir@google.com/ https://lore.kernel.org/lkml/20200918201436.2932360-30-samitolvanen@google.com/T/#u SSP attributes can be ordered by strength. Weakest to strongest, they are: ssp, sspstrong, sspreq. Callees with differing SSP attributes may be inlined into each other, and the strongest attribute will be applied to the caller. (No change) After this change: * A callee with no SSP attributes will no longer be inlined into a caller with SSP attributes. * The reverse is also true: a callee with an SSP attribute will not be inlined into a caller with no SSP attributes. * The alwaysinline attribute overrides these rules. Functions that get synthesized by the compiler may not get inlined as a result if they are not created with the same stack protector function attribute as their callers. Alternative approach to https://reviews.llvm.org/D87956. Fixes pr/47479. Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed By: rnk, MaskRay Differential Revision: https://reviews.llvm.org/D91816
2020-11-17Revert "[IR] add fn attr for no_stack_protector; prevent inlining on mismatch"Nick Desaulniers1-12/+2
This reverts commit b7926ce6d7a83cdf70c68d82bc3389c04009b841. Going with a simpler approach.
2020-10-23[IR] add fn attr for no_stack_protector; prevent inlining on mismatchNick Desaulniers1-2/+12
It's currently ambiguous in IR whether the source language explicitly did not want a stack a stack protector (in C, via function attribute no_stack_protector) or doesn't care for any given function. It's common for code that manipulates the stack via inline assembly or that has to set up its own stack canary (such as the Linux kernel) would like to avoid stack protectors in certain functions. In this case, we've been bitten by numerous bugs where a callee with a stack protector is inlined into an __attribute__((__no_stack_protector__)) caller, which generally breaks the caller's assumptions about not having a stack protector. LTO exacerbates the issue. While developers can avoid this by putting all no_stack_protector functions in one translation unit together and compiling those with -fno-stack-protector, it's generally not very ergonomic or as ergonomic as a function attribute, and still doesn't work for LTO. See also: https://lore.kernel.org/linux-pm/20200915172658.1432732-1-rkir@google.com/ https://lore.kernel.org/lkml/20200918201436.2932360-30-samitolvanen@google.com/T/#u Typically, when inlining a callee into a caller, the caller will be upgraded in its level of stack protection (see adjustCallerSSPLevel()). By adding an explicit attribute in the IR when the function attribute is used in the source language, we can now identify such cases and prevent inlining. Block inlining when the callee and caller differ in the case that one contains `nossp` when the other has `ssp`, `sspstrong`, or `sspreq`. Fixes pr/47479. Reviewed By: void Differential Revision: https://reviews.llvm.org/D87956
2020-10-20[IR] Adds mustprogress as a LLVM IR attributeAtmn Patel1-0/+2
This adds the LLVM IR attribute `mustprogress` as defined in LangRef through D86233. This attribute will be applied to functions with in languages like C++ where forward progress is guaranteed. Functions without this attribute are not required to make progress. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D85393
2020-10-16Reapply "OpaquePtr: Add type to sret attribute"Matt Arsenault1-6/+44
This reverts commit eb9f7c28e5fe6d75fed3587023e17f2997c8024b. Previously this was incorrectly handling linking of the contained type, so this merges the fixes from D88973.
2020-10-05[AttributeFuncs] Consider `noundef` in `typeIncompatible`Johannes Doerfert1-0/+4
Drop `noundef` for return values that are replaced by void and make it illegal to put `noundef` on a void value. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D87306
2020-10-05[AttributeFuncs] Consider `align` in `typeIncompatible`Johannes Doerfert1-0/+1
Alignment attributes need to be dropped for non-pointer values. This also introduces a check into the verifier to ensure you don't use `align` on anything but a pointer. Test needed to be adjusted accordingly. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D87304
2020-09-29Revert "OpaquePtr: Add type to sret attribute"Tres Popp1-44/+6
This reverts commit 55c4ff91bd820d72014f63dcf7f3d5a0d3397986. Issues were introduced as discussed in https://reviews.llvm.org/D88241 where this change made previous bugs in the linker and BitCodeWriter visible.
2020-09-25OpaquePtr: Add type to sret attributeMatt Arsenault1-6/+44
Make the corresponding change that was made for byval in b7141207a483d39b99c2b4da4eb3bb591eca9e1a. Like byval, this requires a bulk update of the test IR tests to include the type before this can be mandatory.
2020-08-29[IR] Inline AttrBuilder::addAttribute. It just sets 1 bit. NFC.Benjamin Kramer1-8/+0
2020-08-29[Attributes] Merge calls to getFnAttribute/hasFnAttribute using ↵Craig Topper1-23/+18
Attribute::isValid. NFC Rather than calling hasFnAttribute and then calling getFnAttribute if the attribute exists, its better to just call getFnAttribute and then check if we got a valid attribute back.
2020-08-28[Attributes] Add a method to check if an Attribute has AttrKind None. Use ↵Craig Topper1-4/+4
instead of hasAttribute(Attribute::None) There's a special case in hasAttribute for None when pImpl is null. If pImpl is not null we dispatch to pImpl->hasAttribute which will always return false for Attribute::None. So if we just want to check for None its sufficient to just check that pImpl is null. Which can even be done inline. This patch adds a helper for that case which I hope will speed up our getSubtargetImpl implementations. Differential Revision: https://reviews.llvm.org/D86744
2020-07-20IR: Define byref parameter attributeMatt Arsenault1-5/+44
This allows tracking the in-memory type of a pointer argument to a function for ABI purposes. This is essentially a stripped down version of byval to remove some of the stack-copy implications in its definition. This includes the base IR changes, and some tests for places where it should be treated similarly to byval. Codegen support will be in a future patch. My original attempt at solving some of these problems was to repurpose byval with a different address space from the stack. However, it is technically permitted for the callee to introduce a write to the argument, although nothing does this in reality. There is also talk of removing and replacing the byval attribute, so a new attribute would need to take its place anyway. This is intended avoid some optimization issues with the current handling of aggregate arguments, as well as fixes inflexibilty in how frontends can specify the kernel ABI. The most honest representation of the amdgpu_kernel convention is to expose all kernel arguments as loads from constant memory. Today, these are raw, SSA Argument values and codegen is responsible for turning these into loads. Background: There currently isn't a satisfactory way to represent how arguments for the amdgpu_kernel calling convention are passed. In reality, arguments are passed in a single, flat, constant memory buffer implicitly passed to the function. It is also illegal to call this function in the IR, and this is only ever invoked by a driver of some kind. It does not make sense to have a stack passed parameter in this context as is implied by byval. It is never valid to write to the kernel arguments, as this would corrupt the inputs seen by other dispatches of the kernel. These argumets are also not in the same address space as the stack, so a copy is needed to an alloca. From a source C-like language, the kernel parameters are invisible. Semantically, a copy is always required from the constant argument memory to a mutable variable. The current clang calling convention lowering emits raw values, including aggregates into the function argument list, since using byval would not make sense. This has some unfortunate consequences for the optimizer. In the aggregate case, we end up with an aggregate store to alloca, which both SROA and instcombine turn into a store of each aggregate field. The optimizer never pieces this back together to see that this is really just a copy from constant memory, so we end up stuck with expensive stack usage. This also means the backend dictates the alignment of arguments, and arbitrarily picks the LLVM IR ABI type alignment. By allowing an explicit alignment, frontends can make better decisions. For example, there's real no advantage to an aligment higher than 4, so a frontend could choose to compact the argument layout. Similarly, there is a high penalty to using an alignment lower than 4, so a frontend could opt into more padding for small arguments. Another design consideration is when it is appropriate to expose the fact that these arguments are all really passed in adjacent memory. Currently we have a late IR optimization pass in codegen to rewrite the kernel argument values into explicit loads to enable vectorization. In most programs, unrelated argument loads can be merged together. However, exposing this property directly from the frontend has some disadvantages. We still need a way to track the original argument sizes and alignments to report to the driver. I find using some side-channel, metadata mechanism to track this unappealing. If the kernel arguments were exposed as a single buffer to begin with, alias analysis would be unaware that the padding bits betewen arguments are meaningless. Another family of problems is there are still some gaps in replacing all of the available parameter attributes with metadata equivalents once lowered to loads. The immediate plan is to start using this new attribute to handle all aggregate argumets for kernels. Long term, it makes sense to migrate all kernel arguments, including scalars, to be passed indirectly in the same manner. Additional context is in D79744.
2020-07-17[IR] Fix MSVC warning (NFC)Nikita Popov1-4/+2
As requested by Andrew Kaylor, rewrite this code in a way that does not warn on old MSVC versions. Avoid the buggy constexpr warning by just not using constexpr and removing the static_assert that depends on it.