aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/CodeGen/CGCall.cpp
AgeCommit message (Collapse)AuthorFilesLines
2021-04-23Reland "[Clang] Propagate guaranteed alignment for malloc and others"Dávid Bolvanský1-0/+18
This relands commit 6914a0ed2b30924b188968e59a83efa07ac5fe57. Crash in InstCombine was fixed.
2021-04-23Revert "[Clang] Propagate guaranteed alignment for malloc and others"Dávid Bolvanský1-18/+0
This reverts commit c2297544c04764237cedc523083c7be2fb3833d4. Some buildbots are broken.
2021-04-23[Clang] Propagate guaranteed alignment for malloc and othersDávid Bolvanský1-0/+18
LLVM should be smarter about *known* malloc's alignment and this knowledge may enable other optimizations. Originally started as LLVM patch - https://reviews.llvm.org/D100862 but this logic should be really in Clang. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D100879
2021-04-15Implemented [[clang::musttail]] attribute for guaranteed tail calls.Joshua Haberman1-2/+22
This is a Clang-only change and depends on the existing "musttail" support already implemented in LLVM. The [[clang::musttail]] attribute goes on a return statement, not a function definition. There are several constraints that the user must follow when using [[clang::musttail]], and these constraints are verified by Sema. Tail calls are supported on regular function calls, calls through a function pointer, member function calls, and even pointer to member. Future work would be to throw a warning if a users tries to pass a pointer or reference to a local variable through a musttail call. Reviewed By: rsmith Differential Revision: https://reviews.llvm.org/D99517
2021-04-15[clang][AArch64] Correctly align HFA arguments when passed on the stackMomchil Velikov1-0/+1
When we pass a AArch64 Homogeneous Floating-Point Aggregate (HFA) argument with increased alignment requirements, for example struct S { __attribute__ ((__aligned__(16))) double v[4]; }; Clang uses `[4 x double]` for the parameter, which is passed on the stack at alignment 8, whereas it should be at alignment 16, following Rule C.4 in AAPCS (https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#642parameter-passing-rules) Currently we don't have a way to express in LLVM IR the alignment requirements of the function arguments. The align attribute is applicable to pointers only, and only for some special ways of passing arguments (e..g byval). When implementing AAPCS32/AAPCS64, clang resorts to dubious hacks of coercing to types, which naturally have the needed alignment. We don't have enough types to cover all the cases, though. This patch introduces a new use of the stackalign attribute to control stack slot alignment, when and if an argument is passed in memory. The attribute align is left as an optimizer hint - it still applies to pointer types only and pertains to the content of the pointer, whereas the alignment of the pointer itself is determined by the stackalign attribute. For byval arguments, the stackalign attribute assumes the role, previously perfomed by align, falling back to align if stackalign` is absent. On the clang side, when passing arguments using the "direct" style (cf. `ABIArgInfo::Kind`), now we can optionally specify an alignment, which is emitted as the new `stackalign` attribute. Patch by Momchil Velikov and Lucas Prates. Differential Revision: https://reviews.llvm.org/D98794
2021-04-10Temporairly revert "[CGCall] Annotate `this` argument with alignment"Roman Lebedev1-11/+5
As per @jyknight, "It seems like there's a bug with vtable thunks getting the wrong information." See https://reviews.llvm.org/D99790#2680857, https://godbolt.org/z/MxhYMe1q7 This reverts commit 0aa0458f1429372038ca6a4edc7e94c96cd9a753.
2021-04-07[CGCall] Annotate `this` argument with alignmentRoman Lebedev1-5/+11
As it is being noted in D99249, lack of alignment information on `this` has been preventing LICM from happening. For some time now, lack of alignment attribute does *not* imply natural alignment, but an alignment of `1`. Also, we used to treat dereferenceable as implying alignment, but we no longer do, so it's a bugfix. Differential Revision: https://reviews.llvm.org/D99790
2021-03-29Reapply "OpaquePtr: Turn inalloca into a type attribute"Matt Arsenault1-1/+1
This reverts commit 07e46367baeca96d84b03fa215b41775f69d5989.
2021-03-29Revert "Reapply "OpaquePtr: Turn inalloca into a type attribute""Oliver Stannard1-1/+1
Reverting because test 'Bindings/Go/go.test' is failing on most buildbots. This reverts commit fc9df309917e57de704f3ce4372138a8d4a23d7a.
2021-03-28Reapply "OpaquePtr: Turn inalloca into a type attribute"Matt Arsenault1-1/+1
This reverts commit 20d5c42e0ef5d252b434bcb610b04f1cb79fe771.
2021-03-28Revert "OpaquePtr: Turn inalloca into a type attribute"Nico Weber1-1/+1
This reverts commit 4fefed65637ec46c8c2edad6b07b5569ac61e9e5. Broke check-clang everywhere.
2021-03-28OpaquePtr: Turn inalloca into a type attributeMatt Arsenault1-1/+1
I think byval/sret and the others are close to being able to rip out the code to support the missing type case. A lot of this code is shared with inalloca, so catch this up to the others so that can happen.
2021-03-11[CGBuilder] Remove type-less CreateAlignedLoad() APIs (NFC)Nikita Popov1-1/+3
These are incompatible with opaque pointers. This is in preparation of dropping this API on the IRBuilder side as well. Instead explicitly pass the loaded type.
2021-03-10[NFC] Remove duplicate isNoBuiltinFunc methodserge-sans-paille1-2/+1
It's available both in CodeGenOptions and in LangOptions, and LangOptions implementation is slightly better as it uses a StringRef instead of a char pointer, so use it. Differential Revision: https://reviews.llvm.org/D98175
2021-03-05Refactor -funique-internal-linakge-names implementation.Sriraman Tallam1-0/+10
The option -funique-internal-linkage-names was added in D73307 and D78243 as a LLVM early pass to insert a unique suffix to internal linkage functions and vars. The unique suffix was the hash of the module path. However, we found that this can be done more cleanly in clang early and the fixes that need to be done later can be completely avoided. The fixes in particular are trying to modify the DW_AT_linkage_name and finding the right place to insert the pass. This patch ressurects the original implementation proposed in D73307 which was reviewed and then ditched in favor of the pass based approach. Differential Revision: https://reviews.llvm.org/D96109
2021-03-04[MS] Fix crash involving gnu stmt exprs and inallocaReid Kleckner1-1/+2
Use a WeakTrackingVH to cope with the stmt emission logic that cleans up unreachable blocks. This invalidates the reference to the deferred replacement placeholder. Cope with it. Fixes PR25102 (from 2015!)
2021-03-04Introduce noundef attribute at call sites for stricter poison analysisGui Andrade1-0/+94
This change adds a new IR noundef attribute, which denotes when a function call argument or return val may never contain uninitialized bits. In MemorySanitizer, this attribute enables optimizations which decrease instrumented code size by up to 17% (measured with an instrumented build of clang) . I'll introduce the change allowing msan to take advantage of this information in a separate patch. Differential Revision: https://reviews.llvm.org/D81678
2021-02-20Reduce the number of attributes attached to each functionDávid Bolvanský1-2/+2
This takes advantage of the implicit default behavior to reduce the number of attributes.
2021-02-18[clang] functions with the 'const' or 'pure' attribute must always return.Jeroen Dobbelaere1-0/+5
As described in * https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-pure-function-attribute * https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-const-function-attribute An `__attribute__((pure))` function must always return, as well as an `__attribute__((const))` function. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D96960
2021-02-16Reduce the number of attributes attached to each functionserge-sans-paille1-15/+14
This takes advantage of the implicit default behavior to reduce the number of attributes, which in turns reduces compilation time. I've observed -3% in instruction count when compiling sqlite3 amalgamation with -O0 Differential Revision: https://reviews.llvm.org/D96400
2021-01-12[IR] move nomerge attribute from function declaration/definition to callsitesZequan Wu1-8/+8
Move nomerge attribute from function declaration/definition to callsites to allow virtual function calls attach the attribute. Differential Revision: https://reviews.llvm.org/D94537
2021-01-11[clang][AArch64][SVE] Avoid going through memory for coerced VLST return valuesJoe Ellis1-0/+15
VLST return values are coerced to VLATs in the function epilog for consistency with the VLAT ABI. Previously, this coercion was done through memory. It is preferable to use the llvm.experimental.vector.insert intrinsic to avoid going through memory here. Reviewed By: c-rhodes Differential Revision: https://reviews.llvm.org/D94290
2021-01-05[clang][AArch64][SVE] Avoid going through memory for coerced VLST argumentsJoe Ellis1-0/+21
VLST arguments are coerced to VLATs at the function boundary for consistency with the VLAT ABI. They are then bitcast back to VLSTs in the function prolog. Previously, this conversion is done through memory. With the introduction of the llvm.vector.{insert,extract} intrinsic, we can avoid going through memory here. Depends on D92761 Differential Revision: https://reviews.llvm.org/D92762
2020-12-17[IR][PGO] Add hot func attribute and use hot/cold attribute in func sectionRong Xu1-0/+2
Clang FE currently has hot/cold function attribute. But we only have cold function attribute in LLVM IR. This patch adds support of hot function attribute to LLVM IR. This attribute will be used in setting function section prefix/suffix. Currently .hot and .unlikely suffix only are added in PGO (Sample PGO) compilation (through isFunctionHotInCallGraph and isFunctionColdInCallGraph). This patch changes the behavior. The new behavior is: (1) If the user annotates a function as hot or isFunctionHotInCallGraph is true, this function will be marked as hot. Otherwise, (2) If the user annotates a function as cold or isFunctionColdInCallGraph is true, this function will be marked as cold. The changes are: (1) user annotated function attribute will used in setting function section prefix/suffix. (2) hot attribute overwrites profile count based hotness. (3) profile count based hotness overwrite user annotated cold attribute. The intention for these changes is to provide the user a way to mark certain function as hot in cases where training input is hard to cover all the hot functions. Differential Revision: https://reviews.llvm.org/D92493
2020-12-17[Clang] Make nomerge attribute a function attribute as well as a statement ↵Zequan Wu1-5/+9
attribute. Differential Revision: https://reviews.llvm.org/D92800
2020-12-15[Clang][Attr] Introduce the `assume` function attributeJohannes Doerfert1-0/+13
The `assume` attribute is a way to provide additional, arbitrary information to the optimizer. For now, assumptions are restricted to strings which will be accumulated for a function and emitted as comma separated string function attribute. The key of the LLVM-IR function attribute is `llvm.assume`. Similar to `llvm.assume` and `__builtin_assume`, the `assume` attribute provides a user defined assumption to the compiler. A follow up patch will introduce an LLVM-core API to query the assumptions attached to a function. We also expect to add more options, e.g., expression arguments, to the `assume` attribute later on. The `omp [begin] asssumes` pragma will leverage this attribute and expose the functionality in the absence of OpenMP. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D91979
2020-12-14[PGO] remove unintentional code in early commitRong Xu1-5/+0
Remove unintentional code in commit 54e03d [PGO] Verify BFI counts after loading profile data.
2020-12-14[PGO] Verify BFI counts after loading profile dataRong Xu1-0/+5
This patch adds the functionality to compare BFI counts with real profile counts right after reading the profile. It will print remarks under -Rpass-analysis=pgo, or the internal option -pass-remarks-analysis=pgo. Differential Revision: https://reviews.llvm.org/D91813
2020-12-14[clang][IR] Add support for leaf attributeGulfem Savrun Yeniceri1-0/+2
This patch adds support for leaf attribute as an optimization hint in Clang/LLVM. Differential Revision: https://reviews.llvm.org/D90275
2020-12-09Don't setup inalloca for swiftcc on i686-windows-msvcReid Kleckner1-11/+35
Swiftcall does it's own target-independent argument type classification, since it is not designed to be ABI compatible with anything local on the target that isn't LLVM-based. This means it never uses inalloca. However, we have duplicate logic for checking for inalloca parameters that runs before call argument setup. This logic needs to know ahead of time if inalloca will be used later, and we can't move the CGFunctionInfo calculation earlier. This change gets the calling convention from either the FunctionProtoType or ObjCMethodDecl, checks if it is swift, and if so skips the stackbase setup. Depends on D92883. Differential Revision: https://reviews.llvm.org/D92944
2020-12-09De-templatify EmitCallArgs argument type checking, NFCIReid Kleckner1-2/+68
This template exists to abstract over FunctionPrototype and ObjCMethodDecl, which have similar APIs for storing parameter types. In place of a template, use a PointerUnion with two cases to handle this. Hopefully this improves readability, since the type of the prototype is easier to discover. This allows me to sink this code, which is mostly assertions, out of the header file and into the cpp file. I can also simplify the overloaded methods for computing isGenericMethod, and get rid of the second EmitCallArgs overload. Differential Revision: https://reviews.llvm.org/D92883
2020-11-30[CodeGen] -fno-delete-null-pointer-checks: change dereferenceable to ↵Fangrui Song1-5/+13
dereferenceable_or_null After D17993, with -fno-delete-null-pointer-checks we add the dereferenceable attribute to the `this` pointer. We have observed that one internal target which worked before fails even with -fno-delete-null-pointer-checks. Switching to dereferenceable_or_null fixes the problem. dereferenceable currently does not always respect NullPointerIsValid and may imply nonnull and lead to aggressive optimization. The optimization may be related to `CallBase::isReturnNonNull`, `Argument::hasNonNullAttr`, or `Value::getPointerDereferenceableBytes`. See D66664 and D66618 for some discussions. Reviewed By: bkramer, rsmith Differential Revision: https://reviews.llvm.org/D92297
2020-11-25CGCall.cpp - use castAs<> instead of getAs<> as we dereference the pointer ↵Simon Pilgrim1-2/+2
directly. NFCI. castAs<> will assert the correct cast type instead of just returning null, which we then try to dereference immediately in the setUsedBits call.
2020-11-16[CodeGen] Apply 'nonnull' and 'dereferenceable(N)' to 'this' pointerCJ Johnson1-0/+22
arguments. * Adds 'nonnull' and 'dereferenceable(N)' to 'this' pointer arguments * Gates 'nonnull' on -f(no-)delete-null-pointer-checks * Introduces this-nonnull.cpp and microsoft-abi-this-nullable.cpp tests to explicitly test the behavior of this change * Refactors hundreds of over-constrained clang tests to permit these attributes, where needed * Updates Clang12 patch notes mentioning this change Reviewed-by: rsmith, jdoerfert Differential Revision: https://reviews.llvm.org/D17993
2020-10-25[clang] Enable support for #pragma STDC FENV_ACCESSMelanie Blower1-2/+2
Reviewers: rjmccall, rsmith, sepavloff Differential Revision: https://reviews.llvm.org/D87528
2020-10-16Reapply "OpaquePtr: Add type to sret attribute"Matt Arsenault1-2/+2
This reverts commit eb9f7c28e5fe6d75fed3587023e17f2997c8024b. Previously this was incorrectly handling linking of the contained type, so this merges the fixes from D88973.
2020-10-13[AST] Change return type of getTypeInfoInChars to a proper struct instead of ↵Bevin Hansson1-2/+2
std::pair. Followup to D85191. This changes getTypeInfoInChars to return a TypeInfoChars struct instead of a std::pair of CharUnits. This lets the interface match getTypeInfo more closely. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D86447
2020-10-01[clang][opencl][codegen] Remove the insertion of ↵Michael Liao1-5/+0
`correctly-rounded-divide-sqrt-fp-math` fn-attr. - `-cl-fp32-correctly-rounded-divide-sqrt` is already handled in a per-instruction manner by annotating the accuracy required. There's no need to add that fn-attr. So far, there's no in-tree backend handling that attr and that OpenCL specific option. - In case that out-of-tree backends are broken, this change could be reverted if those backends could not be fixed. Differential Revision: https://reviews.llvm.org/D88424
2020-09-29Revert "OpaquePtr: Add type to sret attribute"Tres Popp1-2/+2
This reverts commit 55c4ff91bd820d72014f63dcf7f3d5a0d3397986. Issues were introduced as discussed in https://reviews.llvm.org/D88241 where this change made previous bugs in the linker and BitCodeWriter visible.
2020-09-28[ubsan] nullability-arg: Fix crash on C++ member pointersVedant Kumar1-4/+1
Extend -fsanitize=nullability-arg to handle call sites which accept C++ member pointers. rdar://62476022 Differential Revision: https://reviews.llvm.org/D88336
2020-09-28[clang][codegen] Annotate `correctly-rounded-divide-sqrt-fp-math` fn-attr ↵Michael Liao1-3/+5
for OpenCL only. - `-cl-fp32-correctly-rounded-divide-sqrt` is an OpenCL-specific option and `correctly-rounded-divide-sqrt-fp-math` should be added for OpenCL at most. Differential revision: https://reviews.llvm.org/D88303
2020-09-25OpaquePtr: Add type to sret attributeMatt Arsenault1-2/+2
Make the corresponding change that was made for byval in b7141207a483d39b99c2b4da4eb3bb591eca9e1a. Like byval, this requires a bulk update of the test IR tests to include the type before this can be mandatory.
2020-09-12[Clang] Add option to allow marking pass-by-value args as noalias.Florian Hahn1-0/+7
After the recent discussion on cfe-dev 'Can indirect class parameters be noalias?' [1], it seems like using using noalias is problematic for current C++, but should be allowed for C-only code. This patch introduces a new option to let the user indicate that it is safe to mark indirect class parameters as noalias. Note that this also applies to external callers, e.g. it might not be safe to use this flag for C functions that are called by C++ functions. In targets that allocate indirect arguments in the called function, this enables more agressive optimizations with respect to memory operations and brings a ~1% - 2% codesize reduction for some programs. [1] : http://lists.llvm.org/pipermail/cfe-dev/2020-July/066353.html Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D85473
2020-08-28Reland "[CodeGen][AArch64] Support arm_sve_vector_bits attribute"Cullen Rhodes1-17/+26
This relands D85743 with a fix for test CodeGen/attr-arm-sve-vector-bits-call.c that disables the new pass manager with '-fno-experimental-new-pass-manager'. Test was failing due to IR differences with the new pass manager which broke the Fuchsia builder [1]. Reverted in 2e7041f. [1] http://lab.llvm.org:8011/builders/fuchsia-x86_64-linux/builds/10375 Original summary: This patch implements codegen for the 'arm_sve_vector_bits' type attribute, defined by the Arm C Language Extensions (ACLE) for SVE [1]. The purpose of this attribute is to define vector-length-specific (VLS) versions of existing vector-length-agnostic (VLA) types. VLSTs are represented as VectorType in the AST and fixed-length vectors in the IR everywhere except in function args/return. Implemented in this patch is codegen support for the following: * Implicit casting between VLA <-> VLS types. * Coercion of VLS types in function args/return. * Mangling of VLS types. Casting is handled by the CK_BitCast operation, which has been extended to support the two new vector kinds for fixed-length SVE predicate and data vectors, where the cast is implemented through memory rather than a bitcast which is unsupported. Implementing this as a normal bitcast would require relaxing checks in LLVM to allow bitcasting between scalable and fixed types. Another option was adding target-specific intrinsics, although codegen support would need to be added for these intrinsics. Given this, casting through memory seemed like the best approach as it's supported today and existing optimisations may remove unnecessary loads/stores, although there is room for improvement here. Coercion of VLSTs in function args/return from fixed to scalable is implemented through the AArch64 ABI in TargetInfo. The VLA and VLS types are defined by the ACLE to map to the same machine-level SVE vectors. VLS types are mangled in the same way as: __SVE_VLS<typename, unsigned> where the first argument is the underlying variable-length type and the second argument is the SVE vector length in bits. For example: #if __ARM_FEATURE_SVE_BITS==512 // Mangled as 9__SVE_VLSIu11__SVInt32_tLj512EE typedef svint32_t vec __attribute__((arm_sve_vector_bits(512))); // Mangled as 9__SVE_VLSIu10__SVBool_tLj512EE typedef svbool_t pred __attribute__((arm_sve_vector_bits(512))); #endif The latest ACLE specification (00bet5) does not contain details of this mangling scheme, it will be specified in the next revision. The mangling scheme is otherwise defined in the appendices to the Procedure Call Standard for the Arm Architecture, see [2] for more information. [1] https://developer.arm.com/documentation/100987/latest [2] https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#appendix-c-mangling Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D85743
2020-08-27Revert "[CodeGen][AArch64] Support arm_sve_vector_bits attribute"Cullen Rhodes1-26/+17
Test CodeGen/attr-arm-sve-vector-bits-call.c is failing on some builders [1][2]. Reverting whilst I investigate. [1] http://lab.llvm.org:8011/builders/fuchsia-x86_64-linux/builds/10375 [2] https://luci-milo.appspot.com/p/fuchsia/builders/ci/clang-linux-x64/b8870800848452818112 This reverts commit 42587345a3afc52c03c6e6095db773358a1b03e9.
2020-08-27[CodeGen][AArch64] Support arm_sve_vector_bits attributeCullen Rhodes1-17/+26
This patch implements codegen for the 'arm_sve_vector_bits' type attribute, defined by the Arm C Language Extensions (ACLE) for SVE [1]. The purpose of this attribute is to define vector-length-specific (VLS) versions of existing vector-length-agnostic (VLA) types. VLSTs are represented as VectorType in the AST and fixed-length vectors in the IR everywhere except in function args/return. Implemented in this patch is codegen support for the following: * Implicit casting between VLA <-> VLS types. * Coercion of VLS types in function args/return. * Mangling of VLS types. Casting is handled by the CK_BitCast operation, which has been extended to support the two new vector kinds for fixed-length SVE predicate and data vectors, where the cast is implemented through memory rather than a bitcast which is unsupported. Implementing this as a normal bitcast would require relaxing checks in LLVM to allow bitcasting between scalable and fixed types. Another option was adding target-specific intrinsics, although codegen support would need to be added for these intrinsics. Given this, casting through memory seemed like the best approach as it's supported today and existing optimisations may remove unnecessary loads/stores, although there is room for improvement here. Coercion of VLSTs in function args/return from fixed to scalable is implemented through the AArch64 ABI in TargetInfo. The VLA and VLS types are defined by the ACLE to map to the same machine-level SVE vectors. VLS types are mangled in the same way as: __SVE_VLS<typename, unsigned> where the first argument is the underlying variable-length type and the second argument is the SVE vector length in bits. For example: #if __ARM_FEATURE_SVE_BITS==512 // Mangled as 9__SVE_VLSIu11__SVInt32_tLj512EE typedef svint32_t vec __attribute__((arm_sve_vector_bits(512))); // Mangled as 9__SVE_VLSIu10__SVBool_tLj512EE typedef svbool_t pred __attribute__((arm_sve_vector_bits(512))); #endif The latest ACLE specification (00bet5) does not contain details of this mangling scheme, it will be specified in the next revision. The mangling scheme is otherwise defined in the appendices to the Procedure Call Standard for the Arm Architecture, see [2] for more information. [1] https://developer.arm.com/documentation/100987/latest [2] https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#appendix-c-mangling Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D85743
2020-08-18[clang codegen] Use IR "align" attribute for static array arguments.Eli Friedman1-4/+12
Without the "align" attribute, marking the argument dereferenceable is basically useless. See also D80166. Fixes https://bugs.llvm.org/show_bug.cgi?id=46876 . Differential Revision: https://reviews.llvm.org/D84992
2020-08-06clang: Use byref for aggregate kernel argumentsMatt Arsenault1-8/+32
Add address space to indirect abi info and use it for kernels. Previously, indirect arguments assumed assumed a stack passed object in the alloca address space using byval. A stack pointer is unsuitable for kernel arguments, which are passed in a separate, constant buffer with a different address space. Start using the new byref for aggregate kernel arguments. Previously these were emitted as raw struct arguments, and turned into loads in the backend. These will lower identically, although with byref you now have the option of applying an explicit alignment. In the future, a reasonable implementation would use byref for all kernel arguments (this would be a practical problem at the moment due to losing things like noalias on pointer arguments). This is mostly to avoid fighting the optimizer's treatment of aggregate load/store. SROA and instcombine both turn aggregate loads and stores into a long sequence of element loads and stores, rather than the optimizable memcpy I would expect in this situation. Now an explicit memcpy will be introduced up-front which is better understood and helps eliminate the alloca in more situations. This skips using byref in the case where HIP kernel pointer arguments in structs are promoted to global pointers. At minimum an additional patch is needed to allow coercion with indirect arguments. This also skips using it for OpenCL due to the current workaround used to support kernels calling kernels. Distinct function bodies would need to be generated up front instead of emitting an illegal call.
2020-07-15[CodeGen] Emit a call instruction instead of an invoke if the calledAkira Hatanaka1-0/+4
llvm function is marked nounwind This fixes cases where an invoke is emitted, despite the called llvm function being marked nounwind, because ConstructAttributeList failed to add the attribute to the attribute list. llvm optimization passes turn invokes into calls and optimize away the exception handling code, but it's better to avoid emitting the code in the front-end if the called function is known not to raise an exception. Differential Revision: https://reviews.llvm.org/D83906
2020-07-10Change behavior with zero-sized static array extentsAaron Ballman1-2/+4
Currently, Clang previously diagnosed this code by default: void f(int a[static 0]); saying that "static has no effect on zero-length arrays", which was accurate. However, static array extents require that the caller of the function pass a nonnull pointer to an array of *at least* that number of elements, but it can pass more (see C17 6.7.6.3p6). Given that we allow zero-sized arrays as a GNU extension and that it's valid to pass more elements than specified by the static array extent, we now support zero-sized static array extents with the usual semantics because it can be useful in cases like: void my_bzero(char p[static 0], int n); my_bzero(&c+1, 0); //ok my_bzero(t+k,n-k); //ok, pattern from actual code