aboutsummaryrefslogtreecommitdiff
path: root/clang/lib/CodeGen/TargetInfo.cpp
AgeCommit message (Collapse)AuthorFilesLines
2023-01-27[SystemZ] Fix handling of vectors and their exposure of the vector ABI.Jonas Paulsson1-29/+65
- Global vector variables expose the vector ABI through their alignments only if they are >=16 bytes in size. - Vectors passed between functions expose the vector ABI only if they are <=16 bytes in size. LLVM test suite builds with gcc/clang now give the same gnu attributes emitted. Reviewed By: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D141409
2023-01-24[clang][RISCV] Fix ABI mismatch between GCC and Clang (extension of integers ↵Alex Bradbury1-11/+3
on stack) See <https://github.com/llvm/llvm-project/issues/57261> for full details. Essentially, a previous version of the psABI indicated (by my reading) that integer scalars passed on the stack were anyext. A [later commit](https://github.com/riscv-non-isa/riscv-elf-psabi-doc/commit/cec39a064ee0e5b0129973fffab7e3ad1710498f) changed this to indicate that they are in fact signext/zeroext just as if they were passed in registers. This patch adds the change in the release notes but doesn't add a flag to retain the old behaviour. The hope is that it's sufficiently hard to trigger an issue due to this that it isn't worthwhile doing so. Differential Revision: https://reviews.llvm.org/D140401
2023-01-13[clang][NFC] Remove dependency on DataLayout::getPrefTypeAlignmentGuillaume Chatelet1-5/+5
2023-01-11[clang][NFC] Use the TypeSize::getXXXValue() instead of TypeSize::getXXXSize)Guillaume Chatelet1-2/+2
This change is one of a series to implement the discussion from https://reviews.llvm.org/D141134.
2023-01-09Move from llvm::makeArrayRef to ArrayRef deduction guides - clang/ partserge-sans-paille1-1/+1
This is a follow-up to https://reviews.llvm.org/D140896, split into several parts as it touches a lot of files. Differential Revision: https://reviews.llvm.org/D141139
2023-01-07Revert "AMDGPU: Invert handling of enqueued block detection"Matt Arsenault1-9/+0
This reverts commit 47288cc977fa31c44cc92b4e65044a5b75c2597e. The runtime is having trouble with this at -O0 when the inputs are always enabled.
2023-01-07clang/AMDGPU: Force disable block enqueue arguments for HIPMatt Arsenault1-0/+9
This is a dirty, dirty hack to workaround bot failures at -O0. Currently these fields are only used by OpenCL features and evidently the HIP runtime isn't expecting to see them in HIP programs. The code objects should be language agnostic, so just force optimize these out until the runtime is fixed.
2022-12-21[clang] Do not extend i8 return values to i16 on AVR.Ben Shi1-0/+4
Reviewed By: Miss_Grape, aykevl Differential Revision: https://reviews.llvm.org/D139908
2022-12-20[CodeGen][AArch64] Fix AArch64ABIInfo::EmitAAPCSVAArg crash with empty ↵yronglin1-0/+10
record type in variadic arg Fix AArch64ABIInfo::EmitAAPCSVAArg crash with empty record type in variadic arg Open issue: https://github.com/llvm/llvm-project/issues/59034 Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D138511
2022-12-06[SystemZ] Emit a .gnu_attribute for an externally visible vector abi.Jonas Paulsson1-7/+89
On SystemZ, the vector ABI changes depending on the presence of hardware vector support. Therefore, each binary compiled with a visible vector ABI (e.g. one that calls an external function with a vector argument) should be marked with a .gnu_attribute describing this. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D105067
2022-11-30[clang][TargetInfo] Use LangAS for getPointer{Width,Align}()Alex Richardson1-7/+8
Mixing LLVM and Clang address spaces can result in subtle bugs, and there is no need for this hook to use the LLVM IR level address spaces. Most of this change is just replacing zero with LangAS::Default, but it also allows us to remove a few calls to getTargetAddressSpace(). This also removes a stale comment+workaround in CGDebugInfo::CreatePointerLikeType(): ASTContext::getTypeSize() does return the expected size for ReferenceType (and handles address spaces). Differential Revision: https://reviews.llvm.org/D138295
2022-11-19[CodeGen][ARM] Fix ARMABIInfo::EmitVAAarg crash with empty record type ↵yronglin1-4/+4
variadic arg Fix ARMABIInfo::EmitVAAarg crash with empty record type variadic arg Open issue: https://github.com/llvm/llvm-project/issues/58794 Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D138137
2022-11-17[clang] Fix wrong ABI of AVRTiny.Ben Shi1-9/+10
A scalar which exceeds 4 bytes should be returned via a stack slot, on an AVRTiny device. Reviewed By: aykevl Differential Revision: https://reviews.llvm.org/D138125
2022-11-08Fix duplicate word typos; NFCRageking81-2/+2
This revision fixes typos where there are 2 consecutive words which are duplicated. There should be no code changes in this revision (only changes to comments and docs). Do let me know if there are any undesirable changes in this revision. Thanks.
2022-10-20[HLSL] Disable integer promotion to avoid int16_t being promoted to int for ↵Xiang Li1-2/+2
HLSL. short will be promoted to int in UsualUnaryConversions. Disable it for HLSL to keep int16_t as 16bit. Reviewed By: aaron.ballman, rjmccall Differential Revision: https://reviews.llvm.org/D133668
2022-10-16[clang][PowerPC] PPC64 VAArg fix right-alignment for aggregates fit in registerTing Wang1-8/+29
PPC64 ABI pass aggregates smaller than a register into the least significant bits of the register. In the case of variadic functions, they will end up right-aligned in their argument slots in the argument area on big-endian targets. Apply right-alignment for these aggregates. Fixes #55900. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D133338
2022-10-16[clang] Use std::clamp (NFC)Kazu Hirata1-3/+3
Note that the constructor of MipsABIInfo guarantees that MinABIStackAlignInBytes <= StackAlignInBytes, so we can use std::clamp safely.
2022-10-06[OpenMP][AMDGPU] Add 'uniform-work-group' attribute to OpenMP kernelsJoseph Huber1-1/+5
The `cl-uniform-work-group` attribute asserts that the global work-size be a multiple of the work-group specified work group size. This should allow optimizations. It is already present by default in the AMD compiler and for HIP kernels so it should be safe to allow this for OpenMP kernels by default. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D135374
2022-10-01[Clang][AArch64] Support AArch64 target(..) attribute formats.David Green1-9/+10
This adds support under AArch64 for the target("..") attributes. The current parsing is very X86-shaped, this patch attempts to bring it line with the GCC implementation from https://gcc.gnu.org/onlinedocs/gcc/AArch64-Function-Attributes.html#AArch64-Function-Attributes. The supported formats are: - "arch=<arch>" strings, that specify the architecture features for a function as per the -march=arch+feature option. - "cpu=<cpu>" strings, that specify the target-cpu and any implied atributes as per the -mcpu=cpu+feature option. - "tune=<cpu>" strings, that specify the tune-cpu cpu for a function as per -mtune. - "+<feature>", "+no<feature>" enables/disables the specific feature, for compatibility with GCC target attributes. - "<feature>", "no-<feature>" enabled/disables the specific feature, for backward compatibility with previous releases. To do this, the parsing of target attributes has been moved into TargetInfo to give the target the opportunity to override the existing parsing. The only non-aarch64 change should be a minor alteration to the error message, specifying using "CPU" to describe the cpu, not "architecture", and the DuplicateArch/Tune from ParsedTargetAttr have been combined into a single option. Differential Revision: https://reviews.llvm.org/D133848
2022-09-20[X86][fastcall][vectorcall] Move capability check before free register updatePhoebe Wang1-12/+11
When passing arguments with `__fastcall` or `__vectorcall` in 32-bit MSVC, the following arguments have chance to be passed by register if the current one failed. `__regcall` from ICC is on the contrary: https://godbolt.org/z/4MPbzhaMG All the three calling conversions are not supported in GCC. Fixes: #57737 Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D133920
2022-09-19[Clang][LoongArch] Implement ABI loweringWeining Lu1-0/+443
Reuse most of RISCV's implementation with several exceptions: 1. Assign signext/zeroext attribute to args passed in stack. On RISCV, integer scalars passed in registers have signext/zeroext when promoted, but are anyext if passed on the stack. This is defined in early RISCV ABI specification. But after this change [1], integers should also be signext/zeroext if passed on the stack. So I think RISCV's ABI lowering should be updated [2]. While in LoongArch ABI spec, we can see that integer scalars narrower than GRLEN bits are zero/sign-extended no matter passed in registers or on the stack. 2. Zero-width bit fields are ignored. This matches GCC's behavior but it hasn't been documented in ABI sepc. See https://gcc.gnu.org/r12-8294. 3. `char` is signed by default. There is another difference worth mentioning is that `char` is signed by default on LoongArch while it is unsigned on RISCV. This patch also adds `_BitInt` type support to LoongArch and handle it in LoongArchABIInfo::classifyArgumentType. [1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/commit/cec39a064ee0e5b0129973fffab7e3ad1710498f [2] https://github.com/llvm/llvm-project/issues/57261 Differential Revision: https://reviews.llvm.org/D132285
2022-08-27Use std::clamp (NFC)Kazu Hirata1-1/+1
This patch replaces clamp idioms with std::clamp where the range is obviously valid from the source code (that is, low <= high) to avoid introducing undefined behavior.
2022-08-20Use llvm::drop_begin (NFC)Kazu Hirata1-2/+2
2022-08-19[clang][RISCV] Fix incorrect ABI lowering for inherited structs under ↵Alex Bradbury1-1/+14
hard-float ABIs The hard float ABIs have a rule that if a flattened struct contains either a single fp value, or an int+fp, or fp+fp then it may be passed in a pair of registers (if sufficient GPRs+FPRs are available). detectFPCCEligibleStruct and the helper it calls, detectFPCCEligibleStructHelper examine the type of the argument/return value to determine if it complies with the requirements for this ABI rule. As reported in bug #57084, this logic produces incorrect results for C++ structs that inherit from other structs. This is because only the fields of the struct were examined, but enumerating RD->fields misses any fields in inherited C++ structs. This patch corrects that issue by adding appropriate logic to enumerate any included base structs. Differential Revision: https://reviews.llvm.org/D131677
2022-08-18[Clang][BPF] Support record argument with direct valuesYonghong Song1-0/+36
Currently, record arguments are always passed by reference by allocating space for record values in the caller. This is less efficient for small records which may take one or two registers. For example, for x86_64 and aarch64, for a record size up to 16 bytes, the record values can be passed by values directly on the registers. This patch added BPF support of record argument with direct values for up to 16 byte record size. If record size is 0, that record will not take any register, which is the same behavior for x86_64 and aarch64. If the record size is greater than 16 bytes, the record argument will be passed by reference. Differential Revision: https://reviews.llvm.org/D132144
2022-08-16[Clang][BPF]: Force sign/zero extension for return values in callerYonghong Song1-0/+53
Currently bpf supports calling kernel functions (x86_64, arm64, etc.) in bpf programs. Tejun discovered a problem where the x86_64 func return value (a unsigned char type) is stored in 8-bit subregister %al and the other 56-bits in %rax might be garbage. But based on current bpf ABI, the bpf program assumes the whole %rax holds the correct value as the callee is supposed to do necessary sign/zero extension. This mismatch between bpf and x86_64 caused the incorrect results. To resolve this problem, this patch forced caller to do needed sign/zero extension for 8/16-bit return values as well. Note that 32-bit return values already had sign/zero extension even without this patch. For example, for the test case attached to this patch: $ cat t.c _Bool bar_bool(void); unsigned char bar_char(void); short bar_short(void); int bar_int(void); int foo_bool(void) { if (bar_bool() != 1) return 0; else return 1; } int foo_char(void) { if (bar_char() != 10) return 0; else return 1; } int foo_short(void) { if (bar_short() != 10) return 0; else return 1; } int foo_int(void) { if (bar_int() != 10) return 0; else return 1; } Without this patch, generated call insns in IR looks like: %call = call zeroext i1 @bar_bool() %call = call zeroext i8 @bar_char() %call = call signext i16 @bar_short() %call = call i32 @bar_int() So it is assumed that zero extension has been done for return values of bar_bool()and bar_char(). Sign extension has been done for the return value of bar_short(). The return value of bar_int() does not have any assumption so caller needs to do necessary shifting to get correct 32bit values. With this patch, generated call insns in IR looks like: %call = call i1 @bar_bool() %call = call i8 @bar_char() %call = call i16 @bar_short() %call = call i32 @bar_int() There are no assumptions for return values of the above four function calls, so necessary shifting is necessary for all of them. The following is the objdump file difference for function foo_char(). Without this patch: 0000000000000010 <foo_char>: 2: 85 10 00 00 ff ff ff ff call -1 3: bf 01 00 00 00 00 00 00 r1 = r0 4: b7 00 00 00 01 00 00 00 r0 = 1 5: 15 01 01 00 0a 00 00 00 if r1 == 10 goto +1 <LBB1_2> 6: b7 00 00 00 00 00 00 00 r0 = 0 0000000000000038 <LBB1_2>: 7: 95 00 00 00 00 00 00 00 exit With this patch: 0000000000000018 <foo_char>: 3: 85 10 00 00 ff ff ff ff call -1 4: bf 01 00 00 00 00 00 00 r1 = r0 5: 57 01 00 00 ff 00 00 00 r1 &= 255 6: b7 00 00 00 01 00 00 00 r0 = 1 7: 15 01 01 00 0a 00 00 00 if r1 == 10 goto +1 <LBB1_2> 8: b7 00 00 00 00 00 00 00 r0 = 0 0000000000000048 <LBB1_2>: 9: 95 00 00 00 00 00 00 00 exit The zero extension of the return 'char' value is done here. Differential Revision: https://reviews.llvm.org/D131598
2022-08-10[X86][BF16] Enable __bf16 for x86 targets.Freddy Ye1-6/+7
X86 psABI has updated to support __bf16 type, the ABI of which is the same as FP16. See https://discourse.llvm.org/t/patch-add-optional-bfloat16-support/63149 Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D130964
2022-08-08[clang][CodeGen] Factor out Swift ABI hooks (NFCI)Sergei Barannikov1-126/+103
Swift calling conventions stands out in the way that they are lowered in mostly target-independent manner, with very few customization points. As such, swift-related methods of ABIInfo do not reference the rest of ABIInfo and vice versa. This change follows interface segregation principle; it removes dependency of SwiftABIInfo on ABIInfo. Targets must now implement SwiftABIInfo separately if they support Swift calling conventions. Almost all targets implemented `shouldPassIndirectly` the same way. This de-facto default implementation has been moved into the base class. `isSwiftErrorInRegister` used to be virtual, now it is not. It didn't accept any arguments which could have an effect on the returned value. This is now a static property of the target ABI. Reviewed By: rusyaev-roman, inclyc Differential Revision: https://reviews.llvm.org/D130394
2022-07-23Use llvm::sort instead of std::sort where possibleDmitri Gribenko1-1/+1
llvm::sort is beneficial even when we use the iterator-based overload, since it can optionally shuffle the elements (to detect non-determinism). However llvm::sort is not usable everywhere, for example, in compiler-rt. Reviewed By: nhaehnle Differential Revision: https://reviews.llvm.org/D130406
2022-07-22[CUDA/SPIR-V] Force passing aggregate type byvalShangwu Yao1-0/+9
This patch forces copying aggregate type in kernel arguments by value when compiling CUDA targeting SPIR-V. The original behavior is not passing by value when there is any of destructor, copy constructor and move constructor defined by user. This patch makes the behavior of SPIR-V generated from CUDA follow the CUDA spec (https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#global-function-argument-processing), and matches the NVPTX implementation ( https://github.com/llvm/llvm-project/blob/41958f76d8a2c47484fa176cba1de565cfe84de7/clang/lib/CodeGen/TargetInfo.cpp#L7241). Differential Revision: https://reviews.llvm.org/D130387
2022-07-22[clang][CodeGen] Only include ABIInfo.h where required (NFC)Sergei Barannikov1-0/+3
Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D130322
2022-06-20[clang] Don't use Optional::getValue (NFC)Kazu Hirata1-1/+1
2022-06-10[ARM] Fix how size-0 bitfields affect homogeneous aggregates.Simon Tatham1-2/+27
By both AAPCS32 and AAPCS64, the test for whether an aggregate qualifies as homogeneous (either HFA or HVA) is based on the data layout alone. So any logical member of the structure that does not affect the data layout also should not affect homogeneity. In particular, an empty bitfield ('int : 0') should make no difference. In fact, clang considered it to make a difference in C but not in C++, and justified that policy as compatible with gcc. But that's considered a bug in gcc as well (at least for Arm targets), and it's fixed in gcc 12.1. This fix mimics gcc's: zero-sized bitfields are now ignored in all languages for the Arm (32- and 64-bit) ABIs. But I've left the previous behaviour unchanged in other ABIs, by means of adding an ABIInfo::isZeroLengthBitfieldPermittedInHomogeneousAggregate query method which the Arm subclasses override. Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D127197
2022-06-02[PS5] Make passing unions in registers match PS4 ABIPaul Robinson1-1/+1
2022-06-02[PS5] Classify __m64 as integer, matching PS4 ABIPaul Robinson1-1/+1
2022-05-31[Clang][CSKY] Add support about CSKYABIInfoZi Xuan Wu (Zeson)1-0/+167
According to the CSKY ABIv2 document, https://github.com/c-sky/csky-doc/blob/master/C-SKY_V2_CPU_Applications_Binary_Interface_Standards_Manual.pdf construct the ABIInfo to handle argument passing and return of clang data type. It also includes how to emit and expand VAArg intrinsic. Differential Revision: https://reviews.llvm.org/D126451
2022-04-26[SystemZ] Fix C++ ABI for passing args of structs containing zero width ↵Jonas Paulsson1-4/+1
bitfield. A struct like { float a; int :0; } should per the SystemZ ABI be passed in a GPR, but to match a bug in GCC it has been passed in an FPR (see 759449c). GCC has now corrected the C++ ABI for this case, and this patch for clang follows suit. Reviewed By: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D122388
2022-03-31Fix the build after cd26190a10fceb6e1472fabcd9e1736f62f078c4Aaron Ballman1-1/+1
These variables were being used uninitialized and it caused a significant number of test failures on Windows.
2022-03-29[X86][regcall] Support passing / returning structuresPhoebe Wang1-27/+50
Currently, the regcall calling conversion in Clang doesn't match with ICC when passing / returning structures. https://godbolt.org/z/axxKMKrW7 This patch tries to fix the problem to match with ICC. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D122104
2022-03-24[clang][AVR] Implement standard calling convention for AVR and AVRTinyBen Shi1-16/+84
This patch implements avr-gcc's calling convention: https://gcc.gnu.org/wiki/avr-gcc#Calling_Convention Reviewed By: aykevl Differential Revision: https://reviews.llvm.org/D120720
2022-03-22[NFC][Clang][OpaquePtr] Remove calls to Address::deprecated inAkira Hatanaka1-42/+49
TargetInfo.cpp Differential Revision: https://reviews.llvm.org/D122199
2022-03-18Use llvm::append_range instead of push_back loops where applicable. NFCI.Benjamin Kramer1-7/+3
2022-03-17[CodeGen] Avoid pointer element type access for blocksNikita Popov1-4/+3
Pass the block struct type down to the TargetInfo hooks.
2022-02-24[CUDA][SPIRV] Assign global address space to CUDA kernel argumentsShangwu Yao1-3/+3
(resubmit https://reviews.llvm.org/D119207 after fixing the test for some build settings) This patch converts CUDA pointer kernel arguments with default address space to CrossWorkGroup address space (__global in OpenCL). This is because Generic or Function (OpenCL's private) is not supported as storage class for kernel pointer types. Differential revision: https://reviews.llvm.org/D120366
2022-02-17[clang] Remove Address::deprecated() in emitVoidPtrDirectVAArg()Arthur Eubanks1-3/+3
2022-02-17Revert "[CUDA][SPIRV] Assign global address space to CUDA kernel arguments"Matthew Voss1-3/+3
This reverts commit 9de4fc0f2d3b60542956f7e5254951d049edeb1f. Reverting due to test failure: https://lab.llvm.org/buildbot/#/builders/139/builds/17199
2022-02-17[CUDA][SPIRV] Assign global address space to CUDA kernel argumentsShangwu Yao1-3/+3
This patch converts CUDA pointer kernel arguments with default address space to CrossWorkGroup address space (__global in OpenCL). This is because Generic or Function (OpenCL's private) is not supported as storage class for kernel pointer types. Differential Revision: https://reviews.llvm.org/D119207
2022-02-17[CodeGen] Rename deprecated Address constructorNikita Popov1-64/+69
To make uses of the deprecated constructor easier to spot, and to ensure that no new uses are introduced, rename it to Address::deprecated(). While doing the rename, I've filled in element types in cases where it was relatively obvious, but we're still left with 135 calls to the deprecated constructor.
2022-02-14[CGBuilder] Remove CreateBitCast() methodNikita Popov1-13/+10
Use CreateElementBitCast() instead, or don't work on Address where not necessary.
2022-02-04[clang][CodeGen] Use memory type representation in `va_arg`Jan Svoboda1-1/+2
Some types (e.g. `_Bool`) have different scalar and memory representations. CodeGen for `va_arg` didn't take this into account, leading to an assertion failures with different types. This patch makes sure we use memory representation for `va_arg`. Reviewed By: ahatanak Differential Revision: https://reviews.llvm.org/D118904