riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2023-01-27	[SystemZ] Fix handling of vectors and their exposure of the vector ABI.	Jonas Paulsson	1	-29/+65
	- Global vector variables expose the vector ABI through their alignments only if they are >=16 bytes in size. - Vectors passed between functions expose the vector ABI only if they are <=16 bytes in size. LLVM test suite builds with gcc/clang now give the same gnu attributes emitted. Reviewed By: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D141409
2023-01-24	[clang][RISCV] Fix ABI mismatch between GCC and Clang (extension of integers ↵	Alex Bradbury	1	-11/+3
	on stack) See <https://github.com/llvm/llvm-project/issues/57261> for full details. Essentially, a previous version of the psABI indicated (by my reading) that integer scalars passed on the stack were anyext. A [later commit](https://github.com/riscv-non-isa/riscv-elf-psabi-doc/commit/cec39a064ee0e5b0129973fffab7e3ad1710498f) changed this to indicate that they are in fact signext/zeroext just as if they were passed in registers. This patch adds the change in the release notes but doesn't add a flag to retain the old behaviour. The hope is that it's sufficiently hard to trigger an issue due to this that it isn't worthwhile doing so. Differential Revision: https://reviews.llvm.org/D140401
2023-01-13	[clang][NFC] Remove dependency on DataLayout::getPrefTypeAlignment	Guillaume Chatelet	1	-5/+5

2023-01-11	[clang][NFC] Use the TypeSize::getXXXValue() instead of TypeSize::getXXXSize)	Guillaume Chatelet	1	-2/+2
	This change is one of a series to implement the discussion from https://reviews.llvm.org/D141134.
2023-01-09	Move from llvm::makeArrayRef to ArrayRef deduction guides - clang/ part	serge-sans-paille	1	-1/+1
	This is a follow-up to https://reviews.llvm.org/D140896, split into several parts as it touches a lot of files. Differential Revision: https://reviews.llvm.org/D141139
2023-01-07	Revert "AMDGPU: Invert handling of enqueued block detection"	Matt Arsenault	1	-9/+0
	This reverts commit 47288cc977fa31c44cc92b4e65044a5b75c2597e. The runtime is having trouble with this at -O0 when the inputs are always enabled.
2023-01-07	clang/AMDGPU: Force disable block enqueue arguments for HIP	Matt Arsenault	1	-0/+9
	This is a dirty, dirty hack to workaround bot failures at -O0. Currently these fields are only used by OpenCL features and evidently the HIP runtime isn't expecting to see them in HIP programs. The code objects should be language agnostic, so just force optimize these out until the runtime is fixed.
2022-12-21	[clang] Do not extend i8 return values to i16 on AVR.	Ben Shi	1	-0/+4
	Reviewed By: Miss_Grape, aykevl Differential Revision: https://reviews.llvm.org/D139908
2022-12-20	[CodeGen][AArch64] Fix AArch64ABIInfo::EmitAAPCSVAArg crash with empty ↵	yronglin	1	-0/+10
	record type in variadic arg Fix AArch64ABIInfo::EmitAAPCSVAArg crash with empty record type in variadic arg Open issue: https://github.com/llvm/llvm-project/issues/59034 Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D138511
2022-12-06	[SystemZ] Emit a .gnu_attribute for an externally visible vector abi.	Jonas Paulsson	1	-7/+89
	On SystemZ, the vector ABI changes depending on the presence of hardware vector support. Therefore, each binary compiled with a visible vector ABI (e.g. one that calls an external function with a vector argument) should be marked with a .gnu_attribute describing this. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D105067
2022-11-30	[clang][TargetInfo] Use LangAS for getPointer{Width,Align}()	Alex Richardson	1	-7/+8
	Mixing LLVM and Clang address spaces can result in subtle bugs, and there is no need for this hook to use the LLVM IR level address spaces. Most of this change is just replacing zero with LangAS::Default, but it also allows us to remove a few calls to getTargetAddressSpace(). This also removes a stale comment+workaround in CGDebugInfo::CreatePointerLikeType(): ASTContext::getTypeSize() does return the expected size for ReferenceType (and handles address spaces). Differential Revision: https://reviews.llvm.org/D138295
2022-11-19	[CodeGen][ARM] Fix ARMABIInfo::EmitVAAarg crash with empty record type ↵	yronglin	1	-4/+4
	variadic arg Fix ARMABIInfo::EmitVAAarg crash with empty record type variadic arg Open issue: https://github.com/llvm/llvm-project/issues/58794 Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D138137
2022-11-17	[clang] Fix wrong ABI of AVRTiny.	Ben Shi	1	-9/+10
	A scalar which exceeds 4 bytes should be returned via a stack slot, on an AVRTiny device. Reviewed By: aykevl Differential Revision: https://reviews.llvm.org/D138125
2022-11-08	Fix duplicate word typos; NFC	Rageking8	1	-2/+2
	This revision fixes typos where there are 2 consecutive words which are duplicated. There should be no code changes in this revision (only changes to comments and docs). Do let me know if there are any undesirable changes in this revision. Thanks.
2022-10-20	[HLSL] Disable integer promotion to avoid int16_t being promoted to int for ↵	Xiang Li	1	-2/+2
	HLSL. short will be promoted to int in UsualUnaryConversions. Disable it for HLSL to keep int16_t as 16bit. Reviewed By: aaron.ballman, rjmccall Differential Revision: https://reviews.llvm.org/D133668
2022-10-16	[clang][PowerPC] PPC64 VAArg fix right-alignment for aggregates fit in register	Ting Wang	1	-8/+29
	PPC64 ABI pass aggregates smaller than a register into the least significant bits of the register. In the case of variadic functions, they will end up right-aligned in their argument slots in the argument area on big-endian targets. Apply right-alignment for these aggregates. Fixes #55900. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D133338
2022-10-16	[clang] Use std::clamp (NFC)	Kazu Hirata	1	-3/+3
	Note that the constructor of MipsABIInfo guarantees that MinABIStackAlignInBytes <= StackAlignInBytes, so we can use std::clamp safely.
2022-10-06	[OpenMP][AMDGPU] Add 'uniform-work-group' attribute to OpenMP kernels	Joseph Huber	1	-1/+5
	The `cl-uniform-work-group` attribute asserts that the global work-size be a multiple of the work-group specified work group size. This should allow optimizations. It is already present by default in the AMD compiler and for HIP kernels so it should be safe to allow this for OpenMP kernels by default. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D135374
2022-10-01	[Clang][AArch64] Support AArch64 target(..) attribute formats.	David Green	1	-9/+10
	This adds support under AArch64 for the target("..") attributes. The current parsing is very X86-shaped, this patch attempts to bring it line with the GCC implementation from https://gcc.gnu.org/onlinedocs/gcc/AArch64-Function-Attributes.html#AArch64-Function-Attributes. The supported formats are: - "arch=<arch>" strings, that specify the architecture features for a function as per the -march=arch+feature option. - "cpu=<cpu>" strings, that specify the target-cpu and any implied atributes as per the -mcpu=cpu+feature option. - "tune=<cpu>" strings, that specify the tune-cpu cpu for a function as per -mtune. - "+<feature>", "+no<feature>" enables/disables the specific feature, for compatibility with GCC target attributes. - "<feature>", "no-<feature>" enabled/disables the specific feature, for backward compatibility with previous releases. To do this, the parsing of target attributes has been moved into TargetInfo to give the target the opportunity to override the existing parsing. The only non-aarch64 change should be a minor alteration to the error message, specifying using "CPU" to describe the cpu, not "architecture", and the DuplicateArch/Tune from ParsedTargetAttr have been combined into a single option. Differential Revision: https://reviews.llvm.org/D133848
2022-09-20	[X86][fastcall][vectorcall] Move capability check before free register update	Phoebe Wang	1	-12/+11
	When passing arguments with `__fastcall` or `__vectorcall` in 32-bit MSVC, the following arguments have chance to be passed by register if the current one failed. `__regcall` from ICC is on the contrary: https://godbolt.org/z/4MPbzhaMG All the three calling conversions are not supported in GCC. Fixes: #57737 Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D133920
2022-09-19	[Clang][LoongArch] Implement ABI lowering	Weining Lu	1	-0/+443
	Reuse most of RISCV's implementation with several exceptions: 1. Assign signext/zeroext attribute to args passed in stack. On RISCV, integer scalars passed in registers have signext/zeroext when promoted, but are anyext if passed on the stack. This is defined in early RISCV ABI specification. But after this change [1], integers should also be signext/zeroext if passed on the stack. So I think RISCV's ABI lowering should be updated [2]. While in LoongArch ABI spec, we can see that integer scalars narrower than GRLEN bits are zero/sign-extended no matter passed in registers or on the stack. 2. Zero-width bit fields are ignored. This matches GCC's behavior but it hasn't been documented in ABI sepc. See https://gcc.gnu.org/r12-8294. 3. `char` is signed by default. There is another difference worth mentioning is that `char` is signed by default on LoongArch while it is unsigned on RISCV. This patch also adds `_BitInt` type support to LoongArch and handle it in LoongArchABIInfo::classifyArgumentType. [1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/commit/cec39a064ee0e5b0129973fffab7e3ad1710498f [2] https://github.com/llvm/llvm-project/issues/57261 Differential Revision: https://reviews.llvm.org/D132285
2022-08-27	Use std::clamp (NFC)	Kazu Hirata	1	-1/+1
	This patch replaces clamp idioms with std::clamp where the range is obviously valid from the source code (that is, low <= high) to avoid introducing undefined behavior.
2022-08-20	Use llvm::drop_begin (NFC)	Kazu Hirata	1	-2/+2

2022-08-19	[clang][RISCV] Fix incorrect ABI lowering for inherited structs under ↵	Alex Bradbury	1	-1/+14
	hard-float ABIs The hard float ABIs have a rule that if a flattened struct contains either a single fp value, or an int+fp, or fp+fp then it may be passed in a pair of registers (if sufficient GPRs+FPRs are available). detectFPCCEligibleStruct and the helper it calls, detectFPCCEligibleStructHelper examine the type of the argument/return value to determine if it complies with the requirements for this ABI rule. As reported in bug #57084, this logic produces incorrect results for C++ structs that inherit from other structs. This is because only the fields of the struct were examined, but enumerating RD->fields misses any fields in inherited C++ structs. This patch corrects that issue by adding appropriate logic to enumerate any included base structs. Differential Revision: https://reviews.llvm.org/D131677
2022-08-18	[Clang][BPF] Support record argument with direct values	Yonghong Song	1	-0/+36
	Currently, record arguments are always passed by reference by allocating space for record values in the caller. This is less efficient for small records which may take one or two registers. For example, for x86_64 and aarch64, for a record size up to 16 bytes, the record values can be passed by values directly on the registers. This patch added BPF support of record argument with direct values for up to 16 byte record size. If record size is 0, that record will not take any register, which is the same behavior for x86_64 and aarch64. If the record size is greater than 16 bytes, the record argument will be passed by reference. Differential Revision: https://reviews.llvm.org/D132144
2022-08-16	[Clang][BPF]: Force sign/zero extension for return values in caller	Yonghong Song	1	-0/+53
	Currently bpf supports calling kernel functions (x86_64, arm64, etc.) in bpf programs. Tejun discovered a problem where the x86_64 func return value (a unsigned char type) is stored in 8-bit subregister %al and the other 56-bits in %rax might be garbage. But based on current bpf ABI, the bpf program assumes the whole %rax holds the correct value as the callee is supposed to do necessary sign/zero extension. This mismatch between bpf and x86_64 caused the incorrect results. To resolve this problem, this patch forced caller to do needed sign/zero extension for 8/16-bit return values as well. Note that 32-bit return values already had sign/zero extension even without this patch. For example, for the test case attached to this patch: $ cat t.c _Bool bar_bool(void); unsigned char bar_char(void); short bar_short(void); int bar_int(void); int foo_bool(void) { if (bar_bool() != 1) return 0; else return 1; } int foo_char(void) { if (bar_char() != 10) return 0; else return 1; } int foo_short(void) { if (bar_short() != 10) return 0; else return 1; } int foo_int(void) { if (bar_int() != 10) return 0; else return 1; } Without this patch, generated call insns in IR looks like: %call = call zeroext i1 @bar_bool() %call = call zeroext i8 @bar_char() %call = call signext i16 @bar_short() %call = call i32 @bar_int() So it is assumed that zero extension has been done for return values of bar_bool()and bar_char(). Sign extension has been done for the return value of bar_short(). The return value of bar_int() does not have any assumption so caller needs to do necessary shifting to get correct 32bit values. With this patch, generated call insns in IR looks like: %call = call i1 @bar_bool() %call = call i8 @bar_char() %call = call i16 @bar_short() %call = call i32 @bar_int() There are no assumptions for return values of the above four function calls, so necessary shifting is necessary for all of them. The following is the objdump file difference for function foo_char(). Without this patch: 0000000000000010 <foo_char>: 2: 85 10 00 00 ff ff ff ff call -1 3: bf 01 00 00 00 00 00 00 r1 = r0 4: b7 00 00 00 01 00 00 00 r0 = 1 5: 15 01 01 00 0a 00 00 00 if r1 == 10 goto +1 <LBB1_2> 6: b7 00 00 00 00 00 00 00 r0 = 0 0000000000000038 <LBB1_2>: 7: 95 00 00 00 00 00 00 00 exit With this patch: 0000000000000018 <foo_char>: 3: 85 10 00 00 ff ff ff ff call -1 4: bf 01 00 00 00 00 00 00 r1 = r0 5: 57 01 00 00 ff 00 00 00 r1 &= 255 6: b7 00 00 00 01 00 00 00 r0 = 1 7: 15 01 01 00 0a 00 00 00 if r1 == 10 goto +1 <LBB1_2> 8: b7 00 00 00 00 00 00 00 r0 = 0 0000000000000048 <LBB1_2>: 9: 95 00 00 00 00 00 00 00 exit The zero extension of the return 'char' value is done here. Differential Revision: https://reviews.llvm.org/D131598
2022-08-10	[X86][BF16] Enable __bf16 for x86 targets.	Freddy Ye	1	-6/+7
	X86 psABI has updated to support __bf16 type, the ABI of which is the same as FP16. See https://discourse.llvm.org/t/patch-add-optional-bfloat16-support/63149 Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D130964
2022-08-08	[clang][CodeGen] Factor out Swift ABI hooks (NFCI)	Sergei Barannikov	1	-126/+103
	Swift calling conventions stands out in the way that they are lowered in mostly target-independent manner, with very few customization points. As such, swift-related methods of ABIInfo do not reference the rest of ABIInfo and vice versa. This change follows interface segregation principle; it removes dependency of SwiftABIInfo on ABIInfo. Targets must now implement SwiftABIInfo separately if they support Swift calling conventions. Almost all targets implemented `shouldPassIndirectly` the same way. This de-facto default implementation has been moved into the base class. `isSwiftErrorInRegister` used to be virtual, now it is not. It didn't accept any arguments which could have an effect on the returned value. This is now a static property of the target ABI. Reviewed By: rusyaev-roman, inclyc Differential Revision: https://reviews.llvm.org/D130394
2022-07-23	Use llvm::sort instead of std::sort where possible	Dmitri Gribenko	1	-1/+1
	llvm::sort is beneficial even when we use the iterator-based overload, since it can optionally shuffle the elements (to detect non-determinism). However llvm::sort is not usable everywhere, for example, in compiler-rt. Reviewed By: nhaehnle Differential Revision: https://reviews.llvm.org/D130406
2022-07-22	[CUDA/SPIR-V] Force passing aggregate type byval	Shangwu Yao	1	-0/+9
	This patch forces copying aggregate type in kernel arguments by value when compiling CUDA targeting SPIR-V. The original behavior is not passing by value when there is any of destructor, copy constructor and move constructor defined by user. This patch makes the behavior of SPIR-V generated from CUDA follow the CUDA spec (https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#global-function-argument-processing), and matches the NVPTX implementation ( https://github.com/llvm/llvm-project/blob/41958f76d8a2c47484fa176cba1de565cfe84de7/clang/lib/CodeGen/TargetInfo.cpp#L7241). Differential Revision: https://reviews.llvm.org/D130387
2022-07-22	[clang][CodeGen] Only include ABIInfo.h where required (NFC)	Sergei Barannikov	1	-0/+3
	Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D130322
2022-06-20	[clang] Don't use Optional::getValue (NFC)	Kazu Hirata	1	-1/+1

2022-06-10	[ARM] Fix how size-0 bitfields affect homogeneous aggregates.	Simon Tatham	1	-2/+27
	By both AAPCS32 and AAPCS64, the test for whether an aggregate qualifies as homogeneous (either HFA or HVA) is based on the data layout alone. So any logical member of the structure that does not affect the data layout also should not affect homogeneity. In particular, an empty bitfield ('int : 0') should make no difference. In fact, clang considered it to make a difference in C but not in C++, and justified that policy as compatible with gcc. But that's considered a bug in gcc as well (at least for Arm targets), and it's fixed in gcc 12.1. This fix mimics gcc's: zero-sized bitfields are now ignored in all languages for the Arm (32- and 64-bit) ABIs. But I've left the previous behaviour unchanged in other ABIs, by means of adding an ABIInfo::isZeroLengthBitfieldPermittedInHomogeneousAggregate query method which the Arm subclasses override. Reviewed By: lenary Differential Revision: https://reviews.llvm.org/D127197
2022-06-02	[PS5] Make passing unions in registers match PS4 ABI	Paul Robinson	1	-1/+1

2022-06-02	[PS5] Classify __m64 as integer, matching PS4 ABI	Paul Robinson	1	-1/+1

2022-05-31	[Clang][CSKY] Add support about CSKYABIInfo	Zi Xuan Wu (Zeson)	1	-0/+167
	According to the CSKY ABIv2 document, https://github.com/c-sky/csky-doc/blob/master/C-SKY_V2_CPU_Applications_Binary_Interface_Standards_Manual.pdf construct the ABIInfo to handle argument passing and return of clang data type. It also includes how to emit and expand VAArg intrinsic. Differential Revision: https://reviews.llvm.org/D126451
2022-04-26	[SystemZ] Fix C++ ABI for passing args of structs containing zero width ↵	Jonas Paulsson	1	-4/+1
	bitfield. A struct like { float a; int :0; } should per the SystemZ ABI be passed in a GPR, but to match a bug in GCC it has been passed in an FPR (see 759449c). GCC has now corrected the C++ ABI for this case, and this patch for clang follows suit. Reviewed By: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D122388
2022-03-31	Fix the build after cd26190a10fceb6e1472fabcd9e1736f62f078c4	Aaron Ballman	1	-1/+1
	These variables were being used uninitialized and it caused a significant number of test failures on Windows.
2022-03-29	[X86][regcall] Support passing / returning structures	Phoebe Wang	1	-27/+50
	Currently, the regcall calling conversion in Clang doesn't match with ICC when passing / returning structures. https://godbolt.org/z/axxKMKrW7 This patch tries to fix the problem to match with ICC. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D122104
2022-03-24	[clang][AVR] Implement standard calling convention for AVR and AVRTiny	Ben Shi	1	-16/+84
	This patch implements avr-gcc's calling convention: https://gcc.gnu.org/wiki/avr-gcc#Calling_Convention Reviewed By: aykevl Differential Revision: https://reviews.llvm.org/D120720
2022-03-22	[NFC][Clang][OpaquePtr] Remove calls to Address::deprecated in	Akira Hatanaka	1	-42/+49
	TargetInfo.cpp Differential Revision: https://reviews.llvm.org/D122199
2022-03-18	Use llvm::append_range instead of push_back loops where applicable. NFCI.	Benjamin Kramer	1	-7/+3

2022-03-17	[CodeGen] Avoid pointer element type access for blocks	Nikita Popov	1	-4/+3
	Pass the block struct type down to the TargetInfo hooks.
2022-02-24	[CUDA][SPIRV] Assign global address space to CUDA kernel arguments	Shangwu Yao	1	-3/+3
	(resubmit https://reviews.llvm.org/D119207 after fixing the test for some build settings) This patch converts CUDA pointer kernel arguments with default address space to CrossWorkGroup address space (__global in OpenCL). This is because Generic or Function (OpenCL's private) is not supported as storage class for kernel pointer types. Differential revision: https://reviews.llvm.org/D120366
2022-02-17	[clang] Remove Address::deprecated() in emitVoidPtrDirectVAArg()	Arthur Eubanks	1	-3/+3

2022-02-17	Revert "[CUDA][SPIRV] Assign global address space to CUDA kernel arguments"	Matthew Voss	1	-3/+3
	This reverts commit 9de4fc0f2d3b60542956f7e5254951d049edeb1f. Reverting due to test failure: https://lab.llvm.org/buildbot/#/builders/139/builds/17199
2022-02-17	[CUDA][SPIRV] Assign global address space to CUDA kernel arguments	Shangwu Yao	1	-3/+3
	This patch converts CUDA pointer kernel arguments with default address space to CrossWorkGroup address space (__global in OpenCL). This is because Generic or Function (OpenCL's private) is not supported as storage class for kernel pointer types. Differential Revision: https://reviews.llvm.org/D119207
2022-02-17	[CodeGen] Rename deprecated Address constructor	Nikita Popov	1	-64/+69
	To make uses of the deprecated constructor easier to spot, and to ensure that no new uses are introduced, rename it to Address::deprecated(). While doing the rename, I've filled in element types in cases where it was relatively obvious, but we're still left with 135 calls to the deprecated constructor.
2022-02-14	[CGBuilder] Remove CreateBitCast() method	Nikita Popov	1	-13/+10
	Use CreateElementBitCast() instead, or don't work on Address where not necessary.
2022-02-04	[clang][CodeGen] Use memory type representation in `va_arg`	Jan Svoboda	1	-1/+2
	Some types (e.g. `_Bool`) have different scalar and memory representations. CodeGen for `va_arg` didn't take this into account, leading to an assertion failures with different types. This patch makes sure we use memory representation for `va_arg`. Reviewed By: ahatanak Differential Revision: https://reviews.llvm.org/D118904