riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2022-01-13	Don't override __attribute__((no_stack_protector)) by inlining (PR52886)	Hans Wennborg	2	-9/+6
	Since 26c6a3e736d3, LLVM's inliner will "upgrade" the caller's stack protector attribute based on the callee. This lead to surprising results with Clang's no_stack_protector attribute added in 4fbf84c1732f (D46300). Consider the following code compiled with clang -fstack-protector-strong -Os (https://godbolt.org/z/7s3rW7a1q). extern void h(int* p); inline __attribute__((always_inline)) int g() { return 0; } int __attribute__((__no_stack_protector__)) f() { int a[1]; h(a); return g(); } LLVM will inline g() into f(), and f() would get a stack protector, against the users explicit wishes, potentially breaking the program e.g. if h() changes the value of the stack cookie. That's a miscompile. More recently, bc044a88ee3c (D91816) addressed this problem by preventing inlining when the stack protector is disabled in the caller and enabled in the callee or vice versa. However, the problem remained if the callee is marked always_inline as in the example above. This affected users, see e.g. http://crbug.com/1274129 and http://llvm.org/pr52886. One way to fix this would be to prevent inlining also in the always_inline case. Despite the name, always_inline does not guarantee inlining, so this would be legal but potentially surprising to users. However, I think the better fix is to not enable the stack protector in a caller based on the callee. The motivation for the old behaviour is unclear, it seems counter-intuitive, and causes real problems as we've seen. This commit implements that fix, which means in the example above, g() gets inlined into f() (also without always_inline), and f() is emitted without stack protector. I think that matches most developers' expectations, and that's also what GCC does. Another effect of this change is that a no_stack_protector function can now be inlined into a stack protected function, e.g. (https://godbolt.org/z/hafP6W856): extern void h(int* p); inline int __attribute__((__no_stack_protector__)) __attribute__((always_inline)) g() { return 0; } int f() { int a[1]; h(a); return g(); } I think that's fine. Such code would be unusual since no_stack_protector is normally applied to a program entry point which sets up the stack canary. And even if such code exists, inlining doesn't change the semantics: there is still no stack cookie setup/check around entry/exit of the g() code region, but there may be in the surrounding context, as there was before inlining. This also matches GCC. See also the discussion at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94722 Differential revision: https://reviews.llvm.org/D116589
2022-01-13	[Docs] Fix IR and TableGen grammar inconsistencies	Sebastian Neubauer	1	-3/+3
	IR: - globals (and functions, ifuncs, aliases) can have a partition - catchret has a `to` before the label - the sint/int types do not exist - signext comes after the type - a variable was missing its type TableGen: - The second value after a `#` concatenation is optional See e.g. llvm/lib/Target/X86/X86InstrAVX512.td:L3351 - IncludeDirective and PreprocessorDirective were never referenced in the grammar - Add some missing ; - Parent classes of multiclasses can have generic arguments. Reuse the `ParentClassList` that is already used in other places. MIR: - liveins only allows physical registers, which start with a $ Differential Revision: https://reviews.llvm.org/D116674
2022-01-13	[ARM] fix bug causing shrinkwrapping not always being off using PAC	Ties Stuij	1	-1/+1
	If you want to check for all uses of PAC, the SpillsLR argument to shouldSignReturnAddress should be true instead of false, as that value will be returned from the function if the other checks fall through. Reviewed By: miyuki Differential Revision: https://reviews.llvm.org/D116213
2022-01-13	[GlobalOpt] Fix global to select transform under opaque pointers	Nikita Popov	1	-1/+4
	We need to check that the load/store type is also the same, as this is no longer implicitly checked through the pointer type.
2022-01-13	[WebAssembly] Fix reftype load/store match with idx from call	Paulo Matos	1	-2/+1
	Implement support for matching an index from a WebAssembly CALL instruction. Add test. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D115327
2022-01-13	[FileCheck] Allow literal '['s before "[[var...]]"	Jay Foad	1	-4/+7
	Change FileCheck to accept patterns like "[[[var...]]" and treat the excess open brackets at the start as literals. This makes the patterns for matching assembler output with literal brackets much cleaner. For example an AMDGPU pattern that used to be written like: buffer_store_dwordx2 v{{\[}}[[LO]]:[[HI]]{{\]}} can now be: buffer_store_dwordx2 v[[[LO]]:[[HI]]] (Even before this patch the final close bracket did not need to be wrapped in {{}}, but people tended to do it anyway for symmetry.) This does not introduce any ambiguity since "[[" was always followed by an identifier or '@' or '#', so "[[[" was always an error. I've included a few test updates in this patch just for illustration and testing. There are a couple of hundred tests that could be updated as a follow up, mostly in test/CodeGen/. Differential Revision: https://reviews.llvm.org/D117117 Change-Id: Ia6bc6f65cb69734821c911f54a43fe1c673bcca7
2022-01-13	[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative ↵	David Sherwood	8	-7/+16
	constants When we know the value we're extending is a negative constant then it makes sense to use SIGN_EXTEND because this may improve code quality in some cases, particularly when doing a constant splat of an unpacked vector type. For example, for SVE when splatting the value -1 into all elements of a vector of type <vscale x 2 x i32> the element type will get promoted from i32 -> i64. In this case we want the splat value to sign-extend from (i32 -1) -> (i64 -1), whereas currently it zero-extends from (i32 -1) -> (i64 0xFFFFFFFF). Sign-extending the constant means we can use a single mov immediate instruction. New tests added here: CodeGen/AArch64/sve-vector-splat.ll I believe we see some code quality improvements in these existing tests too: CodeGen/AArch64/dag-numsignbits.ll CodeGen/AArch64/reduce-and.ll CodeGen/AArch64/unfold-masked-merge-vector-variablemask.ll The apparent regressions in CodeGen/AArch64/fast-isel-cmp-vec.ll only occur because the test disables codegen prepare and branch folding. Differential Revision: https://reviews.llvm.org/D114357
2022-01-13	[LV] Inline CreateSplatIV call for scalar VFs (NFC).	Florian Hahn	1	-1/+20
	This is a NFC change split off from D116123, as suggested there. D116123 will remove the last user of CreateSplatIV.
2022-01-13	[AArch64] Fix incorrect use of MVT::getVectorNumElements in ↵	David Sherwood	1	-3/+6
	AArch64TTIImpl::getVectorInstrCost If we are inserting into or extracting from a scalable vector we do not know the number of elements at runtime, so we can only let the index wrap for fixed-length vectors. Tests added here: Analysis/CostModel/AArch64/sve-insert-extract.ll Differential Revision: https://reviews.llvm.org/D117099
2022-01-13	RuntimeDyldELF: Don't abort on R_AARCH64_NONE relocation	Vladislav Khmelevsky	1	-0/+2
	Do nothing on R_AARCH64_NONE relocation. The relocation is used by BOLT when re-linking the final binary. It is used as a dummy relocation hack in order to stop the RuntimeDyld to skip the allocation of the section. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D117066
2022-01-13	[JITLink] Add fixup value range check	luxufan	1	-6/+28
	This patch makes jitlink to report an out of range error when the fixup value out of range Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D107328
2022-01-13	[M68k][NFC] Use Register instead of unsigned int	Jim Lin	3	-22/+22

2022-01-13	[NVPTX] Lower fp16 fminnum, fmaxnum to native on sm_80.	Christian Sigg	2	-4/+34
	Reviewed By: bkramer, tra Differential Revision: https://reviews.llvm.org/D117122
2022-01-12	[CSKY] Ensure a newline at the end of a file (NFC)	Kazu Hirata	1	-1/+1

2022-01-13	Revert "[Inline] Attempt to delete any discardable if unused functions"	James Y Knight	1	-35/+17
	Somehow this ends up causing an infinite loop in the inliner. This reverts commit d5be48c66d3e5e8be21805c3a33dc67a20e258be.
2022-01-13	[RISCV] Add bfp and bfpw intrinsic in zbf extension	Lian Wang	3	-0/+31
	Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116994
2022-01-12	[Attributor] Simplify how we handle required alignment during heap-to-stack ↵	Philip Reames	1	-16/+14
	[NFC] The existing code duplicated the same concern in two places, and (weirdly) changed the inference of the allocation size based on whether we could meet the alignment requirement. Instead, just directly check the allocation requirement.
2022-01-12	[Attributor] Generalize calloc handling in heap-to-stack for any init value ↵	Philip Reames	1	-12/+12
	[NFC] Rewrite the calloc specific handling in heap-to-stack to allow arbitrary init values. The basic problem being solved is that if an allocation is initilized to anything other than zero, this must be explicitly done for the formed alloca as well. This covers the calloc case today, but once a couple of earlier guards are removed in this code, downstream allocators with other init values could also be handled. Inspired by discussion on D116971
2022-01-12	[Attributor] Reuse object size evaluation code [NFC]	Philip Reames	1	-8/+8

2022-01-12	[Attributor] Use getAllocAlignment where possible [NFC]	Philip Reames	1	-5/+7
	Inspired by D116971.
2022-01-12	AMDGPU: Fix assert on function argument as loop condition	Matt Arsenault	1	-0/+6

2022-01-12	[AMDGPU] Fixed physreg asm constraint parsing	Stanislav Mekhanoshin	2	-29/+41
	We are always failing parsing of the physreg constraint because we do not drop trailing brace, thus getAsInteger() returns a non-empty string and we delegate reparsing to the TargetLowering. In addition it did not parse register tuples. Fixed which has allowed to remove w/a in two places we call it. Differential Revision: https://reviews.llvm.org/D117055
2022-01-12	GlobalISel: Always enable GISelKnownBits for InstructionSelect	Matt Arsenault	1	-4/+4
	This wasn't running at -O0, and causing crashes for AMDGPU. AMDGPU needs this to match the addressing modes of stack access instructions, which is even more important at -O0 than with optimizations. It currently costs nothing to run ahead of time, so just always enable it.
2022-01-12	RegScavenger: Remove used regs from scavenge candidates	Matt Arsenault	1	-0/+22
	In a future change, AMDGPU will have 2 emergency scavenging indexes in some situations. The secondary scavenging index ends up being used recursively when the scavenger calls eliminateFrameIndex for the emergency spill slot. Without this, it would end up seeing the same register which was just scavenged in the parent call as free, inserts a second emergency spill to the same location and returns the same register when 2 unique free registers are required. We need to only do this if the register is used. SystemZ uses 2 scavenging slots, but calls the scavenger twice in sequence and not recursively. In this case the previously scavenged register can be re-clobbered, but is still tracked in the scavenger until it sees the deferred restore instruction.
2022-01-12	AMDGPU/GlobalISel: Fix assertions on legalize queries with huge align	Matt Arsenault	1	-5/+5
	For some reason we pass around the alignment in bits as uint64_t. Two places were truncating it to unsigned, and losing bits in extreme cases.
2022-01-12	GlobalISel: Add G_ASSERT_ALIGN hint instruction	Matt Arsenault	5	-15/+46
	Insert it for call return values only for now, which is the only case the DAG handles also.
2022-01-12	clang support for Armv8.8/9.3 HBC	Tomas Matheson	1	-0/+2
	This introduces clang command line support for new Armv8.8-A and Armv9.3-A Hinted Conditional Branches feature, previously introduced into LLVM in https://reviews.llvm.org/D116156. Patch by Tomas Matheson and Son Tuan Vu. Differential Revision: https://reviews.llvm.org/D116939
2022-01-12	[Demangle] Pass Ret parameter from decodeNumber by reference	Luís Ferreira	1	-5/+5
	Since Ret parameter is never meant to be nullptr, let's pass it by reference instead of a raw pointer. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D117046
2022-01-12	[Demangle] Add support for D types back referencing	Luís Ferreira	1	-2/+52
	This patch adds support for type back referencing, allowing demangling of compressed mangled symbols with repetitive types. Signed-off-by: Luís Ferreira <contact@lsferreira.net> Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D111419
2022-01-12	[Demangle] Add support for D symbols back referencing	Luís Ferreira	1	-3/+137
	This patch adds support for identifier back referencing allowing compressed mangled names by avoiding repetitiveness. Signed-off-by: Luís Ferreira <contact@lsferreira.net> Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D111417
2022-01-12	[Demangle] Add minimal support for D simple basic types	Luís Ferreira	1	-2/+36
	This patch implements simple demangling of two basic types to add minimal type functionality. This will be later used in function type parsing. After that being implemented we can add the rest of the types and test the result of the type name. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D111416
2022-01-12	[InstSimplify] use knownbits to fold more udiv/urem	Sanjay Patel	1	-0/+10
	We could use knownbits on both operands for even more folds (and there are already tests in place for that), but this is enough to recover the example from: https://github.com/llvm/llvm-project/issues/51934 (the tests are derived from the code in that example) I am assuming no noticeable compile-time impact from this because udiv/urem are rare opcodes. Differential Revision: https://reviews.llvm.org/D116616
2022-01-12	Revert "[JITLink][AArch64] Add support for splitting eh-frames on AArch64."	Nico Weber	1	-7/+0
	This reverts commit 253ce92844f72e3a6d0e423473f2765c2c5afd6a. Breaks tests on Windows, see https://github.com/llvm/llvm-project/issues/52921#issuecomment-1011118896
2022-01-12	[RISCV] Update recently ratified Zb{a,b,c,s} extensions to no longer be ↵	Alex Bradbury	3	-10/+13
	experimental Agreed policy is that RISC-V extensions that have not yet been ratified should be marked as experimental, and enabling them requires the use of the -menable-experimental-extensions flag when using clang alongside the version number. These extensions have now been ratified, so this is no longer necessary, and the target feature names can be renamed to no longer be prefixed with "experimental-". Differential Revision: https://reviews.llvm.org/D117131
2022-01-12	AMDGPU/GlobalISel: Do not use terminator copy before waterfall loops	Matt Arsenault	1	-4/+7
	Stop using the _term variants of the mov to save the initial exec value before the waterfall loop. This cannot be glued to the bottom of the block because we may need to spill the result register. Just use a regular mov, like the loops produced on the DAG path. Fixes some verification errors with regalloc fast.
2022-01-12	GlobalISel: Fix insert point in localizer	Matt Arsenault	1	-3/+4
	This was inserting the new G_CONSTANT after the use, and the later block scan would run off the end. Fix calling SkipPHIsAndLabels for no apparent reason.
2022-01-12	[ModuleInliner] Properly delete dead functions	Arthur Eubanks	1	-8/+1
	Followup to D116964 where we only did this in the CGSCC inliner. Fixes leaks reported in D116964.
2022-01-12	[RISCV] Add RISCVProcFamilyEnum and add SiFive7.	Craig Topper	3	-9/+31
	Use it to remove explicit string compares from unrolling preferences. I'm of two minds on this. Ideally, we would define things in terms of architectural or microarchitectural features, but it's hard to do that with things like unrolling preferences without just ending up with FeatureSiFive7UnrollingPreferences. Having a proc enum is consistent with ARM and AArch64. X86 only has a few and is trying to move away from it. Reviewed By: asb, mcberg2021 Differential Revision: https://reviews.llvm.org/D117060
2022-01-12	[NFC][MLGO] The regalloc reward is float, not int64_t	Mircea Trofin	1	-1/+1

2022-01-12	[NFC][MLGO] Prep a few files before the main ML Regalloc adviser	Mircea Trofin	2	-8/+5
	To avoid trivial changes.
2022-01-12	GlobalIsel: Fix fma combine when one of the operands comes from unmerge	Petar Avramovic	1	-83/+89
	Fma combine assumes that MRI.getVRegDef(Reg)->getOperand(0).getReg() = Reg which is not true when Reg is defined by instruction with multiple defs e.g. G_UNMERGE_VALUES. Fix is to keep register and the instruction that defines register in DefinitionAndSourceRegister and use when needed. Differential Revision: https://reviews.llvm.org/D117032
2022-01-12	[Inline] Attempt to delete any discardable if unused functions	Arthur Eubanks	1	-17/+35
	Previously we limited ourselves to only internal/private functions. We can also delete linkonce_odr functions. Minor compile time wins: https://llvm-compile-time-tracker.com/compare.php?from=d51e3474e060cb0e90dc2e2487f778b0d3e6a8de&to=bccffe3f8d5dd4dda884c9ac1f93e51772519cad&stat=instructions Major memory wins on tramp3d: https://llvm-compile-time-tracker.com/compare.php?from=d51e3474e060cb0e90dc2e2487f778b0d3e6a8de&to=bccffe3f8d5dd4dda884c9ac1f93e51772519cad&stat=max-rss Reviewed By: nikic, mtrofin Differential Revision: https://reviews.llvm.org/D115545
2022-01-12	[X86][AVX2] Add SimplifyDemandedVectorElts handling for avx2 per element shifts	Simon Pilgrim	1	-1/+4
	Noticed while investigating how to improve funnel shift codegen
2022-01-12	Revert "[llvm-readobj][XCOFF] dump auxiliary symbols."	Nico Weber	1	-5/+1
	This reverts commit aad49c8eb9849be57c562f8e2b7fbbe816183343. Breaks tests on Windows, see comments on https://reviews.llvm.org/D113825
2022-01-12	[MachO] Port call graph profile section and directive	Leonard Grey	4	-1/+67
	This ports the `.cg_profile` assembly directive and call graph profile section generation to MachO from COFF/ELF. Due to MachO section naming rules, the section is called `__LLVM,__cg_profile` rather than `.llvm.call-graph-profile` as in COFF/ELF. Support for llvm-readobj is included to facilitate testing. Corresponding LLD change is D112164 Differential Revision: https://reviews.llvm.org/D112160
2022-01-12	[VPlan] Introduce and use BranchOnCount VPInstruction.	Florian Hahn	4	-24/+78
	This patch adds a new BranchOnCount VPInstruction opcode with 2 operands. It first compares its 2 operands (increment of canonical induction and vector trip count), followed by a branch to either the exit block or back to the vector header. It must be the last recipe in the exit block of the topmost vector loop region. This extracts parts from D113224 and was discussed in D113223. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D116479
2022-01-12	[LoopVectorize] Pass a vector type to isLegalMaskedGather/Scatter	Rosie Sumpter	7	-74/+96
	This is required to query the legality more precisely in the LoopVectorizer. This adds another TTI function named 'forceScalarizeMaskedGather/Scatter' function to work around the hack introduced for MVE, where isLegalMaskedGather/Scatter would return an answer by second-guessing where the function was called from, based on the Type passed in (vector vs scalar). The new interface makes this explicit. It is also used by X86 to check for vector widths where gather/scatters aren't profitable (or don't exist) for certain subtargets. Differential Revision: https://reviews.llvm.org/D115329
2022-01-12	[DebugInfo] Move flag for instr-ref to LLVM option, from TargetOptions	Jeremy Morse	4	-19/+21
	This feature was previously controlled by a TargetOptions flag, and I figured that codegen::InitTargetOptionsFromCodeGenFlags would default it to "on" for all frontends. Enabling by default was discussed here: https://lists.llvm.org/pipermail/llvm-dev/2021-November/153653.html and originally supposed to happen in 3c045070882f3, but it didn't actually take effect, as it turns out frontends initialize TargetOptions themselves. This patch moves the flag from a TargetOptions flag to a global flag to CodeGen, where it isn't immediately affected by the frontend being used. Hopefully this will actually cause instr-ref to be on by default on x86_64 now! This patch is easily reverted, and chances of turbulence are moderately high. If you need to revert, please consider instead commenting out the 'return true' part of llvm::debuginfoShouldUseDebugInstrRef to turn the feature off, and dropping me an email. Differential Revision: https://reviews.llvm.org/D116821
2022-01-12	[VP] llvm.vp.merge intrinsic and LangRef	Simon Moll	1	-0/+1
	llvm.vp.merge interprets the %evl operand differently than the other vp intrinsics: all lanes at positions greater or equal than the %evl operand are passed through from the second vector input. Otherwise it behaves like llvm.vp.select. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116725
2022-01-12	[X86][XOP] Add SimplifyDemandedVectorElts handling for xop shifts	Simon Pilgrim	1	-0/+15
	Noticed while investigating how to improve funnel shift codegen