rocket-tools/riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
3 hours	[X86] Correct 32-bit immediate assertion and fix 64-bit lowering for huge ↵	Wesley Wiser	1	-1/+1
	frame offsets (#123872) The assertion previously did not work correctly because the operand was being truncated to an `int` prior to comparison. Change the assertion into a a reported error as suggested in https://github.com/llvm/llvm-project/pull/101840#issuecomment-2304992425 by @arsenm Finally, fix the lowering on 64-bit targets so that offsets larger than 32-bit are correctly addressed and add tests for various reported issues.
14 hours	[DAG] Always use stack to promote bitcast when the source is vector (#151065)	Min-Yih Hsu	1	-2/+3
	The optimization introduced by #125637 tried to avoid using stacks to promote bitcast with vector result type. However, it wouldn't be correct if the input type is vector. This patch limits that optimizations to only scalar to vector bitcasts.
19 hours	[TargetLowering] Use getShiftAmountConstant in buildSDIVPow2WithCMov.	Craig Topper	1	-2/+2

39 hours	[SelectionDAG] Move sign pattern check from AArch64 and ARM to general ↵	AZero13	1	-4/+18
	SelectionDAG (#151736) This works on all cases much like the XOR case above it in SelectionDAG.
2 days	[LLVM][DAGCombiner] fold (shl (X * vscale(C0)), C1) -> (X * vscale(C0 << ↵	Paul Walker	1	-0/+13
	C1)). (#150651)
2 days	Add m_SelectCCLike matcher to match SELECT_CC or SELECT with SETCC (#149646)	黃國庭	1	-12/+11
	Fix #147282 and Follow-up to #148834 --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
2 days	[DAGCombiner] Add combine for vector interleave of splats (#151110)	David Sherwood	1	-0/+48
	This patch adds two DAG combines: 1. vector_interleave(splat, splat, ...) -> {splat,splat,...} 2. concat_vectors(splat, splat, ...) -> wide_splat where all the input splats are identical. Both of these together enable us to fold concat_vectors(vector_interleave(splat, splat, ...)) into a wide splat. Post-legalisation we must only do the concat_vector combine if the wider type and splat operation is legal. For fixed-width vectors the DAG combine only occurs for interleave factors of 3 or more, however it's not currently safe to test this for AArch64 since there isn't any lowering support for fixed-width interleaves. I've only added fixed-width tests for RISCV.
2 days	[MachineScheduler] Make cluster check more efficient (#150884)	Ruiling, Song	1	-26/+40

2 days	[RegAlloc] Fix use-after-free in `RegAllocBase::cleanupFailedVReg` (#151435)	Shilei Tian	1	-3/+1
	#128400 introduced a use-after-free bug in `RegAllocBase::cleanupFailedVReg` when removing intervals from regunits. The issue is from the `InterferenceCache` in `RAGreedy`, which holds `LiveRange*`. The current `InterferenceCache` APIs make it difficult to update it, and there isn't a straightforward way to do that. Since #128400 already mentions it's not clear about the necessity of removing intervals from regunits, this PR avoids the issue by simply skipping that step. Fixes SWDEV-527146.
3 days	[llvm][AsmPrinter] Emit call graph section	Prabhu Rajasekaran	1	-0/+108
	Collect the necessary information for constructing the call graph section, and emit to .callgraph section of the binary. MD5 hash of the callee_type metadata string is used as the numerical type id emitted. Reviewers: ilovepi Reviewed By: ilovepi Pull Request: https://github.com/llvm/llvm-project/pull/87576
3 days	[SelectionDAG] Improve the doxygen description for SDValue::isOperandOf. NFC ↵	Craig Topper	1	-1/+1
	(#151244) SDValue::isOperandOf checks the result number in addition to the SDNode. SDNode::isOperandOf only checks the SDNode.
3 days	[TailDup] Delay aggressive computed-goto taildup to after RegAlloc. (#150911)	Florian Hahn	1	-6/+10
	https://github.com/llvm/llvm-project/pull/114990 allowed more aggressive tail duplication for computed-gotos in both pre- and post-regalloc tail duplication. In some cases, performing tail-duplication too early can lead to worse results, especially if we duplicate blocks with a number of phi nodes. This is causing a ~3% performance regression in some workloads using Python 3.12. This patch updates TailDup to delay aggressive tail-duplication for computed gotos to after register allocation. This means we can keep the non-duplicated version for a bit longer throughout the backend, which should reduce compile-time as well as allowing a number of optimizations and simplifications to trigger before drastically expanding the CFG. For the case in https://github.com/llvm/llvm-project/issues/106846, I get the same performance with and without this patch on Skylake. PR: https://github.com/llvm/llvm-project/pull/150911
3 days	MachineInstrBuilder: Introduce copyMIMetadata() function.	Peter Collingbourne	1	-1/+1
	This reduces the amount of boilerplate required when adding a new field to MIMetadata and reduces the chance of bugs like the one I fixed in TargetInstrInfo::reassociateOps. Reviewers: arsenm, nikic Reviewed By: nikic Pull Request: https://github.com/llvm/llvm-project/pull/133535
3 days	[MachineBB] Make sure there are successors in terminatorIsComputedGoto. ↵	Florian Hahn	1	-1/+1
	(#151342) Currently terminatorIsComputedGoto will return for blocks with a indirect branch terminator and no successor. If there are no successor, the terminator is likely not a computed goto, return false in that case. Note that this is currently NFC, as the only use checks it only if there are successors, but it will be needed in https://github.com/llvm/llvm-project/pull/150911. PR: https://github.com/llvm/llvm-project/pull/151342
3 days	[MachineFunction] Move CallSiteInfo constructor out of header (#151520)	Prabhu Rajasekaran	1	-0/+20

3 days	[X86][APX] Do optimizeMemoryInst for v1X masked load/store (#151331)	Phoebe Wang	1	-0/+23
	Fix redundant LEA: https://godbolt.org/z/34xEYE818
4 days	[llvm] Extract and propagate callee_type metadata	Prabhu Rajasekaran	1	-1/+2
	Update MachineFunction::CallSiteInfo to extract numeric CalleeTypeIds from callee_type metadata attached to indirect call instructions. Reviewers: nikic, ilovepi Reviewed By: ilovepi Pull Request: https://github.com/llvm/llvm-project/pull/87575
4 days	[CodeGen] Remove an unnecessary cast (NFC) (#151280)	Kazu Hirata	1	-1/+1
	LoopValStage is already of int.
4 days	Reland "RegisterCoalescer: Add implicit-def of super register when ↵	Sander de Smalen	1	-15/+164
	coalescing SUBREG_TO_REG" (#134408) This tries to reland #123632 (previously reverted by commit 6b1db79887df19bc8e8c946108966aa6021c8b87) This PR aims to fix coalescing of SUBREG_TO_REG when sub-register liveness tracking is enabled and this is now the so-manieth reincarnation of this effort :) This change is needed in order to enable subreg liveness tracking for AArch64, because without the implicit-def, Machine Copy Propagation would remove a 'redundant' copy because it doesn't realise that the top 32-bits of the register are zeroed, which subsequent instructions rely on. Changes compared to previous PR: * Rather than updating all instructions that define the source register (SrcReg) of the SUBREG_TO_REG, this new approach only updates instructions that define SrcReg when they dominate the SUBREG_TO_REG. The live-ranges are updated accordingly.
4 days	[GISel] Introduce MIFlags::InBounds (#150900)	Fabian Ritter	6	-3/+19
	This flag applies to G_PTR_ADD instructions and indicates that the operation implements an inbounds getelementptr operation, i.e., the pointer operand is in bounds wrt. the allocated object it is based on, and the arithmetic does not change that. It is set when the IRTranslator lowers inbounds GEPs (currently only in some cases, to be extended with a future PR), and in the (build\|materialize)ObjectPtrOffset functions. Inbounds information is useful in ISel when we have instructions that perform address computations whose intermediate steps must be in the same memory region as the final result. A follow-up patch will start using it for AMDGPU's flat memory instructions, where the immediate offset must not affect the memory aperture of the address. This is analogous to a concurrent effort in SDAG: #131862 (related: #140017, #141725). For SWDEV-516125.
4 days	[LLVM][SelectionDAG] Align poison/undef binop folds with IR. (#149334)	Paul Walker	1	-20/+61
	The "at construction" binop folds in SelectionDAG::getNode() has different behaviour when compared to the equivalent LLVM IR. This PR makes the behaviour consistent while also extending the coverage to include signed/unsigned max/min operations.
4 days	[DAG] Fold (setcc ((x \| x >> c0 \| ...) & mask)) sequences (#146054)	Pierre van Houtryve	1	-1/+88
	Fold sequences where we extract a bunch of contiguous bits from a value, merge them into the low bit and then check if the low bits are zero or not. Usually the and would be on the outside (the leaves) of the expression, but the DAG canonicalizes it to a single `and` at the root of the expression. The reason I put this in DAGCombiner instead of the target combiner is because this is a generic, valid transform that's also fairly niche, so there isn't much risk of a combine loop I think. See #136727
4 days	[TargetLowering] Use getShiftAmountConstant in CTTZTableLookup. NFC	Craig Topper	1	-1/+1

5 days	[ELF][AsmPrinter] Emit trailing dot for constant pool section when it has a ↵	Mingming Liu	1	-7/+7
	hotness prefix (#150859) Currently, `TargetLoweringObjectFileELF::getSectionForConstant` produce `.<section>.hot` or `.<section>.unlikely` for a constant with non-empty section prefix. This PR changes the implementation add trailing dot when section prefix is not empty, to disambiguate `.hot` as a hotness prefix from `.hot` as a (pure C) variable name. Relevant discussions are in https://github.com/llvm/llvm-project/pull/148985#discussion_r2221141273 and https://github.com/llvm/llvm-project/pull/148985#discussion_r2233382641 and
5 days	[LLVM][Cygwin] Enable conditions that are shared with MinGW (#149638)	jeremyd2019	1	-1/+1
	Cygwin and MinGW share the auto import behavior that could result in __stack_check_guard being non-dso-local. Allow windres to assume a Cygwin target as well as a MinGW one, so defines like _WIN32 would not be present on Cygwin.
5 days	[DAG] Remove AssertZext if the input is masked (#146052)	Pierre van Houtryve	1	-13/+21
	Remove AssertZext if the input ensures the assert cannot fail.
5 days	[GISel] Introduce MachineIRBuilder::(build\|materialize)ObjectPtrOffset (#150392)	Fabian Ritter	4	-16/+31
	These functions are for building G_PTR_ADDs when we know that the base pointer and the result are both valid pointers into (or just after) the same object. They are similar to SelectionDAG::getObjectPtrOffset. This PR also changes call sites of the generic (build\|materialize)PtrAdd functions that implement pointer arithmetic to split large memory accesses to the new functions. Since memory accesses have to fit into an object in memory, pointer arithmetic to an offset into a large memory access also yields an address in that object. Currently, these (build\|materialize)ObjectPtrOffset functions only add "nuw" to the generated G_PTR_ADD, but I intend to introduce an "inbounds" MIFlag in a later PR (analogous to a concurrent effort in SDAG: #131862, related: #140017, #141725) that will also be set in the (build\|materialize)ObjectPtrOffset functions. Most test changes just add "nuw" to G_PTR_ADDs. Exceptions are AMDGPU's call-outgoing-stack-args.ll, flat-scratch.ll, and freeze.ll tests, where offsets are now folded into scratch instructions, and cases where the behavior of the check regeneration script changed, resulting, e.g., in better checks for "nusw G_PTR_ADD" instructions, matched empty lines, and the use of "CHECK-NEXT" in MIPS tests. For SWDEV-516125.
5 days	Fix build warnings after 6fbc397964340ebc9cb04a094fd04bef9a53abc3 (#151100)	David Sherwood	1	-7/+0

5 days	[BranchFolding] Follow up #149999 crash fix	Orlando Cazalet-Hyams	1	-2/+3
	fbf6271c7da20356d7b34583b3711b4126ca1dbb introduced an assertion failure as setDebugValueUndef was called on DBG_LABELs, which isn't allowed and doesn't make sense. Fix by skipping the call for DBG_LABELs and hoisting, in line with the original behaviour.
5 days	[IR][SDAG] Remove lifetime size handling from SDAG (#150944)	Nikita Popov	4	-19/+14
	Split out from https://github.com/llvm/llvm-project/pull/150248: Specify that the argument of lifetime.start/lifetime.end is ignored and will be removed in the future. Remove lifetime size handling from SDAG. The size was previously discarded during isel, so was always ignored for stack coloring anyway. Where necessary, obtain the size of the full frame index.
5 days	[IR] Add new CreateVectorInterleave interface (#150931)	David Sherwood	1	-10/+7
	This PR adds a new interface to IRBuilder called CreateVectorInterleave, which can be used to create vector.interleave intrinsics of factors 2-8. For convenience I have also moved getInterleaveIntrinsicID and getDeinterleaveIntrinsicID from VectorUtils.cpp to Intrinsics.cpp where it can be used by IRBuilder.
5 days	[GlobalISel] Remove `UnsafeFPMath` references (#146319)	paperchalice	2	-5/+3
	This is the GlobalISel part to remove `UnsafeFPMath` flag in CodeGen pipeline.
5 days	[AMDGPU] Add NoaliasAddrSpace to AAMDnodes (#149247)	Shoreshen	4	-1/+12
	This is the following PR of https://github.com/llvm/llvm-project/pull/136553 which calculate NoaliasAddrSpace. This PR carries the info calculated into MIR by adding it into AAMDnodes
6 days	[SelectionDAG] Remove `UnsafeFPMath` in LegalizeDAG (#146316)	paperchalice	2	-2/+6
	These global flags hinder further improvements like [[RFC] Honor pragmas with -ffp-contract=fast](https://discourse.llvm.org/t/rfc-honor-pragmas-with-ffp-contract-fast) and pass concurrency support. Remove them incrementally.
6 days	Hot-patch __ref_* variables should be placed in .rdata, not .data (#151008)	sivadeilra	1	-0/+13
	This is a refinment of #145565 . That PR added support for "Windows Secure Hot-patching". In this design, functions that are compiled for hot-patching need to be modified when they access mutable global variables. The modification is to insert a level of indirection, the so-called `__ref_*` variables. Ref variables are supposed to be inserted into the `.rdata` section, not `.data`. This provides a degree of protection against modification (accidental or malicious) of ref variables during program execution. When the Windows hot-patch subsystem loads a module as a hot-patch, it finds all ref variables and changes the page protections for the pages containing them to read/write. Then it sets the ref variables to point to the real variable locations within the base image. Then it changes page protections back to read-only. This relies on the variables being placed in the `.rdata` section, not `.data`. However, it is still important that the LLVM `GlobalVariable` that is created for the ref variable be created with `isConstant = false`. This prevents LLVM from optimizing accesses to the `GlobalVariable`, i.e. assuming that the variable can never change and thus inlining its value into expressions that would ordinarily dereference it. That optimization would defeat the purpose of hot-patching, so `isConstant = false` is still the correct value for these ref variables.
6 days	Reapply "[llvm] Add CalleeTypeIds field to CallSiteInfo" (#150335) (#150990)	Prabhu Rajasekaran	4	-7/+28
	This reverts commit 05e08cdb3e576cc0887d1507ebd2f756460c7db7. Adding the missing -mtriple flags in MIR/X86 test files which caused these tests to fail which was the reason for reverting the patch.
6 days	Use F.hasOptSize() instead of checking optsize directly (#147348)	Ellis Hoag	1	-2/+1

6 days	Reapply (2) [BranchFolding] Kill common hoisted debug instructions (#149999)	Orlando Cazalet-Hyams	1	-7/+40
	Reapply #140091. branch-folder hoists common instructions from TBB and FBB into their pred. Without this patch it achieves this by splicing the instructions from TBB and deleting the common ones in FBB. That moves the debug locations and debug instructions from TBB into the pred without modification, which is not ideal. Debug locations are handled in #140063. This patch handles debug instructions - in the simplest way possible, which is to just kill (undef) them. We kill and hoist the ones in FBB as well as TBB because otherwise the fact there's an assignment on the code path is deleted (which might lead to a prior location extending further than it should). There's possibly something we could do to preserve some variable locations in some cases, but this is the easiest not-incorrect thing to do. Note I had to replace the constant DBG_VALUEs to use registers in the test- it turns out setDebugValueUndef doesn't undef constant DBG_VALUEs... which feels wrong to me, but isn't something I want to touch right now. --- Fix end-iterator-dereference and add test.
6 days	[CodeGen] More consistently expand float ops by default (#150597)	Nikita Popov	1	-17/+17
	These float operations were expanded for scalar f32/f64/f128, but not for f16 and more problematically, not for vectors. A small subset of them was separately set to expand for vectors. Change these to always expand by default, and adjust targets to mark these as legal where necessary instead. This is a much safer default, and avoids unnecessary legalization failures because a target failed to manually mark them as expand. Fixes https://github.com/llvm/llvm-project/issues/110753. Fixes https://github.com/llvm/llvm-project/issues/121390.
6 days	[COFF] Set .llvmbc and .llvmcmd to metadata section (#150879)	Haohai Wen	1	-1/+2
	Those are metadata sections for ELF but was not properly set for COFF.
7 days	[AsmPrinter] Remove an unnecessary cast (NFC) (#150839)	Kazu Hirata	1	-4/+3
	getLabelAfterInsn() already returns MCSymbol *.
8 days	[IA] Fix a bug introduced by a recent refactoring	Philip Reames	1	-0/+6
	I had dropped the check for which intrinsics were supported. This is a quick fix to get tree back into an unbroken state, a cleaner change may follow.
8 days	MCSectionXCOFF: Remove classof	Fangrui Song	2	-7/+9
	The object file format specific derived classes are used in context like MCStreamer and MCObjectTargetWriter where the type is statically known. We don't use isa/dyn_cast and we want to eliminate MCSection::SectionVariant in the base class.
8 days	MCSectionCOFF: Remove classof	Fangrui Song	3	-8/+9
	The object file format specific derived classes are used in context like MCStreamer and MCObjectTargetWriter where the type is statically known. We don't use isa/dyn_cast and we want to eliminate MCSection::SectionVariant in the base class.
8 days	DAG: Emit an error if trying to legalize read/write register with illegal ↵	Matt Arsenault	2	-0/+58
	types (#145197) This is a starting point to have better legalization failure diagnostics
9 days	MCSectionELF: Remove classof	Fangrui Song	1	-1/+1
	The object file format specific derived classes are used in context like MCStreamer and MCObjectTargetWriter where the type is statically known. We don't use isa/dyn_cast and we want to eliminate MCSection::SectionVariant in the base class.
9 days	[CodeGenPrepare] Make sure that `AddOffset` is also a loop invariant (#150625)	Yingwei Zheng	1	-0/+4
	Closes https://github.com/llvm/llvm-project/issues/150611.
9 days	Revert "[BranchFolding] Kill common hoisted debug instructions" (#150632)	Orlando Cazalet-Hyams	1	-44/+6
	Reverts llvm/llvm-project#149999 https://lab.llvm.org/buildbot/#/builders/139/builds/17622
9 days	Reapply [BranchFolding] Kill common hoisted debug instructions (#149999)	Orlando Cazalet-Hyams	1	-6/+44
	Reapply #140091. branch-folder hoists common instructions from TBB and FBB into their pred. Without this patch it achieves this by splicing the instructions from TBB and deleting the common ones in FBB. That moves the debug locations and debug instructions from TBB into the pred without modification, which is not ideal. Debug locations are handled in #140063. This patch handles debug instructions - in the simplest way possible, which is to just kill (undef) them. We kill and hoist the ones in FBB as well as TBB because otherwise the fact there's an assignment on the code path is deleted (which might lead to a prior location extending further than it should). There's possibly something we could do to preserve some variable locations in some cases, but this is the easiest not-incorrect thing to do. Note I had to replace the constant DBG_VALUEs to use registers in the test- it turns out setDebugValueUndef doesn't undef constant DBG_VALUEs... which feels wrong to me, but isn't something I want to touch right now.
9 days	[IA] Recognize repeated masks which come from shuffle vectors (#150285)	Philip Reames	1	-0/+21
	This extends the fixed vector lowering to support the case where the mask is formed via shufflevector idiom. --------- Co-authored-by: Luke Lau <luke_lau@icloud.com>