riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
9 hours	[VPlan] Compute cost of more replicating loads/stores in ::computeCost. ↵	Florian Hahn	4	-27/+130
	(#160053) Update VPReplicateRecipe::computeCost to compute costs of more replicating loads/stores. There are 2 cases that require extra checks to match the legacy cost model: 1. If the pointer is based on an induction, the legacy cost model passes its SCEV to getAddressComputationCost. In those cases, still fall back to the legacy cost. SCEV computations will be added as follow-up 2. If a load is used as part of an address of another load, the legacy cost model skips the scalarization overhead. Those cases are currently handled by a usedByLoadOrStore helper. Note that getScalarizationOverhead also needs updating, because when the legacy cost model computes the scalarization overhead, scalars have not been collected yet, so we can't each for replicating recipes to skip their cost, except other loads. This again can be further improved by modeling inserts/extracts explicitly and consistently, and compute costs for those operations directly where needed. PR: https://github.com/llvm/llvm-project/pull/160053
10 hours	[DropUnnecessaryAssumes] Make the ephemeral value check more precise (#160700)	Nikita Popov	1	-7/+43
	The initial implementation used a very crude check where a value was considered ephemeral if it has only one use. This is insufficient if there are multiple assumes acting on the same value, or in more complex cases like cyclic phis. Generalize this to a more typical ephemeral value check, i.e. make sure that all transitive users are in assumes, while stopping at side-effecting instructions.
19 hours	[InstCombine] Transform `vector.reduce.add` and `splat` into multiplication ↵	Gábor Spaits	1	-0/+12
	(#161020) Fixes #160066 Whenever we have a vector with all the same elemnts, created with `insertelement` and `shufflevector` and we sum the vector, we have a multiplication.
20 hours	[VPlan] Rewrite VPExpandSCEVExprs in replaceSymbolicStrides.	Florian Hahn	1	-0/+17
	Extend replaceSymbolicStrides to also replace SCEVUnknowns in VPExpandSCEVExprs using the information from StridesMaps. This results in simpler SCEV expansions in some cases.
24 hours	[VPlan] Remove dead code for scalar VFs in VPRegionBlock::cost (NFC).	Florian Hahn	1	-12/+3
	The VPlan cost model is not used to compute costs of scalar VFs currently, as conversion to replicate regions makes accurately computing the original scalar cost difficult. Remove left over, dead code.
31 hours	[VPlan] Move using VPlanPatternMatch to top in VPlanUtils.cpp (NFC).	Florian Hahn	1	-3/+1
	Only VPlan pattern matching is used in the file, move the using statement to the top level.
32 hours	[LV] Clarify nature of legacy CSE (NFC) (#160855)	Ramkumar Ramachandra	1	-3/+4
	In order to avoid conflating the legacy CSE with the VPlan-based one, rename the legacy CSE and insert a FIXME to clarify the nature of the legacy CSE.
44 hours	[VPlan] Allow multiple users of (broadcast %evl).	Florian Hahn	1	-1/+2
	CSE may replace multiple redundant broadcasts of EVL with a single broadcast which may have more than 1 user. Adjust the verifier to allow this. Fixes a crash when building llvm-test-suite with EVL: https://lab.llvm.org/buildbot/#/builders/210/builds/3303
45 hours	[VPlan] Mark VPInstruction::Broadcast as not reading/writing memory.	Florian Hahn	1	-0/+1
	This enables additional DCE/CSE opportunities and ensures that we don't end up with multiple redundant users of a VPInstruction using EVL. It fixes a verifier error in the added test_3_inductions test.
3 days	[InstCombine] Rotate transformation port from SelectionDAG to InstCombine ↵	Axel Sorenson	1	-0/+16
	(#160628) The rotate transformation from https://github.com/llvm/llvm-project/blob/72c04bb882ad70230bce309c3013d9cc2c99e9a7/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L10312-L10337 has no middle-end equivalent in InstCombine. The following is a port of that transformation to InstCombine. --------- Co-authored-by: Yingwei Zheng <dtcxzyw@qq.com>
3 days	[profcheck][SimplifyCFG] Propagate !prof from `switch` to `select` (#159645)	Mircea Trofin	1	-11/+75
	Propagate `!prof` from `switch` instructions. Issue #147390
3 days	[ASan][RISCV] Teach AddressSanitizer to support indexed load/store. (#160443)	Hank Chang	1	-0/+19
	This patch is based on https://github.com/llvm/llvm-project/pull/159713 This patch extends AddressSanitizer to support indexed/segment instructions in RVV. It enables proper instrumentation for these memory operations. A new member, `MaybeOffset`, is added to `InterestingMemoryOperand` to describe the offset between the base pointer and the actual memory reference address. Co-authored-by: Yeting Kuo <yeting.kuo@sifive.com>
3 days	[VPlan] Run CSE closer to VPlan::execute. (#160572)	Florian Hahn	1	-1/+1
	Additional CSE opportunities are exposed after converting to concrete recipes/dissolving regions and materializing various expressions. Run CSE later, to capitalize on some of the late opportunities. PR: https://github.com/llvm/llvm-project/pull/160572
4 days	[profcheck] Add unknown branch weights for inlined strcmp/strncmp (#160455)	Jin Huang	1	-4/+13
	The strcmp/strncmp inliner creates new conditional branches but was failing to add profile metadata. This caused the ProfileVerifierPass to fail when profcheck is enabled. This patch fixes the issue by explicitly adding unknown branch weights to these branches. Issue #147390
4 days	[VPlan] Fix packed replication of struct types (#160274)	Luke Lau	1	-6/+17
	I ran into this crash when #158690 caused a loop with a struct call to be vectorized. If we have a replicate recipe in a branch-on-mask predicated region that's used by a widened recipe in another block then it will be packed together with the other lanes via a VPPredInstPHIRecipe. If we're replicating a call with a struct return type then we currently crash. The code that handles structs in packScalarIntoVectorizedValue seemed to be untested at least on test/Transforms/LoopVectorize. There's two places that need to be fixed. The poison value that the scalar is packed into needs to use toVectorizedTy to correctly handle structs (not to be confused with toVectorTy!) The other is that VPPredInstPHIRecipe expects its operand to be an InsertElementInstr when stringing together the different lanes. For structs this will be an InsertVlaueInstr, and the value for the previous lane will be at the back of a chain of InsertValueInstrs.
4 days	[msan] Handle AVX512/AVX10 vrndscale (#160624)	Thurston Dang	1	-0/+56
	Uses the updated handleAVX512VectorGenericMaskedFP() from https://github.com/llvm/llvm-project/pull/159966
4 days	[SLP]Correctly set the insert point for insertlements with copyable arguments	Alexey Bataev	1	-2/+9
	Need to find the last insertelement instruction in the list for the copyable arguments, otherwise wrong def-use chain may be built Fixes #160671
4 days	[LoopFusion] Detecting legal dependencies for fusion using DA info (#146383)	Alireza Torabian	1	-0/+42
	Loop fusion pass will use the information provided by the recent DA patch to fuse additional legal loops, including those with forward loop-carried dependencies.
4 days	[MemProf] Make sure call clones without callsite node clones get updated ↵	Teresa Johnson	1	-0/+115
	(#159861) Because we may prune differing amounts of call context for different allocation contexts during matching (we only keep enough call context to distinguish cold from noncold paths), we can end up with different numbers of callsite node clones for different callsites in the same function. Any callsites that don't have node clones for all function clones should have their copies in those other function clones updated the same way as the version in the original function, which might be calling a clone of the callsite.
4 days	[llvm] Add `vfs::FileSystem` to `PassBuilder` (#160188)	Jan Svoboda	1	-6/+6
	Some LLVM passes need access to the filesystem to read configuration files and similar. In some places, this is achieved by grabbing the VFS from `PGOOptions`, but some passes don't have access to these and resort to just calling `vfs::getRealFileSystem()`. This PR allows setting the VFS directly on `PassBuilder` that's able to pass it down to all passes that need it.
4 days	[InstCombine] Remove redundant align 1 assumptions. (#160695)	Florian Hahn	1	-0/+4
	It seems like we have a bunch of align 1 assumptions in practice and unless I am missing something they should not add any value. See https://github.com/dtcxzyw/llvm-opt-benchmark/pull/2861/files PR: https://github.com/llvm/llvm-project/pull/160695
4 days	[InstCombine] Skip replaceExtractElements for ConstantData (#160575)	Yingwei Zheng	1	-1/+5
	Closes https://github.com/llvm/llvm-project/issues/160507. Note: Replacing other users except for `ExtElt` is a bit strange to me. I tried to only replace `ExtElt` with a new extractelement, but it caused regressions on `widen_extract2/3`.
4 days	Reapply "[ControlHeightReduction] Drop lifetime annotations where necessary" ↵	Aiden Grossman	1	-8/+37
	(#160640) Reapplies #159686 This reverts commit 4f33d7b7a9f39d733b7572f9afbf178bca8da127. The original landing of this patch had an issue where it would try and hoist allocas into the entry block that were in the entry block. This would end up actually moving them lower in the block potentially after users, resulting in invalid IR. This update fixes this by ensuring that we are only hoisting static allocas that have been sunk into a split basic block. A regression test has been added. Integration tested using a three stage build of clang with IRPGO enabled.
4 days	[LoopInterchange] Bail out when finding a dependency with all `*` elements ↵	Ryotaro Kasuga	1	-0/+11
	(#149049) If a direction vector with all `` elements, like `[ * *]`, is present, it indicates that none of the loop pairs are legal to interchange. In such cases, continuing the analysis is meaningless. This patch introduces a check to detect such direction vectors and exits early when one is found. This slightly reduces compile time.
4 days	InstCombine: Check GEP operand is available (#160438)	Matt Arsenault	1	-2/+12
	Logic copied from the select case. Fixes #160302
4 days	[VPlan] Set correct flags when creating and cloning VPWidenCastRecipe.	Florian Hahn	3	-10/+15
	Make sure that we set the correct wrap flags when creating new VPWidenCastRecipes for truncs and preserve the flags from the recipe directly when cloning, to make sure they are not dropped. Fixes https://github.com/llvm/llvm-project/issues/160396
4 days	[DropUnnecessaryAssumes] Add support for operand bundles (#160311)	Nikita Popov	1	-12/+60
	This extends the DropUnnecessaryAssumes pass to also handle operand bundle assumes. For this purpose, export the affected value analysis for operand bundles from AssumptionCache. If the bundle only affects ephemeral values, drop it. If all bundles on an assume are dropped, drop the whole assume.
4 days	[VPlan] Create epilogue minimum iteration check in VPlan. (#157545)	Florian Hahn	3	-144/+187
	Move creation of the minimum iteration check for the epilogue vector loop to VPlan. This is a first step towards breaking up and moving skeleton creation for epilogue vectorization to VPlan. It moves most logic out of EpilogueVectorizerEpilogueLoop: the minimum iteration check is created directly in VPlan, connecting the check blocks from the main vector loop is done as post-processing. Next steps are to move connecting and updating the branches from the check blocks to VPlan, as well as updating the incoming values for phis. Test changes are improvements due to folding of live-ins. PR: https://github.com/llvm/llvm-project/pull/157545
5 days	[LV] Remove EVLIndVarSimplify pass (#160454)	Luke Lau	2	-301/+0
	Initially this was needed to replace the fixed-step canonical IV with the variable-step EVL IV, but this was eventually superseded by the loop vectorizer doing this transform itself in #147222. The pass was then removed from the RISC-V pipeline in #151483 and the loop vectorizer stopped emitting the metadata used by the pass in #155760, so now there's no users of it.
5 days	[msan][NFCI] Generalize handleAVX512VectorGenericMaskedFP() operands (#159966)	Thurston Dang	1	-16/+38
	This generalizes handleAVX512VectorGenericMaskedFP() (introduced in #158397), to potentially handle intrinsics that have A/WriteThru/Mask in an operand order that is different to AVX512/AVX10 rcp and rsqrt. Any operands other than A and WriteThru must be fully initialized. For example, the generalized handler could be applied in follow-up work to many of the AVX512 rndscale intrinsics: ``` <32 x half> @llvm.x86.avx512fp16.mask.rndscale.ph.512(<32 x half>, i32, <32 x half>, i32, i32) <16 x float> @llvm.x86.avx512.mask.rndscale.ps.512(<16 x float>, i32, <16 x float>, i16, i32) <8 x double> @llvm.x86.avx512.mask.rndscale.pd.512(<8 x double>, i32, <8 x double>, i8, i32) A Imm WriteThru Mask Rounding <8 x float> @llvm.x86.avx512.mask.rndscale.ps.256(<8 x float>, i32, <8 x float>, i8) <4 x float> @llvm.x86.avx512.mask.rndscale.ps.128(<4 x float>, i32, <4 x float>, i8) <4 x double> @llvm.x86.avx512.mask.rndscale.pd.256(<4 x double>, i32, <4 x double>, i8) <2 x double> @llvm.x86.avx512.mask.rndscale.pd.128(<2 x double>, i32, <2 x double>, i8) A Imm WriteThru Mask ```
5 days	[LV] Set extend kinds together with ExtOpTypes (NFC).	Florian Hahn	1	-10/+7
	Set extend kinds together with ExtOpTypes. This will make it easier to adjust the extend kind handling.
5 days	[profcheck] Option to inject distinct small weights (#159644)	Mircea Trofin	1	-29/+47
	There are cases where the easiest way to regression-test a profile change is to add `!prof` metadata, with small numbers as to simplify manual verification. To ensure coverage, this (the inserting) may become tedious. This patch makes `prof-inject` do that for us, if so opted in. The list of weights used is a bunch of primes, used as a circular buffer. Issue #147390
5 days	[InstCombine] Fold selects into masked loads (#160522)	Matthew Devereau	1	-0/+10
	Selects can be folded into masked loads if the masks are identical.
5 days	[LLVMContext] Add OB_align assume bundle op ID. (#158078)	Florian Hahn	1	-1/+1
	Assume operand bundles are emitted in a few more places now, including used in various places in libc++. Add a dedicated ID for them. PR: https://github.com/llvm/llvm-project/pull/158078
5 days	[LV] Don't create partial reductions if factor doesn't match accumulator ↵	Florian Hahn	4	-17/+29
	(#158603) Check if the scale-factor of the accumulator is the same as the request ScaleFactor in tryToCreatePartialReductions. This prevents creating partial reductions if not all instructions in the reduction chain form partial reductions. e.g. because we do not form a partial reduction for the loop exit instruction. Currently code-gen works fine, because the scale factor of VPPartialReduction is not used during ::execute, but it means we compute incorrect cost/register pressure, because the partial reduction won't reduce to the specified scaling factor. PR: https://github.com/llvm/llvm-project/pull/158603
5 days	[AssumptionCache] Don't use ResultElem for assumption list (NFC) (#160462)	Nikita Popov	1	-2/+2
	ResultElem stores a weak handle of an assume, plus an index for referring to a specific operand bundle. This makes sense for the results of assumptionsFor(), which refers to specific operands of assumes. However, assumptions() is a plain list of assumes. It does not contain separate entries for each operand bundles. The operand bundle index is always ExprResultIdx. As such, we should be directly using WeakVH for this case, without the additional wrapper.
5 days	[LV] Don't ignore invariant stores when costing (#158682)	Ramkumar Ramachandra	1	-14/+0
	Invariant stores of reductions are removed early in the VPlan construction, and there is no reason to ignore them while costing.
6 days	Reapply "[Coroutines] Add llvm.coro.is_in_ramp and drop return value of ↵	Weibo He	4	-19/+36
	llvm.coro.end (#155339)" (#159278) As mentioned in #151067, current design of llvm.coro.end mixes two functionalities: querying where we are and lowering to some code. This patch separate these functionalities into independent intrinsics by introducing a new intrinsic llvm.coro.is_in_ramp. Update a test in inline/ML, Reapply #155339
6 days	[LV] Check for hoisted safe-div selects in planContainsAdditionalSimp.	Florian Hahn	1	-9/+28
	In some cases, safe-divisor selects can be hoisted out of the vector loop. Catching all cases in the legacy cost model isn't possible, in particular checking if all conditions guarding a division are loop invariant. Instead, check in planContainsAdditionalSimplifications if there are any hoisted safe-divisor selects. If so, don't compare to the more inaccurate legacy cost model. Fixes https://github.com/llvm/llvm-project/issues/160354. Fixes https://github.com/llvm/llvm-project/issues/160356.
6 days	[SimplifyCFG] Avoid using isNonIntegralPointerType()	Alexander Richardson	1	-8/+19
	This is an overly broad check, the transformation made here can be done safely for pointers with index!=repr width. This fixes the codegen regression introduced by https://github.com/llvm/llvm-project/pull/105735 and should be beneficial for AMDGPU code-generation once the datalayout there no longer uses the overly strict `ni:` specifier. Reviewed By: arsenm Pull Request: https://github.com/llvm/llvm-project/pull/159890
6 days	[SLPVectorizer] Move size checks (NFC) (#159361)	Mikhail Gudim	1	-15/+12
	Move size checks inside `isStridedLoad`. In the future we plan to possibly change the size and type of strided load there.
7 days	Revert "[ControlHeightReduction] Drop lifetime annotations where necessary ↵	Aiden Grossman	1	-37/+8
	(#159686)" This reverts commit a00450944d2a91aba302954556c1c23ae049dfc7. Looks like this one is actually breaking the buildbots. Reverting the switch back to IRPGO did not fix things.
7 days	[TTI][ASan][RISCV] reland Move InterestingMemoryOperand to Analysis and ↵	Hank Chang	1	-6/+18
	embed in MemIntrinsicInfo #157863 (#159713) [Previously reverted due to failures on asan-rvv-intrinsics.ll, the test case is riscv only and it is triggered by other target] Reland [#157863](https://github.com/llvm/llvm-project/pull/157863), and add `; REQUIRES: riscv-registered-target` in test case to skip the configuration that doesn't register riscv target. Previously asan considers target intrinsics as black boxes, so asan could not instrument accurate check. This patch make SmallVector<InterestingMemoryOperand> a member of MemIntrinsicInfo so that TTI can make targets describe their intrinsic informations to asan. Note, 1. This patch move InterestingMemoryOperand from Transforms to Analysis. 2. Extend MemIntrinsicInfo by adding a SmallVector<InterestingMemoryOperand> member. 3. This patch does not support RVV indexed/segment load/store.
7 days	[LV][EVL] Remove metadata on EVL vectorized loops (#155760)	Shih-Po Hung	1	-20/+0
	This patch removes the metadata emission for EVL‑vectorized loops, since there is no current in-tree consumer: 1) after VPlan performs canonical IV replacement #147222 and 2) RISCV dropped EVLIndVarSimplifyPass #151483, which was the only user of this metadata.
7 days	[ControlHeightReduction] Drop lifetime annotations where necessary (#159686)	Aiden Grossman	1	-8/+37
	ControlHeightReduction will duplicate some blocks and insert phi nodes in exit blocks of regions that it operates on for any live values. This includes allocas. Having a lifetime annotation refer to a phi node was made illegal in 92c55a315eab455d5fed2625fe0f61f88cb25499, which causes the verifier to fail after CHR. There are some cases where we might not need to drop lifetime annotations (usually because we do not need the phi to begin with), but drop all annotations for now to be conservative. Fixes #159621.
7 days	[InferAlignment] Fix updating alignment when larger than i32 (#160109)	Joseph Huber	1	-1/+2
	Summary: The changes made in https://github.com/llvm/llvm-project/pull/156057 allows the alignment value to be increased. We assert effectively infinite alignment when the pointer argument is invalid / null. The problem is that for whatever reason the masked load / store functions use i32 for their alignment value which means this gets truncated to zero. Add a special check for this, long term we probably want to just remove this argument entirely.
7 days	[VPlan] Avoid branching around State.get (NFC) (#159042)	Ramkumar Ramachandra	1	-9/+3

7 days	[VPlan] Add WidenGEP::getSourceElementType (NFC) (#159029)	Ramkumar Ramachandra	3	-17/+21

7 days	[Coroutines] Take byval param alignment into account when spilling to frame ↵	Hans Wennborg	1	-4/+8
	(#159765) Fixes #159571
8 days	[LV] Set correct costs for interleave group members.	Florian Hahn	1	-3/+12
	This ensures each scalarized member has an accurate cost, matching the cost it would have if it would not have been considered for an interleave group.