riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
8 days	[SSAUpdaterBulk] Add PHI simplification pass. (#150936)	Valery Pykhtin	1	-0/+220
	This optimization is performed as a separate pass over newly inserted PHI nodes to simplify and deduplicate them. By processing PHIs separately, we avoid the complexity of tracking reference bookkeeping needed to update BBValueInfo structures during insertion.
8 days	[VPlan] Strip VPDT's default constructor (NFC) (#162692)	Ramkumar Ramachandra	1	-6/+3

13 days	[llvm][DebugInfo][NFC] Abstract DICompileUnit::SourceLanguage to allow ↵	Michael Buch	1	-11/+13
	alternate DWARF SourceLanguage encoding (#162255) This patch sets up `DICompileUnit` to support the DWARFv6 `DW_AT_language_name` and `DW_AT_language_version` attributes (which are set to replace `DW_AT_language`). This patch changes the `DICompileUnit::SourceLanguage` field type to a `DISourceLanguageName` that encapsulates the notion of "versioned vs. unversioned name". A "versioned" name is one that has an associated version stored separately in `DISourceLanguageName::Version`. This patch just changes all the clients of the `getSourceLanguage` API to the expect a `DISourceLanguageName`. Currently they all just `assert` (via `DISourceLanguageName::getUnversionedName`) that we're dealing with "unversioned names" (i.e., the pre-DWARFv6 language codes). In follow-up patches (e.g., draft is at https://github.com/llvm/llvm-project/pull/162261), when we start emitting versioned language codes, the `getUnversionedName` calls can then be adjusted to `getName`. Implementation considerations * We could have added a new member to `DICompileUnit` alongside the existing `SourceLanguage` field. I don't think this would have made the transition any simpler (clients would still need to be aware of "versioned" vs. "unversioned" language names). I felt that encapsulating this inside a `DISourceLanguageName` was easier to reason about for maintainers. * Currently DISourceLanguageName is a `12` byte structure. We could probably pack all the info inside a `uint64_t` (16-bits for the name, 32-bits for the version, 1-bit for answering the `hasVersionedName`). Just to keep the prototype simple I used a `std::optional`. But since the guts of the structure are hidden, we can always change the layout to a more compact representation instead. How to review * The new `DISourceLanguageName` structure is defined in `DebugInfoMetadata.h`. All the other changes fall out from changing the `DICompileUnit::SourceLanguage` from `unsigned` to `DISourceLanguageName`.
2025-10-03	[VPlan] Deref VPlanPtr when passing to transform (NFC) (#161369)	Ramkumar Ramachandra	2	-3/+3
	For uniformity with other transforms.
2025-09-24	Reapply "[Coroutines] Add llvm.coro.is_in_ramp and drop return value of ↵	Weibo He	2	-7/+7
	llvm.coro.end (#155339)" (#159278) As mentioned in #151067, current design of llvm.coro.end mixes two functionalities: querying where we are and lowering to some code. This patch separate these functionalities into independent intrinsics by introducing a new intrinsic llvm.coro.is_in_ramp. Update a test in inline/ML, Reapply #155339
2025-09-18	[LV] Provide utility routine to find uncounted exit recipes (#152530)	Graham Hunter	3	-2/+105
	Splitting out just the recipe finding code from #148626 into a utility function (along with the extra pattern matchers). Hopefully this makes reviewing a bit easier. Added a gtest, since this isn't actually used anywhere yet.
2025-09-17	Revert "Reapply "[Coroutines] Add llvm.coro.is_in_ramp and drop return value ↵	Weibo He	1	-4/+4
	of llvm.coro.end #153404"" (#159236) Reverts llvm/llvm-project#155339 because of CI fail
2025-09-17	Reapply "[Coroutines] Add llvm.coro.is_in_ramp and drop return value of ↵	Weibo He	1	-4/+4
	llvm.coro.end #153404" (#155339) As mentioned in #151067, current design of llvm.coro.end mixes two functionalities: querying where we are and lowering to some code. This patch separate these functionalities into independent intrinsics by introducing a new intrinsic llvm.coro.is_in_ramp.
2025-09-16	Add DebugSSAUpdater class to track debug value liveness (#135349)	Stephen Tozer	2	-0/+220
	This patch adds a class that uses SSA construction, with debug values as definitions, to determine whether and which debug values for a particular variable are live at each point in an IR function. This will be used by the IR reader of llvm-debuginfo-analyzer to compute variable ranges and coverage, although it may be applicable to other debug info IR analyses.
2025-09-15	[VPlan] Match more GEP-like in m_GetElementPtr (#158019)	Ramkumar Ramachandra	1	-0/+24
	The m_GetElementPtr matcher is incorrect and incomplete. Fix it to match all possible GEPs to avoid misleading users. It currently just has one use, and the change is non-functional for that use.
2025-09-11	[PGO] Add llvm.loop.estimated_trip_count metadata (#152775)	Joel E. Denny	1	-0/+53
	This patch implements the `llvm.loop.estimated_trip_count` metadata discussed in [[RFC] Fix Loop Transformations to Preserve Block Frequencies](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785). As the RFC explains, that metadata enables future patches, such as PR #128785, to fix block frequency issues without losing estimated trip counts.
2025-09-10	[LowerTypeTests] Optimize buildBitSet (#157386)	Vitaly Buka	1	-4/+1
	`buildBitSet` had a loop trough entire GlobalLayout to pickup matching offsets. The patch maps all offsets to correspondign `TypeId`, so we pass prepared list of offsets into `buildBitSet`. On one large internal binary, `LowerTypeTests` took 58% of ThinLTO link time before the patch. After the patch just 7% (absolute saving is 200s).
2025-08-29	[DirectX] Make dx.RawBuffer an op that can't be replaced (#154620)	Farzon Lotfi	1	-0/+8
	fixes #152348 SimplifyCFG collapses raw buffer store from a if\else load into a select. This change prevents the TargetExtType dx.Rawbuffer from being replace thus preserving the if\else blocks. A further change was needed to eliminate the phi node before we process Intrinsic::dx_resource_getpointer in DXILResourceAccess.cpp
2025-08-25	[VPlan] Support scalar VF for ExtractLane and FirstActiveLane.	Florian Hahn	1	-2/+1
	Extend ExtractLane and FirstActiveLane to support scalable VFs. This allows correct handling when interleaving with VF = 1. Alive2 proofs: - Fixed codegen with this patch: https://alive2.llvm.org/ce/z/8Y5_Vc (verifies as correct) - Original codegen: https://alive2.llvm.org/ce/z/twdg3X (doesn't verify) Fixes https://github.com/llvm/llvm-project/issues/154967.
2025-08-25	Revert "[Coroutines] Add llvm.coro.is_in_ramp and drop return value of ↵	Chuanqi Xu	1	-4/+4
	llvm.coro.end (#153404)" This reverts commit 19a4f520952c2b87de43e7176f34be9906384a33. See test failure in https://github.com/llvm/llvm-project/pull/153404
2025-08-25	[Coroutines] Add llvm.coro.is_in_ramp and drop return value of llvm.coro.end ↵	Weibo He	1	-4/+4
	(#153404) As mentioned in #151067, current design of `llvm.coro.end` mixes two functionalities: querying where we are and lowering to some code. This patch separate these functionalities into independent intrinsics by introducing a new intrinsic `llvm.coro.is_in_ramp`.
2025-08-22	[SandboxVec][SeedCollector] Implement collection of seeds with different ↵	vporpo	1	-4/+43
	types (#146171) Up until now the seed collector could only collect seeds with the same element type. For example, `i32` and <2 x i32>`. This patch implements the collection of seeds with different types, like `i32` and `i8`.
2025-08-21	[VPlan] Move SCEV expansion to VPlan transform. (NFCI).	Florian Hahn	1	-1/+1
	Move the logic to expand SCEVs directly to a late VPlan transform that expands SCEVs in the entry block. This turns VPExpandSCEVRecipe into an abstract recipe without execute, which clarifies how the recipe is handled, i.e. it is not executed like regular recipes. It also helps to simplify construction, as now scalar evolution isn't required to be passed to the recipe.
2025-08-21	[VPlan] Use VPIRMetadata for VPInterleaveRecipe. (#153084)	Florian Hahn	1	-1/+1
	Use VPIRMetadata for VPInterleaveRecipe to preserve noalias metadata added by versioning. This still uses InterleaveGroup's logic to preserve existing metadata from IR. This can be migrated separately. Fixes https://github.com/llvm/llvm-project/issues/153006. PR: https://github.com/llvm/llvm-project/pull/153084
2025-08-14	[VPlan] Add incoming_[blocks,values] iterators to VPPhiAccessors (NFC) ↵	Florian Hahn	1	-4/+3
	(#138472) Add 3 new iterator ranges to VPPhiAccessors * incoming_values(): returns a range over the incoming values of a phi * incoming_blocks(): returns a range over the incoming blocks of a phi * incoming_values_and_blocks: returns a range over pairs of incoming values and blocks. Depends on https://github.com/llvm/llvm-project/pull/124838. PR: https://github.com/llvm/llvm-project/pull/138472
2025-08-11	[PredicateInfo] Use bitcast instead of ssa.copy (#151174)	Nikita Popov	1	-6/+5
	PredicateInfo needs some no-op to which the predicate can be attached. Currently this is an ssa.copy intrinsic. This PR replaces it with a no-op bitcast. Using a bitcast is more efficient because we don't have the overhead of an overloaded intrinsic. It also makes things slightly simpler overall.
2025-08-09	[VPlan] Move initial skeleton construction earlier (NFC). (#150848)	Florian Hahn	1	-3/+6
	Split up the not clearly named prepareForVectorization transform into buildVPlan0, which adds the vector preheader, middle and scalar preheader blocks, as well as the canonical induction recipes and sets the trip count. The new transform is run directly after building the plain CFG VPlan initially. The remaining code handling early exits and adding the branch in the middle block is renamed to handleEarlyExitsAndAddMiddleCheck and still runs at the original position. With the code movement, we only have to add the skeleton once to the initial VPlan, and cloning will take care of the rest. It will also enable moving other construction steps to work directly on VPlan0, like adding resume phis. PR: https://github.com/llvm/llvm-project/pull/150848
2025-08-08	[KeyInstr] Remove LLVM_EXPERIMENTAL_KEY_INSTRUCTIONS CMake flag (#152735)	Orlando Cazalet-Hyams	1	-4/+0
	The CMake flag has been on by default for a month without any issues. This makes the feature support in LLVM unconditional (but does not enable the feature by default).
2025-08-06	[VPlan] Use scalar VPPhi instead of VPWidenPHIRecipe in createPlainCFG. ↵	Florian Hahn	1	-3/+3
	(#150847) The initial VPlan closely reflects the original scalar loop, so unsing VPWidenPHIRecipe here is premature. Widened phi recipes should only be introduced together with other widened recipes. PR: https://github.com/llvm/llvm-project/pull/150847
2025-08-01	[LAA] Support assumptions in evaluatePtrAddRecAtMaxBTCWillNotWrap (#147047)	Florian Hahn	1	-1/+2
	This patch extends the logic added in https://github.com/llvm/llvm-project/pull/128061 to support dereferenceability information from assumptions as well. Unfortunately both assumption cache and the dominator tree need to be threaded through multiple layers to make them available where needed. PR: https://github.com/llvm/llvm-project/pull/147047
2025-08-01	[VPlan] Delete IR instruction after test. NFC	Luke Lau	1	-0/+2
	This fixes a LeakSanitizer failure on the sanitizer buildbots: https://lab.llvm.org/buildbot/#/builders/52/builds/10088
2025-08-01	[VPlan] Fix unit test without LLVM_ENABLE_DUMP. NFC	Luke Lau	1	-0/+5
	Without dumping the faulty recipe isn't printed, so account for that like in the other tests. Fixes the buildbot failure at https://lab.llvm.org/buildbot/#/builders/2/builds/30229
2025-07-31	[VPlan] Fix header phi VPInstruction verification. NFC (#151472)	Luke Lau	1	-0/+31
	Noticed this when checking the invariant that all phis in the header block must be header phis. I think there's a missing set of parentheses here, since otherwise it only cast<VPInstruction> when RecipeI isn't a VPInstruction.
2025-07-27	[VPlan] Pass debug location explicitly to VPBlendRecipe (NFC).	Florian Hahn	2	-2/+2
	This enables creating VPBlendRecipes without underlying PHINode.
2025-07-21	[DebugInfo] Remove intrinsic-flavours of findDbgUsers (#149816)	Jeremy Morse	1	-17/+6
	This is one of the final remaining debug-intrinsic specific codepaths out there, and pieces of cross-LLVM infrastructure to do with debug intrinsics.
2025-07-18	[DebugInfo] Shave even more users of DbgVariableIntrinsic from LLVM (#149136)	Jeremy Morse	2	-64/+1
	At this stage I'm just opportunistically deleting any code using debug-intrinsic types, largely adjacent to calls to findDbgUsers. I'll get to deleting that in probably one or more two commits.
2025-07-15	[DebugInfo][RemoveDIs] Suppress getNextNonDebugInfoInstruction (#144383)	Jeremy Morse	1	-7/+7
	There are no longer debug-info instructions, thus we don't need this skipping. Horray!
2025-06-30	[KeyInstr] Add DISubprogram::keyInstructions bit (#144107)	Orlando Cazalet-Hyams	1	-2/+2
	Patch 1/4 adding bitcode support. Store whether or not a function is using Key Instructions in its DISubprogram so that we don't need to rely on the -mllvm flag -dwarf-use-key-instructions to determine whether or not to interpret Key Instructions metadata to decide is_stmt placement at DWARF emission time. This makes bitcode support simple and enables well defined mixing of non-key-instructions and key-instructions functions in an LTO context. This patch adds the bit (using DISubprogram::SubclassData1). PR 144104 and 144103 use it during DWARF emission. PR 44102 adds bitcode support. See pull request for overview of alternative attempts.
2025-06-27	[SandboxVec][SeedCollector][NFC] Replace cl::opt flags with constructor args ↵	vporpo	1	-5/+10
	(#143206) The `SeedCollector` class gets two new arguments: `CollectStores` and `CollectLoads`. These replace the `sbvec-collect-seeds` cl::opt flag. This is done to help with reusing the SeedCollector class in a future pass. The cl::opt flag is moved to the seed collection pass: Passes/SeedCollection.cpp
2025-06-26	TargetLibraryInfo: Delete default TargetLibraryInfoImpl constructor (#145826)	Matt Arsenault	14	-51/+67
	It should not be possible to construct one without a triple. It would also be nice to delete TargetLibraryInfoWrapperPass, but that is more difficult.
2025-06-24	[VPlan] Add VPInst::getNumOperandsForOpcode, use to verify in ctor (NFC) ↵	Florian Hahn	1	-17/+22
	(#142284) Add a new getNumOperandsForOpcode helper to determine the number of operands from the opcode. For now, it is used to verify the number operands at VPInstruction construction. It returns -1 for a few opcodes where the number of operands cannot be determined (GEP, Switch, PHI, Call). This can also be used in a follow-up to determine if a VPInstruction is masked based on the number of arguments. PR: https://github.com/llvm/llvm-project/pull/142284
2025-06-06	LowerTypeTests: Shrink check size by 1 instruction on x86.	Peter Collingbourne	1	-4/+4
	We currently generate code like this on x86 for a jump table with 5 elements, assuming the call target is in rbx: lea global_addr(%rip), %rax # initialize temporary rax with base address mov %rbx, %rcx # initialize another temporary rcx for index (rbx will be used for the call, so it is still live) sub %rax, %rcx # compute `address - base` ror $0x3, %rcx # compute `(address - base) ror 3` i.e. index cmp $0x4, %rcx # check index <= 4 ja .Ltrap [...] .Ltrap: ud1 A more efficient instruction sequence, that only needs one temporary register and one fewer instruction, is possible by subtracting the address we are testing from the fixed address instead of vice versa: lea (global_addr + 4*8)(%rip), %rax # initialize temporary rax with address of last element sub %rbx, %rax # compute `last element - address` ror $0x3, %rax # compute `(last element - address) ror 3` i.e. 4 - index cmp $0x4, %rax # check 4 - index <= 4 (same as above) ja .Ltrap [...] .Ltrap: ud1 Change LowerTypeTests to generate that sequence. As a consequence, the order of bits in the bitsets is reversed. Because it doesn't matter how we do the subtraction on other architectures (to the best of my knowledge), do so unconditionally. Reviewers: fmayer, vitalybuka Reviewed By: fmayer Pull Request: https://github.com/llvm/llvm-project/pull/142887
2025-06-05	Reapply "[SandboxVec] Add a simple pack reuse pass (#141848)"	Vasileios Porpodas	1	-0/+50
	This reverts commit 31abf0774232735ad7a7d45e531497305bf99fae.
2025-06-05	[MemProf] Split MemProfiler into Instrumentation and Use. (#142811)	Snehasish Kumar	1	-1/+1
	Most of the recent development on the MemProfiler has been on the Use part. The instrumentation has been quite stable for a while. As the complexity of the use grows (with undrifting, diagnostics etc) I figured it would be good to separate these two implementations.
2025-06-04	Revert "[SandboxVec] Add a simple pack reuse pass (#141848)"	Vasileios Porpodas	1	-50/+0
	This reverts commit 1268352656f81ea173860a8002aadb88844137e7.
2025-06-04	[SandboxVec] Add a simple pack reuse pass (#141848)	vporpo	1	-0/+50
	This patch implements a simple pass that tries to de-duplicate packs. If there are two packing patterns inserting the exact same values in the exact same order, then we will keep the top-most one of them. Even though such patterns may be optimized away by subsequent passes it is still useful to do this within the vectorizer because otherwise the cost estimation may be off, making the vectorizer over conservative.
2025-05-30	[VPlan] Remove ResumePhi opcode, use regular PHI instead (NFC). (#140405)	Florian Hahn	1	-0/+2
	Use regular VPPhi instead of a separate opcode for resume phis. This removes an unneeded specialized opcode and unifies the code (verification, printing, updating when CFG is changed). Depends on https://github.com/llvm/llvm-project/pull/140132. PR: https://github.com/llvm/llvm-project/pull/140405
2025-05-29	[VPlan] Use EMIT-SCALAR for single-scalar VPPhis (NFC).	Florian Hahn	1	-1/+1
	Follow-up to https://github.com/llvm/llvm-project/pull/141428, to also use EMIT-SCALAR for VPPhis that are single scalars.
2025-05-27	[VPlan] Connect Entry to scalar preheader during initial construction. (#140132)	Florian Hahn	2	-79/+55
	Update initial construction to connect the Plan's entry to the scalar preheader during initial construction. This moves a small part of the skeleton creation out of ILV and will also enable replacing VPInstruction::ResumePhi with regular VPPhi recipes. Resume phis need 2 incoming values to start with, the second being the bypass value from the scalar ph (and used to replicate the incoming value for other bypass blocks). Adding the extra edge ensures we incoming values for resume phis match the incoming blocks. PR: https://github.com/llvm/llvm-project/pull/140132
2025-05-25	[VPlan] Separate out logic to manage IR flags to VPIRFlags (NFC). (#140621)	Florian Hahn	1	-10/+10
	This patch moves the logic to manage IR flags to a separate VPIRFlags class. For now, VPRecipeWithIRFlags is the only class that inherits VPIRFlags. The new class allows for simpler passing of flags when constructing recipes, simplifying the constructors for various recipes (VPInstruction in particular, which now just has 2 constructors, one taking an extra VPIRFlags argument. This mirrors the approach taken for VPIRMetadata and makes it easier to extend in the future. The patch also adds a unified flagsValidForOpcode to check if the flags in a VPIRFlags match the provided opcode. PR: https://github.com/llvm/llvm-project/pull/140621
2025-05-23	Reapply "[VPlan] Support cloning initial VPlan (NFC)."	Florian Hahn	1	-0/+43
	This reverts commit 204252e2df80876702616518a5154dccacf3ebac. Recommit with a fix for the leak in a unit test.
2025-05-22	Reapply "[VPlan] Move predication to VPlanTransform (NFC). (#128420)"	Florian Hahn	1	-2/+1
	This reverts commit 793bb6b257fa4d9f4af169a4366cab3da01f2e1f. The recommitted version contains a fix to make sure only the original phis are processed in convertPhisToBlends nu collecting them in a vector first. This fixes a crash when no mask is needed, because there is only a single incoming value. Original message: This patch moves the logic to predicate and linearize a VPlan to a dedicated VPlan transform. It mostly ports the existing logic directly. There are a number of follow-ups planned in the near future to further improve on the implementation: * Edge and block masks are cached in VPPredicator, but the block masks are still made available to VPRecipeBuilder, so they can be accessed during recipe construction. As a follow-up, this should be replaced by adding mask operands to all VPInstructions that need them and use that during recipe construction. * The mask caching in a map also means that this map needs updating each time a new recipe replaces a VPInstruction; this would also be handled by adding mask operands. PR: https://github.com/llvm/llvm-project/pull/128420
2025-05-21	Revert "[VPlan] Move predication to VPlanTransform (NFC). (#128420)"	Florian Hahn	1	-1/+2
	This reverts commit b263c08e1a0b54a871915930aa9a1a6ba205b099. Looks like this triggers a crash in one of the Fortran tests. Reverting while I investigate https://lab.llvm.org/buildbot/#/builders/41/builds/6825
2025-05-21	[VPlan] Move predication to VPlanTransform (NFC). (#128420)	Florian Hahn	1	-2/+1
	This patch moves the logic to predicate and linearize a VPlan to a dedicated VPlan transform. It mostly ports the existing logic directly. There are a number of follow-ups planned in the near future to further improve on the implementation: * Edge and block masks are cached in VPPredicator, but the block masks are still made available to VPRecipeBuilder, so they can be accessed during recipe construction. As a follow-up, this should be replaced by adding mask operands to all VPInstructions that need them and use that during recipe construction. * The mask caching in a map also means that this map needs updating each time a new recipe replaces a VPInstruction; this would also be handled by adding mask operands. PR: https://github.com/llvm/llvm-project/pull/128420
2025-05-19	[AMDGPU] Set AS8 address width to 48 bits	Alexander Richardson	1	-1/+1
	Of the 128-bits of buffer descriptor only 48 bits are address bits, so following the discussion on https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/54, the logic conclusion is to set the index width to 48 bits instead of the current value of 128. Most of the test changes are mechanical datalayout updates, but there is one actual change: the ptrmask test now uses .i48 instead of .i128 and I had to update SelectionDAGBuilder to correctly extend the mask. Reviewed By: krzysz00 Pull Request: https://github.com/llvm/llvm-project/pull/139419