riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
13 hours	[llvm][clang] Use the VFS in `GCOVProfilerPass` (#161260)	Jan Svoboda	1	-12/+18
	This PR starts using the correct VFS in `GCOVProfilerPass` instead of using the real FS directly. This matches compiler's behavior for other input files.
4 days	[ASan][RISCV] Teach AddressSanitizer to support indexed load/store. (#160443)	Hank Chang	1	-0/+19
	This patch is based on https://github.com/llvm/llvm-project/pull/159713 This patch extends AddressSanitizer to support indexed/segment instructions in RVV. It enables proper instrumentation for these memory operations. A new member, `MaybeOffset`, is added to `InterestingMemoryOperand` to describe the offset between the base pointer and the actual memory reference address. Co-authored-by: Yeting Kuo <yeting.kuo@sifive.com>
4 days	[msan] Handle AVX512/AVX10 vrndscale (#160624)	Thurston Dang	1	-0/+56
	Uses the updated handleAVX512VectorGenericMaskedFP() from https://github.com/llvm/llvm-project/pull/159966
5 days	[llvm] Add `vfs::FileSystem` to `PassBuilder` (#160188)	Jan Svoboda	1	-6/+6
	Some LLVM passes need access to the filesystem to read configuration files and similar. In some places, this is achieved by grabbing the VFS from `PGOOptions`, but some passes don't have access to these and resort to just calling `vfs::getRealFileSystem()`. This PR allows setting the VFS directly on `PassBuilder` that's able to pass it down to all passes that need it.
5 days	Reapply "[ControlHeightReduction] Drop lifetime annotations where necessary" ↵	Aiden Grossman	1	-8/+37
	(#160640) Reapplies #159686 This reverts commit 4f33d7b7a9f39d733b7572f9afbf178bca8da127. The original landing of this patch had an issue where it would try and hoist allocas into the entry block that were in the entry block. This would end up actually moving them lower in the block potentially after users, resulting in invalid IR. This update fixes this by ensuring that we are only hoisting static allocas that have been sunk into a split basic block. A regression test has been added. Integration tested using a three stage build of clang with IRPGO enabled.
5 days	[msan][NFCI] Generalize handleAVX512VectorGenericMaskedFP() operands (#159966)	Thurston Dang	1	-16/+38
	This generalizes handleAVX512VectorGenericMaskedFP() (introduced in #158397), to potentially handle intrinsics that have A/WriteThru/Mask in an operand order that is different to AVX512/AVX10 rcp and rsqrt. Any operands other than A and WriteThru must be fully initialized. For example, the generalized handler could be applied in follow-up work to many of the AVX512 rndscale intrinsics: ``` <32 x half> @llvm.x86.avx512fp16.mask.rndscale.ph.512(<32 x half>, i32, <32 x half>, i32, i32) <16 x float> @llvm.x86.avx512.mask.rndscale.ps.512(<16 x float>, i32, <16 x float>, i16, i32) <8 x double> @llvm.x86.avx512.mask.rndscale.pd.512(<8 x double>, i32, <8 x double>, i8, i32) A Imm WriteThru Mask Rounding <8 x float> @llvm.x86.avx512.mask.rndscale.ps.256(<8 x float>, i32, <8 x float>, i8) <4 x float> @llvm.x86.avx512.mask.rndscale.ps.128(<4 x float>, i32, <4 x float>, i8) <4 x double> @llvm.x86.avx512.mask.rndscale.pd.256(<4 x double>, i32, <4 x double>, i8) <2 x double> @llvm.x86.avx512.mask.rndscale.pd.128(<2 x double>, i32, <2 x double>, i8) A Imm WriteThru Mask ```
7 days	Revert "[ControlHeightReduction] Drop lifetime annotations where necessary ↵	Aiden Grossman	1	-37/+8
	(#159686)" This reverts commit a00450944d2a91aba302954556c1c23ae049dfc7. Looks like this one is actually breaking the buildbots. Reverting the switch back to IRPGO did not fix things.
7 days	[TTI][ASan][RISCV] reland Move InterestingMemoryOperand to Analysis and ↵	Hank Chang	1	-6/+18
	embed in MemIntrinsicInfo #157863 (#159713) [Previously reverted due to failures on asan-rvv-intrinsics.ll, the test case is riscv only and it is triggered by other target] Reland [#157863](https://github.com/llvm/llvm-project/pull/157863), and add `; REQUIRES: riscv-registered-target` in test case to skip the configuration that doesn't register riscv target. Previously asan considers target intrinsics as black boxes, so asan could not instrument accurate check. This patch make SmallVector<InterestingMemoryOperand> a member of MemIntrinsicInfo so that TTI can make targets describe their intrinsic informations to asan. Note, 1. This patch move InterestingMemoryOperand from Transforms to Analysis. 2. Extend MemIntrinsicInfo by adding a SmallVector<InterestingMemoryOperand> member. 3. This patch does not support RVV indexed/segment load/store.
8 days	[ControlHeightReduction] Drop lifetime annotations where necessary (#159686)	Aiden Grossman	1	-8/+37
	ControlHeightReduction will duplicate some blocks and insert phi nodes in exit blocks of regions that it operates on for any live values. This includes allocas. Having a lifetime annotation refer to a phi node was made illegal in 92c55a315eab455d5fed2625fe0f61f88cb25499, which causes the verifier to fail after CHR. There are some cases where we might not need to drop lifetime annotations (usually because we do not need the phi to begin with), but drop all annotations for now to be conservative. Fixes #159621.
11 days	Fix NDEBUG Wundef warning; NFC (#159539)	Sven van Haastregt	1	-1/+1
	The `NDEBUG` macro is tested for defined-ness everywhere else. The instance here triggers a warning when compiling with `-Wundef`.
11 days	Revert "[TTI][ASan][RISCV] Move InterestingMemoryOperand to Analysis and ↵	Florian Mayer	1	-18/+6
	embed in MemIntrinsicInfo" (#159700) Reverts llvm/llvm-project#157863
11 days	[TTI][ASan][RISCV] Move InterestingMemoryOperand to Analysis and embed in ↵	Hank Chang	1	-6/+18
	MemIntrinsicInfo (#157863) Previously asan considers target intrinsics as black boxes, so asan could not instrument accurate check. This patch make SmallVector<InterestingMemoryOperand> a member of MemIntrinsicInfo so that TTI can make targets describe their intrinsic informations to asan. Note, 1. This patch move InterestingMemoryOperand from Transforms to Analysis. 2. Extend MemIntrinsicInfo by adding a SmallVector<InterestingMemoryOperand> member. 3. This patch does not support RVV indexed/segment load/store.
14 days	Re-apply "[NFCI][Globals] In GlobalObjects::setSectionPrefix, do conditional ↵	Mingming Liu	1	-3/+2
	update if existing prefix is not equivalent to the new one. Returns whether prefix changed." (#159161) This is a reland of https://github.com/llvm/llvm-project/pull/158460 Test failures are gone once I undo the changes in codegenprepare.
14 days	Revert "[NFCI][Globals] In GlobalObjects::setSectionPrefix, do conditional ↵	Mingming Liu	1	-2/+3
	update if existing prefix is not equivalent to the new one. Returns whether prefix changed." (#159159) Reverts llvm/llvm-project#158460 due to buildbot failures
14 days	[NFCI][Globals] In GlobalObjects::setSectionPrefix, do conditional update if ↵	Mingming Liu	1	-3/+2
	existing prefix is not equivalent to the new one. Returns whether prefix changed. (#158460) Before this change, `setSectionPrefix` overwrites existing section prefix with new one unconditionally. After this change, `setSectionPrefix` checks for equivalences, updates conditionally and returns whether an update happens. Update the existing callers to make use of the return value. [PR 155337](https://github.com/llvm/llvm-project/pull/155337/files#diff-cc0c67ac89807f4453f0cfea9164944a4650cd6873a468a0f907e7158818eae9) is a motivating use case whether the 'update' semantic is needed.
2025-09-15	[msan] Handle AVX512/AVX10 rcp and rsqrt (#158397)	Thurston Dang	1	-0/+165
	Adds a new handler, handleAVX512VectorGenericMaskedFP(), and applies it to AVX512/AVX10 rcp and rsqrt
2025-09-11	[msan] Handle AVX512 pack with saturation intrinsics (#157984)	Thurston Dang	1	-0/+21
	Approximately handle avx512_{packssdw/packsswb/packusdw/packuswb} with the existing handleVectorPackIntrinsic(), instead of relying on the default (strict) handler.
2025-09-10	[Instrumentation] Fix formatting of MemorySanitizer.cpp	Kazu Hirata	1	-1/+2

2025-09-10	Mark variable as maybe unused (only used in debug mode) (#157875)	Karlo Basioli	1	-1/+1

2025-09-10	[x86][AVX-VNNI] Fix VPDPBUSD Argument Types (#155194)	BaiXilin	1	-21/+21
	Fixed intrinsic VPDPBUSD[,S]_128/256/512's argument types to match with the ISA. Fixes part of #97271
2025-09-02	[msan] Fix multiply-add-accumulate (#153927) to use ReductionFactor (#155748)	Thurston Dang	1	-4/+6
	https://github.com/llvm/llvm-project/pull/153927 incorrectly cast using a hardcoded reduction factor of two, rather than using the parameter. This caused false negatives but not false positives. (The only incorrect case was a reduction factor of four; if four values {A,B,C,D} are being reduced, the result is fully zero iff {A,B} and {C,D} are both zero after pairwise reduction. If only one of those reduced pairs is zero, then the quadwise reduction is non-zero.)
2025-08-27	[MemProf] Extend MemProfUse pass to make use of data access profiles to ↵	Mingming Liu	1	-3/+117
	partition data (#151238) https://github.com/llvm/llvm-project/commit/f3f28323adbb9d01372d81b4c78ed94683e58757 introduces the data access profile format as a payload inside [memprof](https://llvm.org/docs/InstrProfileFormat.html#memprof-profile-data), and the MemProfUse pass reads the memprof payload. This change extends the MemProfUse pass to read the data access profiles to annotate global variables' section prefix. 1. If there are samples for a global variable, it's annotated as hot. 2. If a global variable is seen in the profiled binary file but doesn't have access samples, it's annotated as unlikely. Introduce an option `annotate-static-data-prefix` to flag-gate the global-variable annotation path, and make it false by default. https://github.com/llvm/llvm-project/pull/155337 is the (WIP) draft change to "reconcile" two sources of hotness.
2025-08-27	[ASan] Prevent assert from scalable vectors in FunctionStackPoisoner. (#155357)	David Green	1	-1/+3
	This has recently started causing 'Invalid size request on a scalable vector.'
2025-08-26	[hwasan] Add hwasan-static-linking option (#154529)	Vadim Marchenko	1	-14/+34
	Discarding the `.note.hwasan.globals` section in ldscript causes a linker error, since `hwasan_globals` refers to the discarded section. The issue comes from `hwasan.dummy.global` being associated via metadata with `.note.hwasan.globals`. Add a new `-hwasan-static-linking` option to skip inserting `.note.hwasan.globals` for static binaries, as it is only needed for instrumenting globals from dynamic libraries. In static binaries, the global variables section can be accessed directly via the `__start_hwasan_globals` and `__stop_hwasan_globals` symbols inserted by the linker.
2025-08-25	[msan][NFCI] Refactor visitIntrinsicInst() into instruction families (#154878)	Thurston Dang	1	-49/+88
	Currently visitIntrinsicInst() is a long, partly unsorted list. This patch groups them into cross-platform, X86 SIMD, and Arm SIMD families, making the overall intent of visitIntrinsicInst() clearer: ``` void visitIntrinsicInst(IntrinsicInst &I) { if (maybeHandleCrossPlatformIntrinsic(I)) return; if (maybeHandleX86SIMDIntrinsic(I)) return; if (maybeHandleArmSIMDIntrinsic(I)) return; if (maybeHandleUnknownIntrinsic(I)) return; visitInstruction(I); } ``` There is one disadvantage: the compiler will not tell us if the switch statements in the handlers have overlapping coverage.
2025-08-21	[msan] Handle AVX512 VCVTPS2PH (#154460)	Thurston Dang	1	-30/+102
	This extends handleAVX512VectorConvertFPToInt() from 556c8467d15a131552e3c84478d768bafd95d4e6 (https://github.com/llvm/llvm-project/pull/147377) to handle AVX512 VCVTPS2PH.
2025-08-21	[hwasan] Port "[Asan] Skip pre-split coroutine and noop coroutine frame ↵	Thurston Dang	1	-0/+3
	(#99415)" (#154803) Originally suggested by rnk@ (this is the simplified function-level skip version, to unblock builds ASAP)
2025-08-18	[msan] Handle multiply-add-accumulate; apply to AVX Vector Neural Network ↵	Thurston Dang	1	-13/+174
	Instructions (VNNI) (#153927) This extends the pmadd handler (recently improved in https://github.com/llvm/llvm-project/pull/153353) to three-operand intrinsics (multiply-add-accumulate), and applies it to the AVX Vector Neural Network Instructions. Updates the tests from https://github.com/llvm/llvm-project/pull/153135
2025-08-18	[msan] Add Instrumentation for Avx512 Instructions: pmaddw, pmaddubs (#153919)	Thurston Dang	1	-0/+18
	This applies the pmadd handler (recently improved in https://github.com/llvm/llvm-project/pull/153353) to the Avx512 equivalent of the pmaddw and pmaddubs intrinsics: <16 x i32> @llvm.x86.avx512.pmaddw.d.512(<32 x i16>, <32 x i16>) <32 x i16> @llvm.x86.avx512.pmaddubs.w.512(<64 x i8>, <64 x i8>)
2025-08-15	[msan] Handle SSE/AVX pshuf intrinsic by applying to shadow (#153895)	Thurston Dang	1	-0/+22
	llvm.x86.sse.pshuf.w(<1 x i64>, i8) and llvm.x86.avx512.pshuf.b.512(<64 x i8>, <64 x i8>) are currently handled strictly, which is suboptimal. llvm.x86.ssse3.pshuf.b(<1 x i64>, <1 x i64>) llvm.x86.ssse3.pshuf.b.128(<16 x i8>, <16 x i8>) and llvm.x86.avx2.pshuf.b(<32 x i8>, <32 x i8>) are currently heuristically handled using maybeHandleSimpleNomemIntrinsic, which is incorrect. Since the second argument is the shuffle order, we instrument all these intrinsics using `handleIntrinsicByApplyingToShadow(..., /trailingVerbatimArgs=/1)` (https://github.com/llvm/llvm-project/pull/114490).
2025-08-15	[msan] Reland with even more improvement: Improve packed multiply-add ↵	Thurston Dang	1	-19/+118
	instrumentation (#153353) This reverts commit cf002847a464c004a57ca4777251b1aafc33d958 i.e., relands ba603b5e4d44f1a25207a2a00196471d2ba93424. It was reverted because it was subtly wrong: multiplying an uninitialized zero should not result in an initialized zero. This reland fixes the issue by using instrumentation analogous to visitAnd (bitwise AND of an initialized zero and an uninitialized value results in an initialized value). Additionally, this reland expands a test case; fixes the commit message; and optimizes the change to avoid the need for horizontalReduce. The current instrumentation has false positives: it does not take into account that multiplying an initialized zero value with an uninitialized value results in an initialized zero value This change fixes the issue during the multiplication step. The horizontal add step is modeled using bitwise OR. Future work can apply this improved handler to the AVX512 equivalent intrinsics (x86_avx512_pmaddw_d_512, x86_avx512_pmaddubs_w_512.) and AVX VNNI intrinsics.
2025-08-14	[NFC][PGO] Factor downscaling of branch weights out of `Instrumentation` ↵	Mircea Trofin	1	-6/+2
	into `ProfileData` (#153735) The logic isn’t instrumentation-specific, and the refactoring allows users avoid a dependency on `Instrumentation` and just take one on `ProfileData` (which a fairly low-level dependency)
2025-08-14	[NFC][PGO] Drop unused `Module` parameter in `setProfMetadata` (#153733)	Mircea Trofin	3	-7/+7

2025-08-12	Revert "[msan] Improve packed multiply-add instrumentation" (#153343)	Thurston Dang	1	-87/+19
	Reverts llvm/llvm-project#152941 Buildbot breakage: https://lab.llvm.org/buildbot/#/builders/66/builds/17843
2025-08-12	[msan] Improve packed multiply-add instrumentation (#152941)	Thurston Dang	1	-19/+87
	The current instrumentation has false positives: if there is a single uninitialized bit in any of the operands, the entire output is poisoned. This does not take into account that multiplying an uninitialized value with zero results in an initialized zero value. This step allows elements that are zero to clear the corresponding shadow during the multiplication step. The horizontal add step and accumulation step (if any) are modeled using bitwise OR. Future work can apply this improved handler to the AVX512 equivalent intrinsics (x86_avx512_pmaddw_d_512, x86_avx512_pmaddubs_w_512.) and AVX VNNI intrinsics.
2025-08-12	[MemorySanitizer] Fix an unused-variable warning (NFC)	Jie Fu	1	-1/+1
	/llvm-project/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp:2752:22: error: unused variable 'ParamType' [-Werror,-Wunused-variable] FixedVectorType *ParamType = ^ 1 error generated.
2025-08-11	[NFCI][msan] Refactor into 'horizontalReduce' (#152961)	Thurston Dang	1	-43/+60
	The functionality is used by two helper functions, and will be used even more in the future (e.g., https://github.com/llvm/llvm-project/pull/152941).
2025-08-08	[IR] Remove size argument from lifetime intrinsics (#150248)	Nikita Popov	4	-31/+11
	Now that #149310 has restricted lifetime intrinsics to only work on allocas, we can also drop the explicit size argument. Instead, the size is implied by the alloca. This removes the ability to only mark a prefix of an alloca alive/dead. We never used that capability, so we should remove the need to handle that possibility everywhere (though many key places, including stack coloring, did not actually respect this).
2025-08-07	[TypeSanitizer] Use alloca size for lifetime markers (#152154)	Nikita Popov	1	-5/+13
	Split out from https://github.com/llvm/llvm-project/pull/150248: Use the size of the alloca instead of the size passed to the lifetime intrinsic. As a bonus, this handles dynamic allocas correctly (see the added test) instead of doing a memset with size -1...
2025-08-06	[TSan] Add option to ignore capturing behavior when instrumenting (#148156)	Yussur Mustafa Oraji	1	-1/+6
	While not needed for most applications, some tools such as [MUST](https://www.i12.rwth-aachen.de/cms/i12/forschung/forschungsschwerpunkte/lehrstuhl-fuer-hochleistungsrechnen/~nrbe/must/) depend on the instrumentation being present. MUST uses the ThreadSanitizer annotation interface to detect data races in MPI programs, where the capture tracking is detrimental as it has no bearing on MPI data races, leading to missed races.
2025-08-04	[LLVM][NumericalStabilitySanitizer] Add support for vector ConstantFPs. ↵	Paul Walker	1	-1/+2
	(#151739)
2025-08-04	[IR] Allow poison argument to lifetime markers (#151148)	Nikita Popov	2	-4/+5
	This slightly relaxes the invariant established in #149310, by also allowing the lifetime argument to be poison. This is to support the typical pattern of RAUWing with poison when removing an instruction. It's worth noting that this does not require any conservative assumptions, lifetimes with poison arguments can simply be skipped. Fixes https://github.com/llvm/llvm-project/issues/151119.
2025-07-31	Revert "[PGO] Add `llvm.loop.estimated_trip_count` metadata" (#151585)	Joel E. Denny	2	-46/+0
	Reverts llvm/llvm-project#148758 [As requested.](https://github.com/llvm/llvm-project/pull/148758#pullrequestreview-3076627201)
2025-07-31	[hwasan] Add hwasan-all-globals option (#149621)	shuffle2	1	-7/+23
	hwasan-globals does not instrument globals with custom sections, because existing code may use `__start_`/`__stop_` symbols to iterate over globals in such a way which will cause hwasan assertions. Introduce new hwasan-all-globals option, which instruments all user-defined globals (but not those globals which are generated by the hwasan instrumentation itself), including those with custom sections. fixes #142442
2025-07-31	[PGO] Add `llvm.loop.estimated_trip_count` metadata (#148758)	Joel E. Denny	2	-0/+46
	This patch implements the `llvm.loop.estimated_trip_count` metadata discussed in [[RFC] Fix Loop Transformations to Preserve Block Frequencies](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785). As [suggested in the RFC comments](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785/4), it adds the new metadata to all loops at the time of profile ingestion and estimates each trip count from the loop's `branch_weights` metadata. As [suggested in the PR #128785 review](https://github.com/llvm/llvm-project/pull/128785#discussion_r2151091036), it does so via a new `PGOEstimateTripCountsPass` pass, which creates the new metadata for each loop but omits the value if it cannot estimate a trip count due to the loop's form. An important observation not previously discussed is that `PGOEstimateTripCountsPass` often cannot estimate a loop's trip count, but later passes can sometimes transform the loop in a way that makes it possible. Currently, such passes do not necessarily update the metadata, but eventually that should be fixed. Until then, if the new metadata has no value, `llvm::getLoopEstimatedTripCount` disregards it and tries again to estimate the trip count from the loop's current `branch_weights` metadata.
2025-07-30	[msan] Approximately handle AVX Galois Field Affine Transformation (#150794)	Thurston Dang	1	-0/+80
	e.g., <16 x i8> @llvm.x86.vgf2p8affineqb.128(<16 x i8>, <16 x i8>, i8) <32 x i8> @llvm.x86.vgf2p8affineqb.256(<32 x i8>, <32 x i8>, i8) <64 x i8> @llvm.x86.vgf2p8affineqb.512(<64 x i8>, <64 x i8>, i8) Out A x b where A and x are packed matrices, b is a vector, Out = A * x + b in GF(2) Multiplication in GF(2) is equivalent to bitwise AND. However, the matrix computation also includes a parity calculation. For the bitwise AND of bits V1 and V2, the exact shadow is: Out_Shadow = (V1_Shadow & V2_Shadow) \| (V1 & V2_Shadow) \| (V1_Shadow & V2) We approximate the shadow of gf2p8affine using: Out_Shadow = _mm512_gf2p8affine_epi64_epi8(x_Shadow, A_shadow, 0) \| _mm512_gf2p8affine_epi64_epi8(x, A_shadow, 0) \| _mm512_gf2p8affine_epi64_epi8(x_Shadow, A, 0) \| _mm512_set1_epi8(b_Shadow) This approximation has false negatives: if an intermediate dot-product contains an even number of 1's, the parity is 0. It has no false positives. Updates the test from https://github.com/llvm/llvm-project/pull/149258
2025-07-28	Use F.hasOptSize() instead of checking optsize directly (#147348)	Ellis Hoag	1	-1/+1

2025-07-25	[NFC] [HWASan] remove unneeded pointer cast (#150510)	Florian Mayer	1	-5/+2
	The first argument to a lifetime intrinsic now has to be an alloca
2025-07-24	[NFC] [HWASan] remove unnecessary bool return in instrumentLandingPads	Florian Mayer	1	-3/+2

2025-07-24	[NFC] [HWASan] remove unused bool return value (#150516)	Florian Mayer	1	-3/+2