rocket-tools/riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2024-04-03	[𝘀𝗽𝗿] changes introduced through rebaseusers/vitalybuka/spr/main.clangubsan-switch-ubsan-optimization-to-llvmallowruntimeubsancheck	Vitaly Buka	441	-13254/+36744
	Created using spr 1.3.4 [skip ci]
2024-04-03	[GlobalISel] Fix the infinite loop issue in `commute_int_constant_to_rhs`	darkbuck	2	-8/+37
	- When both operands are constant, the matcher runs into an infinite loop as the commutation should be applied only when LHS is a constant and RHS is not. Reviewers: arsenm Reviewed By: arsenm Pull Request: https://github.com/llvm/llvm-project/pull/87426
2024-04-03	[clang] Init fields added by #87357	Vitaly Buka	1	-1/+3

2024-04-04	[RISCV][TTI] Scale the cost of intrinsic stepvector with LMUL (#87301)	Shih-Po Hung	2	-91/+61
	Use the return type to measure the LMUL size for latency/throughput cost
2024-04-03	[RISCV] Remove G_TRUNC/ZEXT/SEXT/ANYEXT from the first switch in ↵	Craig Topper	1	-10/+0
	RISCVRegisterBankInfo::getInstrMapping. This removes the special case for vectors. The default case in the second switch can handle GPR in addition to vectors. We just won't use the static ValueMapping entry.
2024-04-03	[mlir][vector] Skip 0D vectors in vector linearization. (#87577)	Han-Chung Wang	2	-0/+13

2024-04-03	[lldb] Set static Module's load addresses via ObjectFile (#87439)	Jason Molenda	1	-24/+16
	This is a followup to https://github.com/llvm/llvm-project/pull/86359 "[lldb] [ObjectFileMachO] LLVM_COV is not mapped into firmware memory (#86359)" where I treat LLVM_COV segments in a Mach-O binary as non-loadable. There is another codepath in `DynamicLoaderStatic::LoadAllImagesAtFileAddresses` which is called to set the load addresses for a Module to the file addresses. It has no logic to detect a segment that is not loaded in virtual memory (ObjectFileMachO::SectionIsLoadable), so it would set the load address for this LLVM_COV segment to the file address and shadow actual code, breaking lldb behavior. This method currently sets the load address for any section that doesn't have a load address set already. This presumes that a Module was added to the Target, some mechanism set the correct load address for SOME segments, and then this method is going to set the other segments to a no-slide value, assuming they were forgotten. ObjectFile base class doesn't, today, vend a SectionIsLoadable method, but we do have ObjectFile::SetLoadAddress and at a higher level, Module::SetLoadAddress, when we're setting the same slide to all segments. That's the behavior we want in this method. If any section has a load address, we don't touch this Module. Otherwise we set all sections to have a load address that is the same as the file address. I also audited the other parts of lldb that are calling SectionList::SectionLoadAddress and looked if they should be more correctly using Module::SetLoadAddress for the entire binary. But in most cases, we have the potential for different slides for different sections so this section-by-section approach must be taken. rdar://125800290
2024-04-03	[BoundsSafety] Minor fixes on counted_by (#87559)	Yeoul Na	2	-3/+3
	DeclRef to field must be marked as LValue to be consistent with how the field decl will be evaluated. T->desugar() is unnecessary to call ->isArrayType().
2024-04-03	Revert "DebugInfoD issues, take 2" (#87583)	Chelsea Cassanova	10	-506/+17
	Reverts llvm/llvm-project#86812. This commit caused a regression on the x86_64 MacOS buildbot: https://green.lab.llvm.org/job/llvm.org/view/LLDB/job/lldb-cmake/784/
2024-04-03	[Bounds-Safety][NFC] Clean up leading space emission for CountAttributedType ↵	Dan Liew	1	-4/+5
	(#87582) Previously the leading space was added in each string constant. This patch moves the leading space out of the string constants and is instead explicitly added to add clarity to the code.
2024-04-03	[mlir][vector] Update `castAwayContractionLeadingOneDim` to omit transposes ↵	Kojo Acquah	2	-3/+30
	solely on leading unit dims. (#85694) Updates `castAwayContractionLeadingOneDim` to check for leading unit dimensions before inserting `vector.transpose` ops. Currently `castAwayContractionLeadingOneDim` removes all leading unit dims based on the accumulator and transpose any subsequent operands to match the accumulator indexing. This does not take into account if the transpose is strictly necessary, for instance when given this vector-matrix contract: ```mlir %result = vector.contract {indexing_maps = [affine_map<(d0, d1, d2, d3) -> (d0, d1, d3)>, affine_map<(d0, d1, d2, d3) -> (d0, d2, d3)>, affine_map<(d0, d1, d2, d3) -> (d1, d2)>], iterator_types = ["parallel", "parallel", "parallel", "reduction"], kind = #vector.kind<add>} %lhs, %rhs, %acc : vector<1x1x8xi32>, vector<1x8x8xi32> into vector<1x8xi32> ``` Passing this through `castAwayContractionLeadingOneDim` pattern produces the following: ```mlir %0 = vector.transpose %arg0, [1, 0, 2] : vector<1x1x8xi32> to vector<1x1x8xi32> %1 = vector.extract %0[0] : vector<1x8xi32> from vector<1x1x8xi32> %2 = vector.extract %arg2[0] : vector<8xi32> from vector<1x8xi32> %3 = vector.contract {indexing_maps = [affine_map<(d0, d1, d2) -> (d0, d2)>, affine_map<(d0, d1, d2) -> (d0, d1, d2)>, affine_map<(d0, d1, d2) -> (d1)>], iterator_types = ["parallel", "parallel", "reduction"], kind = #vector.kind<add>} %1, %arg1, %2 : vector<1x8xi32>, vector<1x8x8xi32> into vector<8xi32> %4 = vector.broadcast %3 : vector<8xi32> to vector<1x8xi32> ``` The `vector.transpose` introduced does not affect the underlying data layout (effectively a no op), but it cannot be folded automatically. This change avoids inserting transposes when only leading unit dimensions are involved. Fixes #85691
2024-04-03	[mlir][ArmNeon] Updates LowerContractionToSMMLAPattern with vecmat unroll ↵	Kojo Acquah	2	-31/+191
	patterns (#86005) Updates smmla unrolling patterns to handle vecmat contracts where `dimM=1`. This includes explicit vecmats in the form: `<1x8xi8> x <8x8xi8> --> <1x8xi32>` or implied with the leading dim folded: `<8xi8> x <8x8xi8> --> <8xi32>` Since the smmla operates on two `<2x8xi8>` input vectors to produce `<2x2xi8>` accumulators, half of each 2x2 accumulator tile is dummy data not pertinent to the computation, resulting in half throughput.
2024-04-03	Revert "dsymutil: Re-add missing -latomic (#85380)"	Gulfem Savrun Yeniceri	1	-1/+1
	This reverts commit 23616c65e7d632e750ddb67d55cc39098a69a8a6 because it breaks Fuchsia Clang toolchain builders. https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8751656876289840849/overview
2024-04-03	[RISCV][GISEL] Instruction selection for G_ZEXT, G_SEXT, and G_ANYEXT with ↵	Michael Maitland	3	-0/+2702
	scalable vector type
2024-04-03	[RISCV][GISEL] Regbankselect for G_ZEXT, G_SEXT, and G_ANYEXT with scalable ↵	Michael Maitland	4	-3/+2469
	vector type
2024-04-03	[RISCV][GISEL] Instruction selection for G_ICMP	Michael Maitland	1	-0/+534

2024-04-03	[RISCV][GISEL] Regbank select for scalable vector G_ICMP	Michael Maitland	2	-1/+679

2024-04-03	[RISCV][GISEL] Legalize G_ZEXT, G_SEXT, and G_ANYEXT, G_SPLAT_VECTOR, and ↵	Michael Maitland	14	-16/+7436
	G_ICMP for scalable vector types This patch legalizes G_ZEXT, G_SEXT, and G_ANYEXT. If the type is a legal mask type, then the instruction is legalized as the element-wise select, where the condition on the select is the mask typed source operand, and the true and false values are 1 or -1 (for zero/any-extension and sign extension) and zero. If the type is a legal integer or vector integer type, then the instruction is marked as legal. The legalization of the extends may introduce a G_SPLAT_VECTOR, which needs to be legalized in this patch for the extend test cases to pass. A G_SPLAT_VECTOR is legal if the vector type is a legal integer or floating point vector type and the source operand is sXLen type. This is because the SelectionDAG patterns only support sXLen typed ISD::SPLAT_VECTORS, and we'd like to reuse those patterns. A G_SPLAT_VECTOR is cutom legalized if it has a legal s1 element vector type and s1 scalar operand. It is legalized to G_VMSET_VL or G_VMCLR_VL if the splat is all ones or all zeros respectivley. In the case of a non-constant mask splat, we legalize by promoting the scalar value to s8. In order to get the s8 element vector back into s1 vector, we use a G_ICMP. In order for the splat vector and extend tests to pass, we also need to legalize G_ICMP in this patch. A G_ICMP is legal if the destination type is a legal bool vector and the LHS and RHS are legal integer vector types.
2024-04-03	Revert "Revert "Revert "[clang][UBSan] Add implicit conversion check for ↵	Vitaly Buka	11	-493/+73
	bitfields""" (#87562) Reverts llvm/llvm-project#87529 Reverts #87518 https://lab.llvm.org/buildbot/#/builders/37/builds/33262 is still broken
2024-04-03	[libc++] Fix copy/pasta error in atomic tests for ↵	Damien L-G	2	-4/+4
	`atomic_compare_exchange_{weak,strong}` (#87135) Spotted this minor mistake in the tests as I was looking into testing more thoroughly `atomic_ref`. The two argument overloads are tested just above. The names of the lambda clearly indicates that the intent was to test the one argument overload.
2024-04-03	[flang][runtime] Enable I/O APIs in F18 runtime offload builds. (#87543)	Slava Zakharin	8	-213/+235

2024-04-03	[VectorCombine][X86] shuffle-of-casts.ll - adjust zext nneg tests to improve ↵	Simon Pilgrim	1	-16/+16
	costs for testing Improves SSE vs AVX test results for #87510
2024-04-03	[SLP]Improve minbitwidth analysis for operands of IToFP and ICmp instructions.	Alexey Bataev	3	-16/+50
	Compiler can improve analysis for operands of UIToFP/SIToFP instructions and operands of ICmp instruction. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/85966
2024-04-03	[libc] Added transitive bindings for OffsetType (#87397)	Shourya Goel	7	-12/+30
	Adding OffTType to fcntl.h and stdio.h 's Macro lists in libc/spec/posix.td as mentioned here: #87266
2024-04-03	fully qualifies use of `detail` namespace (#87536)	Christopher Di Bella	1	-4/+6
	Some TUs apparently end up with an ambiguity between `::llvm::detail` and `support::detail`, so we close that gap at the source.
2024-04-03	Revert "[SLP]Improve minbitwidth analysis for operands of IToFP and ICmp ↵	Alexey Bataev	3	-48/+16
	instructions." This reverts commit 899855d2b11856a44e530fffe854d76be69b9008 to fix the issue reported in https://lab.llvm.org/buildbot/#/builders/165/builds/51659.
2024-04-03	[SLP]Improve minbitwidth analysis for operands of IToFP and ICmp instructions.	Alexey Bataev	3	-16/+48
	Compiler can improve analysis for operands of UIToFP/SIToFP instructions and operands of ICmp instruction. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/85966
2024-04-03	[AMDGPU] Add a missing COV6 case to getAMDHSACodeObjectVersion() (#87492)	Emma Pilkington	2	-0/+9

2024-04-03	DebugInfoD issues, take 2 (#86812)	Kevin Frei	10	-17/+506
	The previous diff (and it's subsequent fix) were reverted as the tests didn't work properly on the AArch64 & ARM LLDB buildbots. I made a couple more minor changes to tests (from @clayborg's feedback) and disabled them for non Linux-x86(_64) builds, as I don't have the ability do anything about an ARM64 Linux failure. If I had to guess, I'd say the toolchain on the buildbots isn't respecting the `-Wl,--build-id` flag. Maybe, one day, when I have a Linux AArch64 system I'll dig in to it. From the reverted PR: I've migrated the tests in my https://github.com/llvm/llvm-project/pull/79181 from shell to API (at @JDevlieghere's suggestion) and addressed a couple issues that were exposed during testing. The tests first test the "normal" situation (no DebugInfoD involvement, just normal debug files sitting around), then the "no debug info" situation (to make sure the test is seeing failure properly), then it tests to validate that when DebugInfoD returns the symbols, things work properly. This is duplicated for DWP/split-dwarf scenarios. --------- Co-authored-by: Kevin Frei <freik@meta.com>
2024-04-03	[AMDGPU][MC] Allow VOP3C dpp src1 to be imm or SGPR (#87418)	Joe Nash	14	-86/+3218
	Allows src1 of VOP3 encoded VOPC to be an SGPR or inline immediate on GFX1150Plus The w32 and w64 _e64_dpp assembler only real instructions were unused, and erroneously constructed in a way that bugged parsing of the new instructions. They are removed. This patch is a follow up to PR https://github.com/llvm/llvm-project/pull/87382
2024-04-03	AMDGPU: Use PseudoInstr to name SIMCInstr for DSDIR and SOPs, NFC (#87537)	Changpeng Fang	2	-40/+40
	We should consistently use PseudoInstr instead of Mnemonic to name SIMCInstr, even though they may be the same in most cases
2024-04-03	[AArch64] Add a test for non-temporal masked loads / stores. NFC	David Green	1	-0/+75

2024-04-03	[VectorCombine][X86] Add additional tests for #87510	Simon Pilgrim	1	-0/+42
	Add zext nneg tests and check we don't fold casts with different src types
2024-04-03	[SLP]Add support for commutative intrinsics.	Alexey Bataev	6	-27/+54
	Implemented long-standing TODO to support commutative intrinsics. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/86316
2024-04-03	[PseudoProbe] Mix block and call probe ID in lexical order (#75092)	Lei Wang	13	-79/+71
	Before all the call probe ids are after block ids, in this change, it mixed the call probe and block probe by reordering them in lexical(line-number) order. For example: ``` main(): BB1 if(...) BB2 foo(..); else BB3 bar(...); BB4 ``` Before the profile is ``` main 1: .. 2: .. 3: ... 4: ... 5: foo ... 6: bar ... ``` Now the new order is ``` main 1: .. 2: .. 3: foo ... 4: ... 5: bar ... 6: ... ``` This can potentially make it more tolerant of profile mismatch, either from stale profile or frontend change. e.g. before if we add one block, even the block is the last one, all the call probes are shifted and mismatched. Moreover, this makes better use of call-anchor based stale profile matching. Blocks are matched based on the closest anchor, there would be more anchors used for the matching, reduce the mismatch scope.
2024-04-03	[AArch64][ARM] Make neon fp16 generic intrinsics always available. (#87467)	David Green	6	-676/+1100
	By generic intrinsics this mean things like dup, ext, zip and bsl that can always be executed with integer s16 operations and do not require fullfp16. This makes them always available, and brings them inline with GCC. https://godbolt.org/z/azs8eMv54 The relevant test cases have been moved into their own files, to allow them to be tested with armv8-a and armv8.2-a+fp16.
2024-04-03	Revert "Revert "[clang][UBSan] Add implicit conversion check for bitfields"" ↵	Vitaly Buka	11	-73/+493
	(#87529) Reverts llvm/llvm-project#87518 Revert is not needed as the regression was fixed with 1189e87951e59a81ee097eae847c06008276fef1. I assumed the crash and warning are different issues, but according to https://lab.llvm.org/buildbot/#/builders/240/builds/26629 fixing warning resolves the crash.
2024-04-03	Always check the function attribute to determine checksum mismatch for ↵	Lei Wang	2	-10/+25
	available_externally functions (#87279) This is to fix an assertion error. Apparently, `pseudo_probe_desc` could still be available for import functions, and its checksum mismatch state can be different from import function's `profile-checksum-mismatch` attr. This happens when unstable IR or ODR violation issue occurs, the definitions of the same function across different translation units could be different and result in different checksums. During link time deduplication, the internal function definition (the checksum in desc is computed based on) is substituted by the `available_externally` definition, which cause the inconsistency. Hence, we fix it to by always checking the state for the new `available_externally` definition, which is saved in the function attribute.
2024-04-03	[mlir] Initialize DefaultTimingManager::out. (#87522)	Chenguang Wang	1	-1/+2
	`DefaultTimingManager::clear()` uses `out` to initialize `TimerImpl`, but the `out` is `nullptr` by default. This means if `DefaultTimingManager::setOutput()` is never called, `DefaultTimingManager` destructor may generate SIGSEGV.
2024-04-03	[libc++] Mark some recent LWG issues and papers as done (#87502)	Louis Dionne	4	-7/+8
	Justifications: - LWG3950: Done in #66206 - LWG3975: Wording changes only - LWG4011: Wording changes only - LWG4030: Wording changes only - LWG4043: Wording changes only - LWG3036 and P2875R4: We implemented neither, but the latter reverts the former, so now we implement both without doing anything!
2024-04-03	Updates to 'tosa.reshape' verifier (#87416)	Rafael Ubal	2	-17/+58
	This addition catches common cases of malformed `tosa.reshape` ops. This prevents the `--tosa-to-tensor` pass from asserting when fed invalid operations, as these will be caught ahead of time by the verifier. Closes #87396
2024-04-03	[SLP]Fix PR87133: crash because of different altopcodes for cmps after ↵	Alexey Bataev	3	-23/+104
	reordering. If the node has cmp instruction with 3 or more different but swappable predicates, need to keep same kind of main/alternate opcodes to avoid incorrect detection of opcodes after reordering. Reordering changes the order and we may erroneously consider swappable opcodes as non-compatible/alternate, which may lead to a later compiler crash. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/87267
2024-04-03	dsymutil: Re-add missing -latomic (#85380)	maflcko	1	-1/+1
	This was accidentally removed in https://reviews.llvm.org/D137799#4657404 / https://reviews.llvm.org/D137799#C3933303OL44, and downstream projects are forced to add it back. For example, https://git.savannah.gnu.org/cgit/guix.git/commit/?id=4e26331a5ee87928a16888c36d51e270f0f10f90 Fix this, by re-adding it. Co-authored-by: MarcoFalke <*~=`'#}+{/-\|&$^_@721217.xyz>
2024-04-03	[RISCV][GISEL] Run update_mir_test_checks on ↵	Michael Maitland	1	-44/+44
	llvm/test/CodeGen/RISCV/GlobalISel/legalizer/rvv/legalize-xor.mir
2024-04-03	[C23] Remove WG14 N2416 from the C status page	Aaron Ballman	1	-5/+0
	This paper did not add any normative changes for us to check conformance against. It added a note describing a potential behavioral difference between compile-time and runtime evaluation of negative floating-point values in the presence of rounding modes.
2024-04-03	[clang] Precommit test for `llvm.allow.ubsan.check()` (#87435)	Vitaly Buka	1	-0/+207

2024-04-03	Revert "[clang][UBSan] Add implicit conversion check for bitfields" (#87518)	Vitaly Buka	11	-493/+73
	Reverts llvm/llvm-project#75481 Breaks multiple bots, see #75481
2024-04-03	[flang] Fixed MODULO(x, inf) to produce NaN. (#86145)	Slava Zakharin	5	-15/+105
	Straightforward computation of `A − FLOOR (A / P) * P` should produce NaN, when P is infinity. The -menable-no-infs lowering can still use the relaxed operations sequence.
2024-04-03	[SLP]Fix PR87477: fix alternate node cast cost/codegen.	Alexey Bataev	2	-25/+74
	Have to compare actual type size to pick up proper cast operation opcode.
2024-04-03	[Offload][NFC] Add offload subfolder and README (#77154)	Johannes Doerfert	1	-0/+20
	The readme only states the goal and has links to further information, e.g., our meetings. --------- Co-authored-by: Shilei Tian <i@tianshilei.me>