riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2024-08-02	[AMDGPU][SILoadStoreOptimizer] Include constrained buffer load variantsusers/cdevadas/combine-constrained-buffer-loads	Christudasan Devadasan	3	-82/+613
	Use the constrained buffer load opcodes while combining under-aligned load for XNACK enabled subtargets.
2024-08-02	[AMDGPU] Auto-generate lit pattern for test ↵	Christudasan Devadasan	1	-34/+231
	CodeGen/AMDGPU/merge-sbuffer-load.mir.
2024-08-01	[Driver] Include crt0.o in the baremetal link (#101258)	Petr Hosek	4	-0/+13
	The common baremetal libc implementations already provide crt0.o and GCC automatically links it so this improves parity.
2024-08-01	[CMake][Fuchsia] Use standard spelling for Arm baremetal targets (#101302)	Petr Hosek	1	-2/+2
	It's more common to use `none` rather than `unknown` for the OS component in Arm baremetal targets.
2024-08-02	Revert "[X86][AVX10.2] Support AVX10.2 option and VMPSADBW/VADDP[D,H,S] new ↵	Phoebe Wang	49	-1442/+43
	instructions" (#101612) Reverts llvm/llvm-project#101452 There are several buildbot failed. Revert first.
2024-08-01	[clang-format] Fix a misannotation of PointerOrReference (#101291)	Owen Pan	2	-14/+21
	Fixes #101138.
2024-08-01	[TableGen] Use std::move. NFC	Craig Topper	1	-1/+1
	Fixes #101408.
2024-08-02	[X86][AVX10.2] Support AVX10.2 option and VMPSADBW/VADDP[D,H,S] new ↵	Phoebe Wang	49	-43/+1442
	instructions (#101452) Ref.: https://cdrdv2.intel.com/v1/dl/getContent/828965
2024-08-02	[HLSL] cleanup builtin names elementwise usage (#101543)	Farzon Lotfi	7	-90/+105
	Remove elementwise description for builtins that don't perform elementwise operations.
2024-08-02	[SPARC][IAS] Add v8plus feature bit (#101367)	Koakuma	9	-16/+53
	Implement handling for `v8plus` feature bit to allow the user to switch between V8 and V8+ mode with 32-bit code. Currently this only sets the appropriate ELF machine type and flags; codegen changes will be done in future patches. This is done as a prerequisite for `-mv8plus` flag on clang (#98713).
2024-08-02	[LoongArch] Align stack objects passed to memory intrinsics (#101309)	hev	3	-117/+39
	Memcpy, and other memory intrinsics, typically try to use wider load/store if the source and destination addresses are aligned. In CodeGenPrepare, look for calls to memory intrinsics and, if the object is on the stack, align it to 4-byte (32-bit) or 8-byte (64-bit) boundaries if it is large enough that we expect memcpy to use wider load/store instructions to copy it. Fixes #101295
2024-08-01	[RISCV] Use Zvhmin instead of Zvfh on RUN lines for some intrinsic tests. ↵	Craig Topper	544	-3981/+3835
	NFC (#101540) Loads/stores/reinterpret/vfncvt.f.f.w/vfwcvt.f.f.v/vmerge/vmv.v.v are all expected to work for f16 vectors with Zvfhmin. Remove the handcrafted Zvfhmin test that partially tested this. Splits the vfwcvt.f.f.v and vfncvt.f.f.w tests into their own file so we can have a separate RUN line from the float<->int conversions.
2024-08-01	[Attributor] Use `getPointerAddressSpace` to replace a cast followed by a ↵	Shilei Tian	1	-2/+1
	`getAddressSpace`
2024-08-01	Fix attr-nomerge.cpp with fixed triple	Zequan Wu	1	-1/+1

2024-08-01	[Attributor] Indicate optimistic fixed point if an instruction already has ↵	Shilei Tian	2	-2/+7
	non-zero address space (#101589)
2024-08-01	[mlir][spirv] Add definitions and (de)serialization for FPRoundingMode (#101546)	Andrea Faulds	4	-0/+42

2024-08-02	[VPlan][NFC] Make VPValue pointer const. (#101334)	Mel Chen	2	-4/+4

2024-08-02	[mlir][bufferization] Improve performance of DropEquivalentBufferResultsPass ↵	Longsheng Mou	1	-6/+10
	(#101281) By using DenseMap to minimize the traveral time of callOps, and the efficiency of running this pass has been greatly improved.
2024-08-02	[X86_32][C++] fix 0 sized struct case in vaarg. (#86388)	Longsheng Mou	3	-1/+27
	struct SuperEmpty { struct{ int a[0];} b;}; Such 0 sized structs in c++ mode can not be ignored in i386 for that c++ fields are never empty.But when EmitVAArg, its size is 0, so that va_list not increase.Maybe we can just Ignore this kind of arguments, like X86_64 did. Fix #86385.
2024-08-01	[test] Fix attr-nomerge.cpp after ae6dc64ec670891cb15049277e43133d4df7fb4b	Fangrui Song	1	-10/+10

2024-08-01	[Bazel] Port f3bfc56327df821801caa4ae20995f67f8589a19	Fangrui Song	1	-0/+1

2024-08-01	[lldb] Change Module to have a concrete UnwindTable, update (#101130)	Jason Molenda	4	-74/+41
	Currently a Module has a std::optional<UnwindTable> which is created when the UnwindTable is requested from outside the Module. The idea is to delay its creation until the Module has an ObjectFile initialized, which will have been done by the time we're doing an unwind. However, Module::GetUnwindTable wasn't doing any locking, so it was possible for two threads to ask for the UnwindTable for the first time, one would be created and returned while another thread would create one, destroy the first in the process of emplacing it. It was an uncommon crash, but it was possible. Grabbing the Module's mutex would be one way to address it, but when loading ELF binaries, we start creating the SymbolTable on one thread (ObjectFileELF) grabbing the Module's mutex, and then spin up worker threads to parse the individual DWARF compilation units, which then try to also get the UnwindTable and deadlock if they try to get the Module's mutex. This changes Module to have a concrete UnwindTable as an ivar, and when it adds an ObjectFile or SymbolFileVendor, it will call the Update method on it, which will re-evaluate which sections exist in the ObjectFile/SymbolFile. UnwindTable used to have an Initialize method which set all the sections, and an Update method which would set some of them if they weren't set. I unified these with the Initialize method taking a `force` option to re-initialize the section pointers even if they had been done already before. This is addressing a rare crash report we've received, and also a failure Adrian spotted on the -fsanitize=address CI bot last week, it's still uncommon with ASAN but it can happen with the standard testsuite. rdar://128876433
2024-08-01	[asan] Avoid global ~DenseMap()	Vitaly Buka	1	-4/+14
	Follow up to #100923
2024-08-01	[M68k] Fix compilation pipeline check	Michael Liao	1	-1/+0
	- After 'lowerConstantIntrinsics' is merged into pre-isel lowering
2024-08-01	[SandboxIR] Implement UnaryInstruction class (#101541)	vporpo	2	-20/+67
	This patch implements sandboxir::UnaryInstruction class and updates sandboxir::LoadInst and sandboxir::CastInst to inherit from it instead of sandboxir::Instruction.
2024-08-01	Add a tutorial on mlir-opt (#96105)	Jeremy Kun	7	-1/+416
	This tutorial gives an introduction to the `mlir-opt` tool, focusing on how to run basic passes with and without options, run pass pipelines from the CLI, and point out particularly useful flags. --------- Co-authored-by: Jeremy Kun <j2kun@users.noreply.github.com> Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
2024-08-02	[mlir][emitc] Fix EmitC dialect's operations' descriptions (#101523)	Andrey Timonin	1	-40/+40
	- Added the dialect's prefix to operations' descriptions to follow the same style inside the TableGen file. - Minor changes in the 'emitc.yield' operation's description.
2024-08-01	[libc] created tan function fuzzer (#101570)	RoseZhang03	4	-2/+51
	Also edited file header formatting on sin_fuz and cos_fuzz
2024-08-01	Add support for verifying local type units in .debug_names. (#101133)	Greg Clayton	4	-15/+264
	This patch adds support for verifying local type units in .debug_names section. It adds a test to test if the TU index is valid, and a test that tests that an error is found inside the name entry for a type unit. We don't need to test all other errors in the name entry because these are essentially identical to compile unit entries, they just use a different DWARF unit offset index.
2024-08-01	Fix codegen of consteval functions returning an empty class, and related ↵	Eli Friedman	15	-287/+320
	issues (#93115) Fix codegen of consteval functions returning an empty class, and related issues If a class is empty, don't store it to memory: the store might overwrite useful data. Similarly, if a class has tail padding that might overlap other fields, don't store the tail padding to memory. The problem here turned out a bit more general than I initially thought: basically all uses of EmitAggregateStore were broken. Call lowering had a method that did mostly the right thing, though: CreateCoercedStore. Adapt CreateCoercedStore so it always does the conservatively right thing, and use it for both calls and ConstantExpr. Also, along the way, fix the "overlap" bit in AggValueSlot: the bit was set incorrectly for empty classes in some cases. Fixes #93040.
2024-08-01	Reapply "[Clang] Fix nomerge attribute not working with __builtin_trap(), ↵	Zequan Wu	5	-7/+111
	__debugbreak(), __builtin_verbose_trap() (#101549)" This reverts commit 667598d84b16d1789ce90b231565e9e7bfdbe77d and fixes failed tests: llvm/test/CodeGen/X86/nomerge.ll and llvm/test/MC/AArch64/local-bounds-single-trap.ll.
2024-08-01	[Offload][OpenMP] Prettify error messages by "demangling" the kernel name ↵	Johannes Doerfert	15	-24/+145
	(#101400) The kernel names for OpenMP are manually mangled and not ideal when we report something to the user. We demangle them now, providing the function and line number of the target region, together with the actual kernel name.
2024-08-01	[libc][math][C23] removing daddl from arm32 (#101567)	aaryanshukla	1	-1/+0

2024-08-01	[libc] added cos function fuzzing test (#101556)	RoseZhang03	3	-4/+53

2024-08-01	Revert "[Clang] Fix nomerge attribute not working with __builtin_trap(), ↵	Haowei Wu	4	-104/+2
	__debugbreak(), __builtin_verbose_trap() (#101549)" This reverts commit 5e84646982d1ec9bc94e48dde4b47f03c044a156, which broke 'nomerge.ll' test on llvm bots.
2024-08-01	[libc++] Add status page consistency change to git-blame-ignore-revs	Louis Dionne	1	-0/+3
	To avoid breaking searchability of when a paper was implemented.
2024-08-01	[libc++][NFC] Fix inconsistent quoting and spacing in our CSV files	Louis Dionne	2	-156/+156
	There were a few places where we didn't properly quote entries in the CSV status pages, or where we followed inconsistent spacing. This causes issue when trying to synchronize status pages with Github issues.
2024-08-01	[libc++] Improve code gen for string's operator== (#100926)	Nikolas Klauser	1	-5/+11
	If the string is too long for a short string, we can simply check for the long bit. If that's false we can do an early return. This improves the code gen slightly.
2024-08-01	Simplify hot-path size computations in BumpPtrAllocator. (#101467)	Owen Anderson	1	-8/+10
	~0.1% instruction count improvements https://llvm-compile-time-tracker.com/compare.php?from=07d2709a17860a202d91781769a88837e4fb5f2a&to=d5cc47831ecd9f0a2b164b16da67f74b94e9aafc&stat=instructions:u
2024-08-01	[asan] Speed up ASan ODR indicator-based checking (#100923)	Artem Pianykh	2	-12/+95
	Summary: When ASan checks for a potential ODR violation on a global it loops over a linked list of all globals to find those with the matching value of an indicator. With the default setting 'detect_odr_violation=1', ASan doesn't report violations on same-size globals but it still has to traverse the list. For larger binaries with a ton of shared libs and globals (and a non-trivial volume of same-sized duplicates) this gets extremely expensive. This patch adds an indicator indexed (multi-)map of globals to speed up the search. > Note: asan used to use a map to store globals a while ago which was replaced with a list when the codebase [moved off of STL](https://github.com/llvm/llvm-project/commit/e4bada2c946e5399fc37bd67421de01c0047ad38). Internally we see many examples where ODR checking takes seconds (even double digits). With this patch it's practically free and `__asan_register_globals` doesn't show up prominently in the perf profile anymore. There are several high-level questions: 1. I understand that the intent is that we hit the slow path rarely, ideally once before the process dies with an error. But in practice we hit the slow path a lot. It feels reasonable to keep the amount of work bounded even in the worst case, even if it requires a bit of extra memory. But if not, it'd be great to learn about the tradeoffs. 2. Poisoning based ODR checking remains on the slow path. Internally we build everything with `-fsanitize-address-use-odr-indicator` so I'm not sure if poisoning-based check would exhibit the same behavior (looking at the code, the shape looks very similar, so it might?). 3. Globals with an ODR indicator of `-1` need to be skipped for the purposes of ODR checking (cf. https://github.com/llvm/llvm-project/commit/a257639a6935a2c63377784c5c9c3b73864a2582). But they are still getting added to the list of globals and hence take up space and slow down the iteration over the list of globals. It would be a good saving if we could avoid adding them to the globals list. 4. Any reason to use a linked list instead of e.g. a vector to store globals? Test Plan: * `cmake --build build --target check-asan` looks good * Perf-wise things look good when linking against this version of compiler-rt. --------- Co-authored-by: Vitaly Buka <vitalybuka@google.com>
2024-08-01	[libc][math][c23] Add dadd{l,f128} and ddiv{l,f128} C23 math functions (#100456)	aaryanshukla	24	-2/+282
	- fadd removed because I need to add for different input types - finishing rest of basic operations - noticed duplicates will remove --------- Co-authored-by: OverMighty <its.overmighty@gmail.com>
2024-08-01	[libc] Fix 'vasprintf' not working in non-fullbuild mode	Joseph Huber	2	-13/+14

2024-08-01	[SCEV] Prove no-self-wrap from negative power of two step (#101416)	Philip Reames	3	-21/+28
	We have existing code which reasons about a step evenly dividing the iteration space is a finite loop with a single exit implying no-self-wrap. The sign of the step doesn't effect this. --------- Co-authored-by: Nikita Popov <github@npopov.com>
2024-08-01	[flang][runtime] Added missing RT_API_ATTRS. (#101536)	Slava Zakharin	1	-4/+4

2024-08-01	[Clang] Fix nomerge attribute not working with __builtin_trap(), ↵	Zequan Wu	4	-2/+104
	__debugbreak(), __builtin_verbose_trap() (#101549) 1. It fixes the problem that llvm.trap() not getting the nomerge attribute. 2. It sets nomerge flag for the node if the instruction has nomerge arrtibute. This is a copy of https://reviews.llvm.org/D146164. This only attempts to fix `nomerge` for `__builtin_trap()`, `__debugbreak()`, `__builtin_verbose_trap()`, not working for non-trap builtins. Fixes #53011
2024-08-01	[libc++] Revert "Check correctly ref-qualified __is_callable in algorithms ↵	Louis Dionne	23	-227/+84
	(#73451)" This reverts commit 8d151f804ff43aaed1edf810bb2a07607b8bba14, which broke some build bots. I think that is caused by an invalid argument order when checking __is_comparable in upper_bound.
2024-08-01	[flang] Add allocator_idx attribute on fir.embox and fircg.ext_embox (#101212)	Valentin Clement (バレンタインクレメン)	8	-14/+66
	#100690 introduces allocator registry with the ability to store allocator index in the descriptor. This patch adds an attribute to fir.embox and fircg.ext_embox to be able to set the allocator index while populating the descriptor fields.
2024-08-01	[Clang][NFC] Improve generation of GEP and RecordDecl loop (#101434)	Bill Wendling	4	-29/+28
	As with other loops, we need only look at a RecordDecl's FieldDecls. Convert to using them. In the meantime, we can improve the generation of the 'counted_by' FieldDecl's GEP by creating one GEP instead of a series of GEPs.
2024-08-01	[clang] fix classification of a string literal expression used as ↵	Matheus Izvekov	3	-7/+51
	initializer (#101447)
2024-08-01	[clang-format] Rename variable more sensitively (#100943)	Nathan Sidwell	1	-2/+2
	Renaming to `Disallowed`.