riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
40 hours	AMDGPU: Use ELF mangling in data layout (#163011)	Matt Arsenault	1	-1/+1
	Closes #95219
4 days	[clang][CodeGen] Remove "unsafe-fp-math" attribute support (#162779)	paperchalice	1	-6/+0
	These global flags block furthur improvements for clang, users should always use fast-math flags see also https://discourse.llvm.org/t/rfc-honor-pragmas-with-ffp-contract-fast/80797 Remove them incrementally, this is the clang part.
6 days	[clang][SPIR][SPIRV] Materialize non-generic null pointers via addrspacecast ↵	Wenju He	1	-3/+3
	(#161773) LLVM models ConstantPointerNull as all-zero, but some GPUs (e.g. AMDGPU and our downstream GPU target) use a non-zero sentinel for null in private / local address spaces. SPIR-V is a supported input for our GPU target. This PR preserves a canonical zero form in the generic AS while allowing later lowering to substitute the target’s real sentinel.
8 days	[NFC] Change spelling of cluster feature to "clusters" (#162103)	Shilei Tian	1	-2/+2

9 days	[AMDGPU] Make cluster a target feature (#162040)	Shilei Tian	1	-2/+2
	This replaces the original arch check.
2025-09-29	[AMDGPU][SPIRV] Use SPIR-V syncscopes for some AMDGCN BIs (#154867)	Alex Voicu	3	-22/+30
	AMDGCN flavoured SPIR-V allows AMDGCN specific builtins, including those for scoped fences and some specific RMWs. However, at present we don't map syncscopes to their SPIR-V equivalents, but rather use the AMDGCN ones. This ends up pessimising the resulting code as system scope is used instead of device (agent) or subgroup (wavefront), so we correct the behaviour, to ensure that we do the right thing during reverse translation.
2025-09-24	[AMDGPU] Add the support for 45-bit buffer resource (#159702)	Shilei Tian	1	-10/+18
	On new targets like `gfx1250`, the buffer resource (V#) now uses this format: ``` base (57-bit): resource[56:0] num_records (45-bit): resource[101:57] reserved (6-bit): resource[107:102] stride (14-bit): resource[121:108] ``` This PR changes the type of `num_records` from `i32` to `i64` in both builtin and intrinsic, and also adds the support for lowering the new format. Fixes SWDEV-554034. --------- Co-authored-by: Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>
2025-09-17	[AMDGPU] Add gfx1251 runlines to cooperative atomcis tests. NFC (#159437)	Stanislav Mekhanoshin	1	-0/+1

2025-09-17	[AMDGPU] Add gfx1251 subtarget (#159430)	Stanislav Mekhanoshin	1	-0/+2

2025-09-16	[AMDGPU] Add s_cluster_barrier on gfx1250 (#159175)	Stanislav Mekhanoshin	1	-0/+10

2025-09-16	[AMDGPU] Add missing bf16-pk-insts feature to gfx1250 (#159167)	Stanislav Mekhanoshin	1	-1/+1

2025-09-15	[Dwarf] Support heterogeneous DW_{OP,AT}s needed for AMDGPU CFI (#153883)	Scott Linder	2	-50/+51
	These are defined in the user range until standard versions of them get adopted into dwarf, which is expected in DWARF6. Some of these amount to reservations currently as no code to use them is included. It would be very helpful to get them committed to avoid conflicts necessitating encoding changes while we are in the process of upstreaming. --------- Co-authored-by: Juan Martinez Fernandez <juamarti@amd.com> Co-authored-by: Emma Pilkington <Emma.Pilkington@amd.com>
2025-09-15	[Clang] [Sema] Make `-Wincompatible-pointer-types` an error by default (#157364)	Sirraide	1	-1/+1
	GCC 14 also made this an error by default, so we’re following suit. Fixes #74605
2025-09-15	Revert "[HIP][Clang] Remove __AMDGCN_WAVEFRONT_SIZE macros" (#158566)	Fabian Ritter	2	-1/+9
	Reverts llvm/llvm-project#157463 The PR breaks buildbots with old ROCm versions, so revert it and reapply when buildbots are updated.
2025-09-15	[HIP][Clang] Remove __AMDGCN_WAVEFRONT_SIZE macros (#157463)	Fabian Ritter	2	-9/+1
	Remove definitions, test uses, and documentation of the macros, which were deprecated in November 2024 with PR #112849 / #115507. Where required, the wavefront size should instead be queried via means provided by the HIP runtime: the (non-constexpr) `warpSize` variable in device code, or `hipGetDeviceProperties` in host code. This change passed AMD-internal testing. Implements SWDEV-522062.
2025-09-12	[clang] Regenerate test checks including TBAA semantics (NFC)	Antonio Frighetto	12	-455/+574
	Tests exercizing TBAA metadata (both purposefully and not), and previously generated via UTC, have been regenerated and updated to version 6.
2025-09-10	[AMDGPU] Add builtins and intrinsics for cluster attributes (#157877)	Shilei Tian	1	-0/+168
	Co-authored-by: Ivan Kosarev <ivan.kosarev@amd.com>
2025-09-10	[AMDGPU] Add builtins for wave reduction intrinsics (#150170)	Aaditya	1	-0/+378

2025-09-04	[AMDGPU][gfx1250] Add 128B cooperative atomics (#156418)	Pierre van Houtryve	1	-0/+104
	- Add clang built-ins + sema/codegen - Add IR Intrinsic + verifier - Add DAG/GlobalISel codegen for the intrinsics - Add lowering in SIMemoryLegalizer using a MMO flag.
2025-09-03	[AMDGPU] Support cluster_load_async_to_lds instructions on gfx1250 (#156595)	Changpeng Fang	1	-0/+40

2025-09-02	[AMDGPU] Support cluster load instructions for gfx1250 (#156548)	Changpeng Fang	1	-0/+36

2025-08-27	clang/AMDGPU: Add __builtin_amdgcn_inverse_ballot_w{32,64} (#155724)	Nicolai Hähnle	2	-0/+14
	Add builtins that expose the underlying llvm.amdgcn.inverse.ballot intrinsic that we've had for a while. This allows more explicitly writing code that selects or branches in terms of lane masks, which can lead to better code quality.
2025-08-27	[AMDGPU] Refactor insertWaveSizeFeature (#154850)	Stanislav Mekhanoshin	1	-0/+2
	If a wavefrontsize32 or wavefrontsize64 is the only possible value insert it into feature list by default and use that value as an indication that another wavefront size is not legal.
2025-08-26	[AMDGCN] Add missing gfx1250 clang tests. NFC. (#155478)	Stanislav Mekhanoshin	2	-0/+192

2025-08-21	[clang][CodeGen] cast addr space of ReturnValue if needed (#154380)	macurtis-amd	2	-0/+2
	Fixes a bug on AMDGPU targets where a pointer was stored as address space 5, but then loaded as address space 0. Issue found as part of [Kokkos](https://github.com/kokkos/kokkos) testing, specifically `hip.atomics` (see [core/unit_test/TestAtomics.hpp](https://github.com/kokkos/kokkos/blob/develop/core/unit_test/TestAtomics.hpp)). Issue was introduced by commit [39ec9de7c230](https://github.com/llvm/llvm-project/commit/39ec9de7c230) - [clang][CodeGen] sret args should always point to the alloca AS, so use that (https://github.com/llvm/llvm-project/pull/114062).
2025-08-12	[AMDGPU] Add s_barrier_init\|join\|leave instructions (#153296)	Stanislav Mekhanoshin	2	-0/+51

2025-08-12	[VectorCombine] Shrink loads used in shufflevector rebroadcasts. (#153138)	Leon Clark	1	-11/+11
	Reopen #128938. Attempt to shrink the size of vector loads where only some of the incoming lanes are used for rebroadcasts in shufflevector instructions. --------- Co-authored-by: Leon Clark <leoclark@amd.com> Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
2025-08-08	[IR] Remove size argument from lifetime intrinsics (#150248)	Nikita Popov	4	-34/+34
	Now that #149310 has restricted lifetime intrinsics to only work on allocas, we can also drop the explicit size argument. Instead, the size is implied by the alloca. This removes the ability to only mark a prefix of an alloca alive/dead. We never used that capability, so we should remove the need to handle that possibility everywhere (though many key places, including stack coloring, did not actually respect this).
2025-08-05	[AMDGPU] Add gfx1250 wmma_scale[16]_f32_32x16x128_f4 instructions (#152194)	Stanislav Mekhanoshin	1	-0/+22

2025-08-05	[Clang][AMDGPU] Add builtins for some buffer resource atomics (#149216)	zGoldthorpe	4	-39/+102
	This patch exposes builtins for atomic `add`, `max`, and `min` operations that operate over buffer resource pointers.
2025-08-04	[AMDGPU] gfx1250 v_wmma_scale[16]_f32_16x16x128_f8f6f4 codegen (#152036)	Stanislav Mekhanoshin	1	-0/+24

2025-08-04	Revert "[VectorCombine] Shrink loads used in shufflevector rebroadcasts" ↵	Simon Pilgrim	1	-11/+11
	(#151960) Reverts llvm/llvm-project#128938 while a crash regression is investigated
2025-08-04	[VectorCombine] Shrink loads used in shufflevector rebroadcasts (#128938)	Leon Clark	1	-11/+11
	Attempt to shrink the size of vector loads where only some of the incoming lanes are used for rebroadcasts in shufflevector instructions. --------- Co-authored-by: Leon Clark <leoclark@amd.com> Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
2025-08-02	[AMDGPU] v_cvt_scalef32_sr_pk16_* gfx1250 instructions (#151810)	Stanislav Mekhanoshin	1	-0/+42

2025-08-02	[AMDGPU] v_cvt_scalef32_pk16_* gfx1250 instructions (#151807)	Stanislav Mekhanoshin	1	-0/+36

2025-08-02	[AMDGPU] v_cvt_scale_pk16 gfx1250 instructions (#151804)	Stanislav Mekhanoshin	1	-0/+36

2025-08-01	[AMDGPU] gfx1250 v_perm_pk16_* instructions (#151773)	Stanislav Mekhanoshin	2	-1/+56

2025-08-01	[AMDGPU] gfx1250 v_cvt_scalef32_sr_pk8_* instructions (#151765)	Stanislav Mekhanoshin	1	-0/+106

2025-08-01	[AMDGPU] gfx1250 v_cvt_scalef32_pk8_* instructions (#151758)	Stanislav Mekhanoshin	1	-0/+94

2025-08-01	[AMDGPU] gfx1250 v_permlane_* instructions (#151749)	Stanislav Mekhanoshin	1	-0/+126

2025-08-01	[AMDGPU] Update tests (#151688)	Piotr Sobczak	1	-1/+1
	Fix two minor issues: - Add double quote - Remove unused prefix
2025-07-31	[AMDGPU] Add v_cvt_scale_pk8_* gfx1250 instructions (#151616)	Stanislav Mekhanoshin	1	-0/+111

2025-07-31	[AMDGPU] Add gfx1250 cvt_pk\|sr_fp8\|bf8_f32 instructions (#151595)	Stanislav Mekhanoshin	1	-0/+54

2025-07-31	[AMDGPU] v_cvt_sr_pk_f16_f32 gfx1250 instruction (#151482)	Stanislav Mekhanoshin	1	-0/+27

2025-07-30	[AMDGPU] Add v_cvt_sr\|pk_bf8\|fp8_f16 gfx1250 instructions (#151415)	Stanislav Mekhanoshin	1	-0/+138

2025-07-30	[AMDGPU] Add gfx1250 v_cvt_sr_pk_bf16_f32 instruction (#151385)	Stanislav Mekhanoshin	2	-1/+29

2025-07-29	[AMDGPU] Support builtin/intrinsics for async loads/stores on gfx1250 (#151058)	Changpeng Fang	1	-0/+83

2025-07-29	[AMDGPU] Allow readonly features to be written to IR when there is no target ↵	Changpeng Fang	2	-1/+17
	(#148141) Fixes: SWDEV-541399
2025-07-24	[AMDGPU] Support builtin/intrinsics for load monitors on gfx1250 (#150540)	Changpeng Fang	1	-0/+66

2025-07-24	[AMDGPU] gfx1250 vmem prefetch target intrinsics and builtins (#150466)	Stanislav Mekhanoshin	2	-1/+20