riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Sander de Smalen <sander.desmalen@arm.com>	2025-10-03 11:07:07 +0200
committer	GitHub <noreply@github.com>	2025-10-03 10:07:07 +0100
commit	cc9c64d525ece2167a6fae657578a7379541ac6e (patch)
tree	edee05beeaaf73c55ba74b92fd160282dede4650 /libcxx/include/__algorithm/comp.h
parent	5cd3db3bed62c07790c17bf1947e98bc903472a9 (diff)
download	llvm-cc9c64d525ece2167a6fae657578a7379541ac6e.zip llvm-cc9c64d525ece2167a6fae657578a7379541ac6e.tar.gz llvm-cc9c64d525ece2167a6fae657578a7379541ac6e.tar.bz2

[AArch64] Refactor and refine cost-model for partial reductions (#158641)

This cost-model takes into account any type-legalisation that would happen on vectors such as splitting and promotion. This results in wider VFs being chosen for loops that can use partial reductions. The cost-model now also assumes that when SVE is available, the SVE dot instructions for i16 -> i64 dot products can be used for fixed-length vectors. In practice this means that loops with non-scalable VFs are vectorized using partial reductions where they wouldn't before, e.g. ``` int64_t foo2(int8_t *src1, int8_t *src2, int N) { int64_t sum = 0; for (int i=0; i<N; ++i) sum += (int64_t)src1[i] * (int64_t)src2[i]; return sum; } ``` These changes also fix an issue where previously a partial reduction would be used for mixed sign/zero-extends (USDOT), even when +i8mm was not available.

Diffstat (limited to 'libcxx/include/__algorithm/comp.h')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: