aboutsummaryrefslogtreecommitdiff
path: root/gcc/DATESTAMP
diff options
context:
space:
mode:
authorPengxuan Zheng <quic_pzheng@quicinc.com>2024-06-12 18:23:13 -0700
committerPengxuan Zheng <quic_pzheng@quicinc.com>2024-07-02 16:06:48 -0700
commit895bbc08d38c2aca3cbbab273a247021fea73930 (patch)
treed4da22b7e4a092598b82a79b9c8078cdba32ddf4 /gcc/DATESTAMP
parenta7ad9cb813063ddf51269910f33b56116c10462c (diff)
downloadgcc-895bbc08d38c2aca3cbbab273a247021fea73930.zip
gcc-895bbc08d38c2aca3cbbab273a247021fea73930.tar.gz
gcc-895bbc08d38c2aca3cbbab273a247021fea73930.tar.bz2
aarch64: Add vector popcount besides QImode [PR113859]
This patch improves GCC’s vectorization of __builtin_popcount for aarch64 target by adding popcount patterns for vector modes besides QImode, i.e., HImode, SImode and DImode. With this patch, we now generate the following for V8HI: cnt v1.16b, v0.16b uaddlp v2.8h, v1.16b For V4HI, we generate: cnt v1.8b, v0.8b uaddlp v2.4h, v1.8b For V4SI, we generate: cnt v1.16b, v0.16b uaddlp v2.8h, v1.16b uaddlp v3.4s, v2.8h For V4SI with TARGET_DOTPROD, we generate the following instead: movi v0.4s, #0 movi v1.16b, #1 cnt v3.16b, v2.16b udot v0.4s, v3.16b, v1.16b For V2SI, we generate: cnt v1.8b, v.8b uaddlp v2.4h, v1.8b uaddlp v3.2s, v2.4h For V2SI with TARGET_DOTPROD, we generate the following instead: movi v0.8b, #0 movi v1.8b, #1 cnt v3.8b, v2.8b udot v0.2s, v3.8b, v1.8b For V2DI, we generate: cnt v1.16b, v.16b uaddlp v2.8h, v1.16b uaddlp v3.4s, v2.8h uaddlp v4.2d, v3.4s For V4SI with TARGET_DOTPROD, we generate the following instead: movi v0.4s, #0 movi v1.16b, #1 cnt v3.16b, v2.16b udot v0.4s, v3.16b, v1.16b uaddlp v0.2d, v0.4s PR target/113859 gcc/ChangeLog: * config/aarch64/aarch64-simd.md (aarch64_<su>addlp<mode>): Rename to... (@aarch64_<su>addlp<mode>): ... This. (popcount<mode>2): New define_expand. gcc/testsuite/ChangeLog: * gcc.target/aarch64/popcnt-udot.c: New test. * gcc.target/aarch64/popcnt-vec.c: New test. Signed-off-by: Pengxuan Zheng <quic_pzheng@quicinc.com>
Diffstat (limited to 'gcc/DATESTAMP')
0 files changed, 0 insertions, 0 deletions