riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Simon Pilgrim <llvm-dev@redking.me.uk>	2016-07-29 10:23:10 +0000
committer	Simon Pilgrim <llvm-dev@redking.me.uk>	2016-07-29 10:23:10 +0000
commit	cb780b32a3024857a9b2d297c47b8a4b221e16bf (patch)
tree	3f3c6a6a8dac92403d3bba296913d7ff197170c9 /lldb/source/Plugins/Process/gdb-remote/GDBRemoteClientBase.cpp
parent	83d5d5680fe1d3b1442d621f56c510a8c8367c43 (diff)
download	llvm-cb780b32a3024857a9b2d297c47b8a4b221e16bf.zip llvm-cb780b32a3024857a9b2d297c47b8a4b221e16bf.tar.gz llvm-cb780b32a3024857a9b2d297c47b8a4b221e16bf.tar.bz2

[X86][SSE] Optimize the truncation of vector comparison results with PACKSS

We currently default to using either generic shuffles or MASK+PACKUS/PACKSS to truncate all integer vectors. For vector comparisons, we know that the result will be either all or zero bits in every element, which can be efficiently truncated by directly using PACKSS to repeatedly halve the size of each element. Due to the limited input values (-1 or 0) we don't need to account for vector element size, so for simplicity we just use the PACKSS(vXi16,vXi16) implementation in all cases. Additionally for AVX2 PACKSS of 256bit data we must perform a PERMQ shuffle to reorder the data into the correct order. I did investigate performing a single shuffle after all the PACKSS calls but the need to cross 128bit lanes makes this difficult to achieve efficiently. We avoid performing this on AVX512 as it should have better alternative truncation instructions. Differential Revision: https://reviews.llvm.org/D22814 llvm-svn: 277132

Diffstat (limited to 'lldb/source/Plugins/Process/gdb-remote/GDBRemoteClientBase.cpp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: