riscv-gnu-toolchain/llvm.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Craig Topper <craig.topper@intel.com>	2020-01-08 09:52:37 -0800
committer	Craig Topper <craig.topper@intel.com>	2020-01-08 10:06:01 -0800
commit	3811417f39a7d0a370fac2923060f5ef8dacd8d7 (patch)
tree	a6e162aaa21ca7ca568700250ac808f96a9e678c /lldb/scripts/Python/python-extensions.swig
parent	d60b3b4817cb9346b682bb75371c41642c273b13 (diff)
download	llvm-3811417f39a7d0a370fac2923060f5ef8dacd8d7.zip llvm-3811417f39a7d0a370fac2923060f5ef8dacd8d7.tar.gz llvm-3811417f39a7d0a370fac2923060f5ef8dacd8d7.tar.bz2

[X86] Custom type legalize v4i64->v4f32 uint_to_fp on sse4.1 targets in 64-bit mode

For v4i64->v4f32 uint_to_fp on pre-avx targets where v4i64 isn't legal we create to v2i64->v2f32 uint_to_fp that need to be shuffled together. Our codegen for v2i64->v2f32 involves detecting if the number is larger than (2^31 - 1), if so we do a special divison by 2 so we can do a signed conversion which we need to scalarize, then do a multiply by 2 at the end if we divided earlier. When v4i64 isn't legal we need to split the checking for a larger number and dividing by 2 into two v2i64 vectors. The scalar part can extract the 4 i64 values from those 4 splits. But we can reassemble the 4 scalar f32 results directly into a single v432 vector. Then we just need to combine the fixup indications from the 2 halves and we can do the final multiply by 2 fixup on all 4 values if needed at once using a single v4f32 blend and v4f32 fadd. Differential Revision: https://reviews.llvm.org/D72368

Diffstat (limited to 'lldb/scripts/Python/python-extensions.swig')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: