tcg: Add INDEX_op_dupm_vec

Allow the backend to expand dup from memory directly, instead of forcing the value into a temp first. This is especially important if integer/vector register moves do not exist. Note that officially tcg_out_dupm_vec is allowed to fail. If it did, we could fix this up relatively easily: VECE == 32/64: Load the value into a vector register, then dup. Both of these must work. VECE == 8/16: If the value happens to be at an offset such that an aligned load would place the desired value in the least significant end of the register, go ahead and load w/garbage in high bits. Load the value w/INDEX_op_ld{8,16}_i32. Attempt a move directly to vector reg, which may fail. Store the value into the backing store for OTS. Load the value into the vector reg w/TCG_TYPE_I32, which must work. Duplicate from the vector reg into itself, which must work. All of which is well and good, except that all supported hosts can support dupm for all vece, so all of the failure paths would be dead code and untestable. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
author: Richard Henderson <richard.henderson@linaro.org> 2019-03-17 01:55:22 +0000
committer: Richard Henderson <richard.henderson@linaro.org> 2019-05-13 22:52:08 +0000
commit: 37ee55a081b7863ffab2151068dd1b2f11376914 (patch)
tree: fb44f76c0e0b814f5d408d5007213c0a7605cbff /tcg/i386
parent: f23e5e15edfd49d5dd72cab2ed2d85ac354b2eeb (diff)
download: qemu-37ee55a081b7863ffab2151068dd1b2f11376914.zip
qemu-37ee55a081b7863ffab2151068dd1b2f11376914.tar.gz
qemu-37ee55a081b7863ffab2151068dd1b2f11376914.tar.bz2
1 files changed, 4 insertions, 0 deletions
diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c
index f4bd00e..5b33bbd 100644
--- a/tcg/i386/tcg-target.inc.c
+++ b/tcg/i386/tcg-target.inc.c
@@ -2829,6 +2829,9 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
     case INDEX_op_st_vec:
         tcg_out_st(s, type, a0, a1, a2);
         break;
+    case INDEX_op_dupm_vec:
+        tcg_out_dupm_vec(s, type, vece, a0, a1, a2);
+        break;
 
     case INDEX_op_x86_shufps_vec:
         insn = OPC_SHUFPS;
@@ -3115,6 +3118,7 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op)
 
     case INDEX_op_ld_vec:
     case INDEX_op_st_vec:
+    case INDEX_op_dupm_vec:
         return &x_r;
 
     case INDEX_op_add_vec:
author	Richard Henderson <richard.henderson@linaro.org>	2019-03-17 01:55:22 +0000
committer	Richard Henderson <richard.henderson@linaro.org>	2019-05-13 22:52:08 +0000
commit	37ee55a081b7863ffab2151068dd1b2f11376914 (patch)
tree	fb44f76c0e0b814f5d408d5007213c0a7605cbff /tcg/i386
parent	f23e5e15edfd49d5dd72cab2ed2d85ac354b2eeb (diff)
download	qemu-37ee55a081b7863ffab2151068dd1b2f11376914.zip qemu-37ee55a081b7863ffab2151068dd1b2f11376914.tar.gz qemu-37ee55a081b7863ffab2151068dd1b2f11376914.tar.bz2