RS6000 add 128-bit Integer Operations part 1

2021-06-07 Carl Love <cel@us.ibm.com> gcc/ChangeLog * config/rs6000/altivec.h (vec_dive, vec_mod): Add define for new builtins. * config/rs6000/altivec.md (UNSPEC_VMULEUD, UNSPEC_VMULESD, UNSPEC_VMULOUD, UNSPEC_VMULOSD): New unspecs. (altivec_eqv1ti, altivec_gtv1ti, altivec_gtuv1ti, altivec_vmuleud, altivec_vmuloud, altivec_vmulesd, altivec_vmulosd, altivec_vrlq, altivec_vrlqmi, altivec_vrlqmi_inst, altivec_vrlqnm, altivec_vrlqnm_inst, altivec_vslq, altivec_vsrq, altivec_vsraq, altivec_vcmpequt_p, altivec_vcmpgtst_p, altivec_vcmpgtut_p): New define_insn. (vec_widen_umult_even_v2di, vec_widen_smult_even_v2di, vec_widen_umult_odd_v2di, vec_widen_smult_odd_v2di, altivec_vrlqmi, altivec_vrlqnm): New define_expands. * config/rs6000/rs6000-builtin.def (VCMPEQUT_P, VCMPGTST_P, VCMPGTUT_P): Add macro expansions. (BU_P10V_AV_P): Add builtin predicate definition. (VCMPGTUT, VCMPGTST, VCMPEQUT, CMPNET, CMPGE_1TI, CMPGE_U1TI, CMPLE_1TI, CMPLE_U1TI, VNOR_V1TI_UNS, VNOR_V1TI, VCMPNET_P, VCMPAET_P, VMULEUD, VMULESD, VMULOUD, VMULOSD, VRLQ, VSLQ, VSRQ, VSRAQ, VRLQNM, DIV_V1TI, UDIV_V1TI, DIVES_V1TI, DIVEU_V1TI, MODS_V1TI, MODU_V1TI, VRLQMI): New macro expansions. (VRLQ, VSLQ, VSRQ, VSRAQ, DIVE, MOD): New overload expansions. * config/rs6000/rs6000-call.c (P10_BUILTIN_VCMPEQUT, P10V_BUILTIN_CMPGE_1TI, P10V_BUILTIN_CMPGE_U1TI, P10V_BUILTIN_VCMPGTUT, P10V_BUILTIN_VCMPGTST, P10V_BUILTIN_CMPLE_1TI, P10V_BUILTIN_VCMPLE_U1TI, P10V_BUILTIN_DIV_V1TI, P10V_BUILTIN_UDIV_V1TI, P10V_BUILTIN_VMULESD, P10V_BUILTIN_VMULEUD, P10V_BUILTIN_VMULOSD, P10V_BUILTIN_VMULOUD, P10V_BUILTIN_VNOR_V1TI, P10V_BUILTIN_VNOR_V1TI_UNS, P10V_BUILTIN_VRLQ, P10V_BUILTIN_VRLQMI, P10V_BUILTIN_VRLQNM, P10V_BUILTIN_VSLQ, P10V_BUILTIN_VSRQ, P10V_BUILTIN_VSRAQ, P10V_BUILTIN_VCMPGTUT_P, P10V_BUILTIN_VCMPGTST_P, P10V_BUILTIN_VCMPEQUT_P, P10V_BUILTIN_VCMPGTUT_P, P10V_BUILTIN_VCMPGTST_P, P10V_BUILTIN_CMPNET, P10V_BUILTIN_VCMPNET_P, P10V_BUILTIN_VCMPAET_P, P10V_BUILTIN_DIVES_V1TI, P10V_BUILTIN_MODS_V1TI, P10V_BUILTIN_MODU_V1TI): New overloaded definitions. (rs6000_gimple_fold_builtin) [P10V_BUILTIN_VCMPEQUT, P10V_BUILTIN_CMPNET, P10V_BUILTIN_CMPGE_1TI, P10V_BUILTIN_CMPGE_U1TI, P10V_BUILTIN_VCMPGTUT, P10V_BUILTIN_VCMPGTST, P10V_BUILTIN_CMPLE_1TI, P10V_BUILTIN_CMPLE_U1TI]: New case statements. (rs6000_init_builtins) [bool_V1TI_type_node, int_ftype_int_v1ti_v1ti]: New assignments. (altivec_init_builtins): New E_V1TImode case statement. (builtin_function_type)[P10_BUILTIN_128BIT_VMULEUD, P10_BUILTIN_128BIT_VMULOUD, P10_BUILTIN_128BIT_DIVEU_V1TI, P10_BUILTIN_128BIT_MODU_V1TI, P10_BUILTIN_CMPGE_U1TI, P10_BUILTIN_VCMPGTUT, P10_BUILTIN_VCMPEQUT]: New case statements. * config/rs6000/rs6000.c (rs6000_handle_altivec_attribute) [E_TImode, E_V1TImode]: New case statements. * config/rs6000/rs6000.h (rs6000_builtin_type_index): New enum value RS6000_BTI_bool_V1TI. * config/rs6000/vector.md (vector_gtv1ti,vector_nltv1ti, vector_gtuv1ti, vector_nltuv1ti, vector_ngtv1ti, vector_ngtuv1ti, vector_eq_v1ti_p, vector_ne_v1ti_p, vector_ae_v1ti_p, vector_gt_v1ti_p, vector_gtu_v1ti_p, vrotlv1ti3, vashlv1ti3, vlshrv1ti3, vashrv1ti3): New define_expands. * config/rs6000/vsx.md (UNSPEC_VSX_DIVSQ, UNSPEC_VSX_DIVUQ, UNSPEC_VSX_DIVESQ, UNSPEC_VSX_DIVEUQ, UNSPEC_VSX_MODSQ, UNSPEC_VSX_MODUQ): New unspecs. (mulv2di3, vsx_div_v1ti, vsx_udiv_v1ti, vsx_dives_v1ti, vsx_diveu_v1ti, vsx_mods_v1ti, vsx_modu_v1ti, xxswapd_v1ti): New define_insns. (vcmpnet): New define_expand. * doc/extend.texi: Add documentation for the new builtins vec_rl, vec_rlmi, vec_rlnm, vec_sl, vec_sr, vec_sra, vec_mule, vec_mulo, vec_div, vec_dive, vec_mod, vec_cmpeq, vec_cmpne, vec_cmpgt, vec_cmplt, vec_cmpge, vec_cmple, vec_all_eq, vec_all_ne, vec_all_gt, vec_all_lt, vec_all_ge, vec_all_le, vec_any_eq, vec_any_ne, vec_any_gt, vec_any_lt, vec_any_ge, vec_any_le. gcc/testsuite/ChangeLog * gcc.target/powerpc/int_128bit-runnable.c: New test file.
author: Carl Love <cel@us.ibm.com> 2020-06-06 16:56:08 -0500
committer: Carl Love <cel@us.ibm.com> 2021-06-09 11:10:56 -0500
commit: f03122f2a7626772fe13ab77f677141377104502 (patch)
tree: 3e7b70eb56a1f9cac0f928b03854f51764bd7d6d /gcc/doc
parent: 2142e34340523e1553c0dc131f657893f307e291 (diff)
download: gcc-f03122f2a7626772fe13ab77f677141377104502.zip
gcc-f03122f2a7626772fe13ab77f677141377104502.tar.gz
gcc-f03122f2a7626772fe13ab77f677141377104502.tar.bz2
1 files changed, 171 insertions, 0 deletions
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 22f9e93..47b919c 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -20174,6 +20174,177 @@ Generate PCV from specified Mask size, as if implemented by the
 immediate value is either 0, 1, 2 or 3.
 @findex vec_genpcvm
 
+@smallexample
+@exdent vector unsigned __int128 vec_rl (vector unsigned __int128 A,
+                                         vector unsigned __int128 B);
+@exdent vector signed __int128 vec_rl (vector signed __int128 A,
+                                       vector unsigned __int128 B);
+@end smallexample
+
+Result value: Each element of R is obtained by rotating the corresponding element
+of A left by the number of bits specified by the corresponding element of B.
+
+
+@smallexample
+@exdent vector unsigned __int128 vec_rlmi (vector unsigned __int128,
+                                           vector unsigned __int128,
+                                           vector unsigned __int128);
+@exdent vector signed __int128 vec_rlmi (vector signed __int128,
+                                         vector signed __int128,
+                                         vector unsigned __int128);
+@end smallexample
+
+Returns the result of rotating the first input and inserting it under mask
+into the second input.  The first bit in the mask, the last bit in the mask are
+obtained from the two 7-bit fields bits [108:115] and bits [117:123]
+respectively of the second input.  The shift is obtained from the third input
+in the 7-bit field [125:131] where all bits counted from zero at the left.
+
+@smallexample
+@exdent vector unsigned __int128 vec_rlnm (vector unsigned __int128,
+                                           vector unsigned __int128,
+                                           vector unsigned __int128);
+@exdent vector signed __int128 vec_rlnm (vector signed __int128,
+                                         vector unsigned __int128,
+                                         vector unsigned __int128);
+@end smallexample
+
+Returns the result of rotating the first input and ANDing it with a mask.  The
+first bit in the mask and the last bit in the mask are obtained from the two
+7-bit fields bits [117:123] and bits [125:131] respectively of the second
+input.  The shift is obtained from the third input in the 7-bit field bits
+[125:131] where all bits counted from zero at the left.
+
+@smallexample
+@exdent vector unsigned __int128 vec_sl(vector unsigned __int128 A, vector unsigned __int128 B);
+@exdent vector signed __int128 vec_sl(vector signed __int128 A, vector unsigned __int128 B);
+@end smallexample
+
+Result value: Each element of R is obtained by shifting the corresponding element of
+A left by the number of bits specified by the corresponding element of B.
+
+@smallexample
+@exdent vector unsigned __int128 vec_sr(vector unsigned __int128 A, vector unsigned __int128 B);
+@exdent vector signed __int128 vec_sr(vector signed __int128 A, vector unsigned __int128 B);
+@end smallexample
+
+Result value: Each element of R is obtained by shifting the corresponding element of
+A right by the number of bits specified by the corresponding element of B.
+
+@smallexample
+@exdent vector unsigned __int128 vec_sra(vector unsigned __int128 A, vector unsigned __int128 B);
+@exdent vector signed __int128 vec_sra(vector signed __int128 A, vector unsigned __int128 B);
+@end smallexample
+
+Result value: Each element of R is obtained by arithmetic shifting the corresponding
+element of A right by the number of bits specified by the corresponding element of B.
+
+@smallexample
+@exdent vector unsigned __int128 vec_mule (vector unsigned long long,
+                                           vector unsigned long long);
+@exdent vector signed __int128 vec_mule (vector signed long long,
+                                         vector signed long long);
+@end smallexample
+
+Returns a vector containing a 128-bit integer result of multiplying the even
+doubleword elements of the two inputs.
+
+@smallexample
+@exdent vector unsigned __int128 vec_mulo (vector unsigned long long,
+                                           vector unsigned long long);
+@exdent vector signed __int128 vec_mulo (vector signed long long,
+                                         vector signed long long);
+@end smallexample
+
+Returns a vector containing a 128-bit integer result of multiplying the odd
+doubleword elements of the two inputs.
+
+@smallexample
+@exdent vector unsigned __int128 vec_div (vector unsigned __int128,
+                                          vector unsigned __int128);
+@exdent vector signed __int128 vec_div (vector signed __int128,
+                                        vector signed __int128);
+@end smallexample
+
+Returns the result of dividing the first operand by the second operand. An
+attempt to divide any value by zero or to divide the most negative signed
+128-bit integer by negative one results in an undefined value.
+
+@smallexample
+@exdent vector unsigned __int128 vec_dive (vector unsigned __int128,
+                                           vector unsigned __int128);
+@exdent vector signed __int128 vec_dive (vector signed __int128,
+                                         vector signed __int128);
+@end smallexample
+
+The result is produced by shifting the first input left by 128 bits and
+dividing by the second.  If an attempt is made to divide by zero or the result
+is larger than 128 bits, the result is undefined.
+
+@smallexample
+@exdent vector unsigned __int128 vec_mod (vector unsigned __int128,
+                                          vector unsigned __int128);
+@exdent vector signed __int128 vec_mod (vector signed __int128,
+                                        vector signed __int128);
+@end smallexample
+
+The result is the modulo result of dividing the first input  by the second
+input.
+
+The following builtins perform 128-bit vector comparisons.  The
+@code{vec_all_xx}, @code{vec_any_xx}, and @code{vec_cmpxx}, where @code{xx} is
+one of the operations @code{eq, ne, gt, lt, ge, le} perform pairwise
+comparisons between the elements at the same positions within their two vector
+arguments.  The @code{vec_all_xx}function returns a non-zero value if and only
+if all pairwise comparisons are true.  The @code{vec_any_xx} function returns
+a non-zero value if and only if at least one pairwise comparison is true.  The
+@code{vec_cmpxx}function returns a vector of the same type as its two
+arguments, within which each element consists of all ones to denote that
+specified logical comparison of the corresponding elements was true.
+Otherwise, the element of the returned vector contains all zeros.
+
+@smallexample
+vector bool __int128 vec_cmpeq (vector signed __int128, vector signed __int128);
+vector bool __int128 vec_cmpeq (vector unsigned __int128, vector unsigned __int128);
+vector bool __int128 vec_cmpne (vector signed __int128, vector signed __int128);
+vector bool __int128 vec_cmpne (vector unsigned __int128, vector unsigned __int128);
+vector bool __int128 vec_cmpgt (vector signed __int128, vector signed __int128);
+vector bool __int128 vec_cmpgt (vector unsigned __int128, vector unsigned __int128);
+vector bool __int128 vec_cmplt (vector signed __int128, vector signed __int128);
+vector bool __int128 vec_cmplt (vector unsigned __int128, vector unsigned __int128);
+vector bool __int128 vec_cmpge (vector signed __int128, vector signed __int128);
+vector bool __int128 vec_cmpge (vector unsigned __int128, vector unsigned __int128);
+vector bool __int128 vec_cmple (vector signed __int128, vector signed __int128);
+vector bool __int128 vec_cmple (vector unsigned __int128, vector unsigned __int128);
+
+int vec_all_eq (vector signed __int128, vector signed __int128);
+int vec_all_eq (vector unsigned __int128, vector unsigned __int128);
+int vec_all_ne (vector signed __int128, vector signed __int128);
+int vec_all_ne (vector unsigned __int128, vector unsigned __int128);
+int vec_all_gt (vector signed __int128, vector signed __int128);
+int vec_all_gt (vector unsigned __int128, vector unsigned __int128);
+int vec_all_lt (vector signed __int128, vector signed __int128);
+int vec_all_lt (vector unsigned __int128, vector unsigned __int128);
+int vec_all_ge (vector signed __int128, vector signed __int128);
+int vec_all_ge (vector unsigned __int128, vector unsigned __int128);
+int vec_all_le (vector signed __int128, vector signed __int128);
+int vec_all_le (vector unsigned __int128, vector unsigned __int128);
+
+int vec_any_eq (vector signed __int128, vector signed __int128);
+int vec_any_eq (vector unsigned __int128, vector unsigned __int128);
+int vec_any_ne (vector signed __int128, vector signed __int128);
+int vec_any_ne (vector unsigned __int128, vector unsigned __int128);
+int vec_any_gt (vector signed __int128, vector signed __int128);
+int vec_any_gt (vector unsigned __int128, vector unsigned __int128);
+int vec_any_lt (vector signed __int128, vector signed __int128);
+int vec_any_lt (vector unsigned __int128, vector unsigned __int128);
+int vec_any_ge (vector signed __int128, vector signed __int128);
+int vec_any_ge (vector unsigned __int128, vector unsigned __int128);
+int vec_any_le (vector signed __int128, vector signed __int128);
+int vec_any_le (vector unsigned __int128, vector unsigned __int128);
+@end smallexample
+
+
 @node PowerPC Hardware Transactional Memory Built-in Functions
 @subsection PowerPC Hardware Transactional Memory Built-in Functions
 GCC provides two interfaces for accessing the Hardware Transactional
author	Carl Love <cel@us.ibm.com>	2020-06-06 16:56:08 -0500
committer	Carl Love <cel@us.ibm.com>	2021-06-09 11:10:56 -0500
commit	f03122f2a7626772fe13ab77f677141377104502 (patch)
tree	3e7b70eb56a1f9cac0f928b03854f51764bd7d6d /gcc/doc
parent	2142e34340523e1553c0dc131f657893f307e291 (diff)
download	gcc-f03122f2a7626772fe13ab77f677141377104502.zip gcc-f03122f2a7626772fe13ab77f677141377104502.tar.gz gcc-f03122f2a7626772fe13ab77f677141377104502.tar.bz2