aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-09-22Allow -mno-evex512 usagedevel/ix86/evex512Haochen Jiang4-1/+40
gcc/ChangeLog: * config/i386/i386.opt: Allow -mno-evex512. gcc/testsuite/ChangeLog: * gcc.target/i386/noevex512-1.c: New test. * gcc.target/i386/noevex512-2.c: Ditto. * gcc.target/i386/noevex512-3.c: Ditto.
2023-09-22Support -mevex512 for AVX512FP16 intrinsHaochen Jiang1-23/+21
gcc/ChangeLog: * config/i386/sse.md (V48H_AVX512VL): Add TARGET_EVEX512. (VFH): Ditto. (VF2H): Ditto. (VFH_AVX512VL): Ditto. (VHFBF): Ditto. (VHF_AVX512VL): Ditto. (VI2H_AVX512VL): Ditto. (VI2F_256_512): Ditto. (VF48_I1248): Remove unused iterator. (VF48H_AVX512VL): Add TARGET_EVEX512. (VF_AVX512): Remove unused iterator. (REDUC_PLUS_MODE): Add TARGET_EVEX512. (REDUC_SMINMAX_MODE): Ditto. (FMAMODEM): Ditto. (VFH_SF_AVX512VL): Ditto. (VEC_PERM_AVX2): Ditto. Co-authored-by: Hu, Lin1 <lin1.hu@intel.com>
2023-09-22Support -mevex512 for ↵Haochen Jiang2-27/+31
AVX512{IFMA,VBMI,VNNI,BF16,VPOPCNTDQ,VBMI2,BITALG,VP2INTERSECT},VAES,GFNI,VPCLMULQDQ intrins gcc/ChangeLog: * config/i386/sse.md (VI1_AVX512VL): Add TARGET_EVEX512. (VI8_FVL): Ditto. (VI1_AVX512F): Ditto. (VI1_AVX512VNNI): Ditto. (VI1_AVX512VL_F): Ditto. (VI12_VI48F_AVX512VL): Ditto. (*avx512f_permvar_truncv32hiv32qi_1): Ditto. (sdot_prod<mode>): Ditto. (VEC_PERM_AVX2): Ditto. (VPERMI2): Ditto. (VPERMI2I): Ditto. (vpmadd52<vpmadd52type>v8di): Ditto. (usdot_prod<mode>): Ditto. (vpdpbusd_v16si): Ditto. (vpdpbusds_v16si): Ditto. (vpdpwssd_v16si): Ditto. (vpdpwssds_v16si): Ditto. (VI48_AVX512VP2VL): Ditto. (avx512vp2intersect_2intersectv16si): Ditto. (VF_AVX512BF16VL): Ditto. (VF1_AVX512_256): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr90096.c: Adjust error message. Co-authored-by: Hu, Lin1 <lin1.hu@intel.com>
2023-09-22Support -mevex512 for AVX512BW intrinsHaochen Jiang4-126/+128
gcc/Changelog: * config/i386/i386-expand.cc (ix86_expand_vector_init_duplicate): Make sure there is EVEX512 enabled. (ix86_expand_vecop_qihi2): Refuse V32QI->V32HI when no EVEX512. * config/i386/i386.cc (ix86_hard_regno_mode_ok): Disable 64 bit mask when !TARGET_EVEX512. * config/i386/i386.md (avx512bw_512): New. (SWI1248_AVX512BWDQ_64): Add TARGET_EVEX512. (*zero_extendsidi2): Change isa to avx512bw_512. (kmov_isa): Ditto. (*anddi_1): Ditto. (*andn<mode>_1): Change isa to kmov_isa. (*<code><mode>_1): Ditto. (*notxor<mode>_1): Ditto. (*one_cmpl<mode>2_1): Ditto. (*one_cmplsi2_1_zext): Change isa to avx512bw_512. (*ashl<mode>3_1): Change isa to kmov_isa. (*lshr<mode>3_1): Ditto. * config/i386/sse.md (VI12HFBF_AVX512VL): Add TARGET_EVEX512. (VI1248_AVX512VLBW): Ditto. (VHFBF_AVX512VL): Ditto. (VI): Ditto. (VIHFBF): Ditto. (VI_AVX2): Ditto. (VI1_AVX512): Ditto. (VI12_256_512_AVX512VL): Ditto. (VI2_AVX2_AVX512BW): Ditto. (VI2_AVX512VNNIBW): Ditto. (VI2_AVX512VL): Ditto. (VI2HFBF_AVX512VL): Ditto. (VI8_AVX2_AVX512BW): Ditto. (VIMAX_AVX2_AVX512BW): Ditto. (VIMAX_AVX512VL): Ditto. (VI12_AVX2_AVX512BW): Ditto. (VI124_AVX2_24_AVX512F_1_AVX512BW): Ditto. (VI248_AVX512VL): Ditto. (VI248_AVX512VLBW): Ditto. (VI248_AVX2_8_AVX512F_24_AVX512BW): Ditto. (VI248_AVX512BW): Ditto. (VI248_AVX512BW_AVX512VL): Ditto. (VI248_512): Ditto. (VI124_256_AVX512F_AVX512BW): Ditto. (VI_AVX512BW): Ditto. (VIHFBF_AVX512BW): Ditto. (SWI1248_AVX512BWDQ): Ditto. (SWI1248_AVX512BW): Ditto. (SWI1248_AVX512BWDQ2): Ditto. (*knotsi_1_zext): Ditto. (define_split for zero_extend + not): Ditto. (kunpckdi): Ditto. (REDUC_SMINMAX_MODE): Ditto. (VEC_EXTRACT_MODE): Ditto. (*avx512bw_permvar_truncv16siv16hi_1): Ditto. (*avx512bw_permvar_truncv16siv16hi_1_hf): Ditto. (truncv32hiv32qi2): Ditto. (avx512bw_<code>v32hiv32qi2): Ditto. (avx512bw_<code>v32hiv32qi2_mask): Ditto. (avx512bw_<code>v32hiv32qi2_mask_store): Ditto. (usadv64qi): Ditto. (VEC_PERM_AVX2): Ditto. (AVX512ZEXTMASK): Ditto. (SWI24_MASK): New. (vec_pack_trunc_<mode>): Change iterator to SWI24_MASK. (avx512bw_packsswb<mask_name>): Add TARGET_EVEX512. (avx512bw_packssdw<mask_name>): Ditto. (avx512bw_interleave_highv64qi<mask_name>): Ditto. (avx512bw_interleave_lowv64qi<mask_name>): Ditto. (<mask_codefor>avx512bw_pshuflwv32hi<mask_name>): Ditto. (<mask_codefor>avx512bw_pshufhwv32hi<mask_name>): Ditto. (vec_unpacks_lo_di): Ditto. (SWI48x_MASK): New. (vec_unpacks_hi_<mode>): Change iterator to SWI48x_MASK. (avx512bw_umulhrswv32hi3<mask_name>): Add TARGET_EVEX512. (VI1248_AVX512VL_AVX512BW): Ditto. (avx512bw_<code>v32qiv32hi2<mask_name>): Ditto. (*avx512bw_zero_extendv32qiv32hi2_1): Ditto. (*avx512bw_zero_extendv32qiv32hi2_2): Ditto. (<insn>v32qiv32hi2): Ditto. (pbroadcast_evex_isa): Change isa attribute to avx512bw_512. (VPERMI2): Add TARGET_EVEX512. (VPERMI2I): Ditto.
2023-09-22Support -mevex512 for AVX512DQ intrinsHaochen Jiang3-17/+31
gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_expand_sse2_mulvxdi3): Add TARGET_EVEX512 for 512 bit usage. * config/i386/i386.cc (standard_sse_constant_opcode): Ditto. * config/i386/sse.md (VF1_VF2_AVX512DQ): Ditto. (VF1_128_256VL): Ditto. (VF2_AVX512VL): Ditto. (VI8_256_512): Ditto. (<mask_codefor>fixuns_trunc<mode><sseintvecmodelower>2<mask_name>): Ditto. (AVX512_VEC): Ditto. (AVX512_VEC_2): Ditto. (VI4F_BRCST32x2): Ditto. (VI8F_BRCST64x2): Ditto.
2023-09-22Support -mevex512 for AVX512F intrinsHaochen Jiang9-335/+445
gcc/ChangeLog: * config/i386/i386-builtins.cc (ix86_vectorize_builtin_gather): Disable 512 bit gather when !TARGET_EVEX512. * config/i386/i386-expand.cc (ix86_valid_mask_cmp_mode): Add TARGET_EVEX512. (ix86_expand_int_sse_cmp): Ditto. (ix86_expand_vector_init_one_nonzero): Disable subroutine when !TARGET_EVEX512. (ix86_emit_swsqrtsf): Add TARGET_EVEX512. (ix86_vectorize_vec_perm_const): Disable subroutine when !TARGET_EVEX512. * config/i386/i386.cc (standard_sse_constant_p): Add TARGET_EVEX512. (standard_sse_constant_opcode): Ditto. (ix86_get_ssemov): Ditto. (ix86_legitimate_constant_p): Ditto. (ix86_vectorize_builtin_scatter): Diable 512 bit scatter when !TARGET_EVEX512. * config/i386/i386.md (avx512f_512): New. (movxi): Add TARGET_EVEX512. (*movxi_internal_avx512f): Ditto. (*movdi_internal): Change alternative 12 to ?Yv. Adjust mode for alternative 13. (*movsi_internal): Change alternative 8 to ?Yv. Adjust mode for alternative 9. (*movhi_internal): Change alternative 11 to *Yv. (*movdf_internal): Change alternative 12 to Yv. (*movsf_internal): Change alternative 5 to Yv. Adjust mode for alternative 5 and 6. (*mov<mode>_internal): Change alternative 4 to Yv. (define_split for convert SF to DF): Add TARGET_EVEX512. (extendbfsf2_1): Ditto. * config/i386/predicates.md (bcst_mem_operand): Disable predicate for 512 bit when !TARGET_EVEX512. * config/i386/sse.md (VMOVE): Add TARGET_EVEX512. (V48_AVX512VL): Ditto. (V48_256_512_AVX512VL): Ditto. (V48H_AVX512VL): Ditto. (VI12_AVX512VL): Ditto. (V): Ditto. (V_512): Ditto. (V_256_512): Ditto. (VF): Ditto. (VF1_VF2_AVX512DQ): Ditto. (VFH): Ditto. (VFB): Ditto. (VF1): Ditto. (VF1_AVX2): Ditto. (VF2): Ditto. (VF2H): Ditto. (VF2_512_256): Ditto. (VF2_512_256VL): Ditto. (VF_512): Ditto. (VFB_512): Ditto. (VI48_AVX512VL): Ditto. (VI1248_AVX512VLBW): Ditto. (VF_AVX512VL): Ditto. (VFH_AVX512VL): Ditto. (VF1_AVX512VL): Ditto. (VI): Ditto. (VIHFBF): Ditto. (VI_AVX2): Ditto. (VI8): Ditto. (VI8_AVX512VL): Ditto. (VI2_AVX512F): Ditto. (VI4_AVX512F): Ditto. (VI4_AVX512VL): Ditto. (VI48_AVX512F_AVX512VL): Ditto. (VI8_AVX2_AVX512F): Ditto. (VI8_AVX_AVX512F): Ditto. (V8FI): Ditto. (V16FI): Ditto. (VI124_AVX2_24_AVX512F_1_AVX512BW): Ditto. (VI248_AVX512VLBW): Ditto. (VI248_AVX2_8_AVX512F_24_AVX512BW): Ditto. (VI248_AVX512BW): Ditto. (VI248_AVX512BW_AVX512VL): Ditto. (VI48_AVX512F): Ditto. (VI48_AVX_AVX512F): Ditto. (VI12_AVX_AVX512F): Ditto. (VI148_512): Ditto. (VI124_256_AVX512F_AVX512BW): Ditto. (VI48_512): Ditto. (VI_AVX512BW): Ditto. (VIHFBF_AVX512BW): Ditto. (VI4F_256_512): Ditto. (VI48F_256_512): Ditto. (VI48F): Ditto. (VI12_VI48F_AVX512VL): Ditto. (V32_512): Ditto. (AVX512MODE2P): Ditto. (STORENT_MODE): Ditto. (REDUC_PLUS_MODE): Ditto. (REDUC_SMINMAX_MODE): Ditto. (*andnot<mode>3): Change isa attribute to avx512f_512. (*andnot<mode>3): Ditto. (<code><mode>3): Ditto. (<code>tf3): Ditto. (FMAMODEM): Add TARGET_EVEX512. (FMAMODE_AVX512): Ditto. (VFH_SF_AVX512VL): Ditto. (avx512f_fix_notruncv16sfv16si<mask_name><round_name>): Ditto. (fix<fixunssuffix>_truncv16sfv16si2<mask_name><round_saeonly_name>): Ditto. (avx512f_cvtdq2pd512_2): Ditto. (avx512f_cvtpd2dq512<mask_name><round_name>): Ditto. (fix<fixunssuffix>_truncv8dfv8si2<mask_name><round_saeonly_name>): Ditto. (<mask_codefor>avx512f_cvtpd2ps512<mask_name><round_name>): Ditto. (vec_unpacks_lo_v16sf): Ditto. (vec_unpacks_hi_v16sf): Ditto. (vec_unpacks_float_hi_v16si): Ditto. (vec_unpacks_float_lo_v16si): Ditto. (vec_unpacku_float_hi_v16si): Ditto. (vec_unpacku_float_lo_v16si): Ditto. (vec_pack_sfix_trunc_v8df): Ditto. (avx512f_vec_pack_sfix_v8df): Ditto. (<mask_codefor>avx512f_unpckhps512<mask_name>): Ditto. (<mask_codefor>avx512f_unpcklps512<mask_name>): Ditto. (<mask_codefor>avx512f_movshdup512<mask_name>): Ditto. (<mask_codefor>avx512f_movsldup512<mask_name>): Ditto. (AVX512_VEC): Ditto. (AVX512_VEC_2): Ditto. (vec_extract_lo_v64qi): Ditto. (vec_extract_hi_v64qi): Ditto. (VEC_EXTRACT_MODE): Ditto. (<mask_codefor>avx512f_unpckhpd512<mask_name>): Ditto. (avx512f_movddup512<mask_name>): Ditto. (avx512f_unpcklpd512<mask_name>): Ditto. (*<avx512>_vternlog<mode>_all): Ditto. (*<avx512>_vpternlog<mode>_1): Ditto. (*<avx512>_vpternlog<mode>_2): Ditto. (*<avx512>_vpternlog<mode>_3): Ditto. (avx512f_shufps512_mask): Ditto. (avx512f_shufps512_1<mask_name>): Ditto. (avx512f_shufpd512_mask): Ditto. (avx512f_shufpd512_1<mask_name>): Ditto. (<mask_codefor>avx512f_interleave_highv8di<mask_name>): Ditto. (<mask_codefor>avx512f_interleave_lowv8di<mask_name>): Ditto. (vec_dupv2df<mask_name>): Ditto. (trunc<pmov_src_lower><mode>2): Ditto. (*avx512f_<code><pmov_src_lower><mode>2): Ditto. (*avx512f_vpermvar_truncv8div8si_1): Ditto. (avx512f_<code><pmov_src_lower><mode>2_mask): Ditto. (avx512f_<code><pmov_src_lower><mode>2_mask_store): Ditto. (truncv8div8qi2): Ditto. (avx512f_<code>v8div16qi2): Ditto. (*avx512f_<code>v8div16qi2_store_1): Ditto. (*avx512f_<code>v8div16qi2_store_2): Ditto. (avx512f_<code>v8div16qi2_mask): Ditto. (*avx512f_<code>v8div16qi2_mask_1): Ditto. (*avx512f_<code>v8div16qi2_mask_store_1): Ditto. (avx512f_<code>v8div16qi2_mask_store_2): Ditto. (vec_widen_umult_even_v16si<mask_name>): Ditto. (*vec_widen_umult_even_v16si<mask_name>): Ditto. (vec_widen_smult_even_v16si<mask_name>): Ditto. (*vec_widen_smult_even_v16si<mask_name>): Ditto. (VEC_PERM_AVX2): Ditto. (one_cmpl<mode>2): Ditto. (<mask_codefor>one_cmpl<mode>2<mask_name>): Ditto. (*one_cmpl<mode>2_pternlog_false_dep): Ditto. (define_split to xor): Ditto. (*andnot<mode>3): Ditto. (define_split for ior): Ditto. (*iornot<mode>3): Ditto. (*xnor<mode>3): Ditto. (*<nlogic><mode>3): Ditto. (<mask_codefor>avx512f_interleave_highv16si<mask_name>): Ditto. (<mask_codefor>avx512f_interleave_lowv16si<mask_name>): Ditto. (avx512f_pshufdv3_mask): Ditto. (avx512f_pshufd_1<mask_name>): Ditto. (*vec_extractv4ti): Ditto. (VEXTRACTI128_MODE): Ditto. (define_split to vec_extract): Ditto. (VI1248_AVX512VL_AVX512BW): Ditto. (<mask_codefor>avx512f_<code>v16qiv16si2<mask_name>): Ditto. (<insn>v16qiv16si2): Ditto. (avx512f_<code>v16hiv16si2<mask_name>): Ditto. (<insn>v16hiv16si2): Ditto. (avx512f_zero_extendv16hiv16si2_1): Ditto. (avx512f_<code>v8qiv8di2<mask_name>): Ditto. (*avx512f_<code>v8qiv8di2<mask_name>_1): Ditto. (*avx512f_<code>v8qiv8di2<mask_name>_2): Ditto. (<insn>v8qiv8di2): Ditto. (avx512f_<code>v8hiv8di2<mask_name>): Ditto. (<insn>v8hiv8di2): Ditto. (avx512f_<code>v8siv8di2<mask_name>): Ditto. (*avx512f_zero_extendv8siv8di2_1): Ditto. (*avx512f_zero_extendv8siv8di2_2): Ditto. (<insn>v8siv8di2): Ditto. (avx512f_roundps512_sfix): Ditto. (vashrv8di3): Ditto. (vashrv16si3): Ditto. (pbroadcast_evex_isa): Change isa attribute to avx512f_512. (vec_dupv4sf): Add TARGET_EVEX512. (*vec_dupv4si): Ditto. (*vec_dupv2di): Ditto. (vec_dup<mode>): Change isa attribute to avx512f_512. (VPERMI2): Add TARGET_EVEX512. (VPERMI2I): Ditto. (VEC_INIT_MODE): Ditto. (VEC_INIT_HALF_MODE): Ditto. (<mask_codefor>avx512f_vcvtph2ps512<mask_name><round_saeonly_name>): Ditto. (avx512f_vcvtps2ph512_mask_sae): Ditto. (<mask_codefor>avx512f_vcvtps2ph512<mask_name><round_saeonly_name>): Ditto. (*avx512f_vcvtps2ph512<merge_mask_name>): Ditto. (INT_BROADCAST_MODE): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr89229-5b.c: Modify message of scan-assembler. * gcc.target/i386/pr89229-6b.c: Ditto. * gcc.target/i386/pr89229-7b.c: Ditto.
2023-09-22Disable zmm register and 512 bit libmvec call when !TARGET_EVEX512Haochen Jiang4-33/+42
gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_broadcast_from_constant): Disable zmm broadcast for !TARGET_EVEX512. * config/i386/i386-options.cc (ix86_option_override_internal): Do not use PVW_512 when no-evex512. (ix86_simd_clone_adjust): Add evex512 target into string. * config/i386/i386.cc (type_natural_mode): Report ABI warning when using zmm register w/o evex512. (ix86_return_in_memory): Do not allow zmm when !TARGET_EVEX512. (ix86_hard_regno_mode_ok): Ditto. (ix86_set_reg_reg_cost): Ditto. (ix86_rtx_costs): Ditto. (ix86_vector_mode_supported_p): Ditto. (ix86_preferred_simd_mode): Ditto. (ix86_get_mask_mode): Ditto. (ix86_simd_clone_compute_vecsize_and_simdlen): Disable 512 bit libmvec call when !TARGET_EVEX512. (ix86_simd_clone_usable): Ditto. * config/i386/i386.h (BIGGEST_ALIGNMENT): Disable 512 alignment when !TARGET_EVEX512 (MOVE_MAX): Do not use PVW_512 when !TARGET_EVEX512. (STORE_MAX_PIECES): Ditto.
2023-09-22Add OPTION_MASK_ISA2_EVEX512 for 512 bit builtinsHaochen Jiang1-78/+78
gcc/ChangeLog: * config/i386/i386-builtin.def (BDESC): Add OPTION_MASK_ISA2_EVEX512.
2023-09-22Add OPTION_MASK_ISA2_EVEX512 for 512 bit builtinsHaochen Jiang1-94/+94
gcc/ChangeLog: * config/i386/i386-builtin.def (BDESC): Add OPTION_MASK_ISA2_EVEX512.
2023-09-22Add OPTION_MASK_ISA2_EVEX512 for 512 bit builtinsHaochen Jiang1-113/+113
gcc/ChangeLog: * config/i386/i386-builtin.def (BDESC): Add OPTION_MASK_ISA2_EVEX512.
2023-09-22Add OPTION_MASK_ISA2_EVEX512 for 512 bit builtinsHaochen Jiang1-47/+47
gcc/ChangeLog: * config/i386/i386-builtin.def (BDESC): Add OPTION_MASK_ISA2_EVEX512.
2023-09-22Add OPTION_MASK_ISA2_EVEX512 for 512 bit builtinsHaochen Jiang2-348/+372
gcc/ChangeLog: * config/i386/i386-builtin.def (BDESC): Add OPTION_MASK_ISA2_EVEX512. * config/i386/i386-builtins.cc (ix86_init_mmx_sse_builtins): Ditto.
2023-09-22Push evex512 target for 512 bit intrinsHaochen Jiang1-2678/+2705
gcc/Changelog: * config/i386/avx512fp16intrin.h: Add evex512 target for 512 bit intrins. Co-authored-by: Hu, Lin1 <lin1.hu@intel.com>
2023-09-22Push evex512 target for 512 bit intrinsHaochen Jiang18-221/+282
gcc/ChangeLog: * config.gcc: Add avx512bitalgvlintrin.h. * config/i386/avx5124fmapsintrin.h: Add evex512 target for 512 bit intrins. * config/i386/avx5124vnniwintrin.h: Ditto. * config/i386/avx512bf16intrin.h: Ditto. * config/i386/avx512bitalgintrin.h: Add evex512 target for 512 bit intrins. Split 128/256 bit intrins to avx512bitalgvlintrin.h. * config/i386/avx512erintrin.h: Add evex512 target for 512 bit intrins * config/i386/avx512ifmaintrin.h: Ditto * config/i386/avx512pfintrin.h: Ditto * config/i386/avx512vbmi2intrin.h: Ditto. * config/i386/avx512vbmiintrin.h: Ditto. * config/i386/avx512vnniintrin.h: Ditto. * config/i386/avx512vp2intersectintrin.h: Ditto. * config/i386/avx512vpopcntdqintrin.h: Ditto. * config/i386/gfniintrin.h: Ditto. * config/i386/immintrin.h: Add avx512bitalgvlintrin.h. * config/i386/vaesintrin.h: Add evex512 target for 512 bit intrins. * config/i386/vpclmulqdqintrin.h: Ditto. * config/i386/avx512bitalgvlintrin.h: New.
2023-09-22Push evex512 target for 512 bit intrinsHaochen Jiang1-138/+153
gcc/ChangeLog: * config/i386/avx512bwintrin.h: Add evex512 target for 512 bit intrins.
2023-09-22Push evex512 target for 512 bit intrinsHaochen Jiang1-455/+467
gcc/ChangeLog: * config/i386/avx512dqintrin.h: Add evex512 target for 512 bit intrins.
2023-09-22Push evex512 target for 512 bit intrinsHaochen Jiang1-3666/+3745
gcc/ChangeLog: * config/i386/avx512fintrin.h: Add evex512 target for 512 bit intrins.
2023-09-22Initial support for -mevex512Haochen Jiang4-1/+39
gcc/ChangeLog: * common/config/i386/i386-common.cc (OPTION_MASK_ISA2_EVEX512_SET): New. (OPTION_MASK_ISA2_EVEX512_UNSET): Ditto. (ix86_handle_option): Handle EVEX512. * config/i386/i386-c.cc (ix86_target_macros_internal): Ditto. * config/i386/i386-options.cc: (isa2_opts): Ditto. (ix86_valid_target_attribute_inner_p): Ditto. (ix86_option_override_internal): Set EVEX512 target if it is not explicitly set when AVX512 is enabled. Disable AVX512{PF,ER,4VNNIW,4FAMPS} for -mno-evex512. * config/i386/i386.opt: Add mevex512. Temporaily RejectNegative.
2023-09-21RISC-V: Add more VLS unary testsJuzhe-Zhong3-0/+173
Notice we are missing these tests. Committed. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/abs-1.c: New test. * gcc.target/riscv/rvv/autovec/vls/not-1.c: New test. * gcc.target/riscv/rvv/autovec/vls/sqrt-1.c: New test.
2023-09-21RISC-V: Support VLS mult highJuzhe-Zhong3-0/+159
Regression passed. Committed. gcc/ChangeLog: * config/riscv/vector-iterators.md: Extend VLS modes. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/def.h: Add VLS mult high. * gcc.target/riscv/rvv/autovec/vls/mulh-1.c: New test.
2023-09-21RISC-V: Adjusting the comments of the ↵Lehua Ding1-12/+21
emit_vlmax_insn/emit_vlmax_insn_lra/emit_nonvlmax_insn functions V2 Change: Use Robin's comments. This patch adjusts the comments of the emit_vlmax_insn/emit_vlmax_insn_lra/emit_nonvlmax_insn functions. The purpose of the adjustment is to make it clear that vlmax here is not VLMAX as defined inside the RVV ISA. This is because this function is used by RVV mode (e.g. RVVM1SImode) in addition to VLS mode (V16QI). For RVV mode, it means the same thing, for VLS mode, it indicates setting the vl to the number of units of the mode. Changed the comment because I didn't think of a better name. If there is a suitable name, feel free to discuss it. gcc/ChangeLog: * config/riscv/riscv-v.cc (emit_vlmax_insn): Adjust comments. (emit_nonvlmax_insn): Adjust comments. (emit_vlmax_insn_lra): Adjust comments. Co-Authored-By: Robin Dapp <rdapp.gcc@gmail.com>
2023-09-21rust: Implement TARGET_RUST_OS_INFO for *-*-*linux*.Iain Buclaw3-0/+63
gcc/ChangeLog: * config.gcc (*linux*): Set rust target_objs, and target_has_targetrustm, * config/t-linux (linux-rust.o): New rule. * config/linux-rust.cc: New file.
2023-09-21rust: Implement TARGET_RUST_OS_INFO for i[34567]86-*-mingw* and x86_64-*-mingw*.Iain Buclaw3-0/+46
gcc/ChangeLog: * config.gcc (i[34567]86-*-mingw* | x86_64-*-mingw*): Set rust_target_objs and target_has_targetrustm. * config/t-winnt (winnt-rust.o): New rule. * config/winnt-rust.cc: New file.
2023-09-21rust: Implement TARGET_RUST_OS_INFO for *-*-fuchsia*.Iain Buclaw3-0/+64
gcc/ChangeLog: * config.gcc (*-*-fuchsia): Set tmake_rule, rust_target_objs, and target_has_targetrustm. * config/fuchsia-rust.cc: New file. * config/t-fuchsia: New file.
2023-09-21rust: Implement TARGET_RUST_OS_INFO for *-*-vxworks*Iain Buclaw3-0/+47
gcc/ChangeLog: * config.gcc (*-*-vxworks*): Set rust_target_objs and target_has_targetrustm. * config/t-vxworks (vxworks-rust.o): New rule. * config/vxworks-rust.cc: New file.
2023-09-21rust: Implement TARGET_RUST_OS_INFO for *-*-dragonfly*Iain Buclaw3-0/+46
gcc/ChangeLog: * config.gcc (*-*-dragonfly*): Set rust_target_objs and target_has_targetrustm. * config/t-dragonfly (dragonfly-rust.o): New rule. * config/dragonfly-rust.cc: New file.
2023-09-21rust: Implement TARGET_RUST_OS_INFO for *-*-solaris2*.Iain Buclaw3-0/+47
gcc/ChangeLog: * config.gcc (*-*-solaris2*): Set rust_target_objs and target_has_targetrustm. * config/t-sol2 (sol2-rust.o): New rule. * config/sol2-rust.cc: New file.
2023-09-21rust: Implement TARGET_RUST_OS_INFO for *-*-openbsd*Iain Buclaw3-0/+47
gcc/ChangeLog: * config.gcc (*-*-openbsd*): Set rust_target_objs and target_has_targetrustm. * config/t-openbsd (openbsd-rust.o): New rule. * config/openbsd-rust.cc: New file.
2023-09-21rust: Implement TARGET_RUST_OS_INFO for *-*-netbsd*Iain Buclaw3-0/+46
gcc/ChangeLog: * config.gcc (*-*-netbsd*): Set rust_target_objs and target_has_targetrustm. * config/t-netbsd (netbsd-rust.o): New rule. * config/netbsd-rust.cc: New file.
2023-09-21rust: Implement TARGET_RUST_OS_INFO for *-*-freebsd*Iain Buclaw3-0/+46
gcc/ChangeLog: * config.gcc (*-*-freebsd*): Set rust_target_objs and target_has_targetrustm. * config/t-freebsd (freebsd-rust.o): New rule. * config/freebsd-rust.cc: New file.
2023-09-21rust: Implement TARGET_RUST_OS_INFO for *-*-darwin*Iain Buclaw3-0/+50
gcc/ChangeLog: * config.gcc (*-*-darwin*): Set rust_target_objs and target_has_targetrustm. * config/t-darwin (darwin-rust.o): New rule. * config/darwin-rust.cc: New file.
2023-09-21rust: Implement TARGET_RUST_CPU_INFO for i[34567]86-*-* and x86_64-*-*Iain Buclaw3-0/+155
There are still quite a lot of the previously reverted i386-rust.cc missing, so it's only a partial reimplementation. gcc/ChangeLog: * config/i386/t-i386 (i386-rust.o): New rule. * config/i386/i386-rust.cc: New file. * config/i386/i386-rust.h: New file.
2023-09-21rust: Reintroduce TARGET_RUST_OS_INFO hookIain Buclaw4-0/+16
gcc/ChangeLog: * doc/tm.texi: Regenerate. * doc/tm.texi.in: Document TARGET_RUST_OS_INFO. gcc/rust/ChangeLog: * rust-session-manager.cc (Session::init): Call targetrustm.rust_os_info. * rust-target.def (rust_os_info): New hook.
2023-09-21rust: Reintroduce TARGET_RUST_CPU_INFO hookIain Buclaw6-3/+42
gcc/ChangeLog: * doc/tm.texi: Regenerate. * doc/tm.texi.in: Add @node for Rust language and ABI, and document TARGET_RUST_CPU_INFO. gcc/rust/ChangeLog: * rust-lang.cc (rust_add_target_info): Remove sorry. * rust-session-manager.cc: Replace include of target.h with include of tm.h and rust-target.h. (Session::init): Call targetrustm.rust_cpu_info. * rust-target.def (rust_cpu_info): New hook. * rust-target.h (rust_add_target_info): Declare.
2023-09-21rust: Add skeleton support and documentation for targetrustm hooks.Iain Buclaw11-3/+220
gcc/ChangeLog: * Makefile.in (tm_rust_file_list, tm_rust_include_list, TM_RUST_H, RUST_TARGET_DEF, RUST_TARGET_H, RUST_TARGET_OBJS): New variables. (tm_rust.h, cs-tm_rust.h, default-rust.o, rust/rust-target-hooks-def.h, s-rust-target-hooks-def-h): New rules. (s-tm-texi): Also check timestamp on rust-target.def. (generated_files): Add TM_RUST_H and rust-target-hooks-def.h. (build/genhooks.o): Also depend on RUST_TARGET_DEF. * config.gcc (tm_rust_file, rust_target_objs, target_has_targetrustm): New variables. * configure: Regenerate. * configure.ac (tm_rust_file_list, tm_rust_include_list, rust_target_objs): Add substitutes. * doc/tm.texi: Regenerate. * doc/tm.texi.in (targetrustm): Document. (target_has_targetrustm): Document. * genhooks.cc: Include rust/rust-target.def. * config/default-rust.cc: New file. gcc/rust/ChangeLog: * rust-target-def.h: New file. * rust-target.def: New file. * rust-target.h: New file.
2023-09-21RISC-V: Enable undefined support for RVV auto-vectorization[PR110751]Juzhe-Zhong22-53/+105
Now GCC middle-end can support undefined value which is traslated into (scratch:mode). This patch is to enable RISC-V backend undefine value in ELSE value of COND_LEN_xxx/COND_xxx. Consider this following case: __attribute__((noipa)) void vrem_int8_t (int8_t * __restrict dst, int8_t * __restrict a, int8_t * __restrict b, int n) { for (int i = 0; i < n; i++) dst[i] = a[i] % b[i]; } Before this patch: vrem_int8_t: ble a3,zero,.L5 vsetvli a5,zero,e8,m1,ta,ma vmv.v.i v4,0 ---> redundant. .L3: vsetvli a5,a3,e8,m1,tu,ma ---> should be TA. vmv1r.v v1,v4 ---> redudant. vle8.v v3,0(a1) vle8.v v2,0(a2) sub a3,a3,a5 vrem.vv v1,v3,v2 vse8.v v1,0(a0) add a1,a1,a5 add a2,a2,a5 add a0,a0,a5 bne a3,zero,.L3 .L5: ret After this patch: vrem_int8_t: ble a3,zero,.L5 .L3: vsetvli a5,a3,e8,m1,ta,ma vle8.v v1,0(a1) vle8.v v2,0(a2) sub a3,a3,a5 vrem.vv v1,v1,v2 vse8.v v1,0(a0) add a1,a1,a5 add a2,a2,a5 add a0,a0,a5 bne a3,zero,.L3 .L5: ret PR target/110751 gcc/ChangeLog: * config/riscv/autovec.md: Enable scratch rtx in ELSE operand. * config/riscv/predicates.md (autovec_else_operand): New predicate. * config/riscv/riscv-v.cc (get_else_operand): New function. (expand_cond_len_unop): Adapt ELSE value. (expand_cond_len_binop): Ditto. (expand_cond_len_ternop): Ditto. * config/riscv/riscv.cc (riscv_preferred_else_value): New function. (TARGET_PREFERRED_ELSE_VALUE): New targethook. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv-nofm.c: Adapt test. * gcc.target/riscv/rvv/autovec/binop/vdiv-rv32gcv.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv-nofm.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/vdiv-rv64gcv.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/vrem-rv32gcv.c: Ditto. * gcc.target/riscv/rvv/autovec/binop/vrem-rv64gcv.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-1.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-10.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-11.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-12.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-2.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-3.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-4.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-5.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-6.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-7.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-8.c: Ditto. * gcc.target/riscv/rvv/autovec/ternop/ternop_nofm-9.c: Ditto.
2023-09-21RISC-V: Fix SUBREG move of VLS mode[PR111486]Juzhe-Zhong2-1/+13
This patch fixes this bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111486 Before this patch, we can only handle (subreg:DI (reg:V8QI)) The PR ICE: during RTL pass: reload testcase.c: In function 'foo': testcase.c:8:1: internal compiler error: in require, at machmode.h:313 8 | } | ^ 0xa40cd2 opt_mode<machine_mode>::require() const /repo/gcc-trunk/gcc/machmode.h:313 0xa47091 opt_mode<machine_mode>::require() const /repo/gcc-trunk/gcc/config/riscv/riscv.cc:2546 0xa47091 riscv_legitimize_move(machine_mode, rtx_def*, rtx_def*) /repo/gcc-trunk/gcc/config/riscv/riscv.cc:2543 0x1f1df10 gen_movdi(rtx_def*, rtx_def*) /repo/gcc-trunk/gcc/config/riscv/riscv.md:2024 0x10f1423 rtx_insn* insn_gen_fn::operator()<rtx_def*, rtx_def*>(rtx_def*, rtx_def*) const /repo/gcc-trunk/gcc/recog.h:411 0x10f1423 emit_move_insn_1(rtx_def*, rtx_def*) /repo/gcc-trunk/gcc/expr.cc:4164 0x10f183d emit_move_insn(rtx_def*, rtx_def*) /repo/gcc-trunk/gcc/expr.cc:4334 0x13070ec lra_emit_move(rtx_def*, rtx_def*) /repo/gcc-trunk/gcc/lra.cc:509 0x132295b curr_insn_transform /repo/gcc-trunk/gcc/lra-constraints.cc:4748 0x1324335 lra_constraints(bool) /repo/gcc-trunk/gcc/lra-constraints.cc:5488 0x130a3d4 lra(_IO_FILE*) /repo/gcc-trunk/gcc/lra.cc:2419 0x12bb629 do_reload /repo/gcc-trunk/gcc/ira.cc:5970 0x12bb629 execute /repo/gcc-trunk/gcc/ira.cc:6156 Because of (subreg:DI (reg:V2QI)) PR target/111486 gcc/ChangeLog: * config/riscv/riscv.cc (riscv_legitimize_move): Fix bug. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr111486.c: New test.
2023-09-21check undefine_p for one more vrJiufu Guo2-1/+9
The root cause of PR111355 and PR111482 is missing to check if vr0 is undefined_p before call vr0.lower_bound. In the pattern "(X + C) / N", (if (INTEGRAL_TYPE_P (type) && get_range_query (cfun)->range_of_expr (vr0, @0)) (if (...) (plus (op @0 @2) { wide_int_to_tree (type, plus_op1 (c)); }) (if (TYPE_UNSIGNED (type) && c.sign_mask () < 0 ... && wi::geu_p (vr0.lower_bound (), -c)) In "(if (...)", there is code to prevent vr0's undefined_p, But in the "else" part, vr0's undefined_p is not checked before "wi::geu_p (vr0.lower_bound (), -c)". PR tree-optimization/111355 gcc/ChangeLog: * match.pd ((X + C) / N): Update pattern. gcc/testsuite/ChangeLog: * gcc.dg/pr111355.c: New test.
2023-09-21using overflow_free_p to simplify patternJiufu Guo1-30/+6
In r14-3582, an "overflow_free_p" interface is added. The pattern of "(t * 2) / 2" in match.pd can be simplified by using this interface. gcc/ChangeLog: * match.pd ((t * 2) / 2): Update to use overflow_free_p.
2023-09-21RISC-V: Optimized for strided load/store with stride == element width[PR111450]xuli5-17/+250
When stride == element width, vlsse should be optimized into vle.v. vsse should be optimized into vse.v. PR target/111450 gcc/ChangeLog: * config/riscv/constraints.md (c01): const_int 1. (c02): const_int 2. (c04): const_int 4. (c08): const_int 8. * config/riscv/predicates.md (vector_eew8_stride_operand): New predicate for stride operand. (vector_eew16_stride_operand): Ditto. (vector_eew32_stride_operand): Ditto. (vector_eew64_stride_operand): Ditto. * config/riscv/vector-iterators.md: New iterator for stride operand. * config/riscv/vector.md: Add stride = element width constraint. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr111450.c: New test.
2023-09-21RISC-V: Rename predicate vector_gs_scale_operand_16/32 to more generic namesLehua Ding2-16/+16
This little rename vector_gs_scale_operand_16/32 to more generic names const_1_or_2/4_operand. So it's a little better understood when offered for use elsewhere. gcc/ChangeLog: * config/riscv/predicates.md (const_1_or_2_operand): Rename. (const_1_or_4_operand): Ditto. (vector_gs_scale_operand_16): Ditto. (vector_gs_scale_operand_32): Ditto. * config/riscv/vector-iterators.md: Adjust.
2023-09-21RISC-V: Support VLS INT <-> FP conversionsJuzhe-Zhong15-16/+882
Support INT <-> FP VLS auto-vectorization patterns. Regression passed. Committed. gcc/ChangeLog: * config/riscv/autovec.md: Extend VLS modes. * config/riscv/vector-iterators.md: Ditto. * config/riscv/vector.md: Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/convert-1.c: New test. * gcc.target/riscv/rvv/autovec/vls/convert-10.c: New test. * gcc.target/riscv/rvv/autovec/vls/convert-11.c: New test. * gcc.target/riscv/rvv/autovec/vls/convert-12.c: New test. * gcc.target/riscv/rvv/autovec/vls/convert-2.c: New test. * gcc.target/riscv/rvv/autovec/vls/convert-3.c: New test. * gcc.target/riscv/rvv/autovec/vls/convert-4.c: New test. * gcc.target/riscv/rvv/autovec/vls/convert-5.c: New test. * gcc.target/riscv/rvv/autovec/vls/convert-6.c: New test. * gcc.target/riscv/rvv/autovec/vls/convert-7.c: New test. * gcc.target/riscv/rvv/autovec/vls/convert-8.c: New test. * gcc.target/riscv/rvv/autovec/vls/convert-9.c: New test.
2023-09-21Daily bump.GCC Administrator9-1/+421
2023-09-20testsuite: Add test for already-fixed issue with _Pragma expansion [PR90400]Lewis Hyatt1-0/+14
The PR was fixed by r12-5454. Since the fix was somewhat incidental, although related, add a testcase from PR90400 too before closing it out. gcc/testsuite/ChangeLog: PR preprocessor/90400 * c-c++-common/cpp/pr90400.c: New test.
2023-09-20libcpp: Fix ICE on #include after a line marker directive [PR61474]Lewis Hyatt4-2/+21
As noted in the PR, GCC will segfault if a file name is first seen in a linemarker directive, and then later seen in a normal #include. This is because the fake include process adds the file to the cache with a null PATH member. The normal #include finds this file in the cache and then attempts to use the null PATH. Resolve by adding the file to the cache with a unique starting directory, so that the fake entry will only be found by a subsequent fake include, not by a real one. libcpp/ChangeLog: PR preprocessor/61474 * files.cc (_cpp_find_file): Set DONT_READ to TRUE for fake include files. (_cpp_fake_include): Pass a unique cpp_dir* address so the fake file will not be found when looked up for real. gcc/testsuite/ChangeLog: PR preprocessor/61474 * c-c++-common/cpp/pr61474-2.h: New test. * c-c++-common/cpp/pr61474.c: New test. * c-c++-common/cpp/pr61474.h: New test.
2023-09-20Tweak merge_range API.Andrew MacLeod1-24/+15
merge_range use to return TRUE if there was already a range. Now it returns TRUE if a new range is added, OR updates and existing range with a new value. FALSE is returned when the range already matches. * gimple-range-cache.cc (ssa_cache::merge_range): Change meaning of the return value. (ssa_cache::dump): Don't print GLOBAL RANGE header. (ssa_lazy_cache::merge_range): Adjust return value meaning. (ranger_cache::dump): Print GLOBAL RANGE header.
2023-09-20aarch64: Ensure const and sign correctnessPekka Seppänen1-2/+3
Be const and sign correct by using a matching CIE augmentation type. Use a builtin instead of relying <string.h> being included. libgcc/ChangeLog: * config/aarch64/aarch64-unwind.h (aarch64_cie_signed_with_b_key): Use const unsigned type and a builtin. Signed-off-by: Pekka Seppänen <pexu@gcc.mail.kapsi.fi>
2023-09-20RISC-V: Remove math.h import to resolve missing stubs failuresPatrick O'Neill1-1/+0
Resolves some of the missing stubs failures: fatal error: gnu/stubs-lp64d.h: No such file or directory compilation terminated. 2023-09-20 Juzhe Zhong <juzhe.zhong@rivai.ai> gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/def.h: Remove unneeded math.h import. Tested-by: Patrick O'Neill <patrick@rivosinc.com>
2023-09-20[frange] Remove special casing from unordered operators.Aldy Hernandez3-16/+112
In coming up with testcases for the unordered folders, I realized that we were already handling them correctly, even in the absence of my work in this area lately. All of the unordered fold_range() methods try to fold with the ordered variants first, and if they return TRUE, we are guaranteed to be able to fold, even in the presence of NANs. For example: if (x_5 >= y_8) if (x_5 __UNLE y_8) On the true side of the first conditional we know that either x_5 < y_8 or that one or more operands is a NAN. Since UNLE_EXPR returns true for precisely this scenario, we can fold as true. This is handled in the fold_range() methods as follows: if (!range_op_handler (LE_EXPR).fold_range (r, type, op1_no_nan, op2_no_nan, trio)) return false; // The result is the same as the ordered version when the // comparison is true or when the operands cannot be NANs. if (!maybe_isnan (op1, op2) || r == range_true (type)) return true; This code has been there since the last release, and makes the special casing I am deleting obsolete. I have added tests to make sure we keep track of this behavior. gcc/ChangeLog: * range-op-float.cc (foperator_unordered_ge::fold_range): Remove special casing. (foperator_unordered_gt::fold_range): Same. (foperator_unordered_lt::fold_range): Same. (foperator_unordered_le::fold_range): Same. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/vrp-float-relations-5.c: New test. * gcc.dg/tree-ssa/vrp-float-relations-6.c: New test.
2023-09-20c, c++: Accept __builtin_classify_type (typename)Jakub Jelinek10-2/+387
As mentioned in my stdckdint.h mail, __builtin_classify_type has a problem that argument promotion (the argument is passed to ... prototyped builtin function) means that certain type classes will simply never appear. I think it is too late to change how it behaves, lots of code in the wild might rely on the current behavior. So, the following patch adds option to use a typename rather than expression as the operand to the builtin, making it behave similarly to sizeof, typeof or say the clang _Generic extension where the first argument can be there not just expression, but also typename. I think we have other prior art here, e.g. __builtin_va_arg also expects typename. I've added this to both C and C++, because it would be weird if it supported it only in C and not in C++. 2023-09-20 Jakub Jelinek <jakub@redhat.com> gcc/ * builtins.h (type_to_class): Declare. * builtins.cc (type_to_class): No longer static. Return int rather than enum. * doc/extend.texi (__builtin_classify_type): Document. gcc/c/ * c-parser.cc (c_parser_postfix_expression_after_primary): Parse __builtin_classify_type call with typename as argument. gcc/cp/ * parser.cc (cp_parser_postfix_expression): Parse __builtin_classify_type call with typename as argument. * pt.cc (tsubst_copy_and_build): Handle __builtin_classify_type with dependent typename as argument. gcc/testsuite/ * c-c++-common/builtin-classify-type-1.c: New test. * g++.dg/ext/builtin-classify-type-1.C: New test. * g++.dg/ext/builtin-classify-type-2.C: New test. * gcc.dg/builtin-classify-type-1.c: New test.