Age | Commit message (Collapse) | Author | Files | Lines |
|
The gfx803 "Fiji" device was deprecated in GCC 14, removed from LLVM 18, and
hasn't worked properly with the drivers since about ROCm 4.
This patch removes the device from GCC options and documentation, and removes
the direct mentions from the internals.
The TARGET_GCN3 support in the back-end is now unused and can be removed (in a
follow-up patch).
gcc/ChangeLog:
* config.gcc (amdgcn-*-*): Remove "fiji" from with_arch checks.
* config/gcn/gcn-hsa.h (ABI_VERSION_SPEC): Remove fiji alternative.
(NO_XNACK): Likewise.
(NO_SRAM_ECC): Likewise.
(ASM_SPEC): Remove "%{}" around ABI_VERSION_SPEC.
* config/gcn/gcn-opts.h (enum processor_type): Remove PROCESSOR_FIJI.
(TARGET_FIJI): Delete.
* config/gcn/gcn.cc (gcn_option_override): Remove Fiji.
(gcn_omp_device_kind_arch_isa): Likewise.
(output_file_start): Likewise.
* config/gcn/gcn.h (TARGET_CPU_CPP_BUILTINS): Likewise.
* config/gcn/gcn.opt (gpu_type): Likewise.
(march, mtune): Change default to PROCESSOR_VEGA10.
* config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX803): Delete.
(copy_early_debug_info): Remove elf_flags_actual.
Use ELFABIVERSION_AMDGPU_HSA_V4 unconditionally.
(get_arch): Remove Fiji.
(main): Remove gfx803.
* config/gcn/t-omp-device
(omp-device-properties-gcn): Remove fiji and gfx803.
* doc/install.texi (amdgcn*-*-*): Remove fiji and special instructions.
* doc/invoke.texi: Remove fiji.
libgomp/ChangeLog:
* libgomp.texi: Remove fiji and gfx803.
* testsuite/libgomp.c/declare-variant-4.h: Remove fiji and gfx803.
* testsuite/libgomp.c/declare-variant-4-fiji.c: Removed.
* testsuite/libgomp.c/declare-variant-4-gfx803.c: Removed.
|
|
gcc/
* config.gcc (extra_objs) [target=avr]: Add avr-passes.o.
* config/avr/t-avr (avr-passes.o): New rule to make it.
* config/avr/avr.cc (#define INCLUDE_VECTOR): Remove.
(cfganal.h, cfgrtl.h, context.h, tree-pass.h, print-rtl.h): Don't
include them.
(avr_strict_signed_p, avr_strict_unsigned_p, avr_2comparisons_rhs)
(make_avr_pass_recompute_notes, make_avr_pass_casesi)
(make_avr_pass_ifelse, make_avr_pass_pre_proep, avr_split_tiny_move)
(emit_move_ccc, emit_move_ccc_after, reg_seen_between_p)
(avr_maybe_adjust_cfa, avr_redundant_compare_regs)
(avr_parallel_insn_from_insns, avr_is_casesi_sequence)
(avr_optimize_casesi, avr_redundant_compare, make_avr_pass_fuse_add)
(avr_optimize_2ifelse, avr_rest_of_handle_ifelse)
(avr_casei_sequence_check_operands)
Move functions...
(avr_pass_data_fuse_add, avr_pass_data_ifelse)
(avr_pass_data_casesi, avr_pass_data_recompute_notes)
(avr_pass_data_pre_proep): Move objects...
(avr_pass_fuse_add, avr_pass_pre_proep, avr_pass_recompute_notes)
(avr_pass_ifelse, avr_pass_casesi, AVR_LdSt_Props): Move classes...
* config/avr/avr-passes.cc: ... to this new C++ module.
(struct Ranges): Move to...
* config/avr/ranges.h: ...this new file.
* config/avr/avr-protos.h: Adjust comments.
|
|
gcc/ChangeLog:
* config.gcc: Add avx10_2copyintrin.h.
* config/i386/i386.md (avx10_2): New isa attribute.
* config/i386/immintrin.h: Include avx10_2copyintrin.h.
* config/i386/sse.md
(sse_movss_<mode>): Add new constraints to handle AVX10.2.
(vec_set<mode>_0): Ditto.
(@vec_set<mode>_0): Ditto.
(vec_set<mode>_0): Ditto.
(avx512fp16_mov<mode>): Ditto.
(*vec_set<mode>_0_1): New split.
* config/i386/avx10_2copyintrin.h: New file.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx10_2-vmovd-1.c: New test.
* gcc.target/i386/avx10_2-vmovd-2.c: Ditto.
* gcc.target/i386/avx10_2-vmovw-1.c: Ditto.
* gcc.target/i386/avx10_2-vmovw-2.c: Ditto.
|
|
gcc/ChangeLog:
* config.gcc: Add avx10_2-512minmaxintrin.h and
avx10_2minmaxintrin.h.
* config/i386/i386-builtin-types.def:
Add DEF_FUNCTION_TYPE (V8BF, V8BF, V8BF, INT, V8BF, UQI),
(V16BF, V16BF, V16BF, INT, V16BF, UHI),
(V32BF, V32BF, V32BF, INT, V32BF, USI),
(V8HF, V8HF, V8HF, INT, V8HF, UQI),
(V8DF, V8DF, V8DF, INT, V8DF, UQI, INT),
(V32HF, V32HF, V32HF, INT, V32HF, USI, INT),
(V16HF, V16HF, V16HF, INT, V16HF, UHI, INT),
(V16SF, V16SF, V16SF, INT, V16SF, UHI, INT).
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/i386-expand.cc
(ix86_expand_args_builtin): Handle V8BF_FTYPE_V8BF_V8BF_INT_V8BF_UQI,
V16BF_FTYPE_V16BF_V16BF_INT_V16BF_UHI,
V32BF_FTYPE_V32BF_V32BF_INT_V32BF_USI,
V8HF_FTYPE_V8HF_V8HF_INT_V8HF_UQI,
(ix86_expand_round_builtin): Handle V8DF_FTYPE_V8DF_V8DF_INT_V8DF_UQI_INT,
V32HF_FTYPE_V32HF_V32HF_INT_V32HF_USI_INT,
V16HF_FTYPE_V16HF_V16HF_INT_V16HF_UHI_INT.
V16SF_FTYPE_V16SF_V16SF_INT_V16SF_UHI_INT.
* config/i386/immintrin.h: Include avx10_2-512mixmaxintrin.h and
avx10_2minmaxintrin.h.
* config/i386/sse.md (VFH_AVX10_2): New.
(avx10_2_vminmaxnepbf16_<mode><mask_name>): New define_insn.
(avx10_2_minmaxp<mode><mask_name><round_saeonly_name>): Ditto.
(avx10_2_minmaxs<mode><mask_scalar_name><round_saeonly_scalar_name>): Ditto.
* config/i386/avx10_2-512minmaxintrin.h: New file.
* config/i386/avx10_2minmaxintrin.h: Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx-1.c: Add macros.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/avx512f-helper.h: Add helper function.
* gcc.target/i386/avx10-minmax-helper.h: New helper file.
* gcc.target/i386/avx10_2-512-minmax-1.c: New test.
* gcc.target/i386/avx10_2-512-vminmaxnepbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vminmaxpd-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vminmaxph-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vminmaxps-2.c: Ditto.
* gcc.target/i386/avx10_2-minmax-1.c: Ditto.
* gcc.target/i386/avx10_2-vminmaxnepbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vminmaxsd-2.c: Ditto.
* gcc.target/i386/avx10_2-vminmaxsh-2.c: Ditto.
* gcc.target/i386/avx10_2-vminmaxss-2.c: Ditto.
* gcc.target/i386/avx10_2-vminmaxpd-2.c: Ditto.
* gcc.target/i386/avx10_2-vminmaxph-2.c: Ditto.
* gcc.target/i386/avx10_2-vminmaxps-2.c: Ditto.
Co-authored-by: Hu, Lin1 <lin1.hu@intel.com>
Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
|
|
gcc/ChangeLog:
* config.gcc: Add avx10_2satcvtintrin.h and
avx10_2-512satcvtintrin.h.
* config/i386/i386-builtin-types.def:
Add DEF_FUNCTION_TYPE (V8HI, V8BF, V8HI, UQI),
(V16HI, V16BF, V16HI, UHI), (V32HI, V32BF, V32HI, USI),
(V16SI, V16SF, V16SI, UHI, INT), (V16HI, V16BF, V16HI, UHI, INT),
(V32HI, V32BF, V32HI, USI, INT).
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/i386-expand.cc (ix86_expand_args_builtin): Handle
V32HI_FTYPE_V32BF_V32HI_USI, V16HI_FTYPE_V16BF_V16HI_UHI,
V8HI_FTYPE_V8BF_V8HI_UQI.
(ix86_expand_round_builtin): Handle V32HI_FTYPE_V32BF_V32HI_USI_INT,
V16SI_FTYPE_V16SF_V16SI_UHI_INT, V16HI_FTYPE_V16BF_V16HI_UHI_INT.
* config/i386/immintrin.h: Include avx10_2satcvtintrin.h and
avx10_2-512savcvtintrin.h.
* config/i386/sse.md:
(UNSPEC_CVTNE_BF16_IBS_ITER): New iterator.
(sat_cvt_sign_prefix): Ditto.
(sat_cvt_trunc_prefix): Ditto.
(UNSPEC_CVT_PH_IBS_ITER): Ditto.
(UNSPEC_CVTT_PH_IBS_ITER): Ditto.
(UNSPEC_CVT_PS_IBS_ITER): Ditto.
(UNSPEC_CVTT_PS_IBS_ITER): Ditto.
(avx10_2_cvt<sat_cvt_trunc_prefix>nebf162i<sat_cvt_sign_prefix>bs<mode><mask_name>):
New define_insn.
(avx10_2_cvtph2i<sat_cvt_sign_prefix>bs<mode><mask_name><round_name>):
Ditto.
(avx10_2_cvttph2i<sat_cvt_sign_prefix>bs<mode><mask_name><round_saeonly_name>):
Ditto.
(avx10_2_cvtps2i<sat_cvt_sign_prefix>bs<mode><mask_name><round_name>):
Ditto.
(avx10_2_cvttps2i<sat_cvt_sign_prefix>bs<mode><mask_name><round_saeonly_name>):
Ditto.
* config/i386/avx10_2-512satcvtintrin.h: New file.
* config/i386/avx10_2satcvtintrin.h: Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx-1.c: Add macros.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/avx512f-helper.h: Add new test macro.
* gcc.target/i386/m512-check.h: Add new type.
* gcc.target/i386/avx10_2-512-satcvt-1.c: New test.
* gcc.target/i386/avx10_2-512-vcvtnebf162ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtnebf162iubs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtph2ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtph2iubs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtps2ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtps2iubs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvttnebf162ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvttnebf162iubs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvttph2ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvttph2iubs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvttps2ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvttps2iubs-2.c: Ditto.
* gcc.target/i386/avx10_2-satcvt-1.c: Ditto.
* gcc.target/i386/avx10_2-vcvtnebf162ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtnebf162iubs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtph2ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtph2iubs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtps2ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttnebf162ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttnebf162iubs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttph2ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttph2iubs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttps2ibs-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvttps2iubs-2.c: Ditto.
|
|
gcc/ChangeLog:
* config.gcc: Add avx10_2-512bf16intrin.h and avx10_2bf16intrin.h.
* config/i386/i386-builtin-types.def : Add new
DEF_FUNCTION_TYPE for V32BF_FTYPE_V32BF_V32BF,
V16BF_FTYPE_V16BF_V16BF, V8BF_FTYPE_V8BF_V8BF,
V8BF_FTYPE_V8BF_V8BF_UQI, V16BF_FTYPE_V16BF_V16BF_UHI,
V32BF_FTYPE_V32BF_V32BF_USI, V32BF_FTYPE_V32BF_V32BF_V32BF_USI,
V8BF_FTYPE_V8BF_V8BF_V8BF_UQI and V16BF_FTYPE_V16BF_V16BF_V16BF_UHI.
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/i386-expand.cc (ix86_expand_args_builtin):
Handle new DEF_FUNCTION_TYPE.
* config/i386/immintrin.h: Include avx10_2-512bf16intrin.h and
avx10_2bf16intrin.h.
* config/i386/sse.md
(VBF_AVX10_2): New iterator.
(avx10_2_scalefpbf16_<mode><mask_name>): New define_insn.
(avx10_2_<code>nepbf16_<mode><mask_name>): Ditto.
(avx10_2_<insn>nepbf16_<mode><mask_name>): Ditto.
(avx10_2_fmaddnepbf16_<mode>_maskz): New expander.
(avx10_2_fnmaddnepbf16_<mode>_maskz): Ditto.
(avx10_2_fmsubnepbf16_<mode>_maskz): Ditto.
(avx10_2_fnmsubnepbf16_<mode>_maskz): Ditto.
(avx10_2_fmaddnepbf16_<mode><sd_maskz_name>): New define_insn.
(avx10_2_fmaddnepbf16_<mode>_mask): Ditto.
(avx10_2_fmaddnepbf16_<mode>_mask3): Ditto.
(avx10_2_fnmaddnepbf16_<mode><sd_maskz_name>): Ditto.
(avx10_2_fnmaddnepbf16_<mode>_mask): Ditto.
(avx10_2_fnmaddnepbf16_<mode>_mask3): Ditto.
(avx10_2_fmsubnepbf16_<mode><sd_maskz_name>): Ditto.
(avx10_2_fmsubnepbf16_<mode>_mask): Ditto.
(avx10_2_fmsubnepbf16_<mode>_mask3): Ditto.
(avx10_2_fnmsubnepbf16_<mode><sd_maskz_name>): Ditto.
(avx10_2_fnmsubnepbf16_<mode>_mask): Ditto.
(avx10_2_fnmsubnepbf16_<mode>_mask3): Ditto.
* config/i386/avx10_2-512bf16intrin.h: New file.
* config/i386/avx10_2bf16intrin.h: Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512f-helper.h: Add MAKE_MASK_MERGE and MAKE_MASK_ZERO
for bf16_uw.
* gcc.target/i386/m512-check.h: Add union512bf16_uw, union256bf16_uw,
union128bf16_uw and CHECK_EXP for them.
* gcc.target/i386/avx10-helper.h: New file.
* gcc.target/i386/avx10_2-512-bf16-1.c: New test.
* gcc.target/i386/avx10_2-512-vaddnepbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vdivnepbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vfmaddXXXnepbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vfmsubXXXnepbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vfnmaddXXXnepbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vfnmsubXXXnepbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vmaxpbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vminpbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vscalefpbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vsubnepbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-bf16-1.c: Ditto.
* gcc.target/i386/avx10_2-vaddnepbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vdivnepbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vfmaddXXXnepbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vfmsubXXXnepbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vfnmaddXXXnepbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vfnmsubXXXnepbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vmaxpbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vminpbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vmulnepbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vscalefpbf16-2.c: Ditto.
* gcc.target/i386/avx10_2-vsubnepbf16-2.c: Ditto.
Co-authored-by: Levy Hsu <admin@levyhsu.com>
|
|
gcc/ChangeLog:
* config.gcc: Add avx10_2-512convertintrin.h and
avx10_2convertintrin.h.
* config/i386/i386-builtin-types.def: Add new DEF_POINTER_TYPE
and DEF_FUNCTION_TYPE.
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/i386-expand.cc (ix86_expand_args_builtin):
Handle AVX10.2.
(ix86_expand_round_builtin): Ditto.
* config/i386/immintrin.h: Include avx10_2-512convertintrin.h,
avx10_2convertintrin.h.
* config/i386/sse.md (VHF_AVX10_2): New iterator.
(bf16_ph): Add 512 bit mode.
(avx10_2_cvt2ps2phx_<mode><mask_name<round_name>): New define_insn.
(ssebvecmode): New iterator.
(UNSPEC_NECONVERTFP8_PACK): Ditto.
(neconvertfp8_pack): Ditto.
(vcvt<neconvertfp8_pack><mode><mask_name>): New define_insn.
(ssebvecmode_2): New iterator.
(UNSPEC_VCVTBIASPH2FP8_PACK): Ditto.
(biasph2fp8_pack): Ditto.
(vcvt<biasph2fp8_pack>v8hf): New expander.
(vcvt<biasph2fp8_pack>v8hf_mask): Ditto.
(*vcvt<biasph2bf8_pack>v8hf): New define_insn.
(*vcvt<biasph2fp8_pack>v8hf_mask): Ditto.
(VHF_AVX10_2_2): New iterator.
(vcvt<biasph2fp8_pack><mode><mask_name>): New define_insn.
(VHF_256_512): New iterator.
(ph2fp8suff): Ditto.
(UNSPEC_NECONVERTPH2FP8_PACK): Ditto.
(neconvertph2fp8): Ditto.
(vcvt<neconvertph2fp8>v8hf_mask): New expander.
(*vcvt<neconvertph2fp8>v8hf): New define_insn.
(*vcvt<neconvertph2fp8>v8hf_mask): Ditto.
(vcvt<neconvertph2fp8><mode><mask_name>): Ditto.
(vcvthf82ph<mode><mask_name>): Ditto.
* config/i386/avx10_2-512convertintrin.h: New file.
* config/i386/avx10_2convertintrin.h: Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx-1.c: Add macros for const.
* gcc.target/i386/avx-2.c: Ditto.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/avx10_2-512-convert-1.c: New test.
* gcc.target/i386/avx10_2-512-vcvt2ps2phx-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtbiasph2bf8-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtbiasph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtbiasph2hf8-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtbiasph2hf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvthf82ph-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtne2ph2bf8-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtne2ph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtne2ph2hf8-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtne2ph2hf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtneph2bf8-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtneph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtneph2hf8-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtneph2hf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-convert-1.c: Ditto.
* gcc.target/i386/avx10_2-vcvt2ps2phx-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtbiasph2bf8-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtbiasph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtbiasph2hf8-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtbiasph2hf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvthf82ph-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtne2ph2bf8-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtne2ph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtne2ph2hf8-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtne2ph2hf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtneph2bf8-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtneph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtneph2hf8-2.c: Ditto.
* gcc.target/i386/avx10_2-vcvtneph2hf8s-2.c: Ditto.
* gcc.target/i386/fp8-helper.h: New helper file.
Co-authored-by: Levy Hsu <admin@levyhsu.com>
Co-authored-by: Kong Lingling <lingling.kong@intel.com>
|
|
gcc/ChangeLog
* config.gcc: Add avx10_2mediaintrin.h and
avx10_2-512mediaintrin.h.
* config/i386/i386-builtin.def: Add new builtins.
* config/i386/i386-builtins.cc (def_builtin): Handle shared
builtins between AVXVNNIINT8 and AVX10.2.
* config/i386/i386-expand.cc (ix86_check_builtin_isa_match):
Ditto.
* config/i386/immintrin.h: Include avx10_2mediaintrin.h and
avx10_2-512mediaintrin.h
* config/i386/sse.md: (VI4_AVX10_2): New.
(vpdp<vpdotprodtype>_<mode>): Add AVX10_2_256.
(vpdp<vpdotprodtype>_v16si): New define_insn.
(vpdp<vpdotprodtype>_<mode>_mask): Ditto.
(*vpdp<vpdotprodtype>_<mode>_maskz): Ditto.
(vpdp<vpdotprodtype>_<mode>_maskz): New expander.
* config/i386/avx10_2-512mediaintrin.h: New file.
* config/i386/avx10_2mediaintrin.h: Ditto.
gcc/testsuite/ChangeLog
* gcc.target/i386/avx512f-helper.h: Reuse AVX512F macros
for AVX10.
* gcc.target/i386/funcspec-56.inc: Add new target attribute.
* lib/target-supports.exp
(check_effective_target_avx10_2): New.
(check_effective_target_avx10_2_512): Ditto.
* gcc.target/i386/avx10-check.h: New test file.
* gcc.target/i386/avx10-helper.h: Ditto.
* gcc.target/i386/avx10_2-builtin-1.c: Ditto.
* gcc.target/i386/avx10_2-512-media-1.c: Ditto.
* gcc.target/i386/avx10_2-media-1.c: Ditto..
* gcc.target/i386/avxvnniint8-builtin.c: Ditto.
* gcc.target/i386/avx10_2-512-vpdpbssd-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vpdpbssds-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vpdpbsud-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vpdpbsuds-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vpdpbuud-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vpdpbuuds-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpbssd-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpbssds-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpbsud-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpbsuds-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpbuud-2.c: Ditto.
* gcc.target/i386/avx10_2-vpdpbuuds-2.c: Ditto.
Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
|
|
gcc/ChangeLog:
* config.gcc: Add avx10_2roundingintrin.h.
* config/i386/i386-builtin-types.def: Add new DEF_FUNCTION_TYPE.
* config/i386/i386-builtin.def (BDESC): Add new builtins.
* config/i386/i386-expand.cc (ix86_expand_round_builtin): Handle
V4DF_FTYPE_V4DF_V4DF_V4DF_UQI_INT, V8SF_FTYPE_V8SF_V8SF_V8SF_UQI_INT,
V16HF_FTYPE_V16HF_V16HF_V16HF_UHI_INT, UQI_FTYPE_V4DF_V4DF_INT_UQI_INT,
UHI_FTYPE_V16HF_V16HF_INT_UHI_INT, UQI_FTYPE_V8SF_V8SF_INT_UQI_INT.
* config/i386/immintrin.h: Include avx10_2roundingintrin.h.
* config/i386/sse.md: Change subst_attr name due to renaming.
* config/i386/subst.md:
(<round_mode512bit_condition>): Add condition check for avx10.2
rounding control 256bit intrins and renamed to ...
(<round_mode_condition>): ...this.
(round_saeonly_mode512bit_condition): Add condition check for
avx10.2 rounding control 256 bit intris and renamed to ...
(round_saeonly_mode_condition): ...this.
* config/i386/avx10_2roundingintrin.h: New file.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx-1.c: Add -mavx10.2 and new builtin test.
* gcc.target/i386/avx-2.c: Ditto.
* gcc.target/i386/sse-13.c: Add new tests.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Ditto.
* gcc.target/i386/avx10_2-rounding-1.c: New test.
|
|
The ACLE declares several helper types and functions to facilitate construction
of `fpm` arguments. These are available when one of the arm_neon.h, arm_sve.h,
or arm_sme.h headers is included. These helpers don't map to specific FP8
instructions and there's no expectation that they will produce a given code
sequence, they're just an abstraction and an aid to the programmer. Thus they are
implemented in a new header file arm_private_fp8.h
Users are not expected to include this file, as it is a mere implementation detail,
subject to change. A check is included to guard against direct inclusion.
gcc/ChangeLog:
* config.gcc (extra_headers): Install arm_private_fp8.h.
* config/aarch64/arm_neon.h: Include arm_private_fp8.h.
* config/aarch64/arm_sve.h: Likewise.
* config/aarch64/arm_private_fp8.h: New file
(fpm_t): New type representing fpmr values.
(enum __ARM_FPM_FORMAT): New enum representing valid fp8 formats.
(enum __ARM_FPM_OVERFLOW): New enum representing how some fp8
calculations work.
(__arm_fpm_init): New.
(__arm_set_fpm_src1_format): Likewise.
(__arm_set_fpm_src2_format): Likewise.
(__arm_set_fpm_dst_format): Likewise.
(__arm_set_fpm_overflow_cvt): Likewise.
(__arm_set_fpm_overflow_mul): Likewise.
(__arm_set_fpm_lscale): Likewise.
(__arm_set_fpm_lscale2): Likewise.
(__arm_set_fpm_nscale): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/acle/fp8-helpers-neon.c: New test of fpmr helper
functions.
* gcc.target/aarch64/acle/fp8-helpers-sve.c: New test of fpmr helper
functions presence.
* gcc.target/aarch64/acle/fp8-helpers-sme.c: New test of fpmr helper
functions presence.
|
|
This patch adds the power11 option to the -mcpu= and -mtune= switches.
This patch treats the power11 like a power10 in terms of costs and reassociation
width.
This patch issues a ".machine power11" to the assembly file if you use
-mcpu=power11.
This patch defines _ARCH_PWR11 if the user uses -mcpu=power11.
This patch allows GCC to be configured with the --with-cpu=power11 and
--with-tune=power11 options.
This patch passes -mpwr11 to the assembler if the user uses -mcpu=power11.
This patch adds support for using "power11" in the __builtin_cpu_is built-in
function.
2024-07-22 Michael Meissner <meissner@linux.ibm.com>
gcc/
* config.gcc (powerpc*-*-*): Add support for power11.
* config/rs6000/aix71.h (ASM_CPU_SPEC): Add support for -mcpu=power11.
* config/rs6000/aix72.h (ASM_CPU_SPEC): Likewise.
* config/rs6000/aix73.h (ASM_CPU_SPEC): Likewise.
* config/rs6000/driver-rs6000.cc (asm_names): Likewise.
* config/rs6000/ppc-auxv.h (PPC_PLATFORM_POWER11): New define.
* config/rs6000/rs6000-builtin.cc (cpu_is_info): Add power11.
* config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Define
_ARCH_PWR11 if -mcpu=power11.
* config/rs6000/rs6000-cpus.def (POWER11_MASKS_SERVER): New define.
(POWERPC_MASKS): Add power11.
(power11 cpu): Add power11 definition.
* config/rs6000/rs6000-opts.h (PROCESSOR_POWER11): Add power11 processor.
* config/rs6000/rs6000-string.cc (expand_compare_loop): Likewise.
* config/rs6000/rs6000-tables.opt: Regenerate.
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Add power11
support.
(rs6000_machine_from_flags): Likewise.
(rs6000_reassociation_width): Likewise.
(rs6000_adjust_cost): Likewise.
(rs6000_issue_rate): Likewise.
(rs6000_sched_reorder): Likewise.
(rs6000_sched_reorder2): Likewise.
(rs6000_register_move_cost): Likewise.
(rs6000_opt_masks): Likewise.
* config/rs6000/rs6000.h (ASM_CPU_SPEC): Likewise.
* config/rs6000/rs6000.md (cpu attribute): Add power11.
* config/rs6000/rs6000.opt (-mpower11): Add internal power11 flag.
* doc/invoke.texi (RS/6000 and PowerPC Options): Document -mcpu=power11.
* config/rs6000/power10.md (all reservations): Add power11 support.
gcc/testsuite/
* gcc.target/powerpc/power11-1.c: New test.
* gcc.target/powerpc/power11-2.c: Likewise.
* gcc.target/powerpc/power11-3.c: Likewise.
|
|
This patch reuses the MinGW implementation to enable DLL import/export
functionality for the aarch64-w64-mingw32 target. It also modifies
environment configurations for MinGW.
2024-06-08 Evgeny Karpov <Evgeny.Karpov@microsoft.com>
gcc/ChangeLog:
* config.gcc: Add winnt-dll.o, which contains the DLL
import/export implementation.
* config/aarch64/aarch64.cc (aarch64_load_symref_appropriately):
Add dllimport implementation.
(aarch64_expand_call): Likewise.
(aarch64_legitimize_address): Likewise.
* config/aarch64/cygming.h (SYMBOL_FLAG_DLLIMPORT): Modify MinGW
environment to support DLL import/export.
(SYMBOL_FLAG_DLLEXPORT): Likewise.
(SYMBOL_REF_DLLIMPORT_P): Likewise.
(SYMBOL_FLAG_STUBVAR): Likewise.
(SYMBOL_REF_STUBVAR_P): Likewise.
(TARGET_VALID_DLLIMPORT_ATTRIBUTE_P): Likewise.
(TARGET_ASM_FILE_END): Likewise.
(SUB_TARGET_RECORD_STUB): Likewise.
(GOT_ALIAS_SET): Likewise.
(PE_COFF_EXTERN_DECL_SHOULD_BE_LEGITIMIZED): Likewise.
(HAVE_64BIT_POINTERS): Likewise.
|
|
This patch extracts the ix86 implementation for expanding a SYMBOL
into its corresponding dllimport, far-address, or refptr symbol. It
will be reused in the aarch64-w64-mingw32 target. The implementation
is copied as is from i386/i386.cc with minor changes to follow to the
code style.
Also this patch replaces the original DLL import/export implementation
in ix86 with mingw.
2024-06-08 Evgeny Karpov <Evgeny.Karpov@microsoft.com>
gcc/ChangeLog:
* config.gcc: Add winnt-dll.o, which contains the DLL
import/export implementation.
* config/i386/cygming.h (SUB_TARGET_RECORD_STUB): Remove the
old implementation. Rename the required function to MinGW.
Use MinGW implementation for COFF and nothing otherwise.
(GOT_ALIAS_SET): Likewise.
* config/i386/i386-expand.cc (ix86_expand_move): Likewise.
* config/i386/i386-expand.h (ix86_GOT_alias_set): Likewise.
(legitimize_pe_coff_symbol): Likewise.
* config/i386/i386-protos.h (i386_pe_record_stub): Likewise.
* config/i386/i386.cc (is_imported_p): Likewise.
(legitimate_pic_address_disp_p): Likewise.
(ix86_GOT_alias_set): Likewise.
(legitimize_pic_address): Likewise.
(legitimize_tls_address): Likewise.
(struct dllimport_hasher): Likewise.
(GTY): Likewise.
(get_dllimport_decl): Likewise.
(legitimize_pe_coff_extern_decl): Likewise.
(legitimize_dllimport_symbol): Likewise.
(legitimize_pe_coff_symbol): Likewise.
(ix86_legitimize_address): Likewise.
* config/i386/i386.h (GOT_ALIAS_SET): Likewise.
* config/mingw/winnt.cc (i386_pe_record_stub): Likewise.
(mingw_pe_record_stub): Likewise.
* config/mingw/winnt.h (mingw_pe_record_stub): Likewise.
* config/mingw/t-cygming: Add the winnt-dll.o compilation.
* config/mingw/winnt-dll.cc: New file.
* config/mingw/winnt-dll.h: New file.
|
|
This patch refactors recent changes to move mingw-related
functionality to the mingw folder. More renamings to the mingw_ prefix
will be done in follow-up commits.
This is the first commit in the second patch series to add DLL
import/export implementation to AArch64.
Coauthors: Zac Walker <zacwalker@microsoft.com>,
Mark Harmstone <mark@harmstone.com> and
Ron Riddle <ron.riddle@microsoft.com>
Refactored, prepared, and validated by
Radek Barton <radek.barton@microsoft.com> and
Evgeny Karpov <evgeny.karpov@microsoft.com>
2024-06-08 Evgeny Karpov <evgeny.karpov@microsoft.com>
gcc/ChangeLog:
* config.gcc: Move mingw_* declations to mingw.
* config/aarch64/aarch64-protos.h
(mingw_pe_maybe_record_exported_symbol): Likewise.
(mingw_pe_section_type_flags): Likewise.
(mingw_pe_unique_section): Likewise.
(mingw_pe_encode_section_info): Likewise.
* config/aarch64/cygming.h
(mingw_pe_asm_named_section): Likewise.
(mingw_pe_declare_function_type): Likewise.
* config/i386/i386-protos.h
(mingw_pe_unique_section): Likewise.
(mingw_pe_declare_function_type): Likewise.
(mingw_pe_maybe_record_exported_symbol): Likewise.
(mingw_pe_encode_section_info): Likewise.
(mingw_pe_section_type_flags): Likewise.
(mingw_pe_asm_named_section): Likewise.
* config/mingw/winnt.h: New file.
|
|
This patch enables -march/-mtune=shijidadao, costs and tunings are set
according to the characteristics of the processor.
gcc/ChangeLog:
* common/config/i386/cpuinfo.h (get_zhaoxin_cpu): Recognize shijidadao.
* common/config/i386/i386-common.cc: Add shijidadao.
* common/config/i386/i386-cpuinfo.h (enum processor_subtypes):
Add ZHAOXIN_FAM7H_SHIJIDADAO.
* config.gcc: Add shijidadao.
* config/i386/driver-i386.cc (host_detect_local_cpu):
Let -march=native recognize shijidadao processors.
* config/i386/i386-c.cc (ix86_target_macros_internal): Add shijidadao.
* config/i386/i386-options.cc (m_ZHAOXIN): Add m_SHIJIDADAO.
(m_SHIJIDADAO): New definition.
* config/i386/i386.h (enum processor_type): Add PROCESSOR_SHIJIDADAO.
* config/i386/x86-tune-costs.h (struct processor_costs):
Add shijidadao_cost.
* config/i386/x86-tune-sched.cc (ix86_issue_rate): Add shijidadao.
(ix86_adjust_cost): Ditto.
* config/i386/x86-tune.def (X86_TUNE_USE_GATHER_2PARTS): Add m_SHIJIDADAO.
(X86_TUNE_USE_GATHER_4PARTS): Ditto.
(X86_TUNE_USE_GATHER_8PARTS): Ditto.
(X86_TUNE_AVOID_128FMA_CHAINS): Ditto.
* doc/extend.texi: Add details about shijidadao.
* doc/invoke.texi: Ditto.
gcc/testsuite/ChangeLog:
* g++.target/i386/mv32.C: Handle new -march
* gcc.target/i386/funcspec-56.inc: Ditto.
|
|
It has been pointed out to me that when moving Solaris 11.3 from
config.gcc's obsolete to unsupported list, I'd forgotten to also move
the minor version info, leading to confusing
*** Configuration i386-pc-solaris2.11 not supported
instead of the correct
*** Configuration i386-pc-solaris2.11.3 not supported
This patch fixes this oversight.
Tested on i386-pc-solaris2.11 (11.3 and 11.4).
2024-05-30 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>
gcc:
* config.gcc: Move ${target_min} from obsolete to unsupported
message.
|
|
gcc/ChangeLog:
* common/config/i386/cpuinfo.h
(get_intel_cpu): Remove Xeon Phi cpus.
(get_available_features): Remove Xeon Phi ISAs.
* common/config/i386/i386-common.cc
(OPTION_MASK_ISA_AVX512PF_SET): Removed.
(OPTION_MASK_ISA_AVX512ER_SET): Ditto.
(OPTION_MASK_ISA2_AVX5124FMAPS_SET): Ditto.
(OPTION_MASK_ISA2_AVX5124VNNIW_SET): Ditto.
(OPTION_MASK_ISA_PREFETCHWT1_SET): Ditto.
(OPTION_MASK_ISA_AVX512F_UNSET): Remove AVX512PF and AVX512ER.
(OPTION_MASK_ISA_AVX512PF_UNSET): Removed.
(OPTION_MASK_ISA_AVX512ER_UNSET): Ditto.
(OPTION_MASK_ISA2_AVX5124FMAPS_UNSET): Ditto.
(OPTION_MASK_ISA2_AVX5124VNNIW_UNSET): Ditto.
(OPTION_MASK_ISA_PREFETCHWT1_UNSET): Ditto.
(OPTION_MASK_ISA2_AVX512F_UNSET): Remove AVX5124FMAPS and
AVX5125VNNIW.
(ix86_handle_option): Remove Xeon Phi options.
(processor_names): Remove Xeon Phi cpus.
(processor_alias_table): Ditto.
* common/config/i386/i386-cpuinfo.h
(enum processor_types): Ditto.
(enum processor_features): Remove Xeon Phi ISAs.
* common/config/i386/i386-isas.h: Ditto.
* config.gcc: Remove Xeon Phi cpus and ISAs.
* config/i386/avx5124fmapsintrin.h: Remove intrin support.
* config/i386/avx5124vnniwintrin.h: Ditto.
* config/i386/avx512erintrin.h: Ditto.
* config/i386/avx512pfintrin.h: Ditto.
* config/i386/cpuid.h (bit_AVX512PF): Removed.
(bit_AVX512ER): Ditto.
(bit_PREFETCHWT1): Ditto.
(bit_AVX5124VNNIW): Ditto.
(bit_AVX5124FMAPS): Ditto.
* config/i386/driver-i386.cc
(host_detect_local_cpu): Remove Xeon Phi.
* config/i386/i386-builtin-types.def: Remove unused types.
* config/i386/i386-builtin.def (BDESC): Remove builtins.
* config/i386/i386-builtins.cc (ix86_init_mmx_sse_builtins): Ditto.
* config/i386/i386-c.cc (ix86_target_macros_internal): Remove Xeon
Phi cpus and ISAs.
* config/i386/i386-expand.cc (ix86_expand_builtin): Remove Xeon Phi
related handlers.
(ix86_emit_swdivsf): Ditto.
(ix86_emit_swsqrtsf): Ditto.
* config/i386/i386-isa.def: Remove Xeon Phi ISAs.
* config/i386/i386-options.cc (m_KNL): Removed.
(m_KNM): Ditto.
(isa2_opts): Remove Xeon Phi ISAs.
(isa_opts): Ditto.
(processor_cost_table): Remove Xeon Phi cpus.
(ix86_valid_target_attribute_inner_p): Remove Xeon Phi ISAs.
(ix86_option_override_internal): Remove Xeon Phi related handlers.
* config/i386/i386-rust.cc (ix86_rust_target_cpu_info): Remove Xeon
Phi ISAs.
* config/i386/i386.cc (ix86_hard_regno_mode_ok): Remove Xeon Phi
related handler.
* config/i386/i386.h (TARGET_EMIT_VZEROUPPER): Removed.
(enum processor_type): Remove Xeon Phi cpus.
* config/i386/i386.md (prefetch): Remove PREFETCHWT1.
(*prefetch_3dnow): Ditto.
(*prefetch_prefetchwt1): Removed.
* config/i386/i386.opt: Remove Xeon Phi ISAs.
* config/i386/immintrin.h: Ditto.
* config/i386/sse.md (VF1_AVX512ER_128_256): Removed.
(rsqrt<mode>2): Change iterator from VF1_AVX512ER_128_256 to
VF1_128_256.
(GATHER_SCATTER_SF_MEM_MODE): Removed.
(avx512pf_gatherpf<mode>sf): Ditto.
(*avx512pf_gatherpf<VI48_512:mode>sf_mask): Ditto.
(avx512pf_gatherpf<mode>df): Ditto.
(*avx512pf_gatherpf<VI4_256_8_512:mode>df_mask): Ditto.
(avx512pf_scatterpf<mode>sf): Ditto.
(*avx512pf_scatterpf<VI48_512:mode>sf_mask): Ditto.
(avx512pf_scatterpf<mode>df): Ditto.
(*avx512pf_scatterpf<VI4_256_8_512:mode>df_mask): Ditto.
(exp2<mode>2): Ditto.
(avx512er_exp2<mode><mask_name><round_saeonly_name>): Ditto.
(<mask_codefor>avx512er_rcp28<mode><mask_name><round_saeonly_name>):
Ditto.
(avx512er_vmrcp28<mode><mask_name><round_saeonly_name>): Ditto.
(<mask_codefor>avx512er_rsqrt28<mode><mask_name><round_saeonly_name>):
Ditto.
(avx512er_vmrsqrt28<mode><mask_name><round_saeonly_name>): Ditto.
(IMOD4): Ditto.
(imod4_narrow): Ditto.
(mov<mode>): Ditto.
(*mov<mode>_internal): Ditto.
(avx5124fmaddps_4fmaddps): Ditto.
(avx5124fmaddps_4fmaddps_mask): Ditto.
(avx5124fmaddps_4fmaddps_maskz): Ditto.
(avx5124fmaddps_4fmaddss): Ditto.
(avx5124fmaddps_4fmaddss_mask): Ditto.
(avx5124fmaddps_4fmaddss_maskz): Ditto.
(avx5124fmaddps_4fnmaddps): Ditto.
(avx5124fmaddps_4fnmaddps_mask): Ditto.
(avx5124fmaddps_4fnmaddps_maskz): Ditto.
(avx5124fmaddps_4fnmaddss): Ditto.
(avx5124fmaddps_4fnmaddss_mask): Ditto.
(avx5124fmaddps_4fnmaddss_maskz): Ditto.
(avx5124vnniw_vp4dpwssd): Ditto.
(avx5124vnniw_vp4dpwssd_mask): Ditto.
(avx5124vnniw_vp4dpwssd_maskz): Ditto.
(avx5124vnniw_vp4dpwssds): Ditto.
(avx5124vnniw_vp4dpwssds_mask): Ditto.
(avx5124vnniw_vp4dpwssds_maskz): Ditto.
* config/i386/x86-tune-sched.cc (ix86_issue_rate): Remove Xeon Phi cpus.
(ix86_adjust_cost): Ditto.
* config/i386/x86-tune.def (X86_TUNE_SCHEDULE): Ditto.
(X86_TUNE_PARTIAL_REG_DEPENDENCY): Ditto.
(X86_TUNE_MOVX): Ditto.
(X86_TUNE_MEMORY_MISMATCH_STALL): Ditto.
(X86_TUNE_ACCUMULATE_OUTGOING_ARGS): Ditto.
(X86_TUNE_FOUR_JUMP_LIMIT): Ditto.
(X86_TUNE_USE_INCDEC): Ditto.
(X86_TUNE_INTEGER_DFMODE_MOVES): Ditto.
(X86_TUNE_OPT_AGU): Ditto.
(X86_TUNE_AVOID_LEA_FOR_ADDR): Ditto.
(X86_TUNE_AVOID_MEM_OPND_FOR_CMOVE): Ditto.
(X86_TUNE_USE_SAHF): Ditto.
(X86_TUNE_USE_CLTD): Ditto.
(X86_TUNE_USE_BT): Ditto.
(X86_TUNE_ONE_IF_CONV_INSN): Ditto.
(X86_TUNE_EXPAND_ABS): Ditto.
(X86_TUNE_USE_SIMODE_FIOP): Ditto.
(X86_TUNE_EXT_80387_CONSTANTS): Ditto.
(X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL): Ditto.
(X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL): Ditto.
(X86_TUNE_SPLIT_MEM_OPND_FOR_FP_CONVERTS): Ditto.
(X86_TUNE_SLOW_PSHUFB): Ditto.
(X86_TUNE_EMIT_VZEROUPPER): Removed.
* config/i386/xmmintrin.h (enum _mm_hint): Remove _MM_HINT_ET1.
* doc/extend.texi: Remove Xeon Phi.
* doc/invoke.texi: Ditto.
gcc/testsuite/ChangeLog:
* g++.dg/other/i386-2.C: Remove Xeon Phi ISAs.
* g++.dg/other/i386-3.C: Ditto.
* g++.target/i386/mv28.C: Ditto.
* gcc.target/i386/builtin_target.c: Ditto.
* gcc.target/i386/sse-12.c: Ditto.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* gcc.target/i386/sse-26.c: Ditto.
* gcc.target/i386/avx5124fmadd-v4fmaddps-1.c: Removed.
* gcc.target/i386/avx5124fmadd-v4fmaddps-2.c: Ditto.
* gcc.target/i386/avx5124fmadd-v4fmaddss-1.c: Ditto.
* gcc.target/i386/avx5124fmadd-v4fnmaddps-1.c: Ditto.
* gcc.target/i386/avx5124fmadd-v4fnmaddps-2.c: Ditto.
* gcc.target/i386/avx5124fmadd-v4fnmaddss-1.c: Ditto.
* gcc.target/i386/avx5124vnniw-vp4dpwssd-1.c: Ditto.
* gcc.target/i386/avx5124vnniw-vp4dpwssd-2.c: Ditto.
* gcc.target/i386/avx5124vnniw-vp4dpwssds-1.c: Ditto.
* gcc.target/i386/avx5124vnniw-vp4dpwssds-2.c: Ditto.
* gcc.target/i386/avx512er-check.h: Ditto.
* gcc.target/i386/avx512er-vexp2pd-1.c: Ditto.
* gcc.target/i386/avx512er-vexp2pd-2.c: Ditto.
* gcc.target/i386/avx512er-vexp2ps-1.c: Ditto.
* gcc.target/i386/avx512er-vexp2ps-2.c: Ditto.
* gcc.target/i386/avx512er-vrcp28pd-1.c: Ditto.
* gcc.target/i386/avx512er-vrcp28pd-2.c: Ditto.
* gcc.target/i386/avx512er-vrcp28ps-1.c: Ditto.
* gcc.target/i386/avx512er-vrcp28ps-2.c: Ditto.
* gcc.target/i386/avx512er-vrcp28ps-3.c: Ditto.
* gcc.target/i386/avx512er-vrcp28ps-4.c: Ditto.
* gcc.target/i386/avx512er-vrcp28sd-1.c: Ditto.
* gcc.target/i386/avx512er-vrcp28sd-2.c: Ditto.
* gcc.target/i386/avx512er-vrcp28ss-1.c: Ditto.
* gcc.target/i386/avx512er-vrcp28ss-2.c: Ditto.
* gcc.target/i386/avx512er-vrsqrt28pd-1.c: Ditto.
* gcc.target/i386/avx512er-vrsqrt28pd-2.c: Ditto.
* gcc.target/i386/avx512er-vrsqrt28ps-1.c: Ditto.
* gcc.target/i386/avx512er-vrsqrt28ps-2.c: Ditto.
* gcc.target/i386/avx512er-vrsqrt28ps-3.c: Ditto.
* gcc.target/i386/avx512er-vrsqrt28ps-4.c: Ditto.
* gcc.target/i386/avx512er-vrsqrt28ps-5.c: Ditto.
* gcc.target/i386/avx512er-vrsqrt28ps-6.c: Ditto.
* gcc.target/i386/avx512er-vrsqrt28sd-1.c: Ditto.
* gcc.target/i386/avx512er-vrsqrt28sd-2.c: Ditto.
* gcc.target/i386/avx512er-vrsqrt28ss-1.c: Ditto.
* gcc.target/i386/avx512er-vrsqrt28ss-2.c: Ditto.
* gcc.target/i386/avx512pf-vgatherpf0dpd-1.c: Ditto.
* gcc.target/i386/avx512pf-vgatherpf0dps-1.c: Ditto.
* gcc.target/i386/avx512pf-vgatherpf0qpd-1.c: Ditto.
* gcc.target/i386/avx512pf-vgatherpf0qps-1.c: Ditto.
* gcc.target/i386/avx512pf-vgatherpf1dpd-1.c: Ditto.
* gcc.target/i386/avx512pf-vgatherpf1dps-1.c: Ditto.
* gcc.target/i386/avx512pf-vgatherpf1qpd-1.c: Ditto.
* gcc.target/i386/avx512pf-vgatherpf1qps-1.c: Ditto.
* gcc.target/i386/avx512pf-vscatterpf0dpd-1.c: Ditto.
* gcc.target/i386/avx512pf-vscatterpf0dps-1.c: Ditto.
* gcc.target/i386/avx512pf-vscatterpf0qpd-1.c: Ditto.
* gcc.target/i386/avx512pf-vscatterpf0qps-1.c: Ditto.
* gcc.target/i386/avx512pf-vscatterpf1dpd-1.c: Ditto.
* gcc.target/i386/avx512pf-vscatterpf1dps-1.c: Ditto.
* gcc.target/i386/avx512pf-vscatterpf1qpd-1.c: Ditto.
* gcc.target/i386/avx512pf-vscatterpf1qps-1.c: Ditto.
* gcc.target/i386/pr104448.c: Ditto.
* gcc.target/i386/pr82941-2.c: Ditto.
* gcc.target/i386/pr82942-2.c: Ditto.
* gcc.target/i386/pr82990-1.c: Ditto.
* gcc.target/i386/pr82990-3.c: Ditto.
* gcc.target/i386/pr82990-6.c: Ditto.
* gcc.target/i386/pr82990-7.c: Ditto.
* gcc.target/i386/pr89523-5.c: Ditto.
* gcc.target/i386/pr89523-6.c: Ditto.
* gcc.target/i386/pr91033.c: Ditto.
* gcc.target/i386/prefetchwt1-1.c: Ditto.
|
|
gcc/ChangeLog:
* config.gcc: Build and add objects for Cygwin and MinGW. Add Cygwin
and MinGW options to the target.
|
|
Define Cygwin and MinGW environment such as types, SEH definitions,
shared libraries, etc.
gcc/ChangeLog:
* config.gcc: Add Cygwin and MinGW difinitions.
* config/aarch64/aarch64-protos.h
(mingw_pe_maybe_record_exported_symbol): Declare functions
which are used in Cygwin and MinGW environment.
(mingw_pe_section_type_flags): Likewise.
(mingw_pe_unique_section): Likewise.
(mingw_pe_encode_section_info): Likewise.
* config/aarch64/cygming.h: New file.
|
|
This patch defines TARGET_AARCH64_MS_ABI in config.gcc and uses it to
exclude i386 functionality from aarch64 build and adjust MinGW headers
for AArch64 MS ABI.
gcc/ChangeLog:
* config.gcc: Define TARGET_AARCH64_MS_ABI.
* config/mingw/mingw-stdint.h (INTPTR_TYPE): Use
TARGET_AARCH64_MS_ABI to adjust MinGW headers for
AArch64 MS ABI.
(UINTPTR_TYPE): Likewise.
(defined): Likewise.
* config/mingw/mingw32.h (DEFAULT_ABI): Likewise.
(defined): Likewise.
* config/mingw/winnt.cc (defined): Use TARGET_ARM64_MS_ABI to
exclude ix86_get_callcvt.
(i386_pe_maybe_mangle_decl_assembler_name): Likewise.
(i386_pe_mangle_decl_assembler_name): Likewise.
|
|
This patch creates a new config/mingw directory to share MinGW
related definitions, and moves there the corresponding existing files
from config/i386.
gcc/ChangeLog:
* config.gcc: Adjust targets after moving MinGW related files
from i386 to mingw folder.
* config/i386/cygming.opt: Move to...
* config/mingw/cygming.opt: ...here.
* config/i386/cygming.opt.urls: Move to...
* config/mingw/cygming.opt.urls: ...here.
* config/i386/cygwin-d.cc: Move to...
* config/mingw/cygwin-d.cc: ...here.
* config/i386/mingw-stdint.h: Move to...
* config/mingw/mingw-stdint.h: ...here.
* config/i386/mingw.opt: Move to...
* config/mingw/mingw.opt: ...here.
* config/i386/mingw.opt.urls: Move to...
* config/mingw/mingw.opt.urls: ...here.
* config/i386/mingw32.h: Move to...
* config/mingw/mingw32.h: ...here.
* config/i386/msformat-c.cc: Move to...
* config/mingw/msformat-c.cc: ...here.
* config/i386/t-cygming: Move to...
* config/mingw/t-cygming: ...here and updated.
* config/i386/winnt-cxx.cc: Move to...
* config/mingw/winnt-cxx.cc: ...here.
* config/i386/winnt-d.cc: Move to...
* config/mingw/winnt-d.cc: ...here.
* config/i386/winnt-stubs.cc: Move to...
* config/mingw/winnt-stubs.cc: ...here.
* config/i386/winnt.cc: Move to...
* config/mingw/winnt.cc: ...here.
|
|
Define ASM specific for COFF format on AArch64.
gcc/ChangeLog:
* config.gcc: Add COFF format support definitions.
* config/aarch64/aarch64-coff.h: New file.
|
|
Define the MS ABI for aarch64-w64-mingw32.
Adjust FIXED_REGISTERS, CALL_REALLY_USED_REGISTERS and
STATIC_CHAIN_REGNUM for AArch64 MS ABI.
The X18 register is reserved on Windows for the TEB.
gcc/ChangeLog:
* config.gcc: Define TARGET_AARCH64_MS_ABI when
AArch64 MS ABI is used.
* config/aarch64/aarch64.h (FIXED_X18): Adjust
FIXED_REGISTERS, CALL_REALLY_USED_REGISTERS and
STATIC_CHAIN_REGNUM for AArch64 MS ABI.
(CALL_USED_X18): Likewise.
(FIXED_REGISTERS): Likewise.
* config/aarch64/aarch64-abi-ms.h: New file.
|
|
Add the initial aarch64-w64-mingw32 target for gcc.
This is the first commit in a sequence of patch series to add
new aarch64-w64-mingw32 target.
Coauthors: Zac Walker <zacwalker@microsoft.com>,
Mark Harmstone <mark@harmstone.com> and
Ron Riddle <ron.riddle@microsoft.com>
Refactored, prepared, and validated by
Radek Barton <radek.barton@microsoft.com> and
Evgeny Karpov <evgeny.karpov@microsoft.com>
fixincludes/ChangeLog:
* mkfixinc.sh: Extend for *-mingw32* targets.
gcc/ChangeLog:
* config.gcc: Add aarch64-w64-mingw32 target.
|
|
Support for Solaris 11.3 had already been obsoleted in GCC 13. However,
since the only Solaris system in the cfarm was running 11.3, I've kept
it in tree until now when both Solaris 11.4/SPARC and x86 systems have
been added.
This patch actually removes the Solaris 11.3 support. Apart from
several minor simplifications, there are two more widespread changes:
* In Solaris 11.4, libsocket and libnsl were folded into libc, so
there's no longer a need to link them explictly.
* Since Solaris 11.4, Solaris includes all crts needed by gcc (like
crt1.o and gcrt1.o) with the base system. All workarounds to provide
fallbacks can thus go.
Bootstrapped without regressions on i386-pc-solaris2.11 and
sparc-sun-solaris2.11 (as/ld, gas/ld, and gas/gld) as well as Solaris
11.3/x86 to ascertain that version is actually rejected.
2024-04-30 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>
c++tools:
* configure.ac (ax_lib_socket_nsl.m4): Don't sinclude.
(AX_LIB_SOCKET_NSL): Don't call.
(NETLIBS): Remove.
* configure: Regenerate.
* Makefile.in (NETLIBS): Remove.
(g++-mapper-server$(exeext)): Remove $(NETLIBS).
gcc:
* config.gcc: Move *-*-solaris2.11.[0-3]* to unsupported list.
<*-*-solaris2*> (default_use_cxa_atexit): Set unconditionally.
* configure.ac (AX_LIB_SOCKET_NSL): Don't call.
(NETLIBS): Remove.
(gcc_cv_ld_aligned_shf_merge): Remove.
(hidden_linkonce) <i?86-*-solaris2* | x86_64-*-solaris2*>: Remove.
(gcc_cv_target_dl_iterate_phdr) <*-*-solaris2*>: Always set to yes.
* Makefile.in (NETLIBS): Remove.
* configure, config.in, aclocal.m4: Regenerate.
* config/sol2.h: Don't check HAVE_SOLARIS_CRTS.
(STARTFILE_SPEC): Remove !HAVE_SOLARIS_CRTS case.
[USE_GLD] (LINK_EH_SPEC): Remove TARGET_DL_ITERATE_PHDR guard.
* config/i386/i386.cc (USE_HIDDEN_LINKONCE): Remove guard.
* varasm.cc (mergeable_string_section): Remove
HAVE_LD_ALIGNED_SHF_MERGE handling.
(mergeable_constant_section): Likewise.
* doc/install.texi (Specific,i?86-*-solaris2*): Reference Solaris
11.4 only.
(Specific, *-*-solaris2*): Document Solaris 11.3 removal. Remove
11.3 references and caveats. Update for 11.4.
gcc/cp:
* Make-lang.in (cc1plus$(exeext)): Remove $(NETLIBS).
gcc/objcp:
* Make-lang.in (cc1objplus$(exeext)): Remove $(NETLIBS).
gcc/testsuite:
* lib/target-supports.exp (check_effective_target_pie): Always
enable on *-*-solaris2*.
libgcc:
* configure.ac <*-*-solaris2*> (libgcc_cv_solaris_crts): Remove.
* config.host <*-*-solaris2*>: Remove !libgcc_cv_solaris_crts
support.
* configure, config.in: Regenerate.
* config/sol2/gmon.c (internal_mcount) [!HAVE_SOLARIS_CRTS]: Remove.
* config/i386/sol2-c1.S, config/sparc/sol2-c1.S: Remove.
* config/sol2/t-sol2 (crt1.o, gcrt1.o): Remove.
libstdc++-v3:
* testsuite/lib/dg-options.exp (add_options_for_net_ts)
<*-*-solaris2*>: Don't link with -lsocket -lnsl.
|
|
Add support for gfx90c GCN5 APU integrated graphics devices.
The LLVM AMDGPU documentation does not list those devices as supported
by rocm-amdhsa, but it passes most libgomp offloading tests.
Although they are constrainted compared to dGPUs, they might be
interesting for learning, experimentation, and testing.
gcc/ChangeLog:
* config.gcc: Add gfx90c.
* config/gcn/gcn-hsa.h (NO_SRAM_ECC): Likewise.
* config/gcn/gcn-opts.h (enum processor_type): Likewise.
(TARGET_GFX90c): New macro.
* config/gcn/gcn.cc (gcn_option_override): Handle gfx90c.
(gcn_omp_device_kind_arch_isa): Likewise.
(output_file_start): Likewise.
* config/gcn/gcn.h: Add gfx90c.
* config/gcn/gcn.opt: Likewise.
* config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX90c): New macro.
(get_arch): Handle gfx90c.
(main): Handle EF_AMDGPU_MACH_AMDGCN_GFX90c
* config/gcn/t-omp-device: Add gfx90c.
* doc/install.texi: Likewise.
* doc/invoke.texi: Likewise.
libgomp/ChangeLog:
* plugin/plugin-gcn.c (isa_hsa_name): Handle EF_AMDGPU_MACH_AMDGCN_GFX90c.
(isa_code): Handle gfx90c.
(max_isa_vgprs): Handle EF_AMDGPU_MACH_AMDGCN_GFX90c.
Signed-off-by: Frederik Harwath <frederik@harwath.name>
|
|
This commit makes the BPF backend to define the following macros for
c-family languages:
__BPF_CPU_VERSION__
This is a numeric value identifying the version of the BPF "cpu"
for which GCC is generating code.
__BPF_FEATURE_ALU32
__BPF_FEATURE_JMP32
__BPF_FEATURE_JMP_EXT
__BPF_FEATURE_BSWAP
__BPF_FEATURE_SDIV_SMOD
__BPF_FEATURE_MOVSX
__BPF_FEATURE_LDSX
__BPF_FEATURE_GOTOL
__BPF_FEATURE_ST
These are defines if the corresponding "feature" is enabled. The
features are implicitly enabled by the BPF CPU version enabled,
and most of them can also be enabled/disabled using
target-specific -m[no-]FEATURE command line switches.
Note that this patch moves the definition of bpf_target_macros, that
implements TARGET_CPU_CPP_BUILTINS in the BPF backend, to a bpf-c.cc
file. This is because we are now using facilities from c-family/* and
these features are not available in compilers like lto1.
A couple of tests are also added.
Tested in target bpf-unknown-none-gcc and host x86_64-linux-gnu.
No regressions.
gcc/ChangeLog
* config.gcc: Add bpf-c.o as a target object for C and C++.
* config/bpf/bpf.cc (bpf_target_macros): Move to bpf-c.cc.
* config/bpf/bpf-c.cc: New file.
(bpf_target_macros): Move from bpf.cc and define BPF CPU
feature macros.
* config/bpf/t-bpf: Add rules to build bpf-c.o.
gcc/testsuite/ChangeLog
* gcc.target/bpf/feature-macro-1.c: New test.
* gcc.target/bpf/feature-macro-2.c: Likewise.
|
|
Detailed description of these definitions can be found at
https://github.com/loongson/la-toolchain-conventions, which
the LoongArch GCC port aims to conform to.
gcc/ChangeLog:
* config.gcc: Add loongarch-evolution.o.
* config/loongarch/genopts/genstr.sh: Enable generation of
loongarch-evolution.[cc,h].
* config/loongarch/t-loongarch: Likewise.
* config/loongarch/genopts/gen-evolution.awk: New file.
* config/loongarch/genopts/isa-evolution.in: Mark ISA version
of introduction for each ISA evolution feature.
* config/loongarch/loongarch-c.cc (loongarch_cpu_cpp_builtins):
Define builtin macros for enabled ISA evolutions and the ISA
version.
* config/loongarch/loongarch-cpu.cc: Use loongarch-evolution.h.
* config/loongarch/loongarch.h: Likewise.
* config/loongarch/loongarch-cpucfg-map.h: Delete.
* config/loongarch/loongarch-evolution.cc: New file.
* config/loongarch/loongarch-evolution.h: New file.
* config/loongarch/loongarch-opts.h (ISA_HAS_FRECIPE): Define.
(ISA_HAS_DIV32): Likewise.
(ISA_HAS_LAM_BH): Likewise.
(ISA_HAS_LAMCAS): Likewise.
(ISA_HAS_LD_SEQ_SA): Likewise.
|
|
These ISA versions are defined as -march= parameters and
are recommended for building binaries for distribution.
Detailed description of these definitions can be found at
https://github.com/loongson/la-toolchain-conventions, which
the LoongArch GCC port aims to conform to.
gcc/ChangeLog:
* config.gcc: Make la64v1.0 the default ISA preset of the lp64d ABI.
* config/loongarch/genopts/loongarch-strings: Define la64v1.0, la64v1.1.
* config/loongarch/genopts/loongarch.opt.in: Likewise.
* config/loongarch/loongarch-c.cc (LARCH_CPP_SET_PROCESSOR): Likewise.
(loongarch_cpu_cpp_builtins): Likewise.
* config/loongarch/loongarch-cpu.cc (get_native_prid): Likewise.
(fill_native_cpu_config): Likewise.
* config/loongarch/loongarch-def.cc (array_tune): Likewise.
* config/loongarch/loongarch-def.h: Likewise.
* config/loongarch/loongarch-driver.cc (driver_set_m_parm): Likewise.
(driver_get_normalized_m_opts): Likewise.
* config/loongarch/loongarch-opts.cc (default_tune_for_arch): Likewise.
(TUNE_FOR_ARCH): Likewise.
(arch_str): Likewise.
(loongarch_target_option_override): Likewise.
* config/loongarch/loongarch-opts.h (TARGET_uARCH_LA464): Likewise.
(TARGET_uARCH_LA664): Likewise.
* config/loongarch/loongarch-str.h (STR_CPU_ABI_DEFAULT): Likewise.
(STR_ARCH_ABI_DEFAULT): Likewise.
(STR_TUNE_GENERIC): Likewise.
(STR_ARCH_LA64V1_0): Likewise.
(STR_ARCH_LA64V1_1): Likewise.
* config/loongarch/loongarch.cc (loongarch_cpu_sched_reassociation_width): Likewise.
(loongarch_asm_code_end): Likewise.
* config/loongarch/loongarch.opt: Likewise.
* doc/invoke.texi: Likewise.
|
|
This patch marks the nios2*-*-* targets obsolete in GCC 14. Intel has
EOL'ed this architecture and the maintainers no longer have access to
hardware for testing. While the port is still in reasonably good
shape at this time, no further testing or updates are planned.
gcc/
* config.gcc: Add nios2*-*-* to the list of obsoleted targets.
contrib/
* config-list.mk (LIST): --enable-obsolete for nios2*-*-*.
|
|
Coupled with a corresponding binutils patch, this produces a toolchain that can
sucessfully build working binaries targeting aarch64-gnu.
gcc/Changelog:
* config.gcc: Recognize aarch64*-*-gnu* targets.
* config/aarch64/aarch64-gnu.h: New file.
Signed-off-by: Sergey Bugaev <bugaevc@gmail.com>
|
|
Add a multilib with workarounds for Cortex-A53 errata.
gcc/ChangeLog:
* config.gcc (aarch64-*-rtems*): Add target makefile fragment
t-aarch64-rtems.
* config/aarch64/t-aarch64-rtems: New file.
|
|
This implements TLS Descriptors (TLSDESC) as specified in [1].
The 4-instruction sequence is implemented as a single RTX insn for
simplicity, but this can be revisited later if instruction scheduling or
more flexible RA is desired.
The default remains to be the traditional TLS model, but can be configured
with --with-tls={trad,desc}. The choice can be revisited once toolchain
and libc support ships.
[1]: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/373.
gcc/ChangeLog:
* config/riscv/riscv.opt: Add -mtls-dialect to configure TLS flavor.
* config.gcc: Add --with-tls configuration option to change the
default TLS flavor.
* config/riscv/riscv.h: Add TARGET_TLSDESC determined from
-mtls-dialect and with_tls defaults.
* config/riscv/riscv-opts.h: Define enum riscv_tls_type for the
two TLS flavors.
* config/riscv/riscv-protos.h: Define SYMBOL_TLSDESC symbol type.
* config/riscv/riscv.md: Add instruction sequence for TLSDESC.
* config/riscv/riscv.cc (riscv_symbol_insns): Add instruction
sequence length data for TLSDESC.
(riscv_legitimize_tls_address): Add lowering of TLSDESC.
* doc/install.texi: Document --with-tls for RISC-V.
* doc/invoke.texi: Document -mtls-dialect for RISC-V.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/tls_1.x: Add TLSDESC GD test case.
* gcc.target/riscv/tlsdesc.c: Same as above.
|
|
Add support for TLS descriptors on normal code model and extreme
code model.
Normal code model instruction sequence:
-mno-explicit-relocs:
la.tls.desc $r4, s
add.d $r12, $r4, $r2
-mexplicit-relocs:
pcalau12i $r4,%desc_pc_hi20(s)
addi.d $r4,$r4,%desc_pc_lo12(s)
ld.d $r1,$r4,%desc_ld(s)
jirl $r1,$r1,%desc_call(s)
add.d $r12, $r4, $r2
Extreme code model instruction sequence:
-mno-explicit-relocs:
la.tls.desc $r4, $r12, s
add.d $r12, $r4, $r2
-mexplicit-relocs:
pcalau12i $r4,%desc_pc_hi20(s)
addi.d $r12,$r0,%desc_pc_lo12(s)
lu32i.d $r12,%desc64_pc_lo20(s)
lu52i.d $r12,$r12,%desc64_pc_hi12(s)
add.d $r4,$r4,$r12
ld.d $r1,$r4,%desc_ld(s)
jirl $r1,$r1,%desc_call(s)
add.d $r12, $r4, $r2
The default is still traditional TLS model, but can be configured with
--with-tls={trad,desc}. The default can change to TLS descriptors once
libc and LLVM support this.
gcc/ChangeLog:
* config.gcc: Add --with-tls option to change TLS flavor.
* config/loongarch/genopts/loongarch.opt.in: Add -mtls-dialect to
configure TLS flavor.
* config/loongarch/loongarch-def.h (struct loongarch_target): Add
tls_dialect.
* config/loongarch/loongarch-driver.cc (la_driver_init): Add tls
flavor.
* config/loongarch/loongarch-opts.cc (loongarch_init_target): Add
tls_dialect.
(loongarch_config_target): Ditto.
(loongarch_update_gcc_opt_status): Ditto.
* config/loongarch/loongarch-opts.h (loongarch_init_target): Ditto.
(TARGET_TLS_DESC): New define.
* config/loongarch/loongarch.cc (loongarch_symbol_insns): Add TLS
DESC instructions sequence length.
(loongarch_legitimize_tls_address): New TLS DESC instruction sequence.
(loongarch_option_override_internal): Add la_opt_tls_dialect.
(loongarch_option_restore): Add la_target.tls_dialect.
* config/loongarch/loongarch.md (@got_load_tls_desc<mode>): Normal
code model for TLS DESC.
(got_load_tls_desc_off64): Extreme cmode model for TLS DESC.
* config/loongarch/loongarch.opt: Regenerate.
* config/loongarch/loongarch.opt.urls: Ditto.
* doc/invoke.texi: Add a description of the compilation option
'-mtls-dialect={trad,desc}'.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/cmodel-extreme-1.c: Add -mtls-dialect=trad.
* gcc.target/loongarch/cmodel-extreme-2.c: Ditto.
* gcc.target/loongarch/explicit-relocs-auto-tls-ld-gd.c: Ditto.
* gcc.target/loongarch/explicit-relocs-medium-call36-auto-tls-ld-gd.c:
Ditto.
* gcc.target/loongarch/func-call-medium-1.c: Ditto.
* gcc.target/loongarch/func-call-medium-2.c: Ditto.
* gcc.target/loongarch/func-call-medium-3.c: Ditto.
* gcc.target/loongarch/func-call-medium-4.c: Ditto.
* gcc.target/loongarch/tls-extreme-macro.c: Ditto.
* gcc.target/loongarch/tls-gd-noplt.c: Ditto.
* gcc.target/loongarch/explicit-relocs-auto-extreme-tls-desc.c: New test.
* gcc.target/loongarch/explicit-relocs-auto-tls-desc.c: New test.
* gcc.target/loongarch/explicit-relocs-extreme-tls-desc.c: New test.
* gcc.target/loongarch/explicit-relocs-tls-desc.c: New test.
Co-authored-by: Lulu Cheng <chenglulu@loongson.cn>
Co-authored-by: Xi Ruoyao <xry111@xry111.site>
|
|
Add support for the gfx1036 RDNA2 APU integrated graphics devices. The ROCm
documentation warns that these may not be supported, but it seems to work
at least partially.
gcc/ChangeLog:
* config.gcc (amdgcn): Add gfx1036 entries.
* config/gcn/gcn-hsa.h (NO_XNACK): Likewise.
(gcn_local_sym_hash): Likewise.
* config/gcn/gcn-opts.h (enum processor_type): Likewise.
(TARGET_GFX1036): New macro.
* config/gcn/gcn.cc (gcn_option_override): Handle gfx1036.
(gcn_omp_device_kind_arch_isa): Likewise.
(output_file_start): Likewise.
* config/gcn/gcn.h (TARGET_CPU_CPP_BUILTINS): Add __gfx1036__.
(TARGET_CPU_CPP_BUILTINS): Rename __gfx1030 to __gfx1030__.
* config/gcn/gcn.opt: Add gfx1036.
* config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX1036): New.
(main): Handle gfx1036.
* config/gcn/t-omp-device: Add gfx1036 isa.
* doc/install.texi (amdgcn): Add gfx1036.
* doc/invoke.texi (-march): Likewise.
libgomp/ChangeLog:
* plugin/plugin-gcn.c (EF_AMDGPU_MACH): GFX1036.
(gcn_gfx1103_s): New.
(isa_hsa_name): Handle gfx1036.
(isa_code): Likewise.
(max_isa_vgprs): Likewise.
|
|
Add support for the gfx1103 RDNA3 APU integrated graphics devices. The ROCm
documentation warns that these may not be supported, but it seems to work
at least partially.
gcc/ChangeLog:
* config.gcc (amdgcn): Add gfx1103 entries.
* config/gcn/gcn-hsa.h (NO_XNACK): Likewise.
(gcn_local_sym_hash): Likewise.
* config/gcn/gcn-opts.h (enum processor_type): Likewise.
(TARGET_GFX1103): New macro.
* config/gcn/gcn.cc (gcn_option_override): Handle gfx1103.
(gcn_omp_device_kind_arch_isa): Likewise.
(output_file_start): Likewise.
(gcn_hsa_declare_function_name): Use TARGET_RDNA3, not just gfx1100.
* config/gcn/gcn.h (TARGET_CPU_CPP_BUILTINS): Add __gfx1103__.
* config/gcn/gcn.opt: Add gfx1103.
* config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX1103): New.
(main): Handle gfx1103.
* config/gcn/t-omp-device: Add gfx1103 isa.
* doc/install.texi (amdgcn): Add gfx1103.
* doc/invoke.texi (-march): Likewise.
libgomp/ChangeLog:
* plugin/plugin-gcn.c (EF_AMDGPU_MACH): GFX1103.
(gcn_gfx1103_s): New.
(isa_hsa_name): Handle gfx1103.
(isa_code): Likewise.
(max_isa_vgprs): Likewise.
|
|
2024-02-14 Jan Hubicka <jh@suse.cz>
Karthiban Anbazhagan <Karthiban.Anbazhagan@amd.com>
gcc/ChangeLog:
* common/config/i386/cpuinfo.h (get_amd_cpu): Recognize znver5.
* common/config/i386/i386-common.cc (processor_names): Add znver5.
(processor_alias_table): Likewise.
* common/config/i386/i386-cpuinfo.h (processor_types): Add new zen
family.
(processor_subtypes): Add znver5.
* config.gcc (x86_64-*-* |...): Likewise.
* config/i386/driver-i386.cc (host_detect_local_cpu): Let
march=native detect znver5 cpu's.
* config/i386/i386-c.cc (ix86_target_macros_internal): Add
znver5.
* config/i386/i386-options.cc (m_ZNVER5): New definition
(processor_cost_table): Add znver5.
* config/i386/i386.cc (ix86_reassociation_width): Likewise.
* config/i386/i386.h (processor_type): Add PROCESSOR_ZNVER5
(PTA_ZNVER5): New definition.
* config/i386/i386.md (define_attr "cpu"): Add znver5.
(Scheduling descriptions) Add znver5.md.
* config/i386/x86-tune-costs.h (znver5_cost): New definition.
* config/i386/x86-tune-sched.cc (ix86_issue_rate): Add znver5.
(ix86_adjust_cost): Likewise.
* config/i386/x86-tune.def (avx512_move_by_pieces): Add m_ZNVER5.
(avx512_store_by_pieces): Add m_ZNVER5.
* doc/extend.texi: Add znver5.
* doc/invoke.texi: Likewise.
* config/i386/znver4.md: Rename to zn4zn5.md; combine znver4 and znver5 Scheduler.
gcc/testsuite/ChangeLog:
* g++.target/i386/mv29.C: Handle znver5 arch.
* gcc.target/i386/funcspec-56.inc:Likewise.
|
|
gcc/ChangeLog:
* config.gcc: Add a case for loongarch*-*-linux-musl*.
* config/loongarch/linux.h: Disable the multilib-compatible
treatment for *musl* targets.
* config/loongarch/musl.h: New file.
|
|
gcc/ChangeLog:
* config.gcc (target_gtfiles): Change coreout to btfext-out.
(extra_objs): Change coreout to btfext-out.
* config/bpf/coreout.cc: Rename to btfext-out.cc.
* config/bpf/btfext-out.cc: Add.
* config/bpf/coreout.h: Rename to btfext-out.h.
* config/bpf/btfext-out.h: Add.
* config/bpf/core-builtins.cc: Change include.
* config/bpf/core-builtins.h: Change include.
* config/bpf/t-bpf: Accomodate renamed files.
|
|
The following deprecates ia64*-*-* for GCC 14. Since we plan to
force LRA for GCC 15 and the target only has slim chances of getting
updated this notifies people in advance. Given both Linux and
glibc have axed the target further development is also made difficult.
There is no listed maintainer for ia64 either.
PR target/90785
gcc/
* config.gcc: Add ia64*-*-* to the list of obsoleted targets.
contrib/
* config-list.mk (LIST): --enable-obsolete for ia64*-*-*.
|
|
gcc/ChangeLog:
* config.gcc (amdgcn-*-*): Add gfx1030 and gfx1100 to
TM_MULTILIB_CONFIG.
* doc/install.texi (Configuration amdgcn-*-*): Mention gfx1030/gfx1100.
* doc/invoke.texi (AMD GCN Options): Add gfx1030 and gfx1100 to
-march/-mtune.
libgomp/ChangeLog:
* testsuite/libgomp.c/declare-variant-4.h: Add variant functions
for gfx1030 and gfx1100.
* testsuite/libgomp.c/declare-variant-4-gfx1030.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx1100.c: New test.
Signed-off-by: Tobias Burnus <tburnus@baylibre.com>
|
|
This patch is to handle the differences in instruction generation
between Vector and XTheadVector. In this version, we only support
partial xtheadvector instructions that leverage directly from current
RVV1.0 with simple adding "th." prefix. For different name xtheadvector
instructions but share same patterns as RVV1.0 instructions, we will
use ASM targethook to rewrite the whole string of the instructions in
the following patches.
For some vector patterns that cannot be avoided, we use
"!TARGET_XTHEADVECTOR" to disable them in vector.md in order
not to generate instructions that xtheadvector does not support,
like vmv1r.
gcc/ChangeLog:
* config.gcc: Add files for XTheadVector intrinsics.
* config/riscv/autovec.md: Guard XTheadVector.
* config/riscv/predicates.md: Disable immediate vl
for XTheadVector.
* config/riscv/riscv-c.cc (riscv_pragma_intrinsic):
Add pragma for XTheadVector.
* config/riscv/riscv-string.cc (riscv_expand_block_move):
Guard XTheadVector.
* config/riscv/riscv-v.cc (vls_mode_valid_p):
Avoid autovec.
* config/riscv/riscv-vector-builtins-bases.cc:
Do not normalize vsetvl instructions for XTheadVector.
* config/riscv/riscv-vector-builtins-shapes.cc (check_type):
New check type function.
(build_one): Adjust for XTheadVector.
* config/riscv/riscv-vector-switch.def (ENTRY):
Disable fractional mode for the XTheadVector extension.
(TUPLE_ENTRY): Likewise.
* config/riscv/riscv.cc (riscv_v_adjust_bytesize):
Guard XTheadVector.
(riscv_preferred_simd_mode): Likewsie.
(riscv_autovectorize_vector_modes): Likewise.
(riscv_vector_mode_supported_any_target_p): Likewise.
(TARGET_VECTOR_MODE_SUPPORTED_ANY_TARGET_P): Likewise.
* config/riscv/thead.cc (th_asm_output_opcode):
Rewrite vsetvl instructions.
* config/riscv/vector.md:
Include thead-vector.md and change fractional LMUL
into 1 for vbool.
* config/riscv/riscv_th_vector.h: New file.
* config/riscv/thead-vector.md: New file.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/pragma-1.c: Add XTheadVector.
* gcc.target/riscv/rvv/base/abi-1.c: Exclude XTheadVector.
* lib/target-supports.exp: Add target for XTheadVector.
Co-authored-by: Jin Ma <jinma@linux.alibaba.com>
Co-authored-by: Xianmiao Qu <cooper.qu@linux.alibaba.com>
Co-authored-by: Christoph Müllner <christoph.muellner@vrull.eu>
|
|
This patch adds C intrinsics for Bitmanip Extension.
RISCV_BUILTIN_NO_PREFIX is a new riscv_builtin_description like RISCV_BUILTIN.
But it uses CODE_FOR_##INSN rather than CODE_FOR_riscv_##INSN.
Changed orcb, clmul, brev8 pattern's mode form X to GPR because orcbsi, clmul_si,
brev8_si are both included in rv32 and rv64. Test them in scalar_bitmanip_intrinsic-64-emulated.c.
gcc/ChangeLog:
* config.gcc: Include riscv_bitmanip.h.
* config/riscv/bitmanip.md: Changed mode form X to GPR in orcb and clmul pattern.
* config/riscv/crypto.md: Changed mode form X to GPR in brev8 pattern.
* config/riscv/riscv-builtins.cc (AVAIL): Adding new bitmanip builtins.
(RISCV_BUILTIN_NO_PREFIX): New helper macro.
* config/riscv/riscv-cmo.def (RISCV_BUILTIN): Add '_32'/'_64' postfix to builtins.
* config/riscv/riscv-ftypes.def (2): New ftypes.
* config/riscv/riscv-scalar-crypto.def (RISCV_BUILTIN): New builtins.
(RISCV_BUILTIN_NO_PREFIX): Likewise.
* config/riscv/riscv_bitmanip.h: New file.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/scalar_bitmanip_intrinsic-32.c: New test.
* gcc.target/riscv/scalar_bitmanip_intrinsic-64-emulated.c: New test.
* gcc.target/riscv/scalar_bitmanip_intrinsic-64.c: New test.
|
|
This patch adds C intrinsics for Scalar Crypto Extension.
gcc/ChangeLog:
* config.gcc: Include riscv_crypto.h.
* config/riscv/riscv_crypto.h: New file.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/scalar_crypto_intrinsic-32.c: New test.
* gcc.target/riscv/scalar_crypto_intrinsic-64.c: New test.
|
|
The r14-6524 changes created aarch64-builtins.h header and moved
struct aarch64_simd_type_info definition in there.
Unfortunately, the new header wasn't added to target_gtfiles, so the
trees and const char * pointer elements in the aarch64_simd_types
array aren't marked as GC roots anymore. That breaks e.g. PCH, when
the array elements then can refer to ggc_freed memory instead of the expected
types, but also any other GC collection could free them and further uses would
not work correctly.
Unfortunately, just adding the new header to target_gtfiles doesn't fix this,
because non-static variable definitions marked with GTY(()) aren't considered
by gengtype, it looks in those cases for an extern GTY(()) declaration, and
there was none - the aarch64-builtins.h header contains an extern declaration
without GTY(()). Adding GTY(()) to that extern declaration doesn't work, because
then gengtype attempts to emit the aarch64_simd_types GC roots in gtype-desc.cc
but the corresponding header isn't included there.
So, the patch instead adds another extern declaration in aarch64-builtins.cc
right before the actual definition, which makes sure the GC roots are registered
correctly in gt-aarch64-builtins.h (where we want them).
2024-01-09 Jakub Jelinek <jakub@redhat.com>
PR target/113270
* config.gcc (aarch64*-*-*): Add aarch64-builtins.h to target_gtfiles.
* config/aarch64/aarch64-builtins.cc (aarch64_simd_types): Add extern
GTY(()) declaration before the definition, drop GTY(()) drom the
definition.
|
|
ROCm since 5.7.1 supports gfx1100 (RDNA3) cards. This commit adds support
for it, mostly by assuming gfx1100 behaves identical to gfx1030. Like gfx1030,
gfx1100 support is neither documented nor the build of the multilib enabled by
default.
But contrary to gfx1030, gfx1100 has a known issue causing some libraries not
to build, including newlib: The sdwa variant of v_mov_b32_sdwa is not supported
by the hardware but GCC current does generates this instruction.
This will be addressed in a later commit.
gcc/ChangeLog:
* config.gcc (amdgcn-*-amdhsa): Accept --with-arch=gfx1100.
* config/gcn/gcn-hsa.h (NO_XNACK): Add gfx1100:
(ASM_SPEC): Handle gfx1100.
* config/gcn/gcn-opts.h (enum processor_type): Add PROCESSOR_GFX1100.
(enum gcn_isa): Add ISA_RDNA3.
(TARGET_GFX1100, TARGET_RDNA2_PLUS, TARGET_RDNA3): Define.
* config/gcn/gcn-valu.md: Change TARGET_RDNA2 to TARGET_RDNA2_PLUS.
* config/gcn/gcn.cc (gcn_option_override,
gcn_omp_device_kind_arch_isa, output_file_start): Handle gfx1100.
(gcn_global_address_p, gcn_addr_space_legitimate_address_p): Change
TARGET_RDNA2 to TARGET_RDNA2_PLUS.
(gcn_hsa_declare_function_name): Don't use '.amdhsa_reserve_flat_scratch'
with gfx1100.
* config/gcn/gcn.h (ASSEMBLER_DIALECT): Likewise.
(TARGET_CPU_CPP_BUILTINS): Define __RDNA3__, __gfx1030__ and
__gfx1100__.
* config/gcn/gcn.md: Change TARGET_RDNA2 to TARGET_RDNA2_PLUS.
* config/gcn/gcn.opt (Enum gpu_type): Add gfx1100.
* config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX1100): Define.
(isa_has_combined_avgprs, main): Handle gfx1100.
* config/gcn/t-omp-device (isa): Add gfx1100.
libgomp/ChangeLog:
* plugin/plugin-gcn.c (gcn_gfx1100_s): New const string.
(gcn_isa_name_len): Fix length.
(isa_hsa_name, isa_code, max_isa_vgprs): Handle gfx1100.
|
|
|
|
gcc/ChangeLog:
* config.gcc: Add loongarch-d.o to d_target_objs for LoongArch
architecture.
* config/loongarch/t-loongarch: Add object target for loongarch-d.cc.
* config/loongarch/loongarch-d.cc
(loongarch_d_target_versions): add interface function to define builtin
D versions for LoongArch architecture.
(loongarch_d_handle_target_float_abi): add interface function to define
builtin D traits for LoongArch architecture.
(loongarch_d_register_target_info): add interface function to register
loongarch_d_handle_target_float_abi function.
* config/loongarch/loongarch-d.h
(loongarch_d_target_versions): add function prototype.
(loongarch_d_register_target_info): Likewise.
libphobos/ChangeLog:
* configure.tgt: Enable libphobos for LoongArch architecture.
* libdruntime/gcc/sections/elf.d: Add TLS_DTV_OFFSET constant for
LoongArch64.
* libdruntime/gcc/unwind/generic.d: Add __aligned__ constant for
LoongArch64.
|
|
This adds a new aarch64-specific RTL-SSA pass dedicated to forming load
and store pairs (LDPs and STPs).
As a motivating example for the kind of thing this improves, take the
following testcase:
extern double c[20];
double f(double x)
{
double y = x*x;
y += c[16];
y += c[17];
y += c[18];
y += c[19];
return y;
}
for which we currently generate (at -O2):
f:
adrp x0, c
add x0, x0, :lo12:c
ldp d31, d29, [x0, 128]
ldr d30, [x0, 144]
fmadd d0, d0, d0, d31
ldr d31, [x0, 152]
fadd d0, d0, d29
fadd d0, d0, d30
fadd d0, d0, d31
ret
but with the pass, we generate:
f:
.LFB0:
adrp x0, c
add x0, x0, :lo12:c
ldp d31, d29, [x0, 128]
fmadd d0, d0, d0, d31
ldp d30, d31, [x0, 144]
fadd d0, d0, d29
fadd d0, d0, d30
fadd d0, d0, d31
ret
The pass is local (only considers a BB at a time). In theory, it should
be possible to extend it to run over EBBs, at least in the case of pure
(MEM_READONLY_P) loads, but this is left for future work.
The pass works by identifying two kinds of bases: tree decls obtained
via MEM_EXPR, and RTL register bases in the form of RTL-SSA def_infos.
If a candidate memory access has a MEM_EXPR base, then we track it via
this base, and otherwise if it is of a simple reg + <imm> form, we track
it via the RTL-SSA def_info for the register.
For each BB, for a given kind of base, we build up a hash table mapping
the base to an access_group. The access_group data structure holds a
list of accesses at each offset relative to the same base. It uses a
splay tree to support efficient insertion (while walking the bb), and
the nodes are chained using a linked list to support efficient
iteration (while doing the transformation).
For each base, we then iterate over the access_group to identify
adjacent accesses, and try to form load/store pairs for those insns that
access adjacent memory.
The pass is currently run twice, both before and after register
allocation. The first copy of the pass is run late in the pre-RA RTL
pipeline, immediately after sched1, since it was found that sched1 was
increasing register pressure when the pass was run before. The second
copy of the pass runs immediately before peephole2, so as to get any
opportunities that the existing ldp/stp peepholes can handle.
There are some cases that we punt on before RA, e.g.
accesses relative to eliminable regs (such as the soft frame pointer).
We do this since we can't know the elimination offset before RA, and we
want to avoid the RA reloading the offset (due to being out of ldp/stp
immediate range) as this can generate worse code.
The post-RA copy of the pass is there to pick up the crumbs that were
left behind / things we punted on in the pre-RA pass. Among other
things, it's needed to handle accesses relative to the stack pointer.
It can also handle code that didn't exist at the time the pre-RA pass
was run (spill code, prologue/epilogue code).
This is an initial implementation, and there are (among other possible
improvements) the following notable caveats / missing features that are
left for future work, but could give further improvements:
- Moving accesses between BBs within in an EBB, see above.
- Out-of-range opportunities: currently the pass refuses to form pairs
if there isn't a suitable base register with an immediate in range
for ldp/stp, but it can be profitable to emit anchor addresses in the
case that there are four or more out-of-range nearby accesses that can
be formed into pairs. This is handled by the current ldp/stp
peepholes, so it would be good to support this in the future.
- Discovery: currently we prioritize MEM_EXPR bases over RTL bases, which can
lead to us missing opportunities in the case that two accesses have distinct
MEM_EXPR bases (i.e. different DECLs) but they are still adjacent in memory
(e.g. adjacent variables on the stack). I hope to address this for GCC 15,
hopefully getting to the point where we can remove the ldp/stp peepholes and
scheduling hooks. Furthermore it would be nice to make the pass aware of
section anchors (adding these as a third kind of base) allowing merging
accesses to adjacent variables within the same section.
gcc/ChangeLog:
* config.gcc: Add aarch64-ldp-fusion.o to extra_objs for aarch64.
* config/aarch64/aarch64-passes.def: Add copies of pass_ldp_fusion
before and after RA.
* config/aarch64/aarch64-protos.h (make_pass_ldp_fusion): Declare.
* config/aarch64/aarch64.opt (-mearly-ldp-fusion): New.
(-mlate-ldp-fusion): New.
(--param=aarch64-ldp-alias-check-limit): New.
(--param=aarch64-ldp-writeback): New.
* config/aarch64/t-aarch64: Add rule for aarch64-ldp-fusion.o.
* config/aarch64/aarch64-ldp-fusion.cc: New file.
* doc/invoke.texi (AArch64 Options): Document new
-m{early,late}-ldp-fusion options.
|
|
ACLE has added intrinsics to bridge between SVE and Neon.
The NEON_SVE Bridge adds intrinsics that allow conversions between NEON and
SVE vectors.
This patch adds support to GCC for the following 3 intrinsics:
svset_neonq, svget_neonq and svdup_neonq
gcc/ChangeLog:
* config.gcc: Adds new header to config.
* config/aarch64/aarch64-builtins.cc (enum aarch64_type_qualifiers):
Moved to header file.
(ENTRY): Likewise.
(enum aarch64_simd_type): Likewise.
(struct aarch64_simd_type_info): Remove static.
(GTY): Likewise.
* config/aarch64/aarch64-c.cc (aarch64_pragma_aarch64):
Defines pragma for arm_neon_sve_bridge.h.
* config/aarch64/aarch64-protos.h:
Add handle_arm_neon_sve_bridge_h
* config/aarch64/aarch64-sve-builtins-base.h: New intrinsics.
* config/aarch64/aarch64-sve-builtins-base.cc
(class svget_neonq_impl): New intrinsic implementation.
(class svset_neonq_impl): Likewise.
(class svdup_neonq_impl): Likewise.
(NEON_SVE_BRIDGE_FUNCTION): New intrinsics.
* config/aarch64/aarch64-sve-builtins-functions.h
(NEON_SVE_BRIDGE_FUNCTION): Defines macro for NEON_SVE_BRIDGE
functions.
* config/aarch64/aarch64-sve-builtins-shapes.h: New shapes.
* config/aarch64/aarch64-sve-builtins-shapes.cc
(parse_element_type): Add NEON element types.
(parse_type): Likewise.
(struct get_neonq_def): Defines function shape for get_neonq.
(struct set_neonq_def): Defines function shape for set_neonq.
(struct dup_neonq_def): Defines function shape for dup_neonq.
* config/aarch64/aarch64-sve-builtins.cc
(DEF_SVE_TYPE_SUFFIX): Changed to be called through
SVE_NEON macro.
(DEF_SVE_NEON_TYPE_SUFFIX): Defines
macro for NEON_SVE_BRIDGE type suffixes.
(DEF_NEON_SVE_FUNCTION): Defines
macro for NEON_SVE_BRIDGE functions.
(function_resolver::infer_neon128_vector_type): Infers type suffix
for overloaded functions.
(handle_arm_neon_sve_bridge_h): Handles #pragma arm_neon_sve_bridge.h.
* config/aarch64/aarch64-sve-builtins.def
(DEF_SVE_NEON_TYPE_SUFFIX): Macro for handling neon_sve type suffixes.
(bf16): Replace entry with neon-sve entry.
(f16): Likewise.
(f32): Likewise.
(f64): Likewise.
(s8): Likewise.
(s16): Likewise.
(s32): Likewise.
(s64): Likewise.
(u8): Likewise.
(u16): Likewise.
(u32): Likewise.
(u64): Likewise.
* config/aarch64/aarch64-sve-builtins.h
(GCC_AARCH64_SVE_BUILTINS_H): Include aarch64-builtins.h.
(ENTRY): Add aarch64_simd_type definiton.
(enum aarch64_simd_type): Add neon information to type_suffix_info.
(struct type_suffix_info): New function.
* config/aarch64/aarch64-sve.md
(@aarch64_sve_get_neonq_<mode>): New intrinsic insn for big endian.
(@aarch64_sve_set_neonq_<mode>): Likewise.
* config/aarch64/iterators.md: Add UNSPEC_SET_NEONQ.
* config/aarch64/aarch64-builtins.h: New file.
* config/aarch64/aarch64-neon-sve-bridge-builtins.def: New file.
* config/aarch64/arm_neon_sve_bridge.h: New file.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/sve/acle/asm/test_sve_acle.h: Add include
arm_neon_sve_bridge header file
* gcc.dg/torture/neon-sve-bridge.c: New test.
* gcc.target/aarch64/sve/acle/asm/dup_neonq_bf16.c: New test.
* gcc.target/aarch64/sve/acle/asm/dup_neonq_f16.c: New test.
* gcc.target/aarch64/sve/acle/asm/dup_neonq_f32.c: New test.
* gcc.target/aarch64/sve/acle/asm/dup_neonq_f64.c: New test.
* gcc.target/aarch64/sve/acle/asm/dup_neonq_s16.c: New test.
* gcc.target/aarch64/sve/acle/asm/dup_neonq_s32.c: New test.
* gcc.target/aarch64/sve/acle/asm/dup_neonq_s64.c: New test.
* gcc.target/aarch64/sve/acle/asm/dup_neonq_s8.c: New test.
* gcc.target/aarch64/sve/acle/asm/dup_neonq_u16.c: New test.
* gcc.target/aarch64/sve/acle/asm/dup_neonq_u32.c: New test.
* gcc.target/aarch64/sve/acle/asm/dup_neonq_u64.c: New test.
* gcc.target/aarch64/sve/acle/asm/dup_neonq_u8.c: New test.
* gcc.target/aarch64/sve/acle/asm/get_neonq_bf16.c: New test.
* gcc.target/aarch64/sve/acle/asm/get_neonq_f16.c: New test.
* gcc.target/aarch64/sve/acle/asm/get_neonq_f32.c: New test.
* gcc.target/aarch64/sve/acle/asm/get_neonq_f64.c: New test.
* gcc.target/aarch64/sve/acle/asm/get_neonq_s16.c: New test.
* gcc.target/aarch64/sve/acle/asm/get_neonq_s32.c: New test.
* gcc.target/aarch64/sve/acle/asm/get_neonq_s64.c: New test.
* gcc.target/aarch64/sve/acle/asm/get_neonq_s8.c: New test.
* gcc.target/aarch64/sve/acle/asm/get_neonq_u16.c: New test.
* gcc.target/aarch64/sve/acle/asm/get_neonq_u32.c: New test.
* gcc.target/aarch64/sve/acle/asm/get_neonq_u64.c: New test.
* gcc.target/aarch64/sve/acle/asm/get_neonq_u8.c: New test.
* gcc.target/aarch64/sve/acle/asm/set_neonq_bf16.c: New test.
* gcc.target/aarch64/sve/acle/asm/set_neonq_f16.c: New test.
* gcc.target/aarch64/sve/acle/asm/set_neonq_f32.c: New test.
* gcc.target/aarch64/sve/acle/asm/set_neonq_f64.c: New test.
* gcc.target/aarch64/sve/acle/asm/set_neonq_s16.c: New test.
* gcc.target/aarch64/sve/acle/asm/set_neonq_s32.c: New test.
* gcc.target/aarch64/sve/acle/asm/set_neonq_s64.c: New test.
* gcc.target/aarch64/sve/acle/asm/set_neonq_s8.c: New test.
* gcc.target/aarch64/sve/acle/asm/set_neonq_u16.c: New test.
* gcc.target/aarch64/sve/acle/asm/set_neonq_u32.c: New test.
* gcc.target/aarch64/sve/acle/asm/set_neonq_u64.c: New test.
* gcc.target/aarch64/sve/acle/asm/set_neonq_u8.c: New test.
* gcc.target/aarch64/sve/acle/general-c/dup_neonq_1.c: New test.
* gcc.target/aarch64/sve/acle/general-c/get_neonq_1.c: New test.
* gcc.target/aarch64/sve/acle/general-c/set_neonq_1.c: New test.
|