aboutsummaryrefslogtreecommitdiff
path: root/gcc/config
AgeCommit message (Collapse)AuthorFilesLines
2022-11-08i386: Improve vector [GL]E{,U} comparison against vector constants [PR107546]Jakub Jelinek3-11/+94
For integer vector comparisons without XOP before AVX512{F,VL} we are constrained by only GT and EQ being supported in HW. For GTU we play tricks to implement it using GT or unsigned saturating subtraction, for LT/LTU we swap the operands and thus turn it into GT/GTU. For LE/LEU we handle it by using GT/GTU and negating the result and for GE/GEU by using GT/GTU on swapped operands and negating the result. If the second operand is a CONST_VECTOR, we can usually do better though, we can avoid the negation. For LE/LEU cst by doing LT/LTU cst+1 (and then cst+1 GT/GTU x) and for GE/GEU cst by doing GT/GTU cst-1, provided there is no wrap-around on those cst+1 or cst-1. GIMPLE canonicalizes x < cst to x <= cst-1 etc. (the rule is smaller absolute value on constant), but only for scalars or uniform vectors, so in some cases this undoes that canonicalization in order to avoid the extra negation, but it handles also non-uniform constants. E.g. with -mavx2 the testcase assembly difference is: - movl $47, %eax + movl $48, %eax vmovdqa %xmm0, %xmm1 vmovd %eax, %xmm0 vpbroadcastb %xmm0, %xmm0 - vpminsb %xmm0, %xmm1, %xmm0 - vpcmpeqb %xmm1, %xmm0, %xmm0 + vpcmpgtb %xmm1, %xmm0, %xmm0 and - vmovdqa %xmm0, %xmm1 - vmovdqa .LC1(%rip), %xmm0 - vpminsb %xmm1, %xmm0, %xmm1 - vpcmpeqb %xmm1, %xmm0, %xmm0 + vpcmpgtb .LC1(%rip), %xmm0, %xmm0 while with just SSE2: - pcmpgtb .LC0(%rip), %xmm0 - pxor %xmm1, %xmm1 - pcmpeqb %xmm1, %xmm0 + movdqa %xmm0, %xmm1 + movdqa .LC0(%rip), %xmm0 + pcmpgtb %xmm1, %xmm0 and - movdqa %xmm0, %xmm1 - movdqa .LC1(%rip), %xmm0 - pcmpgtb %xmm1, %xmm0 - pxor %xmm1, %xmm1 - pcmpeqb %xmm1, %xmm0 + pcmpgtb .LC1(%rip), %xmm0 2022-11-08 Jakub Jelinek <jakub@redhat.com> PR target/107546 * config/i386/predicates.md (vector_or_const_vector_operand): New predicate. * config/i386/sse.md (vec_cmp<mode><sseintvecmodelower>, vec_cmpv2div2di, vec_cmpu<mode><sseintvecmodelower>, vec_cmpuv2div2di): Use nonimmediate_or_const_vector_operand predicate instead of nonimmediate_operand and vector_or_const_vector_operand instead of vector_operand. * config/i386/i386-expand.cc (ix86_expand_int_sse_cmp): For LE/LEU or GE/GEU with CONST_VECTOR cop1 try to transform those into LE/LEU or GT/GTU with larger or smaller by one cop1 if there is no wrap-around. Force CONST_VECTOR cop0 or cop1 into REG. Formatting fix. * gcc.target/i386/pr107546.c: New test.
2022-11-08Revert "i386: Prefer remote atomic insn for atomic_fetch{add, and, or, xor}"konglin12-28/+3
This reverts commit 48fa4131e419942efc9dd762694fdc7e819de392.
2022-11-08Add m_CORE_ATOM for atom coresHaochen Jiang2-31/+41
gcc/ChangeLog: * config/i386/i386-options.cc (m_CORE_ATOM): New. * config/i386/x86-tune.def (X86_TUNE_SCHEDULE): Initial tune for CORE_ATOM. (X86_TUNE_PARTIAL_REG_DEPENDENCY): Ditto. (X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY): Ditto. (X86_TUNE_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY): Ditto. (X86_TUNE_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY): Ditto. (X86_TUNE_DEST_FALSE_DEP_FOR_GLC): Ditto. (X86_TUNE_MEMORY_MISMATCH_STALL): Ditto. (X86_TUNE_USE_LEAVE): Ditto. (X86_TUNE_PUSH_MEMORY): Ditto. (X86_TUNE_USE_INCDEC): Ditto. (X86_TUNE_INTEGER_DFMODE_MOVES): Ditto. (X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB): Ditto. (X86_TUNE_MISALIGNED_MOVE_STRING_PRO_EPILOGUES): Ditto. (X86_TUNE_USE_SAHF): Ditto. (X86_TUNE_USE_BT): Ditto. (X86_TUNE_AVOID_FALSE_DEP_FOR_BMI): Ditto. (X86_TUNE_ONE_IF_CONV_INSN): Ditto. (X86_TUNE_AVOID_MFENCE): Ditto. (X86_TUNE_USE_SIMODE_FIOP): Ditto. (X86_TUNE_EXT_80387_CONSTANTS): Ditto. (X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL): Ditto. (X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL): Ditto. (X86_TUNE_SSE_TYPELESS_STORES): Ditto. (X86_TUNE_SSE_LOAD0_BY_PXOR): Ditto. (X86_TUNE_AVOID_4BYTE_PREFIXES): Ditto. (X86_TUNE_USE_GATHER_2PARTS): Ditto. (X86_TUNE_USE_GATHER_4PARTS): Ditto. (X86_TUNE_USE_GATHER): Ditto.
2022-11-07bpf: cleanup missed refactorDavid Faust1-23/+1
Commit 068baae1864 "bpf: add preserve_field_info builtin" factored out some repeated code to a new function maybe_make_core_relo (), but missed using it in one place. Clean that up. gcc/ * config/bpf/bpf.cc (handle_attr_preserve): Use maybe_make_core_relo().
2022-11-07Initial Grand Ridge supportHu, Lin14-1/+15
gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_intel_cpu): Handle Grand Ridge. * common/config/i386/i386-common.cc (processor_names): Add grandridge. (processor_alias_table): Ditto. * common/config/i386/i386-cpuinfo.h: (enum processor_types): Add INTEL_GRANDRIDGE. * config.gcc: Add -march=grandridge. * config/i386/driver-i386.cc (host_detect_local_cpu): Handle grandridge. * config/i386/i386-c.cc (ix86_target_macros_internal): Ditto. * config/i386/i386-options.cc (m_GRANDRIDGE): New define. (processor_cost_table): Add grandridge. * config/i386/i386.h (enum processor_type): Add PROCESSOR_GRANDRIDGE. (PTA_GRANDRIDGE): Ditto. * doc/extend.texi: Add grandridge. * doc/invoke.texi: Ditto. gcc/testsuite/ChangeLog: * g++.target/i386/mv16.C: Add grandridge. * gcc.target/i386/funcspec-56.inc: Handle new march.
2022-11-07i386: Prefer remote atomic insn for atomic_fetch{add, and, or, xor}konglin12-3/+28
Add flag -mprefer-remote-atomic to control whether to generate raoint insn for atomic operations. gcc/ChangeLog: * config/i386/i386.opt:Add -mprefer-remote-atomic. * config/i386/sync.md (atomic_<plus_logic><mode>): New define_expand. (atomic_add<mode>): Rename to below one. (atomic_add<mode>_1): To this. (atomic_<logic><mode>): Ditto. (atomic_<logic><mode>_1): Ditto. * doc/invoke.texi: Add -mprefer-remote-atomic. gcc/testsuite/ChangeLog: * gcc.target/i386/raoint-atomic-fetch.c: New test.
2022-11-07Support Intel RAO-INTkonglin19-1/+139
gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_available_features): Detect raoint. * common/config/i386/i386-common.cc (OPTION_MASK_ISA2_RAOINT_SET, OPTION_MASK_ISA2_RAOINT_UNSET): New. (ix86_handle_option): Handle -mraoint. * common/config/i386/i386-cpuinfo.h (enum processor_features): Add FEATURE_RAOINT. * common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for raoint. * config.gcc: Add raointintrin.h * config/i386/cpuid.h (bit_RAOINT): New. * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-c.cc (ix86_target_macros_internal): Define __RAOINT__. * config/i386/i386-isa.def (RAOINT): Add DEF_PTA(RAOINT). * config/i386/i386-options.cc (ix86_valid_target_attribute_inner_p): Add -mraoint. * config/i386/sync.md (rao_a<raointop><mode>): New define insn. * config/i386/i386.opt: Add option -mraoint. * config/i386/x86gprintrin.h: Include raointintrin.h. * doc/extend.texi: Document raoint. * doc/invoke.texi: Document -mraoint. * doc/sourcebuild.texi: Document target raoint. * config/i386/raointintrin.h: New file. gcc/testsuite/ChangeLog: * g++.dg/other/i386-2.C: Add -mraoint. * g++.dg/other/i386-3.C: Ditto. * gcc.target/i386/funcspec-56.inc: Add new target attribute. * gcc.target/i386/sse-12.c: Add -mraoint. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Add raoint target. * gcc.target/i386/sse-23.c: Ditto. * lib/target-supports.exp: Add check_effective_target_raoint. * gcc.target/i386/rao-helper.h: New test. * gcc.target/i386/raoint-1.c: Ditto. * gcc.target/i386/raoint-aadd-2.c: Ditto. * gcc.target/i386/raoint-aand-2.c: Ditto. * gcc.target/i386/raoint-aor-2.c: Ditto. * gcc.target/i386/raoint-axor-2.c: Ditto. * gcc.target/i386/x86gprintrin-1.c: Ditto. * gcc.target/i386/x86gprintrin-2.c: Ditto. * gcc.target/i386/x86gprintrin-3.c: Ditto. * gcc.target/i386/x86gprintrin-4.c: Ditto. * gcc.target/i386/x86gprintrin-5.c: Ditto.
2022-11-07Initial Granite Rapids SupportHaochen Jiang4-2/+17
gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_intel_cpu): Handle Granite Rapids. * common/config/i386/i386-common.cc: (processor_names): Add graniterapids. (processor_alias_table): Ditto. * common/config/i386/i386-cpuinfo.h (enum processor_subtypes): Add INTEL_GRANTIERAPIDS. * config.gcc: Add -march=graniterapids. * config/i386/driver-i386.cc (host_detect_local_cpu): Handle graniterapids. * config/i386/i386-c.cc (ix86_target_macros_internal): Ditto. * config/i386/i386-options.cc (m_GRANITERAPIDS): New. (processor_cost_table): Add graniterapids. * config/i386/i386.h (enum processor_type): Add PROCESSOR_GRANITERAPIDS. (PTA_GRANITERAPIDS): Ditto. * doc/extend.texi: Add graniterapids. * doc/invoke.texi: Ditto. gcc/testsuite/ChangeLog: * g++.target/i386/mv16.C: Add graniterapids. * gcc.target/i386/funcspec-56.inc: Handle new march.
2022-11-07Support Intel prefetchit0/t1Haochen Jiang13-3/+190
gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_available_features): Detect PREFETCHI. * common/config/i386/i386-common.cc (OPTION_MASK_ISA2_PREFETCHI_SET, OPTION_MASK_ISA2_PREFETCHI_UNSET): New. (ix86_handle_option): Handle -mprefetchi. * common/config/i386/i386-cpuinfo.h (enum processor_features): Add FEATURE_PREFETCHI. * common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for prefetchi. * config.gcc: Add prfchiintrin.h. * config/i386/cpuid.h (bit_PREFETCHI): New. * config/i386/i386-builtin-types.def: Add DEF_FUNCTION_TYPE (VOID, PCVOID, INT) and DEF_FUNCTION_TYPE (VOID, PCVOID, INT, INT, INT). * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-c.cc (ix86_target_macros_internal): Define __PREFETCHI__. * config/i386/i386-expand.cc: Handle new builtins. * config/i386/i386-isa.def (PREFETCHI): Add DEF_PTA(PREFETCHI). * config/i386/i386-options.cc (ix86_valid_target_attribute_inner_p): Handle prefetchi. * config/i386/i386.md (prefetchi): New define_insn. * config/i386/i386.opt: Add option -mprefetchi. * config/i386/predicates.md (local_func_symbolic_operand): New predicates. * config/i386/x86gprintrin.h: Include prfchiintrin.h. * config/i386/xmmintrin.h (enum _mm_hint): New enum for prefetchi. (_mm_prefetch): Handle the highest bit of enum. * doc/extend.texi: Document prefetchi. * doc/invoke.texi: Document -mprefetchi. * doc/sourcebuild.texi: Document target prefetchi. * config/i386/prfchiintrin.h: New file. gcc/testsuite/ChangeLog: * g++.dg/other/i386-2.C: Add -mprefetchi. * g++.dg/other/i386-3.C: Ditto. * gcc.target/i386/avx-1.c: Ditto. * gcc.target/i386/funcspec-56.inc: Add new target attribute. * gcc.target/i386/sse-13.c: Add -mprefetchi. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/x86gprintrin-1.c: Ditto. * gcc.target/i386/x86gprintrin-2.c: Ditto. * gcc.target/i386/x86gprintrin-3.c: Ditto. * gcc.target/i386/x86gprintrin-4.c: Ditto. * gcc.target/i386/x86gprintrin-5.c: Ditto. * gcc.target/i386/prefetchi-1.c: New test. * gcc.target/i386/prefetchi-2.c: Ditto. * gcc.target/i386/prefetchi-3.c: Ditto. * gcc.target/i386/prefetchi-4.c: Ditto. Co-authored-by: Hongtao Liu <hongtao.liu@intel.com>
2022-11-06LoongArch: Add fcopysign instructionsXi Ruoyao1-1/+21
Add fcopysign.{s,d} with the names copysign{sf,df}3 so GCC will expand __builtin_copysign{f,} to a single instruction. Link: https://sourceware.org/pipermail/libc-alpha/2022-November/143177.html gcc/ChangeLog: * config/loongarch/loongarch.md (UNSPEC_FCOPYSIGN): New unspec. (type): Add fcopysign. (copysign<mode>3): New instruction template. gcc/testsuite/ChangeLog: * gcc.target/loongarch/fcopysign.c: New test.
2022-11-04aarch64: Fix typo in aarch64-sve.md commentKyrylo Tkachov1-2/+2
gcc/ChangeLog: * config/aarch64/aarch64-sve2.md: Fix typo in Cryptographic extensions comment.
2022-11-04Remove support for Intel MIC offloadingThomas Schwinge5-783/+0
... after its deprecation in GCC 12. * Makefile.def: Remove module 'liboffloadmic'. * Makefile.in: Regenerate. * configure.ac: Remove 'liboffloadmic' handling. * configure: Regenerate. contrib/ * gcc-changelog/git_commit.py (default_changelog_locations): Remove 'liboffloadmic'. * gcc_update (files_and_dependencies): Remove 'liboffloadmic' files. * update-copyright.py (GCCCmdLine): Remove 'liboffloadmic' comment. gcc/ * config.gcc [target *-intelmic-* | *-intelmicemul-*]: Remove. * config/i386/i386-options.cc (ix86_omp_device_kind_arch_isa) [ACCEL_COMPILER]: Remove. * config/i386/intelmic-mkoffload.cc: Remove. * config/i386/intelmic-offload.h: Likewise. * config/i386/t-intelmic: Likewise. * config/i386/t-omp-device: Likewise. * configure.ac [target *-intelmic-* | *-intelmicemul-*]: Remove. * configure: Regenerate. * doc/install.texi (--enable-offload-targets=[...]): Update. * doc/sourcebuild.texi: Remove 'liboffloadmic' documentation. include/ * gomp-constants.h (GOMP_DEVICE_INTEL_MIC): Comment out. (GOMP_VERSION_INTEL_MIC): Remove. libgomp/ * libgomp-plugin.h (OFFLOAD_TARGET_TYPE_INTEL_MIC): Remove. * libgomp.texi (OpenMP Context Selectors): Remove Intel MIC documentation. * plugin/configfrag.ac <enable_offload_targets> [*-intelmic-* | *-intelmicemul-*]: Remove. * configure: Regenerate. * testsuite/lib/libgomp.exp (libgomp_init): Remove 'liboffloadmic' handling. (offload_target_to_openacc_device_type) [$offload_target = *-intelmic*]: Remove. (check_effective_target_offload_device_intel_mic) (check_effective_target_offload_device_any_intel_mic): Remove. * testsuite/libgomp.c-c++-common/on_device_arch.h (device_arch_intel_mic, on_device_arch_intel_mic, any_device_arch) (any_device_arch_intel_mic): Remove. * testsuite/libgomp.c-c++-common/target-45.c: Remove 'offload_device_any_intel_mic' XFAIL. * testsuite/libgomp.fortran/target10.f90: Likewise. liboffloadmic/ * ChangeLog: Remove. * Makefile.am: Likewise. * Makefile.in: Likewise. * aclocal.m4: Likewise. * configure: Likewise. * configure.ac: Likewise. * configure.tgt: Likewise. * doc/doxygen/config: Likewise. * doc/doxygen/header.tex: Likewise. * include/coi/common/COIEngine_common.h: Likewise. * include/coi/common/COIEvent_common.h: Likewise. * include/coi/common/COIMacros_common.h: Likewise. * include/coi/common/COIPerf_common.h: Likewise. * include/coi/common/COIResult_common.h: Likewise. * include/coi/common/COISysInfo_common.h: Likewise. * include/coi/common/COITypes_common.h: Likewise. * include/coi/sink/COIBuffer_sink.h: Likewise. * include/coi/sink/COIPipeline_sink.h: Likewise. * include/coi/sink/COIProcess_sink.h: Likewise. * include/coi/source/COIBuffer_source.h: Likewise. * include/coi/source/COIEngine_source.h: Likewise. * include/coi/source/COIEvent_source.h: Likewise. * include/coi/source/COIPipeline_source.h: Likewise. * include/coi/source/COIProcess_source.h: Likewise. * liboffloadmic_host.spec.in: Likewise. * liboffloadmic_target.spec.in: Likewise. * plugin/Makefile.am: Likewise. * plugin/Makefile.in: Likewise. * plugin/aclocal.m4: Likewise. * plugin/configure: Likewise. * plugin/configure.ac: Likewise. * plugin/libgomp-plugin-intelmic.cpp: Likewise. * plugin/offload_target_main.cpp: Likewise. * runtime/cean_util.cpp: Likewise. * runtime/cean_util.h: Likewise. * runtime/coi/coi_client.cpp: Likewise. * runtime/coi/coi_client.h: Likewise. * runtime/coi/coi_server.cpp: Likewise. * runtime/coi/coi_server.h: Likewise. * runtime/compiler_if_host.cpp: Likewise. * runtime/compiler_if_host.h: Likewise. * runtime/compiler_if_target.cpp: Likewise. * runtime/compiler_if_target.h: Likewise. * runtime/dv_util.cpp: Likewise. * runtime/dv_util.h: Likewise. * runtime/emulator/coi_common.h: Likewise. * runtime/emulator/coi_device.cpp: Likewise. * runtime/emulator/coi_device.h: Likewise. * runtime/emulator/coi_host.cpp: Likewise. * runtime/emulator/coi_host.h: Likewise. * runtime/emulator/coi_version_asm.h: Likewise. * runtime/emulator/coi_version_linker_script.map: Likewise. * runtime/liboffload_error.c: Likewise. * runtime/liboffload_error_codes.h: Likewise. * runtime/liboffload_msg.c: Likewise. * runtime/liboffload_msg.h: Likewise. * runtime/mic_lib.f90: Likewise. * runtime/offload.h: Likewise. * runtime/offload_common.cpp: Likewise. * runtime/offload_common.h: Likewise. * runtime/offload_engine.cpp: Likewise. * runtime/offload_engine.h: Likewise. * runtime/offload_env.cpp: Likewise. * runtime/offload_env.h: Likewise. * runtime/offload_host.cpp: Likewise. * runtime/offload_host.h: Likewise. * runtime/offload_iterator.h: Likewise. * runtime/offload_omp_host.cpp: Likewise. * runtime/offload_omp_target.cpp: Likewise. * runtime/offload_orsl.cpp: Likewise. * runtime/offload_orsl.h: Likewise. * runtime/offload_table.cpp: Likewise. * runtime/offload_table.h: Likewise. * runtime/offload_target.cpp: Likewise. * runtime/offload_target.h: Likewise. * runtime/offload_target_main.cpp: Likewise. * runtime/offload_timer.h: Likewise. * runtime/offload_timer_host.cpp: Likewise. * runtime/offload_timer_target.cpp: Likewise. * runtime/offload_trace.cpp: Likewise. * runtime/offload_trace.h: Likewise. * runtime/offload_util.cpp: Likewise. * runtime/offload_util.h: Likewise. * runtime/ofldbegin.cpp: Likewise. * runtime/ofldend.cpp: Likewise. * runtime/orsl-lite/include/orsl-lite.h: Likewise. * runtime/orsl-lite/lib/orsl-lite.c: Likewise. * runtime/orsl-lite/version.txt: Likewise.
2022-11-04Better integrate default 'sorry' 'TARGET_ASM_CONSTRUCTOR', ↵Thomas Schwinge1-1/+0
'TARGET_ASM_DESTRUCTOR' ... after commit 4ee35c11fd328728c12f3e086ae016ca94624bf8 "Restore default 'sorry' 'TARGET_ASM_CONSTRUCTOR', 'TARGET_ASM_DESTRUCTOR'". No functional change. gcc/ * Makefile.in (OBJS): Remove 'dbxout.o'. * config/nvptx/nvptx.cc: Don't '#include "dbxout.h"'. * dbxout.cc: Remove. * dbxout.h: Likewise. * target-def.h (TARGET_ASM_CONSTRUCTOR, TARGET_ASM_DESTRUCTOR): Default to 'default_asm_out_constructor', 'default_asm_out_destructor'. * targhooks.cc (default_asm_out_constructor) (default_asm_out_destructor): New. * targhooks.h (default_asm_out_constructor) (default_asm_out_destructor): Declare.
2022-11-04Restore default 'sorry' 'TARGET_ASM_CONSTRUCTOR', 'TARGET_ASM_DESTRUCTOR'Thomas Schwinge1-0/+1
... that got lost in commit 7e0db0cdf01e9c885a29cb37415f5bc00d90c029 "STABS: remove -gstabs and -gxcoff functionality". Previously, if a back end was not 'USE_COLLECT2', nor manually defined 'TARGET_ASM_CONSTRUCTOR', 'TARGET_ASM_DESTRUCTOR', or got pointed to the respective 'default_[...]' functions due to 'CTORS_SECTION_ASM_OP', 'DTORS_SECTION_ASM_OP', or 'TARGET_ASM_NAMED_SECTION', it got pointed to 'default_stabs_asm_out_constructor', 'default_stabs_asm_out_destructor'. These would emit 'sorry' for any global constructor/destructor they're run into. This is now gone, and thus in such a back end configuration case 'TARGET_ASM_CONSTRUCTOR', 'TARGET_ASM_DESTRUCTOR' don't get defined anymore, and thus the subsequently following: #if !defined(TARGET_HAVE_CTORS_DTORS) # if defined(TARGET_ASM_CONSTRUCTOR) && defined(TARGET_ASM_DESTRUCTOR) # define TARGET_HAVE_CTORS_DTORS true # endif #endif ... doesn't define 'TARGET_HAVE_CTORS_DTORS' anymore, and thus per my understanding, 'gcc/final.cc:rest_of_handle_final': if (DECL_STATIC_CONSTRUCTOR (current_function_decl) && targetm.have_ctors_dtors) targetm.asm_out.constructor (XEXP (DECL_RTL (current_function_decl), 0), decl_init_priority_lookup (current_function_decl)); if (DECL_STATIC_DESTRUCTOR (current_function_decl) && targetm.have_ctors_dtors) targetm.asm_out.destructor (XEXP (DECL_RTL (current_function_decl), 0), decl_fini_priority_lookup (current_function_decl)); ... simply does nothing anymore for a 'DECL_STATIC_CONSTRUCTOR', 'DECL_STATIC_DESTRUCTOR'. This, effectively, means that GCC/nvptx now suddenly appears to "support" global constructors/destructors, which means that a ton of test cases now erroneously PASS that previously used to FAIL: sorry, unimplemented: global constructors not supported on this target Of course, such support didn't magically happen due to "STABS: remove -gstabs and -gxcoff functionality", so this is bad. And, corresponding execution testing then regularly FAILs (due to the global constructor/destructor functions never being invoked), for example: [-UNSUPPORTED:-]{+PASS:+} gcc.dg/initpri1.c {+(test for excess errors)+} {+FAIL: gcc.dg/initpri1.c execution test+} [-UNSUPPORTED:-]{+PASS:+} g++.dg/special/conpr-1.C {+(test for excess errors)+} {+FAIL: g++.dg/special/conpr-1.C execution test+} To restore the previous GCC/nvptx behavior, for traceability, this simply restores the previous code, stripped down to the bare minimum. gcc/ * Makefile.in (OBJS): Add 'dbxout.o'. * config/nvptx/nvptx.cc: '#include "dbxout.h"'. * dbxout.cc: New. * dbxout.h: Likewise. * target-def.h (TARGET_ASM_CONSTRUCTOR, TARGET_ASM_DESTRUCTOR): Default to 'default_stabs_asm_out_constructor', 'default_stabs_asm_out_destructor'.
2022-11-04Support Intel AMX-FP16 ISAHongyu Wang7-1/+59
gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_available_features): Detect amx-fp16. * common/config/i386/i386-common.cc (OPTION_MASK_ISA2_AMX_FP16_SET, OPTION_MASK_ISA2_AMX_FP16_UNSET): New macros. (ix86_handle_option): Handle -mamx-fp16. * common/config/i386/i386-cpuinfo.h (enum processor_features): Add FEATURE_AMX_FP16. * common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for amx-fp16. * config.gcc: Add amxfp16intrin.h. * config/i386/cpuid.h (bit_AMX_FP16): New. * config/i386/i386-c.cc (ix86_target_macros_internal): Define __AMX_FP16__. * config/i386/i386-isa.def: Add DEF_PTA for AMX_FP16. * config/i386/i386-options.cc (isa2_opts): Add -mamx-fp16. (ix86_valid_target_attribute_inner_p): Add new ATTR. (ix86_option_override_internal): Handle AMX-FP16. * config/i386/i386.opt: Add -mamx-fp16. * config/i386/immintrin.h: Include amxfp16intrin.h. * doc/extend.texi: Document -mamx-fp16. * doc/invoke.texi: Document amx-fp16. * doc/sourcebuild.texi: Document amx_fp16. * config/i386/amxfp16intrin.h: New file. gcc/testsuite/ChangeLog: * g++.dg/other/i386-2.C: Add -mamx-fp16. * g++.dg/other/i386-3.C: Ditto. * gcc.target/i386/sse-12.c: Ditto. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * lib/target-supports.exp: (check_effective_target_amx_fp16): New proc. * gcc.target/i386/funcspec-56.inc: Add new target attribute. * gcc.target/i386/amx-check.h: Add AMX_FP16. * gcc.target/i386/amx-helper.h: New file to support amx-fp16. * gcc.target/i386/amxfp16-asmatt-1.c: New test. * gcc.target/i386/amxfp16-asmintel-1.c: Ditto. * gcc.target/i386/amxfp16-dpfp16ps-2.c: Ditto. Co-authored-by: Haochen Jiang <haochen.jiang@intel.com>
2022-11-04Initial Sierra Forest SupportHaochen Jiang4-1/+16
gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_intel_cpu): Add Sierra Forest. * common/config/i386/i386-common.cc (processor_names): Add Sierra Forest. (processor_alias_table): Ditto. * common/config/i386/i386-cpuinfo.h (enum processor_types): Add INTEL_SIERRAFOREST. * config.gcc: Add -march=sierraforest. * config/i386/driver-i386.cc (host_detect_local_cpu): Handle Sierra Forest. * config/i386/i386-c.cc (ix86_target_macros_internal): Ditto. * config/i386/i386-options.cc (m_SIERRAFOREST): New define. (processor_cost_table): Add sierra forest. * config/i386/i386.h (enum processor_type): Add PROCESSOR_SIERRA_FOREST. (PTA_SIERRAFOREST): Ditto. * doc/extend.texi: Add sierra forest. * doc/invoke.texi: Ditto. gcc/testsuite/ChangeLog: * g++.target/i386/mv16.C: Add sierra forest. * gcc.target/i386/funcspec-56.inc: Handle new march.
2022-11-04Support Intel CMPccXADDHaochen Jiang11-2/+160
gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_available_features): Detect cmpccxadd. * common/config/i386/i386-common.cc (OPTION_MASK_ISA2_CMPCCXADD_SET, OPTION_MASK_ISA2_CMPCCXADD_UNSET): New. (ix86_handle_option): Handle -mcmpccxadd. * common/config/i386/i386-cpuinfo.h (enum processor_features): Add FEATURE_CMPCCXADD. * common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for cmpccxadd. * config.gcc: Add cmpccxaddintrin.h. * config/i386/cpuid.h (bit_CMPCCXADD): New. * config/i386/i386-builtin-types.def: Add DEF_FUNCTION_TYPE(INT, PINT, INT, INT, INT) and DEF_FUNCTION_TYPE(LONGLONG, PLONGLONG, LONGLONG, LONGLONG, INT). * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-c.cc (ix86_target_macros_internal): Define __CMPCCXADD__. * config/i386/i386-expand.cc (ix86_expand_special_args_builtin): Add new parameter to indicate constant position. Handle INT_FTYPE_PINT_INT_INT_INT and LONGLONG_FTYPE_PLONGLONG_LONGLONG_LONGLONG_INT. * config/i386/i386-isa.def (CMPCCXADD): Add DEF_PTA(CMPCCXADD). * config/i386/i386-options.cc (isa2_opts): Add -mcmpccxadd. (ix86_valid_target_attribute_inner_p): Handle cmpccxadd. * config/i386/i386.opt: Add option -mcmpccxadd. * config/i386/sync.md (cmpccxadd_<mode>): New define insn. * config/i386/x86gprintrin.h: Include cmpccxaddintrin.h. * doc/extend.texi: Document cmpccxadd. * doc/invoke.texi: Document -mcmpccxadd. * doc/sourcebuild.texi: Document target cmpccxadd. * config/i386/cmpccxaddintrin.h: New file. gcc/testsuite/ChangeLog: * g++.dg/other/i386-2.C: Add -mcmpccxadd. * g++.dg/other/i386-3.C: Ditto. * gcc.target/i386/avx-1.c: Ditto. * gcc.target/i386/funcspec-56.inc: Add new target attribute. * gcc.target/i386/sse-13.c: Add -mcmpccxadd. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/x86gprintrin-1.c: Ditto. * gcc.target/i386/x86gprintrin-2.c: Ditto. * gcc.target/i386/x86gprintrin-3.c: Ditto. * gcc.target/i386/x86gprintrin-4.c: Ditto. * gcc.target/i386/x86gprintrin-5.c: Ditto. * lib/target-supports.exp (check_effective_target_cmpccxadd): New. * gcc.target/i386/cmpccxadd-1.c: New test. * gcc.target/i386/cmpccxadd-2.c: Ditto.
2022-11-03amdgcn: Fix instruction generation for exp2 and log2 operationsKwok Cheung Yeung1-6/+14
The GCN instructions for the exp2 and log2 operations are v_exp_* and v_log_* respectively, which unfortunately do not line up with the RTL naming convention. To deal with this, a new set of int attributes is now used when generating the assembly for these instructions. 2022-11-03 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/ * config/gcn/gcn-valu.md (math_unop_insn): New attribute. (<math_unop><mode>2, <math_unop><mode>2<exec>, <math_unop><mode>2, <math_unop><mode>2<exec>, *<math_unop><mode>2_insn, *<math_unop><mode>2<exec>_insn): Use math_unop_insn to generate assembler output. gcc/testsuite/ * gcc.target/gcn/unsafe-math-1.c: New.
2022-11-03i386: Fix uninitialized register after peephole2 conversion [PR107404]Uros Bizjak1-1/+2
The eliminate reg-reg move by inverting the condition of a cmove #2 peephole2 converts the following sequence: 473: bx:DI=[r14:DI*0x8+r12:DI] 960: r15:DI=r8:DI 485: {flags:CCC=cmp(r15:DI+bx:DI,bx:DI);r15:DI=r15:DI+bx:DI;} 737: r15:DI={(geu(flags:CCC,0))?r15:DI:bx:DI} to: 1110: {flags:CCC=cmp(r8:DI+bx:DI,bx:DI);r8:DI=r8:DI+bx:DI;} 1111: r15:DI=[r14:DI*0x8+r12:DI] 1112: r15:DI={(geu(flags:CCC,0))?r8:DI:r15:DI} Please note that(insn 1110) uses register BX, but its initialization was eliminated. Avoid conversion if eliminated move intialized a register, used in the moved instruction. 2022-11-03 Uroš Bizjak <ubizjak@gmail.com> gcc/ChangeLog: PR target/107404 * config/i386/i386.md (eliminate reg-reg move by inverting the condition of a cmove #2 peephole2): Check if eliminated move initialized a register, used in the moved instruction. gcc/testsuite/ChangeLog: PR target/107404 * g++.target/i386/pr107404.C: New test.
2022-11-03amdgcn: Fix duplicate conditionals [PR107510]Andrew Stubbs1-2/+0
Just a harmless cut-and-paste issue. PR target/107510 gcc/ChangeLog: * config/gcn/gcn.cc (gcn_expand_reduc_scalar): Remove duplicate UNSPEC_SMIN_DPP_SHR conditionals.
2022-11-02RISC-V: Add Zawrs ISA extension supportChristoph Müllner2-0/+6
This patch adds support for the Zawrs ISA extension. Zawrs has been ratified by the RISC-V BoD on Oct 20th, 2022. Binutils support has been merged as: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=eb668e50036e979fb0a74821df4eee0307b44e66 gcc/ChangeLog: * common/config/riscv/riscv-common.cc: Add zawrs extension. * config/riscv/riscv-opts.h (MASK_ZAWRS): New. (TARGET_ZAWRS): New. * config/riscv/riscv.opt: New. gcc/testsuite/ChangeLog: * gcc.target/riscv/zawrs.c: New test. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
2022-11-02rs6000: Byte reverse V8HI on Power8 by vector rotation.Xionghu Luo2-7/+16
gcc/ PR target/100866 * config/rs6000/altivec.md: (*altivec_vrl<VI_char>): Named to... (altivec_vrl<VI_char>): ...this. * config/rs6000/vsx.md (revb_<mode>): Call vspltish and vrlh when target is Power8 and mode is V8HI. gcc/testsuite/ PR target/100866 * gcc.target/powerpc/pr100866-2.c: New.
2022-11-01i386: correct integer division modeling in znver.mdAlexander Monakov1-18/+21
In znver.md, division instructions have descriptions like (define_insn_reservation "znver1_idiv_DI" 41 (and (eq_attr "cpu" "znver1,znver2") (and (eq_attr "type" "idiv") (and (eq_attr "mode" "DI") (eq_attr "memory" "none")))) "znver1-double,znver1-ieu2*41") which says that DImode idiv has latency 41 (which is correct) and that it occupies 2nd integer execution unit for 41 consecutive cycles, but that is not correct: 1) the division instruction is partially pipelined, and has throughput 1/14, not 1/41; 2) for the most part it occupies a separate division unit, not the general arithmetic unit. Evidently, interaction of such 41-cycle paths with the rest of reservations causes a combinatorial explosion in the automaton. Fix this by modeling the integer division unit properly, and correcting reservations to use the measured reciprocal throughput of those instructions (available from uops.info). A similar correction for floating-point divisions is left for a followup patch. Top 5 znver table sizes, before: 68692 r znver1_ieu_check 68692 r znver1_ieu_transitions 99792 r znver1_ieu_min_issue_delay 428108 r znver1_fp_min_issue_delay 856216 r znver1_fp_transitions After: 1454 r znver1_ieu_translate 1454 r znver1_translate 2304 r znver1_ieu_transitions 428108 r znver1_fp_min_issue_delay 856216 r znver1_fp_transitions gcc/ChangeLog: PR target/87832 * config/i386/znver.md (znver1_idiv): New automaton. (znver1-idiv): New unit. (znver1_idiv_DI): Correct unit and cycles in the reservation. (znver1_idiv_SI): Ditto. (znver1_idiv_HI): Ditto. (znver1_idiv_QI): Ditto. (znver1_idiv_mem_DI): Ditto. (znver1_idiv_mem_SI): Ditto. (znver1_idiv_mem_HI): Ditto. (znver1_idiv_mem_QI): Ditto. (znver3_idiv_DI): Ditto. (znver3_idiv_SI): Ditto. (znver3_idiv_HI): Ditto. (znver3_idiv_QI): Ditto. (znver3_idiv_mem_DI): Ditto. (znver3_idiv_mem_SI): Ditto. (znver3_idiv_mem_HI): Ditto. (znver3_idiv_mem_QI): Ditto.
2022-11-01Fix incorrect digit constraintliuhongt2-84/+58
Matching constraints are used in these circumstances. More precisely, the two operands that match must include one input-only operand and one output-only operand. Moreover, the digit must be a smaller number than the number of the operand that uses it in the constraint. In pr107057, the 2 operands in the pattern are both input operands. gcc/ChangeLog: PR target/107057 * config/i386/sse.md (*vec_interleave_highv2df): Remove constraint 1. (*vec_interleave_lowv2df): Ditto. (vec_concatv2df): Ditto. (*avx512f_unpcklpd512<mask_name>): Ditto and renamed to .. (avx512f_unpcklpd512<mask_name>): .. this. (avx512f_movddup512<mask_name>): Change to define_insn. (avx_movddup256<mask_name>): Ditto. (*avx_unpcklpd256<mask_name>): Remove constraint 1 and renamed to .. (avx_unpcklpd256<mask_name>): .. this. * config/i386/i386.cc (ix86_vec_interleave_v2df_operator_ok): Disallow MEM_P (op1) && MEM_P (op2). gcc/testsuite/ChangeLog: * gcc.target/i386/pr107057.c: New test.
2022-11-01Enable more optimization for 32-bit/64-bit shrd/shld with imm shift count.liuhongt1-4/+146
This patch doens't handle variable count since it require 5 insns to be combined to get wanted pattern, but current pass_combine only supports at most 4. This patch doesn't handle 16-bit shrd/shld either. gcc/ChangeLog: PR target/55583 * config/i386/i386.md (*x86_64_shld_1): Rename to .. (x86_64_shld_1): .. this. (*x86_shld_1): Rename to .. (x86_shld_1): .. this. (*x86_64_shrd_1): Rename to .. (x86_64_shrd_1): .. this. (*x86_shrd_1): Rename to .. (x86_shrd_1): .. this. (*x86_64_shld_shrd_1_nozext): New pre_reload splitter. (*x86_shld_shrd_1_nozext): Ditto. (*x86_64_shrd_shld_1_nozext): Ditto. (*x86_shrd_shld_1_nozext): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr55583.c: New test.
2022-10-31RISC-V: Change constexpr back to CONSTEXPRJu-Zhe Zhong4-11/+11
According to https://github.com/gcc-mirror/gcc/commit/f95d3d5de72a1c43e8d529bad3ef59afc3214705. Since GCC 4.8.6 doesn't support constexpr, we should change it back to CONSTEXPR. gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc: Change constexpr back to CONSTEXPR. * config/riscv/riscv-vector-builtins-shapes.cc (SHAPE): Ditto. * config/riscv/riscv-vector-builtins.cc (struct registered_function_hasher): Ditto. * config/riscv/riscv-vector-builtins.h (struct rvv_arg_type_info): Ditto.
2022-10-31amdgcn: add fmin/fmax patternsAndrew Stubbs2-0/+32
Add fmin/fmax for scalar, vector, and reductions. The smin/smax patterns are already using the IEEE compliant hardware instructions anyway, so we can just expand to use those insns. gcc/ChangeLog: * config/gcn/gcn-valu.md (fminmaxop): New iterator. (<fexpander><mode>3): New define_expand. (<fexpander><mode>3<exec>): Likewise. (reduc_<fexpander>_scal_<mode>): Likewise. * config/gcn/gcn.md (fexpander): New attribute.
2022-10-31amdgcn: multi-size vector reductionsAndrew Stubbs3-94/+45
Add support for vector reductions for any vector width by switching iterators and generalising the code slightly. There's no one-instruction way to move an item from lane 31 to lane 0 (63, 15, 7, 3, and 1 are all fine though), and vec_extract is probably fewer cycles anyway, so now we always reduce to an SGPR. gcc/ChangeLog: * config/gcn/gcn-valu.md (V64_SI): Delete iterator. (V64_DI): Likewise. (V64_1REG): Likewise. (V64_INT_1REG): Likewise. (V64_2REG): Likewise. (V64_ALL): Likewise. (V64_FP): Likewise. (reduc_<reduc_op>_scal_<mode>): Use V_ALL. Use gen_vec_extract. (fold_left_plus_<mode>): Use V_FP. (*<reduc_op>_dpp_shr_<mode>): Use V_1REG. (*<reduc_op>_dpp_shr_<mode>): Use V_DI. (*plus_carry_dpp_shr_<mode>): Use V_INT_1REG. (*plus_carry_in_dpp_shr_<mode>): Use V_SI. (*plus_carry_dpp_shr_<mode>): Use V_DI. (mov_from_lane63_<mode>): Delete. (mov_from_lane63_<mode>): Delete. * config/gcn/gcn.cc (gcn_expand_reduc_scalar): Support partial vectors. * config/gcn/gcn.md (unspec): Remove UNSPEC_MOV_FROM_LANE63.
2022-10-31amdgcn: Silence unused parameter warningAndrew Stubbs1-1/+1
gcc/ChangeLog: * config/gcn/gcn.cc (gcn_simd_clone_compute_vecsize_and_simdlen): Set base_type as ARG_UNUSED.
2022-10-31Support Intel AVX-NE-CONVERTkonglin113-25/+316
gcc/ChangeLog: * common/config/i386/i386-common.cc (OPTION_MASK_ISA2_AVXNECONVERT_SET, OPTION_MASK_ISA2_AVXNECONVERT_UNSET): New. (ix86_handle_option): Handle -mavxneconvert, unset avxneconvert when avx2 is disabled. * common/config/i386/i386-cpuinfo.h (processor_types): Add FEATURE_AVXNECONVERT. * common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for avxneconvert. * common/config/i386/cpuinfo.h (get_available_features): Detect avxneconvert. * config.gcc: Add avxneconvertintrin.h * config/i386/avxneconvertintrin.h: New. * config/i386/avx512bf16vlintrin.h (_mm256_cvtneps_pbh): Unified builtin with avxneconvert. (_mm_cvtneps_pbh): Ditto. * config/i386/cpuid.h (bit_AVXNECONVERT): New. * config/i386/i386-builtin-types.def: Add DEF_POINTER_TYPE (PCV8HF, V8HF, CONST), DEF_POINTER_TYPE (PCV8BF, V8BF, CONST), DEF_POINTER_TYPE (PCV16HF, V16HF, CONST), DEF_POINTER_TYPE (PCV16BF, V16BF, CONST), DEF_FUNCTION_TYPE (V4SF, PCBFLOAT16), DEF_FUNCTION_TYPE (V4SF, PCFLOAT16), DEF_FUNCTION_TYPE (V8SF, PCBFLOAT16), DEF_FUNCTION_TYPE (V8SF, PCFLOAT16), DEF_FUNCTION_TYPE (V4SF, PCV8BF), DEF_FUNCTION_TYPE (V4SF, PCV8HF), DEF_FUNCTION_TYPE (V8SF, PCV16HF), DEF_FUNCTION_TYPE (V8SF, PCV16BF), * config/i386/i386-builtin.def: Add new builtins. * config/i386/i386-c.cc (ix86_target_macros_internal): Define __AVXNECONVERT__. * config/i386/i386-expand.cc (ix86_expand_special_args_builtin): Handle V4SF_FTYPE_PCBFLOAT16,V8SF_FTYPE_PCBFLOAT16, V4SF_FTYPE_PCFLOAT16, V8SF_FTYPE_PCFLOAT16,V4SF_FTYPE_PCV8BF, V4SF_FTYPE_PCV8HF,V8SF_FTYPE_PCV16BF,V8SF_FTYPE_PCV16HF. * config/i386/i386-isa.def : Add DEF_PTA(AVXNECONVERT) New. * config/i386/i386-options.cc (isa2_opts): Add -mavxneconvert. (ix86_valid_target_attribute_inner_p): Handle avxneconvert. * config/i386/i386.md: Add attr avx512bf16vl and avxneconvert. * config/i386/i386.opt: Add option -mavxneconvert. * config/i386/immintrin.h: Inculde avxneconvertintrin.h. * config/i386/sse.md (vbcstnebf162ps_<mode>): New define_insn. (vbcstnesh2ps_<mode>): Ditto. (vcvtnee<bf16_ph>2ps_<mode>):Ditto. (vcvtneo<bf16_ph>2ps_<mode>):Ditto. (vcvtneps2bf16_v4sf): Ditto. (*vcvtneps2bf16_v4sf): Ditto. (vcvtneps2bf16_v8sf): Ditto. * doc/invoke.texi: Document -mavxneconvert. * doc/extend.texi: Document avxneconvert. * doc/sourcebuild.texi: Document target avxneconvert. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-check.h: Add avxneconvert check. * gcc.target/i386/funcspec-56.inc: Add new target attribute. * gcc.target/i386/sse-12.c: Add -mavxneconvert. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * g++.dg/other/i386-2.C: Ditto. * g++.dg/other/i386-3.C: Ditto. * lib/target-supports.exp:add check_effective_target_avxneconvert. * gcc.target/i386/avx-ne-convert-1.c: New test. * gcc.target/i386/avx-ne-convert-vbcstnebf162ps-2.c: Ditto. * gcc.target/i386/avx-ne-convert-vbcstnesh2ps-2.c: Ditto. * gcc.target/i386/avx-ne-convert-vcvtneebf162ps-2.c: Ditto. * gcc.target/i386/avx-ne-convert-vcvtneeph2ps-2.c: Ditto. * gcc.target/i386/avx-ne-convert-vcvtneobf162ps-2.c: Ditto. * gcc.target/i386/avx-ne-convert-vcvtneoph2ps-2.c: Ditto. * gcc.target/i386/avx-ne-convert-vcvtneps2bf16-2.c: Ditto. * gcc.target/i386/avx512bf16vl-vcvtneps2bf16-1.c: Rename.. * gcc.target/i386/avx512bf16vl-vcvtneps2bf16-1a.c: To this. * gcc.target/i386/avx512bf16vl-vcvtneps2bf16-1b.c: New test.
2022-10-31i386:: using __bf16 for AVX512BF16 intrinsicskonglin17-117/+180
gcc/ChangeLog: * config/i386/avx512bf16intrin.h (__attribute__): Change short to bf16. (_mm_cvtsbh_ss): Ditto. (_mm512_cvtne2ps_pbh): Ditto. (_mm512_mask_cvtne2ps_pbh): Ditto. (_mm512_maskz_cvtne2ps_pbh): Ditto. * config/i386/avx512bf16vlintrin.h (__attribute__): Ditto. (_mm256_cvtne2ps_pbh): Ditto. (_mm256_mask_cvtne2ps_pbh): Ditto. (_mm256_maskz_cvtne2ps_pbh): Ditto. (_mm_cvtne2ps_pbh): Ditto. (_mm_mask_cvtne2ps_pbh): Ditto. (_mm_maskz_cvtne2ps_pbh): Ditto. (_mm_cvtness_sbh): Ditto. * config/i386/i386-builtin-types.def (V8BF): Add new DEF_VECTOR_TYPE for BFmode. (V16BF): Ditto. (V32BF): Ditto. * config/i386/i386-builtin.def (BDESC): Fixed builtins. * config/i386/i386-expand.cc (ix86_expand_args_builtin): Changed avx512bf16 ix86_builtin_func_type included HI to BF. * config/i386/immintrin.h: Add SSE2 depend for avx512bf16. * config/i386/sse.md (TARGET_AVX512VL): Changed HI vector to BF vector. (avx512f_cvtneps2bf16_v4sf): New define_expand. (*avx512f_cvtneps2bf16_v4sf): New define_insn. (avx512f_cvtneps2bf16_v4sf_maskz):Ditto. (avx512f_cvtneps2bf16_v4sf_mask): Ditto. (avx512f_cvtneps2bf16_v4sf_mask_1): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512bf16-cvtsbh2ss-1.c: Add fpmath option. * gcc.target/i386/avx512bf16-vdpbf16ps-2.c: Fixed scan-assembler. * gcc.target/i386/avx512bf16vl-cvtness2sbh-1.c: Add x/y suffix for vcvtneps2bf16. * gcc.target/i386/avx512bf16vl-vcvtneps2bf16-1.c: Ditto.
2022-10-31Enable V4BFmode and V2BFmode.liuhongt5-17/+30
Enable V4BFmode and V2BFmode with the same ABI as V4HFmode and V2HFmode. No real operation is supported for them except for movement. This should solve PR target/107261. Also I notice there's redundancy in VALID_AVX512FP16_REG_MODE, and remove V2BFmode remove it. gcc/ChangeLog: PR target/107261 * config/i386/i386-modes.def (VECTOR_MODE): Support V2BFmode. * config/i386/i386.cc (classify_argument): Handle V4BFmode and V2BFmode. (ix86_convert_const_vector_to_integer): Ditto. * config/i386/i386.h (VALID_AVX512FP16_REG_MODE): Remove V2BFmode. (VALID_SSE2_REG_MODE): Add V4BFmode and V2BFmode. (VALID_MMX_REG_MODE): Add V4BFmode. * config/i386/i386.md (mode): Add V4BF and V2BF. (MODE_SIZE): Ditto. * config/i386/mmx.md (MMXMODE) Add V4BF. (V_32): Add V2BF. (V_16_32_64): Add V4BF and V2BF. (mmxinsnmode): Add V4BF and V2BF. (*mov<mode>_internal): Hanlde V4BFmode and V2BFmode. gcc/testsuite/ChangeLog: * gcc.target/i386/pr107261.c: New test.
2022-10-29d: Make TARGET_D_MINFO_SECTION hooks in elfos.h the language default.Iain Buclaw4-32/+4
Removes the last of all TARGET_D_* macro definitions in common target headers. Now everything is either defined in the D language front-end, or D-specific target headers. gcc/ChangeLog: * config/darwin-d.cc (TARGET_D_MINFO_START_NAME): Rename to ... (TARGET_D_MINFO_SECTION_START): ...this. (TARGET_D_MINFO_END_NAME): Rename to ... (TARGET_D_MINFO_SECTION_END): ... this. * config/elfos.h (TARGET_D_MINFO_SECTION): Remove. (TARGET_D_MINFO_START_NAME): Remove. (TARGET_D_MINFO_END_NAME): Remove. * config/i386/cygwin-d.cc (TARGET_D_MINFO_SECTION): Remove. (TARGET_D_MINFO_START_NAME): Remove. (TARGET_D_MINFO_END_NAME): Remove. * config/i386/winnt-d.cc (TARGET_D_MINFO_SECTION): Remove. (TARGET_D_MINFO_START_NAME): Remove. (TARGET_D_MINFO_END_NAME): Remove. * doc/tm.texi: Regenerate. * doc/tm.texi.in (TARGET_D_MINFO_START_NAME): Rename to ... (TARGET_D_MINFO_SECTION_START): ...this. (TARGET_D_MINFO_END_NAME): Rename to ... (TARGET_D_MINFO_SECTION_END): ...this. gcc/d/ChangeLog: * d-target.def (d_minfo_section): Expand documentation of hook. Default initialize to "minfo". (d_minfo_start_name): Rename to ... (d_minfo_section_start): ... this. Default initialize to "__start_minfo". (d_minfo_end_name): Rename to ... (d_minfo_section_end): ... this. Default initialize to "__stop_minfo". * modules.cc (register_moduleinfo): Use new targetdm hook names.
2022-10-29d: Remove D-specific version definitions from target headersIain Buclaw19-91/+325
This splits up the targetdm sources so that each file only handles one target platform. Having all logic kept in the headers means that they could become out of sync when a new target is added (loongarch*-*-linux*) or accidentally broken if some headers in tm_file are changed about. gcc/ChangeLog: * config.gcc: Split out glibc-d.o into linux-d.o, kfreebsd-d.o, kopensolaris-d.o, and gnu-d.o. Split out cygwin-d.o from winnt-d.o. * config/arm/linux-eabi.h (EXTRA_TARGET_D_OS_VERSIONS): Remove. * config/gnu.h (GNU_USER_TARGET_D_OS_VERSIONS): Remove. * config/i386/cygwin.h (EXTRA_TARGET_D_OS_VERSIONS): Remove. * config/i386/linux-common.h (EXTRA_TARGET_D_OS_VERSIONS): Remove. * config/i386/mingw32.h (EXTRA_TARGET_D_OS_VERSIONS): Remove. * config/i386/t-cygming: Add cygwin-d.o. * config/i386/winnt-d.cc (winnt_d_os_builtins): Only add MinGW-specific version condition. * config/kfreebsd-gnu.h (GNU_USER_TARGET_D_OS_VERSIONS): Remove. * config/kopensolaris-gnu.h (GNU_USER_TARGET_D_OS_VERSIONS): Remove. * config/linux-android.h (ANDROID_TARGET_D_OS_VERSIONS): Remove. * config/linux.h (GNU_USER_TARGET_D_OS_VERSIONS): Remove. * config/mips/linux-common.h (EXTRA_TARGET_D_OS_VERSIONS): Remove. * config/t-glibc: Remove glibc-d.o, add gnu-d.o, kfreebsd-d.o, kopensolaris-d.o. * config/t-linux: Add linux-d.o. * config/glibc-d.cc: Remove file. * config/gnu-d.cc: New file. * config/i386/cygwin-d.cc: New file. * config/kfreebsd-d.cc: New file. * config/kopensolaris-d.cc: New file. * config/linux-d.cc: New file.
2022-10-28Fix signed vs unsigned issue in H8 portJeff Law2-2/+2
gcc/ * config/h8300/h8300.cc (pre_incdec_with_reg): Make reg argument an unsigned int * config/h8300/h8300-protos.h (pre_incdec_with_reg): Adjust prototype.
2022-10-28c: tree: target: C2x (...) function prototypes and va_start relaxationJoseph Myers23-49/+79
C2x allows function prototypes to be given as (...), a prototype meaning a variable-argument function with no named arguments. To allow such functions to access their arguments, requirements for va_start calls are relaxed so it ignores all but its first argument (i.e. subsequent arguments, if any, can be arbitrary pp-token sequences). Implement this feature accordingly. The va_start relaxation in <stdarg.h> is itself easy: __builtin_va_start already supports a second argument of 0 instead of a parameter name, and calls get converted internally to the form using 0 for that argument, so <stdarg.h> just needs changing to use a variadic macro that passes 0 as the second argument of __builtin_va_start. (This is done only in C2x mode, on the expectation that users of older standard would expect unsupported uses of va_start to be diagnosed.) For the (...) functions, it's necessary to distinguish these from unprototyped functions, whereas previously C++ (...) functions and unprototyped functions both used NULL TYPE_ARG_TYPES. A flag is added to tree_type_common to mark the (...) functions; as discussed on gcc@, doing things this way is likely to be safer for unchanged code in GCC than adding a different form of representation in TYPE_ARG_TYPES, or adding a flag that instead signals that the function is unprototyped. There was previously an option -fallow-parameterless-variadic-functions to enable support for (...) prototypes. The support was incomplete - it treated the functions as unprototyped, and only parsed some declarations, not e.g. "int g (int (...));". This option is changed into a no-op ignored option; (...) is always accepted syntactically, with a pedwarn_c11 call to given required diagnostics when appropriate. The peculiarity of a parameter list with __attribute__ followed by '...' being accepted with that option is removed. Interfaces in tree.cc that create function types are adjusted to set this flag as appropriate. It is of course possible that some existing users of the functions to create variable-argument functions actually wanted unprototyped functions in the no-named-argument case, rather than functions with a (...) prototype; some such cases in c-common.cc (for built-in functions and implicit function declarations) turn out to need updating for that reason. I didn't do anything to change how the C++ front end creates (...) function types. It's very likely there are unchanged places in the compiler that in fact turn out to need changes to work properly with (...) function prototypes. Target setup_incoming_varargs hooks, where they used the information passed about the last named argument, needed updating to avoid using that information in the (...) case. Note that apart from the x86 changes, I haven't done any testing of those target changes beyond building cc1 to check for syntax errors. It's possible further target-specific fixes will be needed; target maintainers should watch out for failures of c2x-stdarg-4.c or c2x-stdarg-split-1a.c, the execution tests, which would indicate that this feature is not working correctly. Those tests also verify the case where there are named arguments but the last named argument has a declaration that results in undefined behavior in previous C standard versions, such as a type changed by the default argument promotions. Bootstrapped with no regressions for x86_64-pc-linux-gnu. gcc/ * config/aarch64/aarch64.cc (aarch64_setup_incoming_varargs): Check TYPE_NO_NAMED_ARGS_STDARG_P. * config/alpha/alpha.cc (alpha_setup_incoming_varargs): Likewise. * config/arc/arc.cc (arc_setup_incoming_varargs): Likewise. * config/arm/arm.cc (arm_setup_incoming_varargs): Likewise. * config/csky/csky.cc (csky_setup_incoming_varargs): Likewise. * config/epiphany/epiphany.cc (epiphany_setup_incoming_varargs): Likewise. * config/fr30/fr30.cc (fr30_setup_incoming_varargs): Likewise. * config/frv/frv.cc (frv_setup_incoming_varargs): Likewise. * config/ft32/ft32.cc (ft32_setup_incoming_varargs): Likewise. * config/i386/i386.cc (ix86_setup_incoming_varargs): Likewise. * config/ia64/ia64.cc (ia64_setup_incoming_varargs): Likewise. * config/loongarch/loongarch.cc (loongarch_setup_incoming_varargs): Likewise. * config/m32r/m32r.cc (m32r_setup_incoming_varargs): Likewise. * config/mcore/mcore.cc (mcore_setup_incoming_varargs): Likewise. * config/mips/mips.cc (mips_setup_incoming_varargs): Likewise. * config/mmix/mmix.cc (mmix_setup_incoming_varargs): Likewise. * config/nds32/nds32.cc (nds32_setup_incoming_varargs): Likewise. * config/nios2/nios2.cc (nios2_setup_incoming_varargs): Likewise. * config/riscv/riscv.cc (riscv_setup_incoming_varargs): Likewise. * config/rs6000/rs6000-call.cc (setup_incoming_varargs): Likewise. * config/sh/sh.cc (sh_setup_incoming_varargs): Likewise. * config/visium/visium.cc (visium_setup_incoming_varargs): Likewise. * config/vms/vms-c.cc (vms_c_common_override_options): Do not set flag_allow_parameterless_variadic_functions. * doc/invoke.texi (-fallow-parameterless-variadic-functions): Do not document option. * function.cc (assign_parms): Call assign_parms_setup_varargs for TYPE_NO_NAMED_ARGS_STDARG_P case. * ginclude/stdarg.h [__STDC_VERSION__ > 201710L] (va_start): Make variadic macro. Pass second argument of 0 to __builtin_va_start. * target.def (setup_incoming_varargs): Update documentation. * doc/tm.texi: Regenerate. * tree-core.h (struct tree_type_common): Add no_named_args_stdarg_p. * tree-streamer-in.cc (unpack_ts_type_common_value_fields): Unpack TYPE_NO_NAMED_ARGS_STDARG_P. * tree-streamer-out.cc (pack_ts_type_common_value_fields): Pack TYPE_NO_NAMED_ARGS_STDARG_P. * tree.cc (type_cache_hasher::equal): Compare TYPE_NO_NAMED_ARGS_STDARG_P. (build_function_type): Add argument no_named_args_stdarg_p. (build_function_type_list_1, build_function_type_array_1) (reconstruct_complex_type): Update calls to build_function_type. (stdarg_p, prototype_p): Return true for (...) functions. (gimple_canonical_types_compatible_p): Compare TYPE_NO_NAMED_ARGS_STDARG_P. * tree.h (TYPE_NO_NAMED_ARGS_STDARG_P): New. (build_function_type): Update prototype. gcc/c-family/ * c-common.cc (def_fn_type): Call build_function_type for zero-argument variable-argument function. (c_common_nodes_and_builtins): Build default_function_type with build_function_type. * c.opt (fallow-parameterless-variadic-functions): Mark as ignored option. gcc/c/ * c-decl.cc (grokdeclarator): Pass arg_info->no_named_args_stdarg_p to build_function_type. (grokparms): Check arg_info->no_named_args_stdarg_p before converting () to (void). (build_arg_info): Initialize no_named_args_stdarg_p. (get_parm_info): Set no_named_args_stdarg_p. (start_function): Pass TYPE_NO_NAMED_ARGS_STDARG_P to build_function_type. (store_parm_decls): Count (...) functions as prototyped. * c-parser.cc (c_parser_direct_declarator): Allow '...' after open parenthesis to start parameter list. (c_parser_parms_list_declarator): Always allow '...' with no arguments, call pedwarn_c11 and set no_named_args_stdarg_p. * c-tree.h (struct c_arg_info): Add field no_named_args_stdarg_p. * c-typeck.cc (composite_type): Handle TYPE_NO_NAMED_ARGS_STDARG_P. (function_types_compatible_p): Compare TYPE_NO_NAMED_ARGS_STDARG_P. gcc/fortran/ * trans-types.cc (gfc_get_function_type): Do not use build_varargs_function_type_vec for unprototyped function. gcc/lto/ * lto-common.cc (compare_tree_sccs_1): Compare TYPE_NO_NAMED_ARGS_STDARG_P. gcc/objc/ * objc-next-runtime-abi-01.cc (build_next_objc_exception_stuff): Use build_function_type to build type of objc_setjmp_decl. gcc/testsuite/ * gcc.dg/c11-stdarg-1.c, gcc.dg/c11-stdarg-2.c, gcc.dg/c11-stdarg-3.c, gcc.dg/c2x-stdarg-1.c, gcc.dg/c2x-stdarg-2.c, gcc.dg/c2x-stdarg-3.c, gcc.dg/c2x-stdarg-4.c, gcc.dg/gnu2x-stdarg-1.c, gcc.dg/torture/c2x-stdarg-split-1a.c, gcc.dg/torture/c2x-stdarg-split-1b.c: New tests. * gcc.dg/Wold-style-definition-2.c, gcc.dg/format/sentinel-1.c: Update expected diagnostics. * gcc.dg/c2x-nullptr-1.c (test5): Cast unused parameter to (void). * gcc.dg/diagnostic-token-ranges.c: Use -pedantic. Expect warning in place of error.
2022-10-28Aarch64: Do not define DONT_USE_BUILTIN_SETJMPEric Botcazou1-4/+0
The setting looks obsolete at this point. gcc/ * config/aarch64/aarch64.h (DONT_USE_BUILTIN_SETJMP): Delete.
2022-10-27x86: Replace ne:CCC/ne:CCO with UNSPEC_CC_NE in neg patternsH.J. Lu1-20/+25
In i386.md, neg patterns which set MODE_CC register like (set (reg:CCC FLAGS_REG) (ne:CCC (match_operand:SWI48 1 "general_reg_operand") (const_int 0))) can lead to errors when operand 1 is a constant value. If FLAGS_REG in (set (reg:CCC FLAGS_REG) (ne:CCC (const_int 2) (const_int 0))) is set to 1, RTX simplifiers may simplify (set (reg:SI 93) (neg:SI (ltu:SI (reg:CCC 17 flags) (const_int 0 [0])))) as (set (reg:SI 93) (neg:SI (ltu:SI (const_int 1) (const_int 0 [0])))) which leads to incorrect results since LTU on MODE_CC register isn't the same as "unsigned less than" in x86 backend. To prevent RTL optimizers from setting MODE_CC register to a constant, use UNSPEC_CC_NE to replace ne:CCC/ne:CCO when setting FLAGS_REG in neg patterns. gcc/ PR target/107172 * config/i386/i386.md (UNSPEC_CC_NE): New. Replace ne:CCC/ne:CCO with UNSPEC_CC_NE in neg patterns. gcc/testsuite/ PR target/107172 * gcc.target/i386/pr107172.c: New test.
2022-10-27aarch64: Reinstate some uses of CONSTEXPRRichard Sandiford8-62/+62
In 9482a5e4eac8d696129ec2854b331e1bb5dbab42 I'd replaced uses of CONSTEXPR with direct uses of constexpr. However, it turns out that we still have CONSTEXPR for a reason: GCC 4.8 doesn't implement constexpr properly, and for example rejects things like: extern const int x; constexpr int x = 1; This patch partially reverts the previous one. To make things more complicated, there are still some things that need to be constexpr rather than CONSTEXPR, since they are used to initialise scalar constants. The patch therefore doesn't change anything in aarch64-feature-deps.h. gcc/ * config/aarch64/aarch64-protos.h: Replace constexpr with CONSTEXPR. * config/aarch64/aarch64-sve-builtins-base.cc: Likewise. * config/aarch64/aarch64-sve-builtins-functions.h: Likewise. * config/aarch64/aarch64-sve-builtins-shapes.cc: Likewise. * config/aarch64/aarch64-sve-builtins-sve2.cc: Likewise. * config/aarch64/aarch64-sve-builtins.cc: Likewise. * config/aarch64/aarch64.cc: Likewise. * config/aarch64/driver-aarch64.cc: Likewise
2022-10-27RISC-V: Limit regs use for z*inx extension.Jiawei2-6/+20
Limit z*inx abi support with 'ilp32','ilp32e','lp64' only. Use GPR instead FPR when 'zfinx' enable, Only use even registers in RV32 when 'zdinx' enable. Enable FLOAT16 when Zhinx/Zhinxmin enabled. Co-Authored-By: Sinan Lin <sinan@isrc.iscas.ac.cn> gcc/ChangeLog: * config/riscv/constraints.md (TARGET_ZFINX ? GR_REGS): Set GPRS use while Zfinx is enable. * config/riscv/riscv.cc (riscv_hard_regno_mode_ok): Limit odd registers use when Zdinx enable in RV32 cases. (riscv_option_override): New target enable MASK_FDIV. (riscv_libgcc_floating_mode_supported_p): New error info when use incompatible arch&abi. (riscv_excess_precision): New target enable FLOAT16.
2022-10-27RISC-V: Target support for z*inx extension.Jiawei4-44/+46
Support 'TARGET_ZFINX' with float instruction pattern and builtin function. Reuse 'TARGET_HADR_FLOAT', 'TARGET_DOUBLE_FLOAT' and 'TARGET_ZHINX' patterns. gcc/ChangeLog: * config/riscv/iterators.md (TARGET_ZFINX):New target. (TARGET_ZDINX): Ditto. (TARGET_ZHINX): Ditto. * config/riscv/riscv-builtins.cc (AVAIL): Ditto. (riscv_atomic_assign_expand_fenv): Ditto. * config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Ditto. * config/riscv/riscv.md: Ditto.
2022-10-27RISC-V: Minimal support of z*inx extension.Jiawei3-0/+18
Minimal support of z*inx extension, include 'zfinx', 'zdinx' and 'zhinx/zhinxmin' corresponding to 'f', 'd' and 'zfh/zfhmin', the 'zdinx' will imply 'zfinx' same as 'd' imply 'f', 'zhinx' will aslo imply 'zfinx', all zfinx extension imply 'zicsr'. Co-Authored-By: Sinan Lin <sinan@isrc.iscas.ac.cn> gcc/ChangeLog: * common/config/riscv/riscv-common.cc: New extensions. * config/riscv/arch-canonicalize: New imply relations. * config/riscv/riscv-opts.h (MASK_ZFINX): New mask. (MASK_ZDINX): Ditto. (MASK_ZHINX): Ditto. (MASK_ZHINXMIN): Ditto. (TARGET_ZFINX): New target. (TARGET_ZDINX): Ditto. (TARGET_ZHINX): Ditto. (TARGET_ZHINXMIN): Ditto. * config/riscv/riscv.opt: New target variable.
2022-10-26bpf: add preserve_field_info builtinDavid Faust3-75/+334
Add BPF __builtin_preserve_field_info. This builtin is used to extract information to facilitate struct and union relocations performed by the BPF loader, especially for bitfields. The builtin has the following signature: unsigned int __builtin_preserve_field_info (EXPR, unsigned int KIND); Where EXPR is an expression accessing a field of a struct or union. Depending on KIND, different information is returned to the program. The supported values for KIND are as follows: enum { FIELD_BYTE_OFFSET = 0, FIELD_BYTE_SIZE, FIELD_EXISTENCE, FIELD_SIGNEDNESS, FIELD_LSHIFT_U64, FIELD_RSHIFT_U64 }; If -mco-re is in effect (explicitly or implicitly specified), a CO-RE relocation is added for the access in EXPR recording the relevant information according to KIND. gcc/ * config/bpf/bpf.cc: Support __builtin_preserve_field_info. (enum bpf_builtins): Add new builtin. (bpf_init_builtins): Likewise. (bpf_core_field_info): New function. (bpf_expand_builtin): Accomodate new builtin. Refactor adding new relocation to... (maybe_make_core_relo): ... here. New function. (bpf_resolve_overloaded_builtin): Accomodate new builtin. (bpf_core_newdecl): Likewise. (bpf_core_walk): Likewise. (bpf_core_is_maybe_aggregate_access): Improve logic. (struct core_walk_data): New. * config/bpf/coreout.cc (bpf_core_reloc_add): Allow adding different relocation kinds. * config/bpf/coreout.h: Analogous change. * doc/extend.texi: Document BPF __builtin_preserve_field_info. gcc/testsuite/ * gcc.target/bpf/core-builtin-fieldinfo-errors-1.c: New test. * gcc.target/bpf/core-builtin-fieldinfo-errors-2.c: New test. * gcc.target/bpf/core-builtin-fieldinfo-existence-1.c: New test. * gcc.target/bpf/core-builtin-fieldinfo-lshift-1-be.c: New test. * gcc.target/bpf/core-builtin-fieldinfo-lshift-1-le.c: New test. * gcc.target/bpf/core-builtin-fieldinfo-lshift-2.c: New test. * gcc.target/bpf/core-builtin-fieldinfo-offset-1.c: New test. * gcc.target/bpf/core-builtin-fieldinfo-rshift-1.c: New test. * gcc.target/bpf/core-builtin-fieldinfo-rshift-2.c: New test. * gcc.target/bpf/core-builtin-fieldinfo-sign-1.c: New test. * gcc.target/bpf/core-builtin-fieldinfo-sign-2.c: New test. * gcc.target/bpf/core-builtin-fieldinfo-size-1.c: New test.
2022-10-26xtensa: Fix out-of-bounds array access in the movdi patternTakayuki 'January June' Suwa1-3/+4
The following new warnings were introduced in the commit 4f3f0296acbb ("xtensa: Prepare the transition from Reload to LRA"): gcc/config/xtensa/xtensa.md:945:26: error: array subscript 3 is above array bounds of 'rtx_def* [2]' [-Werror=array-bounds] 945 | emit_move_insn (operands[2], operands[3]); gcc/config/xtensa/xtensa.md:945:26: error: array subscript 2 is above array bounds of 'rtx_def* [2]' [-Werror=array-bounds] 945 | emit_move_insn (operands[2], operands[3]); From gcc/insn-emit.cc (generated by building): > /* ../../gcc/config/xtensa/xtensa.md:932 */ > rtx > gen_movdi (rtx operand0, > rtx operand1) > { > rtx_insn *_val = 0; > start_sequence (); > { > rtx operands[2]; // only 2 elements > operands[0] = operand0; > operands[1] = operand1; > #define FAIL return (end_sequence (), _val) > #define DONE return (_val = get_insns (), end_sequence (), _val) > #line 936 "../../gcc/config/xtensa/xtensa.md" > { > if (CONSTANT_P (operands[1])) > { > /* Split in halves if 64-bit Const-to-Reg moves > because of offering further optimization opportunities. */ > if (register_operand (operands[0], DImode)) > { > xtensa_split_DI_reg_imm (operands); // out-of-bounds! > emit_move_insn (operands[0], operands[1]); > emit_move_insn (operands[2], operands[3]); // out-of-bounds! > DONE; > } gcc/ChangeLog: * config/xtensa/xtensa.md (movdi): Copy operands[0...1] to ops[0...3] and then use the latter before calling xtensa_split_DI_reg_imm() and emitting insns.
2022-10-26RISC-V: Fix epilogue generation for barrier.Ju-Zhe Zhong1-2/+2
I noticed that I have made a mistake in previous patch: https://patchwork.sourceware.org/project/gcc/patch/20220817071950.271762-1-juzhe.zhong@rivai.ai/ The previous statement before this patch: bool need_barrier_p = (get_frame_size () + cfun->machine->frame.arg_pointer_offset) != 0; However, I changed it in the previous patch: bool need_barrier_p = known_ne (get_frame_size (), cfun->machine->frame.arg_pointer_offset); This is incorrect. Now, I correct this statement in this patch. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_expand_epilogue): Fix statement.
2022-10-26RISC-V: ADJUST_NUNITS according to -march.Ju-Zhe Zhong5-53/+50
This patch fixed PR107357: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107357 gcc/ChangeLog: PR target/107357 * config/riscv/riscv-modes.def (VECTOR_BOOL_MODE): Set to minimum size. (ADJUST_NUNITS): Adjust according to -march. (ADJUST_BYTESIZE): Ditto. * config/riscv/riscv-protos.h (riscv_v_ext_enabled_vector_mode_p): Remove. (riscv_v_ext_vector_mode_p): Change function implementation. * config/riscv/riscv-vector-builtins.cc (rvv_switcher::rvv_switcher): Change to riscv_v_ext_vector_mode_p. (register_builtin_type): Ditto. * config/riscv/riscv.cc (riscv_v_ext_vector_mode_p): Change to enabled modes. (ENTRY): Ditto. (riscv_v_ext_enabled_vector_mode_p): Remove. (riscv_v_adjust_nunits): New function. (riscv_vector_mode_supported_p): Use riscv_v_ext_vector_mode_p instead. * config/riscv/riscv.h (riscv_v_adjust_nunits): New function.
2022-10-26RISC-V: Support load/store in mov<mode> pattern for RVV modes.Ju-Zhe Zhong11-27/+645
gcc/ChangeLog: * config.gcc (riscv*): Add riscv-v.o to extra_objs. * config/riscv/constraints.md (vu): New constraint. (vi): Ditto. (Wc0): Ditto. (Wc1): Ditto. * config/riscv/predicates.md (vector_length_operand): New. (reg_or_mem_operand): Ditto. (vector_move_operand): Ditto. (vector_mask_operand): Ditto. (vector_merge_operand): Ditto. * config/riscv/riscv-protos.h (riscv_regmode_natural_size) New. (riscv_vector::const_vec_all_same_in_range_p): Ditto. (riscv_vector::legitimize_move): Ditto. (tail_policy): Ditto. (mask_policy): Ditto. * config/riscv/riscv-v.cc: New. * config/riscv/riscv-vector-builtins-bases.cc (vsetvl::expand): Refactor how LMUL encoding. * config/riscv/riscv.cc (riscv_print_operand): Update how LMUL print and mask operand print. (riscv_regmode_natural_size): New. * config/riscv/riscv.h (REGMODE_NATURAL_SIZE): New. * config/riscv/riscv.md (mode): Add vector modes. * config/riscv/t-riscv (riscv-v.o) New. * config/riscv/vector-iterators.md: New. * config/riscv/vector.md (vundefined<mode>): New. (mov<mode>): New. (*mov<mode>): New. (@vsetvl<mode>_no_side_effects): New. (@pred_mov<mode>): New. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/mov-1.c: New. * gcc.target/riscv/rvv/base/mov-10.c: New. * gcc.target/riscv/rvv/base/mov-11.c: New. * gcc.target/riscv/rvv/base/mov-12.c: New. * gcc.target/riscv/rvv/base/mov-13.c: New. * gcc.target/riscv/rvv/base/mov-2.c: New. * gcc.target/riscv/rvv/base/mov-3.c: New. * gcc.target/riscv/rvv/base/mov-4.c: New. * gcc.target/riscv/rvv/base/mov-5.c: New. * gcc.target/riscv/rvv/base/mov-6.c: New. * gcc.target/riscv/rvv/base/mov-7.c: New. * gcc.target/riscv/rvv/base/mov-8.c: New. * gcc.target/riscv/rvv/base/mov-9.c: New.
2022-10-26RISC-V: Recognized Svinval and Svnapot extensionsMonk Chiang2-0/+9
gcc/ChangeLog: * common/config/riscv/riscv-common.cc (riscv_ext_version_table): Add svinval and svnapot extension. (riscv_ext_flag_table): Ditto. * config/riscv/riscv-opts.h (MASK_SVINVAL): New. (MASK_SVNAPOT): Ditto. (TARGET_SVINVAL): Ditto. (TARGET_SVNAPOT): Ditto. * config/riscv/riscv.opt (riscv_sv_subext): New. gcc/testsuite/ChangeLog: * gcc.target/riscv/predef-24.c:New. * gcc.target/riscv/predef-25.c:New.
2022-10-26RISC-V: Adjust table indentation in commnet for riscv-modes.defJu-Zhe Zhong1-23/+23
gcc/ChangeLog: * config/riscv/riscv-modes.def: Adjust table indentation in commnet.
2022-10-26rs6000: cannot_force_const_mem for HIGH code rtx[PR106460]Jiufu Guo1-2/+5
As the issue in PR106460, a rtx 'high:DI (symbol_ref:DI ("var_48")' is tried to store into constant pool and ICE occur. But actually, this rtx represents partial incomplete address and can not be put into a .rodata section. This patch updates rs6000_cannot_force_const_mem to return true for rtx(s) with HIGH code, because these rtx(s) indicate part of address and are not ok for constant pool. Below are some examples: (high:DI (const:DI (plus:DI (symbol_ref:DI ("xx") (const_int 12 [0xc]))))) (high:DI (symbol_ref:DI ("var_1")..))) PR target/106460 gcc/ChangeLog: * config/rs6000/rs6000.cc (rs6000_cannot_force_const_mem): Return true for HIGH code rtx. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr106460.c: New test.