aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-09-08AVX512FP16: Add testcase for vector init and broadcast intrinsics.liuhongt23-1/+909
gcc/testsuite/ChangeLog: * gcc.target/i386/m512-check.h: Add union128h, union256h, union512h. * gcc.target/i386/avx512fp16-10a.c: New test. * gcc.target/i386/avx512fp16-10b.c: Ditto. * gcc.target/i386/avx512fp16-1a.c: Ditto. * gcc.target/i386/avx512fp16-1b.c: Ditto. * gcc.target/i386/avx512fp16-1c.c: Ditto. * gcc.target/i386/avx512fp16-1d.c: Ditto. * gcc.target/i386/avx512fp16-1e.c: Ditto. * gcc.target/i386/avx512fp16-2a.c: Ditto. * gcc.target/i386/avx512fp16-2b.c: Ditto. * gcc.target/i386/avx512fp16-2c.c: Ditto. * gcc.target/i386/avx512fp16-3a.c: Ditto. * gcc.target/i386/avx512fp16-3b.c: Ditto. * gcc.target/i386/avx512fp16-3c.c: Ditto. * gcc.target/i386/avx512fp16-4.c: Ditto. * gcc.target/i386/avx512fp16-5.c: Ditto. * gcc.target/i386/avx512fp16-6.c: Ditto. * gcc.target/i386/avx512fp16-7.c: Ditto. * gcc.target/i386/avx512fp16-8.c: Ditto. * gcc.target/i386/avx512fp16-9a.c: Ditto. * gcc.target/i386/avx512fp16-9b.c: Ditto. * gcc.target/i386/pr54855-13.c: Ditto. * gcc.target/i386/avx512fp16-vec_set_var.c: Ditto.
2021-09-08AVX512FP16: Support vector init/broadcast/set/extract for FP16.liuhongt8-132/+658
gcc/ChangeLog: * config/i386/avx512fp16intrin.h (_mm_set_ph): New intrinsic. (_mm256_set_ph): Likewise. (_mm512_set_ph): Likewise. (_mm_setr_ph): Likewise. (_mm256_setr_ph): Likewise. (_mm512_setr_ph): Likewise. (_mm_set1_ph): Likewise. (_mm256_set1_ph): Likewise. (_mm512_set1_ph): Likewise. (_mm_setzero_ph): Likewise. (_mm256_setzero_ph): Likewise. (_mm512_setzero_ph): Likewise. (_mm_set_sh): Likewise. (_mm_load_sh): Likewise. (_mm_store_sh): Likewise. * config/i386/i386-builtin-types.def (V8HF): New type. (DEF_FUNCTION_TYPE (V8HF, V8HI)): New builtin function type * config/i386/i386-expand.c (ix86_expand_vector_init_duplicate): Support vector HFmodes. (ix86_expand_vector_init_one_nonzero): Likewise. (ix86_expand_vector_init_one_var): Likewise. (ix86_expand_vector_init_interleave): Likewise. (ix86_expand_vector_init_general): Likewise. (ix86_expand_vector_set): Likewise. (ix86_expand_vector_extract): Likewise. (ix86_expand_vector_init_concat): Likewise. (ix86_expand_sse_movcc): Handle vector HFmodes. (ix86_expand_vector_set_var): Ditto. * config/i386/i386-modes.def: Add HF vector modes in comment. * config/i386/i386.c (classify_argument): Add HF vector modes. (ix86_hard_regno_mode_ok): Allow HF vector modes for AVX512FP16. (ix86_vector_mode_supported_p): Likewise. (ix86_set_reg_reg_cost): Handle vector HFmode. (ix86_get_ssemov): Handle vector HFmode. (function_arg_advance_64): Pass unamed V16HFmode and V32HFmode by stack. (function_arg_advance_32): Pass V8HF/V16HF/V32HF by sse reg for 32bit mode. (function_arg_advance_32): Ditto. * config/i386/i386.h (VALID_AVX512FP16_REG_MODE): New. (VALID_AVX256_REG_OR_OI_MODE): Rename to .. (VALID_AVX256_REG_OR_OI_VHF_MODE): .. this, and add V16HF. (VALID_SSE2_REG_VHF_MODE): New. (VALID_AVX512VL_128_REG_MODE): Add V8HF and TImode. (SSE_REG_MODE_P): Add vector HFmode. * config/i386/i386.md (mode): Add HF vector modes. (MODE_SIZE): Likewise. (ssemodesuffix): Add ph suffix for HF vector modes. * config/i386/sse.md (VFH_128): New mode iterator. (VMOVE): Adjust for HF vector modes. (V): Likewise. (V_256_512): Likewise. (avx512): Likewise. (avx512fmaskmode): Likewise. (shuffletype): Likewise. (sseinsnmode): Likewise. (ssedoublevecmode): Likewise. (ssehalfvecmode): Likewise. (ssehalfvecmodelower): Likewise. (ssePScmode): Likewise. (ssescalarmode): Likewise. (ssescalarmodelower): Likewise. (sseintprefix): Likewise. (i128): Likewise. (bcstscalarsuff): Likewise. (xtg_mode): Likewise. (VI12HF_AVX512VL): New mode_iterator. (VF_AVX512FP16): Likewise. (VIHF): Likewise. (VIHF_256): Likewise. (VIHF_AVX512BW): Likewise. (V16_256): Likewise. (V32_512): Likewise. (sseintmodesuffix): New mode_attr. (sse): Add scalar and vector HFmodes. (ssescalarmode): Add vector HFmode mapping. (ssescalarmodesuffix): Add sh suffix for HFmode. (*<sse>_vm<insn><mode>3): Use VFH_128. (*<sse>_vm<multdiv_mnemonic><mode>3): Likewise. (*ieee_<ieee_maxmin><mode>3): Likewise. (<avx512>_blendm<mode>): New define_insn. (vec_setv8hf): New define_expand. (vec_set<mode>_0): New define_insn for HF vector set. (*avx512fp16_movsh): Likewise. (avx512fp16_movsh): Likewise. (vec_extract_lo_v32hi): Rename to ... (vec_extract_lo_<mode>): ... this, and adjust to allow HF vector modes. (vec_extract_hi_v32hi): Likewise. (vec_extract_hi_<mode>): Likewise. (vec_extract_lo_v16hi): Likewise. (vec_extract_lo_<mode>): Likewise. (vec_extract_hi_v16hi): Likewise. (vec_extract_hi_<mode>): Likewise. (vec_set_hi_v16hi): Likewise. (vec_set_hi_<mode>): Likewise. (vec_set_lo_v16hi): Likewise. (vec_set_lo_<mode>): Likewise. (*vec_extract<mode>_0): New define_insn_and_split for HF vector extract. (*vec_extracthf): New define_insn. (VEC_EXTRACT_MODE): Add HF vector modes. (PINSR_MODE): Add V8HF. (sse2p4_1): Likewise. (pinsr_evex_isa): Likewise. (<sse2p4_1>_pinsr<ssemodesuffix>): Adjust to support insert for V8HFmode. (pbroadcast_evex_isa): Add HF vector modes. (AVX2_VEC_DUP_MODE): Likewise. (VEC_INIT_MODE): Likewise. (VEC_INIT_HALF_MODE): Likewise. (avx2_pbroadcast<mode>): Adjust to support HF vector mode broadcast. (avx2_pbroadcast<mode>_1): Likewise. (<avx512>_vec_dup<mode>_1): Likewise. (<avx512>_vec_dup<mode><mask_name>): Likewise. (<mask_codefor><avx512>_vec_dup_gpr<mode><mask_name>): Likewise.
2021-09-08AVX512FP16: Initial support for AVX512FP16 feature and scalar _Float16 ↵Guo, Xuepeng41-76/+561
instructions. gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_available_features): Detect FEATURE_AVX512FP16. * common/config/i386/i386-common.c (OPTION_MASK_ISA_AVX512FP16_SET, OPTION_MASK_ISA_AVX512FP16_UNSET, OPTION_MASK_ISA2_AVX512FP16_SET, OPTION_MASK_ISA2_AVX512FP16_UNSET): New. (OPTION_MASK_ISA2_AVX512BW_UNSET, OPTION_MASK_ISA2_AVX512BF16_UNSET): Add AVX512FP16. (ix86_handle_option): Handle -mavx512fp16. * common/config/i386/i386-cpuinfo.h (enum processor_features): Add FEATURE_AVX512FP16. * common/config/i386/i386-isas.h: Add entry for AVX512FP16. * config.gcc: Add avx512fp16intrin.h. * config/i386/avx512fp16intrin.h: New intrinsic header. * config/i386/cpuid.h: Add bit_AVX512FP16. * config/i386/i386-builtin-types.def: (FLOAT16): New primitive type. * config/i386/i386-builtins.c: Support _Float16 type for i386 backend. (ix86_register_float16_builtin_type): New function. (ix86_float16_type_node): New. * config/i386/i386-c.c (ix86_target_macros_internal): Define __AVX512FP16__. * config/i386/i386-expand.c (ix86_expand_branch): Support HFmode. (ix86_prepare_fp_compare_args): Adjust TARGET_SSE_MATH && SSE_FLOAT_MODE_P to SSE_FLOAT_MODE_SSEMATH_OR_HF_P. (ix86_expand_fp_movcc): Ditto. * config/i386/i386-isa.def: Add PTA define for AVX512FP16. * config/i386/i386-options.c (isa2_opts): Add -mavx512fp16. (ix86_valid_target_attribute_inner_p): Add avx512fp16 attribute. * config/i386/i386.c (ix86_get_ssemov): Use vmovdqu16/vmovw/vmovsh for HFmode/HImode scalar or vector. (ix86_get_excess_precision): Use FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 when TARGET_AVX512FP16 existed. (sse_store_index): Use SFmode cost for HFmode cost. (inline_memory_move_cost): Add HFmode, and perfer SSE cost over GPR cost for HFmode. (ix86_hard_regno_mode_ok): Allow HImode in sse register. (ix86_mangle_type): Add manlging for _Float16 type. (inline_secondary_memory_needed): No memory is needed for 16bit movement between gpr and sse reg under TARGET_AVX512FP16. (ix86_multiplication_cost): Adjust TARGET_SSE_MATH && SSE_FLOAT_MODE_P to SSE_FLOAT_MODE_SSEMATH_OR_HF_P. (ix86_division_cost): Ditto. (ix86_rtx_costs): Ditto. (ix86_add_stmt_cost): Ditto. (ix86_optab_supported_p): Ditto. * config/i386/i386.h (VALID_AVX512F_SCALAR_MODE): Add HFmode. (SSE_FLOAT_MODE_SSEMATH_OR_HF_P): Add HFmode. (PTA_SAPPHIRERAPIDS): Add PTA_AVX512FP16. * config/i386/i386.md (mode): Add HFmode. (MODE_SIZE): Add HFmode. (isa): Add avx512fp16. (enabled): Handle avx512fp16. (ssemodesuffix): Add sh suffix for HFmode. (comm): Add mult, div. (plusminusmultdiv): New code iterator. (insn): Add mult, div. (*movhf_internal): Adjust for avx512fp16 instruction. (*movhi_internal): Ditto. (*cmpi<unord>hf): New define_insn for HFmode. (*ieee_s<ieee_maxmin>hf3): Likewise. (extendhf<mode>2): Likewise. (trunc<mode>hf2): Likewise. (float<floatunssuffix><mode>hf2): Likewise. (*<insn>hf): Likewise. (cbranchhf4): New expander. (movhfcc): Likewise. (<insn>hf3): Likewise. (mulhf3): Likewise. (divhf3): Likewise. * config/i386/i386.opt: Add mavx512fp16. * config/i386/immintrin.h: Include avx512fp16intrin.h. * doc/invoke.texi: Add mavx512fp16. * doc/extend.texi: Add avx512fp16 Usage Notes. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add -mavx512fp16 in dg-options. * gcc.target/i386/avx-2.c: Ditto. * gcc.target/i386/avx512-check.h: Check cpuid for AVX512FP16. * gcc.target/i386/funcspec-56.inc: Add new target attribute check. * gcc.target/i386/sse-13.c: Add -mavx512fp16. * gcc.target/i386/sse-14.c: Ditto. * gcc.target/i386/sse-22.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * lib/target-supports.exp: (check_effective_target_avx512fp16): New. * g++.target/i386/float16-1.C: New test. * g++.target/i386/float16-2.C: Ditto. * g++.target/i386/float16-3.C: Ditto. * gcc.target/i386/avx512fp16-12a.c: Ditto. * gcc.target/i386/avx512fp16-12b.c: Ditto. * gcc.target/i386/float16-3a.c: Ditto. * gcc.target/i386/float16-3b.c: Ditto. * gcc.target/i386/float16-4a.c: Ditto. * gcc.target/i386/float16-4b.c: Ditto. * gcc.target/i386/pr54855-12.c: Ditto. * g++.dg/other/i386-2.C: Ditto. * g++.dg/other/i386-3.C: Ditto. Co-Authored-By: H.J. Lu <hongjiu.lu@intel.com> Co-Authored-By: Liu Hongtao <hongtao.liu@intel.com> Co-Authored-By: Wang Hongyu <hongyu.wang@intel.com> Co-Authored-By: Xu Dianhong <dianhong.xu@intel.com>
2021-09-08Support -fexcess-precision=16 which will enable ↵liuhongt19-17/+76
FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 when backend supports _Float16. gcc/ada/ChangeLog: * gcc-interface/misc.c (gnat_post_options): Issue an error for -fexcess-precision=16. gcc/c-family/ChangeLog: * c-common.c (excess_precision_mode_join): Update below comments. (c_ts18661_flt_eval_method): Set excess_precision_type to EXCESS_PRECISION_TYPE_FLOAT16 when -fexcess-precision=16. * c-cppbuiltin.c (cpp_atomic_builtins): Update below comments. (c_cpp_flt_eval_method_iec_559): Set excess_precision_type to EXCESS_PRECISION_TYPE_FLOAT16 when -fexcess-precision=16. gcc/ChangeLog: * common.opt: Support -fexcess-precision=16. * config/aarch64/aarch64.c (aarch64_excess_precision): Return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16 when EXCESS_PRECISION_TYPE_FLOAT16. * config/arm/arm.c (arm_excess_precision): Ditto. * config/i386/i386.c (ix86_get_excess_precision): Ditto. * config/m68k/m68k.c (m68k_excess_precision): Issue an error when EXCESS_PRECISION_TYPE_FLOAT16. * config/s390/s390.c (s390_excess_precision): Ditto. * coretypes.h (enum excess_precision_type): Add EXCESS_PRECISION_TYPE_FLOAT16. * doc/tm.texi (TARGET_C_EXCESS_PRECISION): Update documents. * doc/tm.texi.in (TARGET_C_EXCESS_PRECISION): Ditto. * doc/extend.texi (Half-Precision): Document -fexcess-precision=16. * flag-types.h (enum excess_precision): Add EXCESS_PRECISION_FLOAT16. * target.def (excess_precision): Update document. * tree.c (excess_precision_type): Set excess_precision_type to EXCESS_PRECISION_FLOAT16 when -fexcess-precision=16. gcc/fortran/ChangeLog: * options.c (gfc_post_options): Issue an error for -fexcess-precision=16. gcc/testsuite/ChangeLog: * gcc.target/i386/float16-6.c: New test. * gcc.target/i386/float16-7.c: New test.
2021-09-08Adjust the wording for x86 _Float16 type.liuhongt1-14/+13
gcc/ChangeLog: * doc/extend.texi: (@node Floating Types): Adjust the wording. (@node Half-Precision): Ditto.
2021-09-08Daily bump.GCC Administrator9-1/+282
2021-09-07gcc: xtensa: fix PR target/102115Max Filippov1-1/+2
2021-09-07 Takayuki 'January June' Suwa <jjsuwa_sys3175@yahoo.co.jp> gcc/ PR target/102115 * config/xtensa/xtensa.c (xtensa_emit_move_sequence): Add 'CONST_INT_P (src)' to the condition of the block that tries to eliminate literal when loading integer contant.
2021-09-07runtime: use hash32, not hash64, for amd64p32, mips64p32, mips64p32leIan Lance Taylor3-5/+5
Fixes PR go/102102 Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/348015
2021-09-07doc: BPF CO-RE documentationDavid Faust2-1/+28
Document the new command line options (-mco-re and -mno-co-re), the new BPF target builtin (__builtin_preserve_access_index), and the new BPF target attribute (preserve_access_index) introduced with BPF CO-RE. gcc/ChangeLog: * doc/extend.texi (BPF Type Attributes) New node. Document new preserve_access_index attribute. Document new preserve_access_index builtin. * doc/invoke.texi: Document -mco-re and -mno-co-re options.
2021-09-07bpf testsuite: Add BPF CO-RE testsDavid Faust8-0/+274
This commit adds several tests for the new BPF CO-RE functionality to the BPF target testsuite. gcc/testsuite/ChangeLog: * gcc.target/bpf/core-attr-1.c: New test. * gcc.target/bpf/core-attr-2.c: Likewise. * gcc.target/bpf/core-attr-3.c: Likewise. * gcc.target/bpf/core-attr-4.c: Likewise * gcc.target/bpf/core-builtin-1.c: Likewise * gcc.target/bpf/core-builtin-2.c: Likewise. * gcc.target/bpf/core-builtin-3.c: Likewise. * gcc.target/bpf/core-section-1.c: Likewise.
2021-09-07bpf: BPF CO-RE supportDavid Faust7-0/+1094
This commit introduces support for BPF Compile Once - Run Everywhere (CO-RE) in GCC. gcc/ChangeLog: * config/bpf/bpf.c: Adjust includes. (bpf_handle_preserve_access_index_attribute): New function. (bpf_attribute_table): Use it here. (bpf_builtins): Add BPF_BUILTIN_PRESERVE_ACCESS_INDEX. (bpf_option_override): Handle "-mco-re" option. (bpf_asm_init_sections): New. (TARGET_ASM_INIT_SECTIONS): Redefine. (bpf_file_end): New. (TARGET_ASM_FILE_END): Redefine. (bpf_init_builtins): Add "__builtin_preserve_access_index". (bpf_core_compute, bpf_core_get_index): New. (is_attr_preserve_access): New. (bpf_expand_builtin): Handle new builtins. (bpf_core_newdecl, bpf_core_is_maybe_aggregate_access): New. (bpf_core_walk): New. (bpf_resolve_overloaded_builtin): New. (TARGET_RESOLVE_OVERLOADED_BUILTIN): Redefine. (handle_attr): New. (pass_bpf_core_attr): New RTL pass. * config/bpf/bpf-passes.def: New file. * config/bpf/bpf-protos.h (make_pass_bpf_core_attr): New. * config/bpf/coreout.c: New file. * config/bpf/coreout.h: Likewise. * config/bpf/t-bpf (TM_H): Add $(srcdir)/config/bpf/coreout.h. (coreout.o): New rule. (PASSES_EXTRA): Add $(srcdir)/config/bpf/bpf-passes.def. * config.gcc (bpf): Add coreout.h to extra_headers. Add coreout.o to extra_objs. Add $(srcdir)/config/bpf/coreout.c to target_gtfiles.
2021-09-07btf: expose get_btf_idDavid Faust2-1/+2
Expose the function get_btf_id, so that it may be used by the BPF backend. This enables the BPF CO-RE machinery in the BPF backend to lookup BTF type IDs, in order to create CO-RE relocation records. A prototype is added in ctfc.h gcc/ChangeLog: * btfout.c (get_btf_id): Function is no longer static. * ctfc.h: Expose it here.
2021-09-07ctfc: add function to lookup CTF ID of a TREE typeDavid Faust2-0/+18
Add a new function, ctf_lookup_tree_type, to return the CTF type ID associated with a type via its is TREE node. The function is exposed via a prototype in ctfc.h. gcc/ChangeLog: * ctfc.c (ctf_lookup_tree_type): New function. * ctfc.h: Likewise.
2021-09-07ctfc: externalize ctf_dtd_lookupDavid Faust2-2/+5
Expose the function ctf_dtd_lookup, so that it can be used by the BPF CO-RE machinery. The function is no longer static, and an extern prototype is added in ctfc.h. gcc/ChangeLog: * ctfc.c (ctf_dtd_lookup): Function is no longer static. * ctfc.h: Analogous change.
2021-09-07dwarf: externalize lookup_type_dieDavid Faust2-2/+2
Expose the function lookup_type_die in dwarf2out, so that it can be used by CTF/BTF when adding BPF CO-RE information. The function is now non-static, and an extern prototype is added in dwarf2out.h. gcc/ChangeLog: * dwarf2out.c (lookup_type_die): Function is no longer static. * dwarf2out.h: Expose it here.
2021-09-07Fix fatal typo in gcc.dg/no_profile_instrument_function-attr-2.cHans-Peter Nilsson1-1/+1
Dejagnu is unfortunately brittle: a syntax error in a directive can abort the test-run for the current "tool" (gcc, g++, gfortran), and if you don't check for this condition or actually read the stdout log yourself, your tools may make you believe the test was successful without regressions. At the very least, always grep for ^ERROR: in the stdout log! With r12-3379, the testsuite got such a fatal syntax error, causing the gcc test-run to abort at (e.g.): ... FAIL: gcc.dg/memchr.c (test for excess errors) FAIL: gcc.dg/memcmp-3.c (test for excess errors) ERROR: (DejaGnu) proc "scan-tree-dump-not\" = foo {\(\)"} optimized" does not exist. The error code is TCL LOOKUP COMMAND scan-tree-dump-not\" The info on the error is: invalid command name "scan-tree-dump-not"" while executing "::tcl_unknown scan-tree-dump-not\" = foo {\(\)"} optimized" ("uplevel" body line 1) invoked from within "uplevel 1 ::tcl_unknown $args" === gcc Summary === # of expected passes 63740 # of unexpected failures 38 # of unexpected successes 2 # of expected failures 351 # of unresolved testcases 3 # of unsupported tests 662 x/cris-elf/gccobj/gcc/xgcc version 12.0.0 20210907 (experimental)\ [master r12-3391-g849d5f5929fc] (GCC) testsuite: * gcc.dg/no_profile_instrument_function-attr-2.c: Fix typo in last change.
2021-09-07Fortran - improve error recovery determining array element from constructorHarald Anlauf2-4/+14
gcc/fortran/ChangeLog: PR fortran/101327 * expr.c (find_array_element): When bounds cannot be determined as constant, return error instead of aborting. gcc/testsuite/ChangeLog: PR fortran/101327 * gfortran.dg/pr101327.f90: New test.
2021-09-07dwarf2out: Emit BTF in dwarf2out_finish for BPF CO-RE usecaseIndu Bhagat3-16/+51
DWARF generation is split between early and late phases when LTO is in effect. This poses challenges for CTF/BTF generation especially if late debug info generation is desirable, as turns out to be the case for BPF CO-RE. The approach taken here in this patch is: 1. LTO is disabled for BPF CO-RE The reason to disable LTO for BPF CO-RE is that if LTO is in effect, BPF CO-RE relocations need to be generated in the LTO link phase _after_ the optimizations are done. This means we need to devise way to combine early and late BTF. At this time, in absence of linker support for BTF sections, it makes sense to steer clear of LTO for BPF CO-RE and bypass the issue. 2. The BPF backend updates the write_symbols with BPF_WITH_CORE_DEBUG to convey the case that BTF with CO-RE support needs to be generated. This information is used by the debug info emission routines to defer the emission of BTF/CO-RE until dwarf2out_finish. So, in other words, dwarf2out_early_finish - Always emit CTF here. - if (BTF && !BTF_WITH_CORE), emit BTF now. dwarf2out_finish - if (BTF_WITH_CORE) emit BTF now. gcc/ChangeLog: * dwarf2ctf.c (ctf_debug_finalize): Make it static. (ctf_debug_early_finish): New definition. (ctf_debug_finish): Likewise. * dwarf2ctf.h (ctf_debug_finalize): Remove declaration. (ctf_debug_early_finish): New declaration. (ctf_debug_finish): Likewise. * dwarf2out.c (dwarf2out_finish): Invoke ctf_debug_finish. (dwarf2out_early_finish): Invoke ctf_debug_early_finish.
2021-09-07bpf: Add new -mco-re option for BPF CO-REIndu Bhagat3-0/+38
-mco-re in the BPF backend enables code generation for the CO-RE usecase. LTO is disabled for CO-RE compilations. gcc/ChangeLog: * config/bpf/bpf.c (bpf_option_override): For BPF backend, disable LTO support when compiling for CO-RE. * config/bpf/bpf.opt: Add new command line option -mco-re. gcc/testsuite/ChangeLog: * gcc.target/bpf/core-lto-1.c: New test.
2021-09-07debug: Add BTF_WITH_CORE_DEBUG debug formatIndu Bhagat3-1/+17
To best handle BTF/CO-RE in GCC, a distinct BTF_WITH_CORE_DEBUG debug format is being added. This helps the compiler detect whether BTF with CO-RE relocations needs to be emitted. gcc/ChangeLog: * flag-types.h (enum debug_info_type): Add new enum DINFO_TYPE_BTF_WITH_CORE. (BTF_WITH_CORE_DEBUG): New bitmask. * flags.h (btf_with_core_debuginfo_p): New declaration. * opts.c (btf_with_core_debuginfo_p): New definition.
2021-09-07tree: Change error_operand_p to an inline functionJason Merrill1-4/+7
I've thought for a while that many of the macros in tree.h and such should become inline functions. This one in particular was confusing Coverity; the null check in the macro made it think that all code guarded by error_operand_p would also need null checks. gcc/ChangeLog: * tree.h (error_operand_p): Change to inline function.
2021-09-07c++: Fix up constexpr evaluation of deleting dtors [PR100495]Jakub Jelinek2-2/+19
We do not save bodies of constexpr clones and instead evaluate the bodies of the constexpr functions they were cloned from. I believe that is just fine for constructors because complete vs. base ctors differ only in classes that have virtual bases and such constructors aren't constexpr, similarly complete/base destructors. But as the testcase below shows, for deleting destructors it is not fine, deleting dtors while marked as clones in fact are just artificial functions with synthetized body which calls the user destructor and deallocation. So, either we'd need to evaluate the destructor and afterwards synthetize and evaluate the deallocation, or we can just save and use the deleting dtors bodies. The latter seems much easier to me. 2021-09-07 Jakub Jelinek <jakub@redhat.com> PR c++/100495 * constexpr.c (maybe_save_constexpr_fundef): Save body even for constexpr deleting dtors. (cxx_eval_call_expression): Don't use DECL_CLONED_FUNCTION for deleting dtors. * g++.dg/cpp2a/constexpr-new21.C: New test.
2021-09-07libgomp.texi: Extend OpenMP 5.0 Implementation StatusTobias Burnus1-3/+93
libgomp/ * libgomp.texi (OpenMP Implementation Status): Extend OpenMP 5.0 section. (OpenACC Profiling Interface): Fix typo.
2021-09-07Rename forwarder_block_p in treading code to empty_block_with_phis_p.Aldy Hernandez1-5/+4
gcc/ChangeLog: * tree-ssa-threadedge.c (forwarder_block_p): Rename to... (empty_block_with_phis_p): ...this. (potentially_threadable_block): Same. (jump_threader::thread_through_normal_block): Same.
2021-09-07libgfortran: Makefile fix for ISO_Fortran_binding.hTobias Burnus2-12/+8
libgfortran/ChangeLog: * Makefile.am (gfor_built_src): Depend on include/ISO_Fortran_binding.h not on ISO_Fortran_binding.h. (ISO_Fortran_binding.h): Rename make target to ... (include/ISO_Fortran_binding.h): ... this. * Makefile.in: Regenerate.
2021-09-07Fix PR debug/101947Eric Botcazou1-8/+44
This is the recent LTO bootstrap failure with Ada enabled. The compiler now generates DW_OP_deref_type for a unit of the Ada front-end, which means that the offset of base types in the CU must be computed during early DWARF too. gcc/ PR debug/101947 * dwarf2out.c (mark_base_types): New overloaded function. (dwarf2out_early_finish): Invoke it on the COMDAT type list as well as the compilation unit, and call move_marked_base_types afterward.
2021-09-07x86: Enable FMA in unsigned SI to SF expandersH.J. Lu7-12/+94
Enable FMA in scalar/vector unsigned SI to SF expanders. Don't check TARGET_AVX512F which has vcvtusi2ss and vcvtudq2ps instructions. gcc/ PR target/85819 * config/i386/i386-expand.c (ix86_expand_convert_uns_sisf_sse): Enable FMA. (ix86_expand_vector_convert_uns_vsivsf): Likewise. gcc/testsuite/ PR target/85819 * gcc.target/i386/pr85819-1a.c: New test. * gcc.target/i386/pr85819-1b.c: Likewise. * gcc.target/i386/pr85819-2a.c: Likewise. * gcc.target/i386/pr85819-2b.c: Likewise. * gcc.target/i386/pr85819-2c.c: Likewise. * gcc.target/i386/pr85819-3.c: Likewise.
2021-09-07tree-optimization/102226 - fix epilogue vector re-useRichard Biener2-2/+31
This fixes re-use of the reduction value in epilogue vectorization when a conversion from/to variable lenght vectors is required. 2021-09-07 Richard Biener <rguenther@suse.de> PR tree-optimization/102226 * tree-vect-loop.c (vect_transform_cycle_phi): Record the converted value for the epilogue PHI use. * g++.dg/vect/pr102226.cc: New testcase.
2021-09-07C, C++, Fortran, OpenMP: Add support for 'flush seq_cst' construct.Marcel Vollweiler12-16/+56
This patch adds support for the 'seq_cst' memory order clause on the 'flush' directive which was introduced in OpenMP 5.1. gcc/c-family/ChangeLog: * c-omp.c (c_finish_omp_flush): Handle MEMMODEL_SEQ_CST. gcc/c/ChangeLog: * c-parser.c (c_parser_omp_flush): Parse 'seq_cst' clause on 'flush' directive. gcc/cp/ChangeLog: * parser.c (cp_parser_omp_flush): Parse 'seq_cst' clause on 'flush' directive. * semantics.c (finish_omp_flush): Handle MEMMODEL_SEQ_CST. gcc/fortran/ChangeLog: * openmp.c (gfc_match_omp_flush): Parse 'seq_cst' clause on 'flush' directive. * trans-openmp.c (gfc_trans_omp_flush): Handle OMP_MEMORDER_SEQ_CST. gcc/testsuite/ChangeLog: * c-c++-common/gomp/flush-1.c: Add test case for 'seq_cst'. * c-c++-common/gomp/flush-2.c: Add test case for 'seq_cst'. * g++.dg/gomp/attrs-1.C: Adapt test to handle all flush clauses. * g++.dg/gomp/attrs-2.C: Adapt test to handle all flush clauses. * gfortran.dg/gomp/flush-1.f90: Add test case for 'seq_cst'. * gfortran.dg/gomp/flush-2.f90: Add test case for 'seq_cst'.
2021-09-07inline: do not einline when no_profile_instrument_function is differentMartin Liska2-0/+32
PR gcov-profile/80223 gcc/ChangeLog: * ipa-inline.c (can_inline_edge_p): Similarly to sanitizer options, do not inline when no_profile_instrument_function attributes are different in early inliner. It's fine to inline it after PGO instrumentation. gcc/testsuite/ChangeLog: * gcc.dg/no_profile_instrument_function-attr-2.c: New test.
2021-09-07tree-optimization/101555 - avoid redundant alias queries in PRERichard Biener1-60/+37
This avoids doing redundant work during PHI translation to invalidate mems when translating their corresponding VUSE through the blocks virtual PHI node. All the invalidation work is already done by prune_clobbered_mems. This speeds up the compile of the testcase from 275s with PRE taking 91% of the compile-time down to 43s with PRE taking 16% of the compile-time. 2021-09-07 Richard Biener <rguenther@suse.de> PR tree-optimization/101555 * tree-ssa-pre.c (translate_vuse_through_block): Do not perform an alias walk to determine the validity of the mem at the start of the block which is already guaranteed by means of prune_clobbered_mems. (phi_translate_1): Pass edge to translate_vuse_through_block.
2021-09-07libgomp.texi: Add OpenMP Implementation StatusTobias Burnus1-3/+108
libgomp/ * libgomp.texi (Enabling OpenMP): Refer to OMP spec in general not to 4.5; link to new section. (OpenMP Implementation Status): New.
2021-09-06Fortran: Revert to non-multilib-specific ISO_Fortran_binding.hSandra Loosemore6-86/+87
Commit fef67987cf502fe322e92ddce22eea7ac46b4d75 changed the libgfortran build process to generate multilib-specific versions of ISO_Fortran_binding.h from a template, by running gfortran to identify the values of the Fortran kind constants C_LONG_DOUBLE, C_FLOAT128, and C_INT128_T. This caused multiple problems with search paths, both for build-tree testing and installed-tree use, not all of which have been fixed. This patch reverts to a non-multilib-specific .h file that uses GCC's predefined preprocessor symbols to detect the supported types and map them to kind values in the same way as the Fortran front end. 2021-09-06 Sandra Loosemore <sandra@codesourcery.com> libgfortran/ * ISO_Fortran_binding-1-tmpl.h: Deleted. * ISO_Fortran_binding-2-tmpl.h: Deleted. * ISO_Fortran_binding-3-tmpl.h: Deleted. * ISO_Fortran_binding.h: New file to replace the above. * Makefile.am (gfor_cdir): Remove MULTISUBDIR. (ISO_Fortran_binding.h): Simplify to just copy the file. * Makefile.in: Regenerated. * mk-kinds-h.sh: Revert pieces no longer needed for ISO_Fortran_binding.h.
2021-09-06rs6000: Expand fmod and remainder when built with fast-math [PR97142]Xionghu Luo2-0/+71
fmod/fmodf and remainder/remainderf could be expanded instead of library call when fast-math build, which is much faster. fmodf: fdivs f0,f1,f2 friz f0,f0 fnmsubs f1,f2,f0,f1 remainderf: fdivs f0,f1,f2 frin f0,f0 fnmsubs f1,f2,f0,f1 SPEC2017 Ofast P8LE: 511.povray_r +1.14%, 526.blender_r +1.72% gcc/ChangeLog: 2021-09-07 Xionghu Luo <luoxhu@linux.ibm.com> PR target/97142 * config/rs6000/rs6000.md (fmod<mode>3): New define_expand. (remainder<mode>3): Likewise. gcc/testsuite/ChangeLog: 2021-09-07 Xionghu Luo <luoxhu@linux.ibm.com> PR target/97142 * gcc.target/powerpc/pr97142.c: New test.
2021-09-07MIPS: add .module arch and ase to all output asmYunQiang Su1-0/+37
Currently, the asm output file for MIPS has no rev info. It can make some trouble, for example: assembler is mips1 by default, gcc is fpxx by default. To assemble the output of gcc -S, we have to pass -mips2 to assembler. The same situation is for some CPU has extension insn. Octeon is an example. So we can just add ".set arch=octeon". If an ASE is enabled, .module ase will also be used. gcc/ChangeLog: * config/mips/mips.c (mips_file_start): add .module for arch and ase.
2021-09-07Daily bump.GCC Administrator6-1/+96
2021-09-06Correct implementation of wi::clzRoger Sayle1-3/+4
As diagnosed with Jakub and Richard in the analysis of PR 102134, the current implementation of wi::clz has incorrect/inconsistent behaviour. As mentioned by Richard in comment #7, clz should (always) return zero for negative values, but the current implementation can only return 0 when precision is a multiple of HOST_BITS_PER_WIDE_INT. The fix is simply to reorder/shuffle the existing tests. 2021-09-06 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * wide-int.cc (wi::clz): Reorder tests to ensure the result is zero for all negative values.
2021-09-06invoke.texi: Fix @opindex for -foffload-optionsTobias Burnus1-1/+1
gcc/ * doc/invoke.texi (-foffload-options): Fix @opindex.
2021-09-06gcc_update: use human readable name for revision string in gcc/REVISIONSerge Belyshev1-2/+17
contrib/Changelog: * gcc_update: Derive human readable name for HEAD using git describe like "git gcc-descr" with short commit hash. Drop "revision" from gcc/REVISION.
2021-09-06x86: Add non-destructive source to @xorsign<mode>3_1H.J. Lu5-10/+36
Add non-destructive source alternative to @xorsign<mode>3_1 for AVX. gcc/ PR target/89984 * config/i386/i386-expand.c (ix86_split_xorsign): Use operands[2]. * config/i386/i386.md (@xorsign<mode>3_1): Add non-destructive source alternative for AVX. gcc/testsuite/ PR target/89984 * gcc.target/i386/pr89984-1.c: New test. * gcc.target/i386/pr89984-2.c: Likewise. * gcc.target/i386/xorsign-avx.c: Likewise.
2021-09-06Avoid FROM being overwritten in expand_fix.liuhongt2-5/+24
For the conversion from _Float16 to int, if the corresponding optab does not exist, the compiler will try the wider mode (SFmode here), but when floatsfsi exists but FAIL, FROM will be rewritten, which leads to a PR runtime error. gcc/ChangeLog: PR middle-end/102182 * optabs.c (expand_fix): Add from1 to avoid from being overwritten. gcc/testsuite/ChangeLog: PR middle-end/102182 * gcc.target/i386/pr101282.c: New test.
2021-09-06'libgomp.c/target-43.c': '-latomic' for nvptx offloadingThomas Schwinge1-0/+2
... to avoid a regression with recent commit 090f0d78f194e3cda23fe904016db77ea36c38fa "openmp: Improve expand_omp_atomic_pipeline": unresolved symbol __atomic_compare_exchange_1 collect2: error: ld returned 1 exit status mkoffload: fatal error: [...]/gcc/x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned 1 exit status libgomp/ * testsuite/libgomp.c/target-43.c: '-latomic' for nvptx offloading.
2021-09-06Fix debug info for packed array types in AdaEric Botcazou1-8/+14
Packed array types are sometimes represented with integer types under the hood in Ada, but we nevertheless need to emit them as array types in the debug info so we have the types.get_array_descr_info langhook for this purpose; but it is not invoked from modified_type_die, which causes: FAIL: gdb.ada/arrayptr.exp: scenario=minimal: print pa_ptr.all FAIL: gdb.ada/arrayptr.exp: scenario=minimal: print pa_ptr.all(3) in the GDB testsuite. gcc/ * dwarf2out.c (modified_type_die): Deal with all array types earlier and use local variable consistently throughout the function.
2021-09-06match.pd: Fix up __builtin_*_overflow arg demotion [PR102207]Jakub Jelinek2-2/+28
My earlier patch to demote arguments of __builtin_*_overflow unfortunately caused a wrong-code regression. The builtins operate on infinite precision arguments, outer_prec > inner_prec signed -> signed, unsigned -> unsigned promotions there are just repeating the sign or 0s and can be demoted, similarly unsigned -> signed which also is repeating 0s, but as the testcase shows, signed -> unsigned promotions need to be preserved (unless we'd know the inner arguments can't be negative), because for negative numbers such promotion sets the outer_prec -> inner_prec bits to 1 bit the bits above that to 0 in the infinite precision. So, the following patch avoids the demotions for the signed -> unsigned promotions. 2021-09-06 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/102207 * match.pd: Don't demote operands of IFN_{ADD,SUB,MUL}_OVERFLOW if they were promoted from signed to wider unsigned type. * gcc.dg/pr102207.c: New test.
2021-09-06Fix PR tree-optimization/63184: add simplification of (& + A) != (& + B)Andrew Pinski3-6/+19
These two testcases have been failing since GCC 5 but things have improved such that adding a simplification to match.pd for this case is easier than before. In the end we have the following IR: .... _5 = &a[1] + _4; _7 = &a + _13; if (_5 != _7) So we can fold the _5 != _7 into: (&a[1] - &a) + _4 != _13 The subtraction is folded into constant by ptr_difference_const. In this case, the full expression gets folded into a constant and we are able to remove the if statement. OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions. gcc/ChangeLog: PR tree-optimization/63184 * match.pd: Add simplification of pointer_diff of two pointer_plus with addr_expr in the first operand of each pointer_plus. Add simplificatoin of ne/eq of two pointer_plus with addr_expr in the first operand of each pointer_plus. gcc/testsuite/ChangeLog: PR tree-optimization/63184 * c-c++-common/pr19807-2.c: Enable for all targets and remove the xfail. * c-c++-common/pr19807-3.c: Likewise.
2021-09-06Explicitly add -msse2 to compile HF related libgcc source file.liuhongt5-2/+35
For 32-bit libgcc configure w/o sse2, there's would be an error since GCC only support _Float16 under sse2. Explicitly add -msse2 for those HF related libgcc functions, so users can still link them w/ the upper configuration. libgcc/ChangeLog: * Makefile.in: Adjust to support specific CFLAGS for each libgcc source file. * config/i386/64/t-softfp: Explicitly add -msse2 for HF related libgcc source files. * config/i386/t-softfp: Ditto. * config/i386/_divhc3.c: New file. * config/i386/_mulhc3.c: New file.
2021-09-06tree-optimization/102176 - locally compute participating SLP stmtsRichard Biener1-5/+64
This performs local re-computation of participating scalar stmts in BB vectorization subgraphs to allow precise computation of liveness of scalar stmts after vectorization and thus precise costing. This treats all extern defs as live but continues to optimistically handle scalar defs that we think we can handle by lane-extraction even though that can still fail late during code-generation. 2021-09-02 Richard Biener <rguenther@suse.de> PR tree-optimization/102176 * tree-vect-slp.c (vect_slp_gather_vectorized_scalar_stmts): New function. (vect_bb_slp_scalar_cost): Use the computed set of vectorized scalar stmts instead of relying on the out-of-date and not accurate PURE_SLP_STMT. (vect_bb_vectorization_profitable_p): Compute the set of vectorized scalar stmts.
2021-09-06Daily bump.GCC Administrator2-1/+35
2021-09-05libgo: update to final Go 1.17 releaseIan Lance Taylor13-66/+216
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/343729
2021-09-05Make the path solver's range_of_stmt() handle all statements.Aldy Hernandez1-5/+3
The path solver's range_of_stmt() was handcuffed to only fold GIMPLE_COND statements, since those were the only statements the backward threader needed to resolve. However, there is no need for this restriction, as the folding code is perfectly capable of folding any statement. This can be the case when trying to fold other statements in the final block of a path (for instance, in the forward threader as it tries to fold candidate statements along a path). Tested on x86-64 Linux. gcc/ChangeLog: * gimple-range-path.cc (path_range_query::range_of_stmt): Remove GIMPLE_COND special casing. (path_range_query::range_defined_in_block): Use range_of_stmt instead of calling fold_range directly.