aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-09-07bpf testsuite: Add BPF CO-RE testsDavid Faust8-0/+274
This commit adds several tests for the new BPF CO-RE functionality to the BPF target testsuite. gcc/testsuite/ChangeLog: * gcc.target/bpf/core-attr-1.c: New test. * gcc.target/bpf/core-attr-2.c: Likewise. * gcc.target/bpf/core-attr-3.c: Likewise. * gcc.target/bpf/core-attr-4.c: Likewise * gcc.target/bpf/core-builtin-1.c: Likewise * gcc.target/bpf/core-builtin-2.c: Likewise. * gcc.target/bpf/core-builtin-3.c: Likewise. * gcc.target/bpf/core-section-1.c: Likewise.
2021-09-07bpf: BPF CO-RE supportDavid Faust7-0/+1094
This commit introduces support for BPF Compile Once - Run Everywhere (CO-RE) in GCC. gcc/ChangeLog: * config/bpf/bpf.c: Adjust includes. (bpf_handle_preserve_access_index_attribute): New function. (bpf_attribute_table): Use it here. (bpf_builtins): Add BPF_BUILTIN_PRESERVE_ACCESS_INDEX. (bpf_option_override): Handle "-mco-re" option. (bpf_asm_init_sections): New. (TARGET_ASM_INIT_SECTIONS): Redefine. (bpf_file_end): New. (TARGET_ASM_FILE_END): Redefine. (bpf_init_builtins): Add "__builtin_preserve_access_index". (bpf_core_compute, bpf_core_get_index): New. (is_attr_preserve_access): New. (bpf_expand_builtin): Handle new builtins. (bpf_core_newdecl, bpf_core_is_maybe_aggregate_access): New. (bpf_core_walk): New. (bpf_resolve_overloaded_builtin): New. (TARGET_RESOLVE_OVERLOADED_BUILTIN): Redefine. (handle_attr): New. (pass_bpf_core_attr): New RTL pass. * config/bpf/bpf-passes.def: New file. * config/bpf/bpf-protos.h (make_pass_bpf_core_attr): New. * config/bpf/coreout.c: New file. * config/bpf/coreout.h: Likewise. * config/bpf/t-bpf (TM_H): Add $(srcdir)/config/bpf/coreout.h. (coreout.o): New rule. (PASSES_EXTRA): Add $(srcdir)/config/bpf/bpf-passes.def. * config.gcc (bpf): Add coreout.h to extra_headers. Add coreout.o to extra_objs. Add $(srcdir)/config/bpf/coreout.c to target_gtfiles.
2021-09-07btf: expose get_btf_idDavid Faust2-1/+2
Expose the function get_btf_id, so that it may be used by the BPF backend. This enables the BPF CO-RE machinery in the BPF backend to lookup BTF type IDs, in order to create CO-RE relocation records. A prototype is added in ctfc.h gcc/ChangeLog: * btfout.c (get_btf_id): Function is no longer static. * ctfc.h: Expose it here.
2021-09-07ctfc: add function to lookup CTF ID of a TREE typeDavid Faust2-0/+18
Add a new function, ctf_lookup_tree_type, to return the CTF type ID associated with a type via its is TREE node. The function is exposed via a prototype in ctfc.h. gcc/ChangeLog: * ctfc.c (ctf_lookup_tree_type): New function. * ctfc.h: Likewise.
2021-09-07ctfc: externalize ctf_dtd_lookupDavid Faust2-2/+5
Expose the function ctf_dtd_lookup, so that it can be used by the BPF CO-RE machinery. The function is no longer static, and an extern prototype is added in ctfc.h. gcc/ChangeLog: * ctfc.c (ctf_dtd_lookup): Function is no longer static. * ctfc.h: Analogous change.
2021-09-07dwarf: externalize lookup_type_dieDavid Faust2-2/+2
Expose the function lookup_type_die in dwarf2out, so that it can be used by CTF/BTF when adding BPF CO-RE information. The function is now non-static, and an extern prototype is added in dwarf2out.h. gcc/ChangeLog: * dwarf2out.c (lookup_type_die): Function is no longer static. * dwarf2out.h: Expose it here.
2021-09-07Fix fatal typo in gcc.dg/no_profile_instrument_function-attr-2.cHans-Peter Nilsson1-1/+1
Dejagnu is unfortunately brittle: a syntax error in a directive can abort the test-run for the current "tool" (gcc, g++, gfortran), and if you don't check for this condition or actually read the stdout log yourself, your tools may make you believe the test was successful without regressions. At the very least, always grep for ^ERROR: in the stdout log! With r12-3379, the testsuite got such a fatal syntax error, causing the gcc test-run to abort at (e.g.): ... FAIL: gcc.dg/memchr.c (test for excess errors) FAIL: gcc.dg/memcmp-3.c (test for excess errors) ERROR: (DejaGnu) proc "scan-tree-dump-not\" = foo {\(\)"} optimized" does not exist. The error code is TCL LOOKUP COMMAND scan-tree-dump-not\" The info on the error is: invalid command name "scan-tree-dump-not"" while executing "::tcl_unknown scan-tree-dump-not\" = foo {\(\)"} optimized" ("uplevel" body line 1) invoked from within "uplevel 1 ::tcl_unknown $args" === gcc Summary === # of expected passes 63740 # of unexpected failures 38 # of unexpected successes 2 # of expected failures 351 # of unresolved testcases 3 # of unsupported tests 662 x/cris-elf/gccobj/gcc/xgcc version 12.0.0 20210907 (experimental)\ [master r12-3391-g849d5f5929fc] (GCC) testsuite: * gcc.dg/no_profile_instrument_function-attr-2.c: Fix typo in last change.
2021-09-07Fortran - improve error recovery determining array element from constructorHarald Anlauf2-4/+14
gcc/fortran/ChangeLog: PR fortran/101327 * expr.c (find_array_element): When bounds cannot be determined as constant, return error instead of aborting. gcc/testsuite/ChangeLog: PR fortran/101327 * gfortran.dg/pr101327.f90: New test.
2021-09-07dwarf2out: Emit BTF in dwarf2out_finish for BPF CO-RE usecaseIndu Bhagat3-16/+51
DWARF generation is split between early and late phases when LTO is in effect. This poses challenges for CTF/BTF generation especially if late debug info generation is desirable, as turns out to be the case for BPF CO-RE. The approach taken here in this patch is: 1. LTO is disabled for BPF CO-RE The reason to disable LTO for BPF CO-RE is that if LTO is in effect, BPF CO-RE relocations need to be generated in the LTO link phase _after_ the optimizations are done. This means we need to devise way to combine early and late BTF. At this time, in absence of linker support for BTF sections, it makes sense to steer clear of LTO for BPF CO-RE and bypass the issue. 2. The BPF backend updates the write_symbols with BPF_WITH_CORE_DEBUG to convey the case that BTF with CO-RE support needs to be generated. This information is used by the debug info emission routines to defer the emission of BTF/CO-RE until dwarf2out_finish. So, in other words, dwarf2out_early_finish - Always emit CTF here. - if (BTF && !BTF_WITH_CORE), emit BTF now. dwarf2out_finish - if (BTF_WITH_CORE) emit BTF now. gcc/ChangeLog: * dwarf2ctf.c (ctf_debug_finalize): Make it static. (ctf_debug_early_finish): New definition. (ctf_debug_finish): Likewise. * dwarf2ctf.h (ctf_debug_finalize): Remove declaration. (ctf_debug_early_finish): New declaration. (ctf_debug_finish): Likewise. * dwarf2out.c (dwarf2out_finish): Invoke ctf_debug_finish. (dwarf2out_early_finish): Invoke ctf_debug_early_finish.
2021-09-07bpf: Add new -mco-re option for BPF CO-REIndu Bhagat3-0/+38
-mco-re in the BPF backend enables code generation for the CO-RE usecase. LTO is disabled for CO-RE compilations. gcc/ChangeLog: * config/bpf/bpf.c (bpf_option_override): For BPF backend, disable LTO support when compiling for CO-RE. * config/bpf/bpf.opt: Add new command line option -mco-re. gcc/testsuite/ChangeLog: * gcc.target/bpf/core-lto-1.c: New test.
2021-09-07debug: Add BTF_WITH_CORE_DEBUG debug formatIndu Bhagat3-1/+17
To best handle BTF/CO-RE in GCC, a distinct BTF_WITH_CORE_DEBUG debug format is being added. This helps the compiler detect whether BTF with CO-RE relocations needs to be emitted. gcc/ChangeLog: * flag-types.h (enum debug_info_type): Add new enum DINFO_TYPE_BTF_WITH_CORE. (BTF_WITH_CORE_DEBUG): New bitmask. * flags.h (btf_with_core_debuginfo_p): New declaration. * opts.c (btf_with_core_debuginfo_p): New definition.
2021-09-07tree: Change error_operand_p to an inline functionJason Merrill1-4/+7
I've thought for a while that many of the macros in tree.h and such should become inline functions. This one in particular was confusing Coverity; the null check in the macro made it think that all code guarded by error_operand_p would also need null checks. gcc/ChangeLog: * tree.h (error_operand_p): Change to inline function.
2021-09-07c++: Fix up constexpr evaluation of deleting dtors [PR100495]Jakub Jelinek2-2/+19
We do not save bodies of constexpr clones and instead evaluate the bodies of the constexpr functions they were cloned from. I believe that is just fine for constructors because complete vs. base ctors differ only in classes that have virtual bases and such constructors aren't constexpr, similarly complete/base destructors. But as the testcase below shows, for deleting destructors it is not fine, deleting dtors while marked as clones in fact are just artificial functions with synthetized body which calls the user destructor and deallocation. So, either we'd need to evaluate the destructor and afterwards synthetize and evaluate the deallocation, or we can just save and use the deleting dtors bodies. The latter seems much easier to me. 2021-09-07 Jakub Jelinek <jakub@redhat.com> PR c++/100495 * constexpr.c (maybe_save_constexpr_fundef): Save body even for constexpr deleting dtors. (cxx_eval_call_expression): Don't use DECL_CLONED_FUNCTION for deleting dtors. * g++.dg/cpp2a/constexpr-new21.C: New test.
2021-09-07libgomp.texi: Extend OpenMP 5.0 Implementation StatusTobias Burnus1-3/+93
libgomp/ * libgomp.texi (OpenMP Implementation Status): Extend OpenMP 5.0 section. (OpenACC Profiling Interface): Fix typo.
2021-09-07Rename forwarder_block_p in treading code to empty_block_with_phis_p.Aldy Hernandez1-5/+4
gcc/ChangeLog: * tree-ssa-threadedge.c (forwarder_block_p): Rename to... (empty_block_with_phis_p): ...this. (potentially_threadable_block): Same. (jump_threader::thread_through_normal_block): Same.
2021-09-07libgfortran: Makefile fix for ISO_Fortran_binding.hTobias Burnus2-12/+8
libgfortran/ChangeLog: * Makefile.am (gfor_built_src): Depend on include/ISO_Fortran_binding.h not on ISO_Fortran_binding.h. (ISO_Fortran_binding.h): Rename make target to ... (include/ISO_Fortran_binding.h): ... this. * Makefile.in: Regenerate.
2021-09-07Fix PR debug/101947Eric Botcazou1-8/+44
This is the recent LTO bootstrap failure with Ada enabled. The compiler now generates DW_OP_deref_type for a unit of the Ada front-end, which means that the offset of base types in the CU must be computed during early DWARF too. gcc/ PR debug/101947 * dwarf2out.c (mark_base_types): New overloaded function. (dwarf2out_early_finish): Invoke it on the COMDAT type list as well as the compilation unit, and call move_marked_base_types afterward.
2021-09-07x86: Enable FMA in unsigned SI to SF expandersH.J. Lu7-12/+94
Enable FMA in scalar/vector unsigned SI to SF expanders. Don't check TARGET_AVX512F which has vcvtusi2ss and vcvtudq2ps instructions. gcc/ PR target/85819 * config/i386/i386-expand.c (ix86_expand_convert_uns_sisf_sse): Enable FMA. (ix86_expand_vector_convert_uns_vsivsf): Likewise. gcc/testsuite/ PR target/85819 * gcc.target/i386/pr85819-1a.c: New test. * gcc.target/i386/pr85819-1b.c: Likewise. * gcc.target/i386/pr85819-2a.c: Likewise. * gcc.target/i386/pr85819-2b.c: Likewise. * gcc.target/i386/pr85819-2c.c: Likewise. * gcc.target/i386/pr85819-3.c: Likewise.
2021-09-07tree-optimization/102226 - fix epilogue vector re-useRichard Biener2-2/+31
This fixes re-use of the reduction value in epilogue vectorization when a conversion from/to variable lenght vectors is required. 2021-09-07 Richard Biener <rguenther@suse.de> PR tree-optimization/102226 * tree-vect-loop.c (vect_transform_cycle_phi): Record the converted value for the epilogue PHI use. * g++.dg/vect/pr102226.cc: New testcase.
2021-09-07C, C++, Fortran, OpenMP: Add support for 'flush seq_cst' construct.Marcel Vollweiler12-16/+56
This patch adds support for the 'seq_cst' memory order clause on the 'flush' directive which was introduced in OpenMP 5.1. gcc/c-family/ChangeLog: * c-omp.c (c_finish_omp_flush): Handle MEMMODEL_SEQ_CST. gcc/c/ChangeLog: * c-parser.c (c_parser_omp_flush): Parse 'seq_cst' clause on 'flush' directive. gcc/cp/ChangeLog: * parser.c (cp_parser_omp_flush): Parse 'seq_cst' clause on 'flush' directive. * semantics.c (finish_omp_flush): Handle MEMMODEL_SEQ_CST. gcc/fortran/ChangeLog: * openmp.c (gfc_match_omp_flush): Parse 'seq_cst' clause on 'flush' directive. * trans-openmp.c (gfc_trans_omp_flush): Handle OMP_MEMORDER_SEQ_CST. gcc/testsuite/ChangeLog: * c-c++-common/gomp/flush-1.c: Add test case for 'seq_cst'. * c-c++-common/gomp/flush-2.c: Add test case for 'seq_cst'. * g++.dg/gomp/attrs-1.C: Adapt test to handle all flush clauses. * g++.dg/gomp/attrs-2.C: Adapt test to handle all flush clauses. * gfortran.dg/gomp/flush-1.f90: Add test case for 'seq_cst'. * gfortran.dg/gomp/flush-2.f90: Add test case for 'seq_cst'.
2021-09-07inline: do not einline when no_profile_instrument_function is differentMartin Liska2-0/+32
PR gcov-profile/80223 gcc/ChangeLog: * ipa-inline.c (can_inline_edge_p): Similarly to sanitizer options, do not inline when no_profile_instrument_function attributes are different in early inliner. It's fine to inline it after PGO instrumentation. gcc/testsuite/ChangeLog: * gcc.dg/no_profile_instrument_function-attr-2.c: New test.
2021-09-07tree-optimization/101555 - avoid redundant alias queries in PRERichard Biener1-60/+37
This avoids doing redundant work during PHI translation to invalidate mems when translating their corresponding VUSE through the blocks virtual PHI node. All the invalidation work is already done by prune_clobbered_mems. This speeds up the compile of the testcase from 275s with PRE taking 91% of the compile-time down to 43s with PRE taking 16% of the compile-time. 2021-09-07 Richard Biener <rguenther@suse.de> PR tree-optimization/101555 * tree-ssa-pre.c (translate_vuse_through_block): Do not perform an alias walk to determine the validity of the mem at the start of the block which is already guaranteed by means of prune_clobbered_mems. (phi_translate_1): Pass edge to translate_vuse_through_block.
2021-09-07libgomp.texi: Add OpenMP Implementation StatusTobias Burnus1-3/+108
libgomp/ * libgomp.texi (Enabling OpenMP): Refer to OMP spec in general not to 4.5; link to new section. (OpenMP Implementation Status): New.
2021-09-06Fortran: Revert to non-multilib-specific ISO_Fortran_binding.hSandra Loosemore6-86/+87
Commit fef67987cf502fe322e92ddce22eea7ac46b4d75 changed the libgfortran build process to generate multilib-specific versions of ISO_Fortran_binding.h from a template, by running gfortran to identify the values of the Fortran kind constants C_LONG_DOUBLE, C_FLOAT128, and C_INT128_T. This caused multiple problems with search paths, both for build-tree testing and installed-tree use, not all of which have been fixed. This patch reverts to a non-multilib-specific .h file that uses GCC's predefined preprocessor symbols to detect the supported types and map them to kind values in the same way as the Fortran front end. 2021-09-06 Sandra Loosemore <sandra@codesourcery.com> libgfortran/ * ISO_Fortran_binding-1-tmpl.h: Deleted. * ISO_Fortran_binding-2-tmpl.h: Deleted. * ISO_Fortran_binding-3-tmpl.h: Deleted. * ISO_Fortran_binding.h: New file to replace the above. * Makefile.am (gfor_cdir): Remove MULTISUBDIR. (ISO_Fortran_binding.h): Simplify to just copy the file. * Makefile.in: Regenerated. * mk-kinds-h.sh: Revert pieces no longer needed for ISO_Fortran_binding.h.
2021-09-06rs6000: Expand fmod and remainder when built with fast-math [PR97142]Xionghu Luo2-0/+71
fmod/fmodf and remainder/remainderf could be expanded instead of library call when fast-math build, which is much faster. fmodf: fdivs f0,f1,f2 friz f0,f0 fnmsubs f1,f2,f0,f1 remainderf: fdivs f0,f1,f2 frin f0,f0 fnmsubs f1,f2,f0,f1 SPEC2017 Ofast P8LE: 511.povray_r +1.14%, 526.blender_r +1.72% gcc/ChangeLog: 2021-09-07 Xionghu Luo <luoxhu@linux.ibm.com> PR target/97142 * config/rs6000/rs6000.md (fmod<mode>3): New define_expand. (remainder<mode>3): Likewise. gcc/testsuite/ChangeLog: 2021-09-07 Xionghu Luo <luoxhu@linux.ibm.com> PR target/97142 * gcc.target/powerpc/pr97142.c: New test.
2021-09-07MIPS: add .module arch and ase to all output asmYunQiang Su1-0/+37
Currently, the asm output file for MIPS has no rev info. It can make some trouble, for example: assembler is mips1 by default, gcc is fpxx by default. To assemble the output of gcc -S, we have to pass -mips2 to assembler. The same situation is for some CPU has extension insn. Octeon is an example. So we can just add ".set arch=octeon". If an ASE is enabled, .module ase will also be used. gcc/ChangeLog: * config/mips/mips.c (mips_file_start): add .module for arch and ase.
2021-09-07Daily bump.GCC Administrator6-1/+96
2021-09-06Correct implementation of wi::clzRoger Sayle1-3/+4
As diagnosed with Jakub and Richard in the analysis of PR 102134, the current implementation of wi::clz has incorrect/inconsistent behaviour. As mentioned by Richard in comment #7, clz should (always) return zero for negative values, but the current implementation can only return 0 when precision is a multiple of HOST_BITS_PER_WIDE_INT. The fix is simply to reorder/shuffle the existing tests. 2021-09-06 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * wide-int.cc (wi::clz): Reorder tests to ensure the result is zero for all negative values.
2021-09-06invoke.texi: Fix @opindex for -foffload-optionsTobias Burnus1-1/+1
gcc/ * doc/invoke.texi (-foffload-options): Fix @opindex.
2021-09-06gcc_update: use human readable name for revision string in gcc/REVISIONSerge Belyshev1-2/+17
contrib/Changelog: * gcc_update: Derive human readable name for HEAD using git describe like "git gcc-descr" with short commit hash. Drop "revision" from gcc/REVISION.
2021-09-06x86: Add non-destructive source to @xorsign<mode>3_1H.J. Lu5-10/+36
Add non-destructive source alternative to @xorsign<mode>3_1 for AVX. gcc/ PR target/89984 * config/i386/i386-expand.c (ix86_split_xorsign): Use operands[2]. * config/i386/i386.md (@xorsign<mode>3_1): Add non-destructive source alternative for AVX. gcc/testsuite/ PR target/89984 * gcc.target/i386/pr89984-1.c: New test. * gcc.target/i386/pr89984-2.c: Likewise. * gcc.target/i386/xorsign-avx.c: Likewise.
2021-09-06Avoid FROM being overwritten in expand_fix.liuhongt2-5/+24
For the conversion from _Float16 to int, if the corresponding optab does not exist, the compiler will try the wider mode (SFmode here), but when floatsfsi exists but FAIL, FROM will be rewritten, which leads to a PR runtime error. gcc/ChangeLog: PR middle-end/102182 * optabs.c (expand_fix): Add from1 to avoid from being overwritten. gcc/testsuite/ChangeLog: PR middle-end/102182 * gcc.target/i386/pr101282.c: New test.
2021-09-06'libgomp.c/target-43.c': '-latomic' for nvptx offloadingThomas Schwinge1-0/+2
... to avoid a regression with recent commit 090f0d78f194e3cda23fe904016db77ea36c38fa "openmp: Improve expand_omp_atomic_pipeline": unresolved symbol __atomic_compare_exchange_1 collect2: error: ld returned 1 exit status mkoffload: fatal error: [...]/gcc/x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned 1 exit status libgomp/ * testsuite/libgomp.c/target-43.c: '-latomic' for nvptx offloading.
2021-09-06Fix debug info for packed array types in AdaEric Botcazou1-8/+14
Packed array types are sometimes represented with integer types under the hood in Ada, but we nevertheless need to emit them as array types in the debug info so we have the types.get_array_descr_info langhook for this purpose; but it is not invoked from modified_type_die, which causes: FAIL: gdb.ada/arrayptr.exp: scenario=minimal: print pa_ptr.all FAIL: gdb.ada/arrayptr.exp: scenario=minimal: print pa_ptr.all(3) in the GDB testsuite. gcc/ * dwarf2out.c (modified_type_die): Deal with all array types earlier and use local variable consistently throughout the function.
2021-09-06match.pd: Fix up __builtin_*_overflow arg demotion [PR102207]Jakub Jelinek2-2/+28
My earlier patch to demote arguments of __builtin_*_overflow unfortunately caused a wrong-code regression. The builtins operate on infinite precision arguments, outer_prec > inner_prec signed -> signed, unsigned -> unsigned promotions there are just repeating the sign or 0s and can be demoted, similarly unsigned -> signed which also is repeating 0s, but as the testcase shows, signed -> unsigned promotions need to be preserved (unless we'd know the inner arguments can't be negative), because for negative numbers such promotion sets the outer_prec -> inner_prec bits to 1 bit the bits above that to 0 in the infinite precision. So, the following patch avoids the demotions for the signed -> unsigned promotions. 2021-09-06 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/102207 * match.pd: Don't demote operands of IFN_{ADD,SUB,MUL}_OVERFLOW if they were promoted from signed to wider unsigned type. * gcc.dg/pr102207.c: New test.
2021-09-06Fix PR tree-optimization/63184: add simplification of (& + A) != (& + B)Andrew Pinski3-6/+19
These two testcases have been failing since GCC 5 but things have improved such that adding a simplification to match.pd for this case is easier than before. In the end we have the following IR: .... _5 = &a[1] + _4; _7 = &a + _13; if (_5 != _7) So we can fold the _5 != _7 into: (&a[1] - &a) + _4 != _13 The subtraction is folded into constant by ptr_difference_const. In this case, the full expression gets folded into a constant and we are able to remove the if statement. OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions. gcc/ChangeLog: PR tree-optimization/63184 * match.pd: Add simplification of pointer_diff of two pointer_plus with addr_expr in the first operand of each pointer_plus. Add simplificatoin of ne/eq of two pointer_plus with addr_expr in the first operand of each pointer_plus. gcc/testsuite/ChangeLog: PR tree-optimization/63184 * c-c++-common/pr19807-2.c: Enable for all targets and remove the xfail. * c-c++-common/pr19807-3.c: Likewise.
2021-09-06Explicitly add -msse2 to compile HF related libgcc source file.liuhongt5-2/+35
For 32-bit libgcc configure w/o sse2, there's would be an error since GCC only support _Float16 under sse2. Explicitly add -msse2 for those HF related libgcc functions, so users can still link them w/ the upper configuration. libgcc/ChangeLog: * Makefile.in: Adjust to support specific CFLAGS for each libgcc source file. * config/i386/64/t-softfp: Explicitly add -msse2 for HF related libgcc source files. * config/i386/t-softfp: Ditto. * config/i386/_divhc3.c: New file. * config/i386/_mulhc3.c: New file.
2021-09-06tree-optimization/102176 - locally compute participating SLP stmtsRichard Biener1-5/+64
This performs local re-computation of participating scalar stmts in BB vectorization subgraphs to allow precise computation of liveness of scalar stmts after vectorization and thus precise costing. This treats all extern defs as live but continues to optimistically handle scalar defs that we think we can handle by lane-extraction even though that can still fail late during code-generation. 2021-09-02 Richard Biener <rguenther@suse.de> PR tree-optimization/102176 * tree-vect-slp.c (vect_slp_gather_vectorized_scalar_stmts): New function. (vect_bb_slp_scalar_cost): Use the computed set of vectorized scalar stmts instead of relying on the out-of-date and not accurate PURE_SLP_STMT. (vect_bb_vectorization_profitable_p): Compute the set of vectorized scalar stmts.
2021-09-06Daily bump.GCC Administrator2-1/+35
2021-09-05libgo: update to final Go 1.17 releaseIan Lance Taylor13-66/+216
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/343729
2021-09-05Make the path solver's range_of_stmt() handle all statements.Aldy Hernandez1-5/+3
The path solver's range_of_stmt() was handcuffed to only fold GIMPLE_COND statements, since those were the only statements the backward threader needed to resolve. However, there is no need for this restriction, as the folding code is perfectly capable of folding any statement. This can be the case when trying to fold other statements in the final block of a path (for instance, in the forward threader as it tries to fold candidate statements along a path). Tested on x86-64 Linux. gcc/ChangeLog: * gimple-range-path.cc (path_range_query::range_of_stmt): Remove GIMPLE_COND special casing. (path_range_query::range_defined_in_block): Use range_of_stmt instead of calling fold_range directly.
2021-09-05Add an unreachable_path_p method to path_range_query.Aldy Hernandez3-11/+31
Keeping track of unreachable calculations while traversing a path is useful to determine edge reachability, among other things. We've been doing this ad-hoc in the backwards threader, so this provides a cleaner way of accessing the information. This patch also makes it easier to compare different threading implementations, in some upcoming work. For example, it's currently difficult to gague how good we're doing compared to the forward threader, because it can thread paths that are obviously unreachable. This provides a way of discarding those paths. Note that I've opted to keep unreachable_path_p() out-of-line, because I have local changes that will enhance this method. Tested on x86-64 Linux. gcc/ChangeLog: * gimple-range-path.cc (path_range_query::range_of_expr): Set m_undefined_path when appropriate. (path_range_query::internal_range_of_expr): Copy from range_of_expr. (path_range_query::unreachable_path_p): New. (path_range_query::precompute_ranges): Set m_undefined_path. * gimple-range-path.h (path_range_query::unreachable_path_p): New. (path_range_query::internal_range_of_expr): New. * tree-ssa-threadbackward.c (back_threader::find_taken_edge_cond): Use unreachable_path_p.
2021-09-05Clean up registering of paths in backwards threader.Aldy Hernandez1-21/+23
All callers to maybe_register_path() call find_taken_edge() beforehand and pass the edge as an argument. There's no reason to repeat this at each call site. This is a clean-up in preparation for some other enhancements to the backwards threader. Tested on x86-64 Linux. gcc/ChangeLog: * tree-ssa-threadbackward.c (back_threader::maybe_register_path): Remove argument and call find_taken_edge. (back_threader::resolve_phi): Do not calculate taken edge before calling maybe_register_path. (back_threader::find_paths_to_names): Same.
2021-09-05Improve handling of C bit for setcc insnsJeff Law2-3/+120
gcc/ * config/h8300/h8300.md (QHSI2 mode iterator): New mode iterator. * config/h8300/testcompare.md (store_c): Update name, use new QHSI2 iterator. (store_neg_c, store_shifted_c): New patterns.
2021-09-05Daily bump.GCC Administrator1-1/+1
2021-09-04Daily bump.GCC Administrator9-1/+277
2021-09-03rs6000: Don't use r12 for CR save on ELFv2 (PR102107)Segher Boessenkool1-4/+7
CR is saved and/or restored on some paths where GPR12 is already live since it has a meaning in the calling convention in the ELFv2 ABI. It is not completely clear to me that we can always use r11 here, but it does seem save, there is checking code (to detect conflicts here), and it is stage 1. So here goes. 2021-09-03 Segher Boessenkool <segher@kernel.crashing.org> PR target/102107 * config/rs6000/rs6000-logue.c (rs6000_emit_prologue): On ELFv2 use r11 instead of r12 for CR save, in all cases.
2021-09-03coroutines: Support for debugging implementation state.Iain Sandoe1-5/+11
Some of the state that is associated with the implementation is of interest to a user debugging a coroutine. In particular items such as the suspend point, promise object, and current suspend point. These variables live in the coroutine frame, but we can inject proxies for them into the outermost bind expression of the coroutine. Such variables are automatically moved into the coroutine frame (if they need to persist across a suspend expression). PLacing the proxies thus allows the user to inspect them by name in the debugger. To implement this, we ensure that (at the outermost scope) the frame entries are not mangled (coroutine frame variables are usually mangled with scope nesting information so that they do not clash). We can safely avoid doing this for the outermost scope so that we can map frame entries directly to the variables. This is partial contribution to debug support (PR 99215). Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> gcc/cp/ChangeLog: * coroutines.cc (register_local_var_uses): Do not mangle frame entries for the outermost scope. Record the outer scope as nesting depth 0.
2021-09-03coroutines: Add a helper for creating local vars.Iain Sandoe1-26/+45
This is primarily code factoring, but we take this opportunity to rename some of the implementation variables (which we intend to expose to debugging) so that they are in the implementation namespace. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> gcc/cp/ChangeLog: * coroutines.cc (coro_build_artificial_var): New. (build_actor_fn): Use var builder, rename vars to use implementation namespace. (coro_rewrite_function_body): Likewise. (morph_fn_to_coro): Likewise.
2021-09-03coroutines: Use DECL_VALUE_EXPR instead of rewriting vars.Iain Sandoe1-100/+5
Variables that need to persist over suspension expressions must be preserved by being copied into the coroutine frame. The initial implementations do this manually in the transform code. However, that has various disadvantages - including that the debug connections are lost between the original var and the frame copy. The revised implementation makes use of DECL_VALUE_EXPRs to contain the frame offset expressions, so that the original var names are preserved in the code. This process is also applied to the function parms which are always copied to the frame. In this case the decls need to be copied since they are used in two different contexts during the re-write (in the building of the ramp function, and in the actor function itself). This will assist in improvement of debugging (PR 99215). Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> gcc/cp/ChangeLog: * coroutines.cc (transform_local_var_uses): Record frame offset expressions as DECL_VALUE_EXPRs instead of rewriting them.