aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-10-28dump reason for throwing away SLP instanceRichard Biener1-1/+7
This adds dumping to vect_slp_analyze_node_alignment when it fails an SLP instance due to shared vector type conflicts. 2020-10-28 Richard Biener <rguenther@suse.de> * tree-vect-data-refs.c (vect_slp_analyze_node_alignment): Dump when vect_update_shared_vectype fails.
2020-10-28libstdc++: Fix name clash with _Cosh in QNX headers [PR 95592]Jonathan Wakely2-37/+66
This replaces unqualified names like _Cosh with struct std::_Cosh to ensure there is no ambiguity with other entities with the same name. libstdc++-v3/ChangeLog: PR libstdc++/95592 * include/bits/valarray_after.h (_DEFINE_EXPR_UNARY_OPERATOR) (_DEFINE_EXPR_BINARY_OPERATOR, _DEFINE_EXPR_BINARY_FUNCTION): Use elaborated-type-specifier and qualified-id to avoid ambiguities with QNX system headers. * testsuite/26_numerics/valarray/95592.cc: New test.
2020-10-28libstdc++: Make std::span layout-compatible with struct iovec [PR 95609]Jonathan Wakely2-6/+54
This change reorders the data members of std::span so that span<byte> is layout-compatible with common implementations of struct iovec. This will allow span<byte> to be used directly in places that use a struct iovec to do scatter-gather I/O. It's important to note that POSIX doesn't specify the order of members in iovec. Also the equivalent type on Windows has members in the other order, and uses type ULONG (which is always 32-bit whereas size_t is 64-bit for Win64). So this change will only help for certain targets and an indirection between std::span and I/O system calls will still be needed for the general case. libstdc++-v3/ChangeLog: PR libstdc++/95609 * include/std/span (span): Reorder data members to match common implementations of struct iovec. * testsuite/23_containers/span/layout_compat.cc: New test.
2020-10-28aarch64: Add vstN_lane_bf16 + vstNq_lane_bf16 intrinsicsAndrea Corallo10-49/+440
gcc/ChangeLog 2020-10-19 Andrea Corallo <andrea.corallo@arm.com> * config/aarch64/arm_neon.h (__ST2_LANE_FUNC, __ST3_LANE_FUNC) (__ST4_LANE_FUNC): Rename the macro generating the 'q' variants into __ST2Q_LANE_FUNC, __ST2Q_LANE_FUNC, __ST2Q_LANE_FUNC so they all can be undefed at the and of the file. (vst2_lane_bf16, vst2q_lane_bf16, vst3_lane_bf16, vst3q_lane_bf16) (vst4_lane_bf16, vst4q_lane_bf16): Add new intrinsics. gcc/testsuite/ChangeLog 2020-10-19 Andrea Corallo <andrea.corallo@arm.com> * gcc.target/aarch64/advsimd-intrinsics/arm-neon-ref.h (hbfloat16_t): Define type. (CHECK_FP): Make it working for bfloat types. * gcc.target/aarch64/advsimd-intrinsics/bf16_vstN_lane_1.c: New file. * gcc.target/aarch64/advsimd-intrinsics/bf16_vstN_lane_2.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vst2_lane_bf16_indices_1.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vst2q_lane_bf16_indices_1.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vst3_lane_bf16_indices_1.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vst3q_lane_bf16_indices_1.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vst4_lane_bf16_indices_1.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vst4q_lane_bf16_indices_1.c: Likewise.
2020-10-28aarch64: Add bfloat16 vldN_lane_bf16 + vldNq_lane_bf16 intrisicsAndrea Corallo9-57/+289
gcc/ChangeLog 2020-10-15 Andrea Corallo <andrea.corallo@arm.com> * config/aarch64/arm_neon.h (__LD2_LANE_FUNC, __LD3_LANE_FUNC) (__LD4_LANE_FUNC): Rename the macro generating the 'q' variants into __LD2Q_LANE_FUNC, __LD2Q_LANE_FUNC, __LD2Q_LANE_FUNC so they all can be undefed at the and of the file. (vld2_lane_bf16, vld2q_lane_bf16, vld3_lane_bf16, vld3q_lane_bf16) (vld4_lane_bf16, vld4q_lane_bf16): Add new intrinsics. gcc/testsuite/ChangeLog 2020-10-15 Andrea Corallo <andrea.corallo@arm.com> * gcc.target/aarch64/advsimd-intrinsics/bf16_vldN_lane_1.c: New testcase. * gcc.target/aarch64/advsimd-intrinsics/bf16_vldN_lane_2.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vld2_lane_bf16_indices_1.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vld2q_lane_bf16_indices_1.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vld3_lane_bf16_indices_1.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vld3q_lane_bf16_indices_1.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vld4_lane_bf16_indices_1.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vld4q_lane_bf16_indices_1.c: Likewise.
2020-10-28[PR97504] riscv needs wraplf for aux_long_long_float tooAlexandre Oliva1-0/+1
riscv is another platform on which GNAT maps Long_Long_Float to double rather than long double, so we have to explicitly avoid the long double intrinsics. for gcc/ada/ChangeLog PR ada/97504 * Makefile.rtl (LIBGNAT_TARGET_PAIRS> <riscv*-*-*>: Use wraplf version of Aux_Long_Long_Float.
2020-10-28openmp: Parsing and some semantic analysis of OpenMP allocate clauseJakub Jelinek17-57/+739
This patch adds parsing of OpenMP allocate clause, but still ignores it during OpenMP lowering where we should for privatized variables with allocate clause use the corresponding allocators rather than allocating them on the stack. 2020-10-28 Jakub Jelinek <jakub@redhat.com> gcc/ * tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_ALLOCATE. * tree.h (OMP_CLAUSE_ALLOCATE_ALLOCATOR, OMP_CLAUSE_ALLOCATE_COMBINED): Define. * tree.c (omp_clause_num_ops, omp_clause_code_name): Add allocate clause. (walk_tree_1): Handle OMP_CLAUSE_ALLOCATE. * tree-pretty-print.c (dump_omp_clause): Likewise. * gimplify.c (gimplify_scan_omp_clauses, gimplify_adjust_omp_clauses, gimplify_omp_for): Likewise. * tree-nested.c (convert_nonlocal_omp_clauses, convert_local_omp_clauses): Likewise. * omp-low.c (scan_sharing_clauses): Likewise. gcc/c-family/ * c-pragma.h (enum pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_ALLOCATE. * c-omp.c: Include bitmap.h. (c_omp_split_clauses): Handle OMP_CLAUSE_ALLOCATE. gcc/c/ * c-parser.c (c_parser_omp_clause_name): Handle allocate. (c_parser_omp_clause_allocate): New function. (c_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_ALLOCATE. (OMP_FOR_CLAUSE_MASK, OMP_SECTIONS_CLAUSE_MASK, OMP_PARALLEL_CLAUSE_MASK, OMP_SINGLE_CLAUSE_MASK, OMP_TASK_CLAUSE_MASK, OMP_TASKGROUP_CLAUSE_MASK, OMP_DISTRIBUTE_CLAUSE_MASK, OMP_TEAMS_CLAUSE_MASK, OMP_TARGET_CLAUSE_MASK, OMP_TASKLOOP_CLAUSE_MASK): Add PRAGMA_OMP_CLAUSE_ALLOCATE. * c-typeck.c (c_finish_omp_clauses): Handle OMP_CLAUSE_ALLOCATE. gcc/cp/ * parser.c (cp_parser_omp_clause_name): Handle allocate. (cp_parser_omp_clause_allocate): New function. (cp_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_ALLOCATE. (OMP_FOR_CLAUSE_MASK, OMP_SECTIONS_CLAUSE_MASK, OMP_PARALLEL_CLAUSE_MASK, OMP_SINGLE_CLAUSE_MASK, OMP_TASK_CLAUSE_MASK, OMP_TASKGROUP_CLAUSE_MASK, OMP_DISTRIBUTE_CLAUSE_MASK, OMP_TEAMS_CLAUSE_MASK, OMP_TARGET_CLAUSE_MASK, OMP_TASKLOOP_CLAUSE_MASK): Add PRAGMA_OMP_CLAUSE_ALLOCATE. * semantics.c (finish_omp_clauses): Handle OMP_CLAUSE_ALLOCATE. * pt.c (tsubst_omp_clauses): Likewise. gcc/testsuite/ * c-c++-common/gomp/allocate-1.c: New test. * c-c++-common/gomp/allocate-2.c: New test. * c-c++-common/gomp/clauses-1.c (omp_allocator_handle_t): New typedef. (foo, bar, baz): Add allocate clauses where allowed.
2020-10-28openmp: Implicitly discover declare target for variants of declare variant callsJakub Jelinek2-2/+63
This marks all variants of declare variant also declare target if the base functions are called directly in target regions or declare target functions. 2020-10-28 Jakub Jelinek <jakub@redhat.com> gcc/ * omp-offload.c (omp_declare_target_tgt_fn_r): Handle direct calls to declare variant base functions. libgomp/ * testsuite/libgomp.c/target-42.c: New test.
2020-10-28xfail and improve some failing libgomp tests [PR81690]Jakub Jelinek3-5/+31
With the patch I've posted today to fix up declare variant LTO handling, Tobias reported the patch still doesn't work, and there are two reasons for that. One is that when the base function is marked implicitly as declare target, we don't mark also implicitly the variants. I'll need to ask on omp-lang about details for that, but generally the compiler should do it some way. The other one is that the way base_delay is written, it will always call the usleep function, which is undesirable for nvptx. While the compiler will replace all direct calls to base_delay to nvptx_delay, the base_delay definition which calls usleep stays. 2020-10-28 Jakub Jelinek <jakub@redhat.com> Tom de Vries <tdevries@suse.de> PR testsuite/81690 * testsuite/libgomp.c/usleep.h: New file. * testsuite/libgomp.c/target-32.c: Include usleep.h. (main): Use tgt_usleep instead of usleep. * testsuite/libgomp.c/thread-limit-2.c: Include usleep.h. (main): Use tgt_usleep instead of usleep.
2020-10-28lto: LTO cgraph support for late declare variant resolution [PR96680]Jakub Jelinek9-7/+196
> I've tried to add the saving/restoring next to ipa refs saving/restoring, as > the declare variant alt stuff is kind of extension of those, unfortunately > following doesn't compile, because I need to also write or read a tree there > (ctx is a portion of DECL_ATTRIBUTES of the base function), but the ipa refs > write/read back functions don't have arguments that can be used for that. This patch adds the streaming out and in of those omp_declare_variant_alt hash table on the side data for the declare_variant_alt cgraph_nodes and treats for LTO purposes the declare_variant_alt nodes (which have no body) as if they contained a body that calls all the possible variants. After IPA all the calls to these magic declare_variant_alt calls are replaced with call to one of the variant depending on which one has the highest score in the context. 2020-10-28 Jakub Jelinek <jakub@redhat.com> PR lto/96680 gcc/ * lto-streamer.h (omp_lto_output_declare_variant_alt, omp_lto_input_declare_variant_alt): Declare variant. * symtab.c (symtab_node::get_partitioning_class): Return SYMBOL_DUPLICATE for declare_variant_alt nodes. * passes.c (ipa_write_summaries): Add declare_variant_alt to partition. * lto-cgraph.c (output_refs): Call omp_lto_output_declare_variant_alt on declare_variant_alt nodes. (input_refs): Call omp_lto_input_declare_variant_alt on declare_variant_alt nodes. * lto-streamer-out.c (output_function): Don't call collect_block_tree_leafs if DECL_INITIAL is error_mark_node. (lto_output): Call output_function even for declare_variant_alt nodes. * omp-general.c (omp_lto_output_declare_variant_alt, omp_lto_input_declare_variant_alt): New functions. gcc/lto/ * lto-common.c (lto_fixup_prevailing_decls): Don't use LTO_NO_PREVAIL on TREE_LIST's TREE_PURPOSE. * lto-partition.c (lto_balanced_map): Treat declare_variant_alt nodes like definitions. libgomp/ * testsuite/libgomp.c/declare-variant-1.c: New test.
2020-10-28wide-int: Fix up set_bit_largeJakub Jelinek1-2/+5
> >> wide_int new_lb = wi::set_bit (r.lower_bound (0), 127) > >> > >> and creates the value: > >> > >> p new_lb > >> {<wide_int_storage> = {val = {-65535, -1, 0}, len = 2, precision = 128}, > >> static is_sign_extended = true} > > > > This is non-canonical and so invalid, if the low HWI has the MSB set > > and the high HWI is -1, it should have been just > > val = {-65535}, len = 1, precision = 128} > > > > I guess the bug is that wi::set_bit_large doesn't call canonize. > > Yeah, looks like a micro-optimisation gone wrong. 2020-10-28 Jakub Jelinek <jakub@redhat.com> * wide-int.cc (wi::set_bit_large): Call canonize unless setting msb bit and clearing bits above it.
2020-10-28[RS6000] power10 scan-assembler testsAlan Modra8-8/+8
On power10 these are "dg-do run" tests, so need -save-temps for the assembler scanning. * gcc.target/powerpc/vsx-load-element-extend-char.c: Add -save-temps. * gcc.target/powerpc/vsx-load-element-extend-int.c: Likewise. * gcc.target/powerpc/vsx-load-element-extend-longlong.c: Likewise. * gcc.target/powerpc/vsx-load-element-extend-short.c: Likewise. * gcc.target/powerpc/vsx-store-element-truncate-char.c: Likewise. * gcc.target/powerpc/vsx-store-element-truncate-int.c: Likewise. * gcc.target/powerpc/vsx-store-element-truncate-longlong.c: Likewise. * gcc.target/powerpc/vsx-store-element-truncate-short.c: Likewise.
2020-10-28[RS6000] dg-do !compile and scan-assemblerAlan Modra12-16/+14
These tests never checked assembly, because .s files were not produced. One was looking for the wrong instructions. A typical error log PASS: gcc.target/powerpc/vec-permute-ext-runnable.c (test for excess errors) gcc.target/powerpc/vec-permute-ext-runnable.c output file does not exist UNRESOLVED: gcc.target/powerpc/vec-permute-ext-runnable.c scan-assembler-times \\mpermx\\M 10 * gcc.target/powerpc/vec-blend-runnable.c: Add save-temps. * gcc.target/powerpc/vec-insert-word-runnable.c: Likewise. * gcc.target/powerpc/vec-permute-ext-runnable.c: Likewise. * gcc.target/powerpc/vec-replace-word-runnable.c: Likewise. * gcc.target/powerpc/vec-splati-runnable.c: Likewise. * gcc.target/powerpc/vec-ternarylogic-3.c: Likewise. * gcc.target/powerpc/vec-ternarylogic-9.c: Likewise. * gcc.target/powerpc/vsx_mask-count-runnable.c: Likewise. * gcc.target/powerpc/vsx_mask-expand-runnable.c: Likewise. * gcc.target/powerpc/vsx_mask-extract-runnable.c: Likewise. * gcc.target/powerpc/vsx_mask-move-runnable.c: Likewise. * gcc.target/powerpc/vec-shift-double-runnable.c: Likewise, and correct assembly match.
2020-10-27Tweaks to ranger API routines.Andrew MacLeod2-44/+50
Remove the gcc_assert wrappers that contain statements that need to be executed. Audit routines to ensure range is set to UNDEFINED when false is returned. * gimple-range-gori.cc (gori_compute_cache::cache_stmt): Accumulate return values and only set cache when everything returned true. * gimple-range.cc (get_tree_range): Set the return range to UNDEFINED when the range isn't supported. (gimple_ranger::calc_stmt): Return varying if the type is supported, even if the stmt processing failed. False otherwise. (range_of_builtin_ubsan_call): Don't use gcc_assert. (range_of_builtin_call): Ditto. (gimple_ranger::range_of_cond_expr): Ditto. (gimple_ranger::range_of_expr): Ditto (gimple_ranger::range_on_entry): Ditto. (gimple_ranger::range_on_exit): Ditto. (gimple_ranger::range_on_edge): DItto. (gimple_ranger::range_of_stmt): Don't use gcc_assert, and initialize return value to UNDEFINED.
2020-10-28Daily bump.GCC Administrator12-1/+646
2020-10-27c: Allow duplicate C2x standard attributesJoseph Myers4-71/+14
N2557, accepted into C2x at the October WG14 meeting, removes the requirement that duplicates of standard attributes cannot appear within an attribute list (so allowing e.g. [[deprecated, deprecated]], where previously that was disallowed but [[deprecated]] [[deprecated]] was OK). Remove the code checking for this (standard attributes aren't in any released version of the C standard) and update tests accordingly. Bootstrapped with no regressions on x86_64-pc-linux-gnu. gcc/c/ 2020-10-27 Joseph Myers <joseph@codesourcery.com> * c-parser.c (c_parser_std_attribute_specifier): Allow duplicate standard attributes. gcc/testsuite/ 2020-10-27 Joseph Myers <joseph@codesourcery.com> * gcc.dg/c2x-attr-deprecated-4.c, gcc.dg/c2x-attr-fallthrough-4.c, gcc.dg/c2x-attr-maybe_unused-4.c: Allow duplicate attributes.
2020-10-27libgo: update to Go 1.15.3 releaseIan Lance Taylor27-188/+366
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/265717
2020-10-27Fix PR97497Andreas Krebbel2-0/+43
This works around a limitation of gcse with handling of partially clobbered registers. With this patch our GOT pointer register r12 is not marked as partially clobbered anymore for the -m31 -mzarch -fpic combination. This is correct since all the bits in r12 we actually care about are in fact preserved. gcc/ChangeLog: PR rtl-optimization/97497 * config/s390/s390.c (s390_hard_regno_call_part_clobbered): Do not return true for r12 when -fpic is used. gcc/testsuite/ChangeLog: * gcc.target/s390/pr97497.c: New test.
2020-10-27PR fortran/97491 - Wrong restriction for VALUE arguments of pure proceduresHarald Anlauf2-0/+17
A dummy argument with the VALUE attribute may be redefined in a PURE or ELEMENTAL procedure. Adjust the associated purity check. gcc/fortran/ChangeLog: * resolve.c (gfc_impure_variable): A dummy argument with the VALUE attribute may be redefined without making a procedure impure. gcc/testsuite/ChangeLog: * gfortran.dg/value_8.f90: New test.
2020-10-27PPC testsuite fixesCarl Love5-11/+15
2020-10-27 Carl Love <cel@us.ibm.com> gcc/testsuite * gcc.target/powerpc/vec-blend-runnable.c: Change #ifdef DEBUG to #if DEBUG. Fix printf line so it is less then 80 characters long. * gcc.target/powerpc/vec-insert-word-runnable.c: Change #ifdef DEBUG to #if DEBUG. * gcc.target/powerpc/vec-permute-ext-runnable.c: Change #ifdef DEBUG to #if DEBUG. * gcc.target/powerpc/vec-replace-word-runnable.c: Change #ifdef DEBUG to #if DEBUG. Fix printf lines so they are less then 80 characters long. * gcc.target/powerpc/vec-shift-double-runnable.c: Change #ifdef DEBUG to #if DEBUG.
2020-10-27compiler, go/internal/gccgoimporter: export notinheap annotationIan Lance Taylor6-9/+35
This is the gofrontend version of https://golang.org/cl/259297. This is required now because that change is in the 1.15.3 release. This requires changing the go/internal/gccgoimporter package, to skip the new annotation. This change will need to be ported to the gc and x/tools repos. For golang/go#41761 Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/265258
2020-10-27compiler: remove unused Type::in_heap_ member variableIan Lance Taylor2-3/+1
This member variable was added in https://golang.org/cl/46490, but it was never used. The code uses Named_type::in_heap_ instead. Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/265257
2020-10-27c++: Kill nested_udtsNathan Sidwell6-298/+25
During the implementation of modules I added myself a note to implement nested_udt handling. It wasn't obvious to me what they were for and nothing seemed to be broken in ignoring them. I figured something would eventually pop up and I'd add support. Nothing popped up. Investigating on trunk discovered 3 places where we look at the nested-udts. I couldn't figure how the one in lookup_field_r was needed -- surely the regular lookup would find the type. It turned out that code was unreachable. So we can delete it. Next in do_type_instantiation, we walk the nested-utd table instantiating types. But those types are also on the TYPE_FIELDS list, which we've just iterated over. So I can move the handling into that loop. The final use is in handling structs that have a typedef name for linkage purposes. Again, we can just iterate over TYPE_FIELDS. (As commented, we probably don't need to do even that, as a DR, whose number I forget, requires such structs to only have C-like things in them. But I didn't go that far. Having removed all the uses of nested-udts, I can remove their creation from name-lookup, and as the only instance of a binding_table object, we can remove all that code too. gcc/cp/ * cp-tree.h (struct lang_type): Delete nested_udts field. (CLASSTYPE_NESTED_UTDS): Delete. * name-lookup.h (binding_table, binding_entry): Delete typedefs. (bt_foreach_proc): Likewise. (struct binding_entry_s): Delete. (SCOPE_DEFAULT_HT_SIZE, CLASS_SCOPE_HT_SIZE) (NAMESPACE_ORDINARY_HT_SIZE, NAMESPACE_STD_HT_SIZE) (GLOBAL_SCOPE_HT_SIZE): Delete. (binding_table_foreach, binding_table_find): Delete declarations. * name-lookup.c (ENTRY_INDEX): Delete. (free_binding_entry): Delete. (binding_entry_make, binding_entry_free): Delete. (struct binding_table_s): Delete. (binding_table_construct, binding_table_free): Delete. (binding_table_new, binding_table_expand): Delete. (binding_table_insert, binding_table_find): Delete. (binding_table_foreach): Delete. (maybe_process_template_type_declaration): Delete CLASSTYPE_NESTED_UTDS insertion. (do_pushtag): Likewise. * decl2.c (bt_reset_linkage_1): Fold into reset_type_linkage_1. (reset_type_linkage_2, bt_reset_linkage_2): Fold into reset_type_linkage. * pt.c (instantiate_class_template_1): Delete NESTED_UTDs comment. (bt_instantiate_type_proc): Delete. (do_type_instantiation): Instantiate implicit typedef fields. Delete NESTED_UTD walk. * search.c (lookup_field_r): Delete unreachable NESTED_UTD search.
2020-10-27c++: Small cleanup for do_type_instantiationNathan Sidwell2-56/+41
In working on a bigger cleanup I noticed some opportunities to make do_type_instantiation's control flow simpler. gcc/cp/ * parser.c (cp_parser_explicit_instantiation): Refactor some RAII. * pt.c (bt_instantiate_type_proc): DATA is the tree, pass type to do_type_instantiation. (do_type_instantiation): Require T to be a type. Refactor for some RAII.
2020-10-27AArch64: Fix overflow in memcopy expansion on aarch64.Tamar Christina2-4/+25
Currently the inline memcpy expansion code for AArch64 is using a signed int to hold the number of elements to copy. When you giver give it a value larger than INT_MAX it will overflow. The overflow causes the maximum number of instructions we want to expand to check to fail since this assumes an unsigned number. This patch changes the maximum isns arithmetic to be unsigned HOST_WIDE_INT. note that the calculation *must* remained signed as the memcopy issues overlapping unaligned copies. This means the pointer must be moved back and so you need signed arithmetic. gcc/ChangeLog: PR target/97535 * config/aarch64/aarch64.c (aarch64_expand_cpymem): Use unsigned arithmetic in check. gcc/testsuite/ChangeLog: PR target/97535 * gcc.target/aarch64/pr97535.c: New test.
2020-10-27aarch64: Add vcopy(q)__lane(q)_bf16 intrinsicsAndrea Corallo10-0/+206
gcc/ChangeLog 2020-10-20 Andrea Corallo <andrea.corallo@arm.com> * config/aarch64/arm_neon.h (vcopy_lane_bf16, vcopyq_lane_bf16) (vcopyq_laneq_bf16, vcopy_laneq_bf16): New intrinsics. gcc/testsuite/ChangeLog 2020-10-20 Andrea Corallo <andrea.corallo@arm.com> * gcc.target/aarch64/advsimd-intrinsics/bf16_vect_copy_lane_1.c: New test. * gcc.target/aarch64/advsimd-intrinsics/vcopy_lane_bf16_indices_1.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcopy_lane_bf16_indices_2.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcopy_laneq_bf16_indices_1.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcopy_laneq_bf16_indices_2.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcopyq_lane_bf16_indices_1.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcopyq_lane_bf16_indices_2.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcopyq_laneq_bf16_indices_1.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vcopyq_laneq_bf16_indices_2.c: Likewise.
2020-10-27libstdc++: Fix ODR violations caused by <tr1/functional>Jonathan Wakely2-82/+66
The placeholders for std::tr1::bind are defined in an anonymous namespace, which means they have internal linkage. This will cause ODR violations when used in function templates (such as std::tr1::bind) from multiple translation units. Although probably harmless (every definition will generate identical code, even if technically ill-formed) we can avoid the ODR violations by reusing the std::placeholder objects as the std::tr1::placeholder objects. To make this work, the std::_Placeholder type needs to be defined for C++98 mode, so that <tr1/functional> can use it. The members of the std::placeholder namespace must not be defined by <functional> in C++98 mode, because "placeholders", "_1", "_2" etc. are not reserved names in C++98. Instead they can be declared in <tr1/functional>, because those names *are* reserved in that header. With the std::placeholders objects declared, a simple using-directive suffices to redeclare them in namespace std::tr1::placeholders. This means any use of the TR1 placeholders actually refers to the C++11 placeholders, which are defined with external linkage and exported from the library, so don't cause ODR violations. libstdc++-v3/ChangeLog: * include/std/functional (std::_Placeholder): Define for C++98 as well as later standards. * include/tr1/functional (std::placeholders::_1 etc): Declare for C++98. (tr1::_Placeholder): Replace with using-declaration for std::_Placeholder. (tr1::placeholders::_1 etc.): Replace with using-directive for std::placeholders.
2020-10-27libstdc++: Remove unused variables in special functionsJonathan Wakely2-11/+1
libstdc++-v3/ChangeLog: * include/tr1/ell_integral.tcc (__ellint_rf, __ellint_rd) (__ellint_rc, __ellint_rj): Remove unused variables. * include/tr1/modified_bessel_func.tcc (__airy): Likewise.
2020-10-27libstdc++: Fix -Wsign-compare warnings in headersJonathan Wakely4-5/+5
libstdc++-v3/ChangeLog: * include/bits/locale_conv.h (__str_codecvt_out_all): Add cast to compare operands of the same signedness. * include/bits/locale_facets_nonio.tcc (time_get::_M_extract_wday_or_month): Likewise. * include/bits/sstream.tcc (basic_stringbuf::overflow): Likewise. * include/tr1/legendre_function.tcc (__sph_legendre): Use unsigned for loop variable.
2020-10-27Extract VX_CPU_PREFIX up into config/vxworks.hOlivier Hainque2-12/+12
Move VX_CPU_PREFIX to a place where it can be reused by multiple target ports. 2020-10-21 Olivier Hainque <hainque@adacore.com> gcc/ * config/vxworks.h (VX_CPU_PREFIX): #define here. * config/rs6000/vxworks.h: Remove #definition.
2020-10-27Fix glitch on VX_CPU selection for E6500Olivier Hainque1-1/+1
Proper macro name is PPCE6500, not E6500. Introduced accidentally during a pre-commit minor rearrangement. 2020-10-27 Olivier Hainque <hainque@adacore.com> gcc/ * config/rs6000/vxworks.h (CPP_SPEC): Fix macro definition for -mcpu=e6500.
2020-10-27Fix BB store group splitting group size computeRichard Biener2-1/+17
This fixes a mistake in the previous change in this area to what was desired - figure the largest power-of-two group size fitting in the matching area. 2020-10-27 Richard Biener <rguenther@suse.de> * tree-vect-slp.c (vect_build_slp_instance): Use ceil_log2 to compute maximum group-size. * gcc.dg/vect/bb-slp-67.c: New testcase.
2020-10-27Fix ipa-modref signature updatesJan Hubicka2-39/+59
PR ipa/97586 * ipa-modref-tree.h (modref_tree::remap_params): New member function. * ipa-modref.c (modref_summaries_lto::duplicate): Check that optimization summaries are not duplicated. (remap_arguments): Remove. (modref_transform): Rename to ... (update_signature): ... this one; handle also lto summary. (pass_ipa_modref::execute): Update signatures here rather than in transform hook.
2020-10-27libstdc++: Add missing noexcept to std::from_chars declarationsJonathan Wakely1-3/+3
libstdc++-v3/ChangeLog: * include/std/charconv (from_chars): Add noexcept to match definitions in src/c++17/floating_from_chars.cc
2020-10-27libstdc++: Fix directory_iterator exception specificationJonathan Wakely1-5/+1
libstdc++-v3/ChangeLog: * src/c++17/fs_dir.cc (fs::directory_iterator::operator*): Add noexcept. Do not throw on precondition violation.
2020-10-27libstdc++: Add noexcept to declaration of path::_List membersJonathan Wakely1-4/+4
libstdc++-v3/ChangeLog: * include/bits/fs_path.h (path::_List::begin, path::_List::end): Add noexcept to match definitions in src/c++17/fs_path.cc.
2020-10-27Add tests for PR92942 - missing -Wstringop-overflow for allocations with a ↵Martin Sebor2-0/+254
negative lower bound size. gcc/testsuite/ChangeLog: PR middle-end/92942 * gcc.dg/Wstringop-overflow-56.c: New test. * gcc.dg/Wstringop-overflow-57.c: Same.
2020-10-27Remove .s file.Martin Sebor1-271/+0
gcc/testsuite/ChangeLog: * gcc.dg/Wstringop-overflow-44.s: Remove.
2020-10-27Combine logical OR ranges properly. pr97567Andrew MacLeod1-2/+2
update testcase to work on 32 bit targets gcc/testsuite * gcc.dg/pr97567.c: Update to work with 32 bit targets.
2020-10-27Adjust BB vectorization function splittingRichard Biener3-21/+23
This adjusts the condition when to split at control altering stmts, only when there's a definition. It also removes the only use of --param slp-max-insns-in-bb which a previous change left doing nothing (but repeatedly print a message for each successive instruction...). 2020-10-27 Richard Biener <rguenther@suse.de> * tree-vect-slp.c (vect_slp_bbs): Remove no-op slp-max-insns-in-bb check. (vect_slp_function): Dump when splitting the function. Adjust the split condition for control altering stmts. * params.opt (-param=slp-max-insns-in-bb): Remove. * doc/invoke.texi (-param=slp-max-insns-in-bb): Likewise.
2020-10-27analyzer: don't assume extern const vars are zero-initialized [PR97568]David Malcolm3-2/+35
gcc/analyzer/ChangeLog: PR analyzer/97568 * region-model.cc (region_model::get_initial_value_for_global): Move check that !DECL_EXTERNAL from here to... * region.cc (decl_region::get_svalue_for_initializer): ...here, using it to reject zero initialization. gcc/testsuite/ChangeLog: PR analyzer/97568 * gcc.dg/analyzer/pr97568.c: New test.
2020-10-27analyzer: Change cast from long to intptr_t [PR96608]Markus Böck1-1/+1
Casting to intptr_t states the intent of an integer to pointer cast more clearly and ensures that the cast causes no loss of precision on any platforms. LLP64 platforms eg. have a long value of 4 bytes and pointer values of 8 bytes which may even cause compiler errors. gcc/analyzer/ChangeLog: PR analyzer/96608 * store.h (hash): Cast to intptr_t instead of long
2020-10-27analyzer: eliminate non-deterministic behaviorDavid Malcolm5-31/+102
This patch is a followup to the previous one, eliminating non-determinism in the behavior of the analyzer (rather than just in the logs), by sorting whenever the result previously depended on pointer values. Tested as per the previous patch. gcc/analyzer/ChangeLog: * constraint-manager.cc (svalue_cmp_by_ptr): Delete. (equiv_class::canonicalize): Use svalue::cmp_ptr_ptr instead. (equiv_class_cmp): Eliminate pointer comparison. * diagnostic-manager.cc (dedupe_key::comparator): If they are at the same location, also compare epath ength and pending_diagnostic kind. * engine.cc (readability_comparator): If two path_vars have the same readability, then impose an arbitrary ordering on them. (worklist::key_t::cmp): If two points have the same plan ordering, continue the comparison. Call sm_state_map::cmp rather than comparing hash values. * program-state.cc (sm_state_map::entry_t::cmp): New. (sm_state_map::cmp): New. * program-state.h (sm_state_map::entry_t::cmp): New decl. (sm_state_map::elements): New. (sm_state_map::cmp): New.
2020-10-27analyzer: eliminate non-determinism in logsDavid Malcolm13-65/+493
This patch and the followup eliminate various forms of non-determinism in the analyzer due to changing pointer values. This patch fixes churn seen when diffing analyzer logs. The patch avoids embedding pointers in various places, and adds sorting when dumping hash_set and hash_map for various analyzer types. Doing so requires implementing a way to sort svalue instances, and assigning UIDs to gimple statements. Tested both patches together via a script that runs a testcase 100 times, and then using diff and md5sum to verify that the results are consistent in the face of address space randomization: FILENAME=$1 rm $FILENAME.* for i in `seq 1 100`; do echo "iteration: $i" ./xgcc -B. -fanalyzer -c ../../src/gcc/testsuite/gcc.dg/analyzer/$FILENAME \ --Wanalyzer-too-complex \ -fdump-analyzer-supergraph \ -fdump-analyzer-exploded-graph \ -fdump-analyzer \ -fdump-noaddr \ -fdump-analyzer-exploded-nodes-2 mv $FILENAME.supergraph.dot $FILENAME.$i.supergraph.dot mv $FILENAME.analyzer.txt $FILENAME.$i.analyzer.txt mv $FILENAME.supergraph-eg.dot $FILENAME.$i.supergraph-eg.dot mv $FILENAME.eg.txt $FILENAME.$i.eg.txt mv $FILENAME.eg.dot $FILENAME.$i.eg.dot done gcc/analyzer/ChangeLog: * engine.cc (setjmp_record::cmp): New. (supernode_cluster::dump_dot): Avoid embedding pointer in cluster name. (supernode_cluster::cmp_ptr_ptr): New. (function_call_string_cluster::dump_dot): Avoid embedding pointer in cluster name. Sort m_map when dumping child clusters. (function_call_string_cluster::cmp_ptr_ptr): New. (root_cluster::dump_dot): Sort m_map when dumping child clusters. * program-point.cc (function_point::cmp): New. (function_point::cmp_ptr): New. * program-point.h (function_point::cmp): New decl. (function_point::cmp_ptr): New decl. * program-state.cc (sm_state_map::print): Sort the values. Guard the printing of pointers with !flag_dump_noaddr. (program_state::prune_for_point): Sort the regions. (log_set_of_svalues): Sort the values. Guard the printing of pointers with !flag_dump_noaddr. * region-model-manager.cc (log_uniq_map): Sort the values. * region-model-reachability.cc (dump_set): New function template. (reachable_regions::dump_to_pp): Use it. * region-model.h (svalue::cmp_ptr): New decl. (svalue::cmp_ptr_ptr): New decl. (setjmp_record::cmp): New decl. (placeholder_svalue::get_name): New accessor. (widening_svalue::get_point): New accessor. (compound_svalue::get_map): New accessor. (conjured_svalue::get_stmt): New accessor. (conjured_svalue::get_id_region): New accessor. (region::cmp_ptrs): Rename to... (region::cmp_ptr_ptr): ...this. * region.cc (region::cmp_ptrs): Rename to... (region::cmp_ptr_ptr): ...this. * state-purge.cc (state_purge_per_ssa_name::state_purge_per_ssa_name): Sort m_points_needing_name when dumping. * store.cc (concrete_binding::cmp_ptr_ptr): New. (symbolic_binding::cmp_ptr_ptr): New. (binding_map::cmp): New. (get_sorted_parent_regions): Update for renaming of region::cmp_ptrs to region::cmp_ptr_ptr. (store::dump_to_pp): Likewise. (store::to_json): Likewise. (store::can_merge_p): Sort the base regions before considering them. * store.h (concrete_binding::cmp_ptr_ptr): New decl. (symbolic_binding::cmp_ptr_ptr): New decl. (binding_map::cmp): New decl. * supergraph.cc (supergraph::supergraph): Assign UIDs to the gimple stmts. * svalue.cc (cmp_cst): New. (svalue::cmp_ptr): New. (svalue::cmp_ptr_ptr): New.
2020-10-27analyzer: fix param "analyzer-max-enodes-per-program-point"David Malcolm1-1/+1
This was effectively checking for one beyond the limit, rather than the limit itself. Seen when fixing PR analyzer/97514. gcc/analyzer/ChangeLog: * engine.cc (exploded_graph::get_or_create_node): Fix off-by-one when imposing param_analyzer_max_enodes_per_program_point limit.
2020-10-27libstdc++: Include <cstdint> in tests that use std::uintptr_tJonathan Wakely2-0/+2
libstdc++-v3/ChangeLog: * testsuite/experimental/memory_resource/new_delete_resource.cc: Add missing <cstdint> header. * testsuite/experimental/memory_resource/resource_adaptor.cc: Likewise.
2020-10-27analyzer: implement region_model::get_representative_path_var for labelsDavid Malcolm2-1/+6
This fixes an ICE seen e.g. with gcc.dg/analyzer/data-model-16.c when enabling -fdump-analyzer. gcc/analyzer/ChangeLog: * region-model.cc (region_model::get_representative_path_var): Implement case RK_LABEL. * region-model.h (label_region::get_label): New accessor.
2020-10-27testsuite: restrict test to c++11 and later [PR97590]Jakub Jelinek1-1/+2
2020-10-27 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/97560 PR testsuite/97590 * g++.dg/pr97560.C: Require c++11 effective target and add comment with PR number.
2020-10-27Refactor array descriptor field accessRichard Biener1-128/+56
This refactors the array descriptor component access tree building to commonize code into new helpers to provide a single place to fix correctness issues with respect to TBAA. The only interesting part is the gfc_conv_descriptor_data_get change to drop broken special-casing of REFERENCE_TYPE desc which, when hit, would build invalid GENERIC trees, missing an INDIRECT_REF before subsetting the descriptor with a COMPONENT_REF. 2020-10-16 Richard Biener <rguenther@suse.de> gcc/fortran/ChangeLog: * trans-array.c (gfc_get_descriptor_field): New helper. (gfc_conv_descriptor_data_get): Use it - drop strange REFERENCE_TYPE handling and make sure we don't trigger it. (gfc_conv_descriptor_data_addr): Use gfc_get_descriptor_field. (gfc_conv_descriptor_data_set): Likewise. (gfc_conv_descriptor_offset): Likewise. (gfc_conv_descriptor_dtype): Likewise. (gfc_conv_descriptor_span): Likewise. (gfc_get_descriptor_dimension): Likewise. (gfc_conv_descriptor_token): Likewise. (gfc_conv_descriptor_subfield): New helper. (gfc_conv_descriptor_stride): Use it. (gfc_conv_descriptor_lbound): Likewise. (gfc_conv_descriptor_ubound): Likewise.
2020-10-27SLP vectorize across PHI nodesRichard Biener24-235/+1031
This makes SLP discovery detect backedges by seeding the bst_map with the node to be analyzed so it can be picked up from recursive calls. This removes the need to discover backedges in a separate walk. This enables SLP build to handle PHI nodes in full, continuing the SLP build to non-backedges. For loop vectorization this enables outer loop vectorization of nested SLP cycles and for BB vectorization this enables vectorization of PHIs at CFG merges. It also turns code generation into a SCC discovery walk to handle irreducible regions and nodes only reachable via backedges where we now also fill in vectorized backedge defs. This requires sanitizing the SLP tree for SLP reduction chains even more, manually filling the backedge SLP def. This also exposes the fact that CFG copying (and edge splitting until I fixed that) ends up with different edge order in the copy which doesn't play well with the desired 1:1 mapping of SLP PHI node children and edges for epilogue vectorization. I've tried to fixup CFG copying here but this really looks like a dead (or expensive) end there so I've done fixup in slpeel_tree_duplicate_loop_to_edge_cfg instead for the cases we can run into. There's still NULLs in the SLP_TREE_CHILDREN vectors and I'm not sure it's possible to eliminate them all this stage1 so the patch has quite some checks for this case all over the place. Bootstrapped and tested on x86_64-unknown-linux-gnu. SPEC CPU 2017 and SPEC CPU 2006 successfully built and tested. 2020-10-27 Richard Biener <rguenther@suse.de> * gimple.h (gimple_expr_type): For PHIs return the type of the result. * tree-vect-loop-manip.c (slpeel_tree_duplicate_loop_to_edge_cfg): Make sure edge order into copied loop headers line up with the originals. * tree-vect-loop.c (vect_transform_cycle_phi): Handle nested loops with SLP. (vectorizable_phi): New function. (vectorizable_live_operation): For BB vectorization compute insert location here. * tree-vect-slp.c (vect_free_slp_tree): Deal with NULL SLP_TREE_CHILDREN entries. (vect_create_new_slp_node): Add overloads with pre-existing node argument. (vect_print_slp_graph): Likewise. (vect_mark_slp_stmts): Likewise. (vect_mark_slp_stmts_relevant): Likewise. (vect_gather_slp_loads): Likewise. (vect_optimize_slp): Likewise. (vect_slp_analyze_node_operations): Likewise. (vect_bb_slp_scalar_cost): Likewise. (vect_remove_slp_scalar_calls): Likewise. (vect_get_and_check_slp_defs): Handle PHIs. (vect_build_slp_tree_1): Handle PHIs. (vect_build_slp_tree_2): Continue SLP build, following PHI arguments. Fix memory leak. (vect_build_slp_tree): Put stub node into the hash-map so we can discover cycles directly. (vect_build_slp_instance): Set the backedge SLP def for reduction chains. (vect_analyze_slp_backedges): Remove. (vect_analyze_slp): Do not call it. (vect_slp_convert_to_external): Release SLP_TREE_LOAD_PERMUTATION. (vect_slp_analyze_node_operations): Handle stray failed backedge defs by failing. (vect_slp_build_vertices): Adjust leaf condition. (vect_bb_slp_mark_live_stmts): Handle PHIs, use visited hash-set to handle cycles. (vect_slp_analyze_operations): Adjust. (vect_bb_partition_graph_r): Likewise. (vect_slp_function): Adjust split condition to allow CFG merges. (vect_schedule_slp_instance): Rename to ... (vect_schedule_slp_node): ... this. Move DFS walk to ... (vect_schedule_scc): ... this new function. (vect_schedule_slp): Call it. Remove ad-hoc vectorized backedge fill code. * tree-vect-stmts.c (vect_analyze_stmt): Call vectorizable_phi. (vect_transform_stmt): Likewise. (vect_is_simple_use): Handle vect_backedge_def. * tree-vectorizer.c (vec_info::new_stmt_vec_info): Only set loop header PHIs to vect_unknown_def_type for loop vectorization. * tree-vectorizer.h (enum vect_def_type): Add vect_backedge_def. (enum stmt_vec_info_type): Add phi_info_type. (vectorizable_phi): Declare. * gcc.dg/vect/bb-slp-54.c: New test. * gcc.dg/vect/bb-slp-55.c: Likewise. * gcc.dg/vect/bb-slp-56.c: Likewise. * gcc.dg/vect/bb-slp-57.c: Likewise. * gcc.dg/vect/bb-slp-58.c: Likewise. * gcc.dg/vect/bb-slp-59.c: Likewise. * gcc.dg/vect/bb-slp-60.c: Likewise. * gcc.dg/vect/bb-slp-61.c: Likewise. * gcc.dg/vect/bb-slp-62.c: Likewise. * gcc.dg/vect/bb-slp-63.c: Likewise. * gcc.dg/vect/bb-slp-64.c: Likewise. * gcc.dg/vect/bb-slp-65.c: Likewise. * gcc.dg/vect/bb-slp-66.c: Likewise. * gcc.dg/vect/vect-outer-slp-1.c: Likewise. * gfortran.dg/vect/O3-bb-slp-1.f: Likewise. * gfortran.dg/vect/O3-bb-slp-2.f: Likewise. * g++.dg/vect/simd-11.cc: Likewise.