aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2022-08-31c: C2x attributes fixes and updatesJoseph Myers11-18/+60
Implement some changes to the currently supported C2x standard attributes that have been made to the specification since they were first implemented in GCC, and some consequent changes: * maybe_unused is now supported on labels. In fact that was already accidentally supported in GCC as a result of sharing the implementation with __attribute__ ((unused)), but needed to be covered in the tests. * As part of the support for maybe_unused on labels, its __has_c_attribute value changed. * The issue of maybe_unused accidentally being already supported on labels showed up the lack of tests for other standard attributes being incorrectly applied to labels; add such tests. * Use of fallthrough or nodiscard attributes on labels already properly resulted in a pedwarn. For the deprecated attribute, however, there was only a warning, and the wording "'deprecated' attribute ignored for 'void'" included an unhelpful "for 'void'". Arrange for the case of the deprecated attribute on a label to be checked for separately and result in a pedwarn. As with inappropriate uses of fallthrough (see commit 6c80b1b56dec2691436f3e2676e3d1b105b01b89), it seems reasonable for this pedwarn to apply regardless of whether [[]] or __attribute__ was used and regardless of whether C or C++ is being compiled. * Attributes on case or default labels (the standard syntax supports attributes on all kinds of labels) were quietly ignored, whether or not appropriate for use in such a context, because they weren't passed to decl_attributes at all. (Note where I'm changing the do_case prototype that such a function is actually only defined in the C front end, not for C++, despite the declaration being in c-common.h.) * A recent change as part of the editorial review in preparation for the C2x CD ballot has changed the __has_c_attribute value for fallthrough to 201910 to reflect when that attribute was actually voted into the working draft. Bootstrapped with no regressions for x86_64-pc-linux-gnu. gcc/c-family/ * c-attribs.cc (handle_deprecated_attribute): Check and pedwarn for LABEL_DECL. * c-common.cc (c_add_case_label): Add argument ATTRS. Call decl_attributes. * c-common.h (do_case, c_add_case_label): Update declarations. * c-lex.cc (c_common_has_attribute): For C, produce a result of 201910 for fallthrough and 202106 for maybe_unused. gcc/c/ * c-parser.cc (c_parser_label): Pass attributes to do_case. * c-typeck.cc (do_case): Add argument ATTRS. Pass it to c_add_case_label. gcc/testsuite/ * gcc.dg/c2x-attr-deprecated-2.c, gcc.dg/c2x-attr-fallthrough-2.c, gcc.dg/c2x-attr-maybe_unused-1.c, gcc.dg/c2x-attr-nodiscard-2.c: Add tests of attributes on labels. * gcc.dg/c2x-has-c-attribute-2.c: Update expected results for maybe_unused and fallthrough.
2022-08-3132-bit PA-RISC with HP-UX: remove deprecated portsMartin Liska12-319/+8
ChangeLog: * configure: Regenerate. * configure.ac: Delete hpux9 and hpux10. config/ChangeLog: * mh-pa-hpux10: Removed. contrib/ChangeLog: * config-list.mk: Remove deprecated ports. contrib/header-tools/ChangeLog: * README: Remove deprecated ports. * reduce-headers: Likewise. gcc/ChangeLog: * config.build: Remove deprecated ports. * config.gcc: Likewise. * config.host: Likewise. * configure.ac: Likewise. * configure: Regenerate. * config/pa/pa-hpux10.h: Removed. * config/pa/pa-hpux10.opt: Removed. * config/pa/t-dce-thr: Removed. gnattools/ChangeLog: * configure.ac: Remove deprecated ports. * configure: Regenerate. libstdc++-v3/ChangeLog: * configure: Regenerate. * crossconfig.m4: Remove deprecated ports. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/lambda/lambda-conv.C: Remove useless test. * gcc.c-torture/execute/ieee/hugeval.x: Likewise. * gcc.dg/torture/pr47917.c: Likewise. * lib/target-supports.exp: Likewise. libgcc/ChangeLog: * config.host: Remove hppa. libitm/ChangeLog: * configure: Regenerate. fixincludes/ChangeLog: * configure: Regenerate.
2022-08-31testsuite: Fix warning regression due to std::string changes [PR106795]Jonathan Wakely1-1/+1
std::string now has [[nodiscard]] attributes on most members, causing -Wunused-result warnings for this test. gcc/testsuite/ChangeLog: PR testsuite/106795 * g++.dg/tree-ssa/empty-loop.C: Use -Wno-unused-result.
2022-08-31Support --disable-fixincludes.Martin Liska3-19/+27
Always install limits.h and syslimits.h header files to include folder. When --disable-fixincludes is used, then no system header files are fixed by the tools in fixincludes. Moreover, the fixincludes tools are not built any longer. gcc/ChangeLog: * Makefile.in: Always install limits.h and syslimits.h to include folder. * configure.ac: Assign STMP_FIXINC blank if --disable-fixincludes is used. * configure: Regenerate.
2022-08-31aarch64: Update sizeless tests for recent GNU C changesRichard Sandiford4-8/+8
The tests for sizeless SVE types include checks that the types are handled for initialisation purposes in the same way as scalars. GNU C and C2x now allow scalars to be initialised using empty braces, so this patch updates the SVE tests to match. gcc/testsuite/ * gcc.target/aarch64/sve/acle/general-c/gnu_vectors_1.c: Update tests for empty initializers. * gcc.target/aarch64/sve/acle/general-c/gnu_vectors_2.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/sizeless-1.c: Likewise. * gcc.target/aarch64/sve/acle/general-c/sizeless-2.c: Likewise.
2022-08-31Avoid fatal fails in predicate::init_from_control_depsRichard Biener1-64/+55
When processing USE predicates we can drop from the AND chain, when procsssing DEF predicates we can drop from the OR chain. Do that instead of giving up completely. This also removes cases that should never trigger. * gimple-predicate-analysis.cc (predicate::init_from_control_deps): Assert the guard_bb isn't empty and has more than one successor. Drop appropriate parts of the predicate when an edge fails to register a predicate. (predicate::dump): Dump empty predicate as TRUE.
2022-08-31tree-optimization/90994 - fix uninit diagnostics with EHRichard Biener2-5/+48
r12-3640-g94c12ffac234b2 sneaked in a hack to avoid the diagnostic for the testcase in PR90994 which sees non-call EH control flow confusing predicate analysis. The following patch instead adjusts the existing code handling EH to handle non-calls and do what I think was intented. PR tree-optimization/90994 * gimple-predicate-analysis.cc (predicate::init_from_control_deps): Ignore exceptional control flow and skip the edge for the purpose of predicate generation also for non-calls. * g++.dg/torture/pr90994.C: New testcase.
2022-08-31Stream out endpoints for frange.Aldy Hernandez2-14/+12
We only stream out the FP properties for global float ranges (currently only NAN). The following patch adds the endpoints as well. gcc/ChangeLog: * value-range-storage.cc (frange_storage_slot::set_frange): Save endpoints. (frange_storage_slot::get_frange): Restore endpoints. * value-range-storage.h (class frange_storage_slot): Add endpoint fields.
2022-08-31remove unused functionMartin Liska1-16/+0
PR tree-optimization/106789 gcc/ChangeLog: * range-op-float.cc (default_frelop_fold_range): Remove the function.
2022-08-31fix clang warnings (-Winconsistent-missing-override)Martin Liska1-4/+4
gcc/ChangeLog: * value-range.h: Add more override keywords.
2022-08-31fix -Winconsistent-missing-override clang warningMartin Liska1-1/+1
Fixes: gcc/value-range.h:357:16: warning: 'set_nonnegative' overrides a member function but is not marked 'override' [-Winconsistent-missing-override] gcc/ChangeLog: * value-range.h: Add override.
2022-08-31tree-optimization/65244 - include asserts in predicates for uninitRichard Biener3-7/+32
When uninit computes the actual predicates from the control dependence edges it currently skips those that are assert-like (where one edge leads to a block which ends in a noreturn call). That leads to bogus uninit diagnostics when applied on the USE side. PR tree-optimization/65244 * gimple-predicate-analysis.h (predicate::init_from_control_deps): Add argument to specify whether the predicate is for the USE. * gimple-predicate-analysis.cc (predicate::init_from_control_deps): Also include predicates effective fallthru control edges when the predicate is for the USE. * gcc.dg/uninit-pr65244-2.c: New testcase.
2022-08-31tree-optimization/73550 - more switch handling improvements for uninitRichard Biener1-27/+50
The following makes predicate analysis handle case labels with a non-singleton contiguous range. PR tree-optimization/73550 * gimple-predicate-analysis.cc (predicate::init_from_control_deps): Sanitize debug dumping. Handle case labels with a CASE_HIGH. (predicate::dump): Adjust for better readability.
2022-08-31uninit testcase for PR65244Richard Biener1-0/+20
The PR65244 has an issue with code in init_from_control_deps for which there's no direct testcase. The following adds one. PR tree-optimization/65244 * gcc.dg/uninit-pr65244-1.c: New testcase.
2022-08-31omp-simd-clone: Unbreak bootstrapJakub Jelinek1-3/+3
This patch fixes -Werror=sign-compare errors during stage2/stage3. 2022-08-30 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> Jakub Jelinek <jakub@redhat.com> * omp-simd-clone.cc (simd_clone_adjust_return_type, simd_clone_adjust_argument_types): Use known_eq (veclen, 0U) instead of known_eq (veclen, 0) to avoid -Wsign-compare warnings.
2022-08-31vect: Fix stray argument in call to dump_printf_locRichard Sandiford1-1/+1
One call to dump_printf_loc had a stray left-over argument from an earlier version of the patch. This went unnoticed on aarch64-linux-gnu and x86_64-linux-gnu since the parameters that actually mattered were passed in FPRs rather than GPRs, but I assume this is the reason for the i686-linux-gnu failures that Jakub hit. gcc/ * tree-vect-slp.cc (vect_optimize_slp_pass::dump): Remove bogus argument.
2022-08-31middle-end: Fix unexpected warnings for RISC-V port.zhongjuzhe1-1/+2
gcc/ChangeLog: * tree-vect-loop-manip.cc (vect_gen_vector_loop_niters): Simply initialize const_vf to 0.
2022-08-31cr16: remove leftover in config.gccMartin Liska1-7/+1
gcc/ChangeLog: * config.gcc: Remove cr16.
2022-08-31Daily bump.GCC Administrator5-1/+350
2022-08-30Update gcc sv.poJoseph Myers1-12/+9
* sv.po: Update.
2022-08-30vec: Add array_slice constructors from non-const and gc vectorsMartin Jambor1-0/+12
This patch adds constructors of array_slice that are required to create them from non-const (heap or auto) vectors or from GC vectors. gcc/ChangeLog: 2022-08-08 Martin Jambor <mjambor@suse.cz> * vec.h (array_slice): Add constructors for non-const reference to heap vector and pointers to heap vectors.
2022-08-30Improve union of ranges containing NAN.Aldy Hernandez1-10/+34
Previously [5,6] U NAN would just drop to VARYING. With this patch, the resulting range becomes [5,6] with the NAN bit set to unknown. [I still have yet to decide what to do with intersections. ISTM, the intersection of a known NAN with anything else should be a NAN, but it could also be undefined (the empty set). I'll have to run some tests and see. Currently, we drop to VARYING cause well... it's always safe to give up;-).] gcc/ChangeLog: * value-range.cc (early_nan_resolve): Change comment. (frange::union_): Handle union when one side is a NAN. (range_tests_nan): Add tests for NAN union.
2022-08-30amdgcn: OpenMP SIMD routine supportAndrew Stubbs7-0/+72
Enable and configure SIMD clones for amdgcn. This affects both the __simd__ function attribute, and the OpenMP "declare simd" directive. Note that the masked SIMD variants are generated, but the middle end doesn't actually support calling them yet. gcc/ChangeLog: * config/gcn/gcn.cc (gcn_simd_clone_compute_vecsize_and_simdlen): New. (gcn_simd_clone_adjust): New. (gcn_simd_clone_usable): New. (TARGET_SIMD_CLONE_ADJUST): New. (TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN): New. (TARGET_SIMD_CLONE_USABLE): New. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-simd-clone-1.c: Add dg-warning. * gcc.dg/vect/vect-simd-clone-2.c: Add dg-warning. * gcc.dg/vect/vect-simd-clone-3.c: Add dg-warning. * gcc.dg/vect/vect-simd-clone-4.c: Add dg-warning. * gcc.dg/vect/vect-simd-clone-5.c: Add dg-warning. * gcc.dg/vect/vect-simd-clone-8.c: Add dg-warning.
2022-08-30omp-simd-clone: Allow fixed-lane vectorsAndrew Stubbs3-5/+21
The vecsize_int/vecsize_float has an assumption that all arguments will use the same bitsize, and vary the number of lanes according to the element size, but this is inappropriate on targets where the number of lanes is fixed and the bitsize varies (i.e. amdgcn). With this change the vecsize can be left zero and the vectorization factor will be the same for all types. gcc/ChangeLog: * doc/tm.texi: Regenerate. * omp-simd-clone.cc (simd_clone_adjust_return_type): Allow zero vecsize. (simd_clone_adjust_argument_types): Likewise. * target.def (compute_vecsize_and_simdlen): Document the new vecsize_int and vecsize_float semantics.
2022-08-30expmed: Fix store_bit_field_1 subreg offsetRichard Sandiford1-6/+6
store_bit_field_1 tries to convert a field assignment into a subreg assignment. Normally it must check that the field occupies a full word (or more specifically, a full REGMODE_NATURAL_SIZE chunk), so that writing to the subreg doesn't clobber any other fields. But it can skip that check if the structure is known to be in an undefined state. The idea was that, in the undefined case, we could rely on simplify_gen_subreg to do the check for a valid subreg, rather than having to repeat the required endianness logic in the caller. Before the addition of the undefined case, the code could use regnum * regsize to get the byte offset, where regnum came from checking that the start was word-aligned. In the undefined case we need to calculate the byte offset explicitly. gcc/ * expmed.cc (store_bit_field_1): Fix byte offset calculation for undefined structures.
2022-08-30Extend SLP permutation optimisationsRichard Sandiford23-404/+1975
Currently SLP tries to force permute operations "down" the graph from loads in the hope of reducing the total number of permutations needed or (in the best case) removing the need for the permutations entirely. This patch tries to extend it as follows: - Allow loads to take a different permutation from the one they started with, rather than choosing between "original permutation" and "no permutation". - Allow changes in both directions, if the target supports the reverse permutation. - Treat the placement of permutations as a two-way dataflow problem: after propagating information from leaves to roots (as now), propagate information back up the graph. - Take execution frequency into account when optimising for speed, so that (for example) permutations inside loops have a higher cost than permutations outside loops. - Try to reduce the total number of permutations when optimising for size, even if that increases the number of permutations on a given execution path. See the big block comment above vect_optimize_slp_pass for a detailed description. The original motivation for doing this was to add a framework that would allow other layout differences in future. The two main ones are: - Make it easier to represent predicated operations, including predicated operations with gaps. E.g.: a[0] += 1; a[1] += 1; a[3] += 1; could be a single load/add/store for SVE. We could handle this by representing a layout such as { 0, 1, _, 2 } or { 0, 1, _, 3 } (depending on what's being counted). We might need to move elements between lanes at various points, like with permutes. (This would first mean adding support for stores with gaps.) - Make it easier to switch between an even/odd and unpermuted layout when switching between wide and narrow elements. E.g. if a widening operation produces an even vector and an odd vector, we should try to keep operations on the wide elements in that order rather than force them to be permuted back "in order". To give some examples of what the patch does: int f1(int *__restrict a, int *__restrict b, int *__restrict c, int *__restrict d) { a[0] = (b[1] << c[3]) - d[1]; a[1] = (b[0] << c[2]) - d[0]; a[2] = (b[3] << c[1]) - d[3]; a[3] = (b[2] << c[0]) - d[2]; } continues to produce the same code as before when optimising for speed: b, c and d are permuted at load time. But when optimising for size we instead permute c into the same order as b+d and then permute the result of the arithmetic into the same order as a: ldr q1, [x2] ldr q0, [x1] ext v1.16b, v1.16b, v1.16b, #8 // <------ sshl v0.4s, v0.4s, v1.4s ldr q1, [x3] sub v0.4s, v0.4s, v1.4s rev64 v0.4s, v0.4s // <------ str q0, [x0] ret The following function: int f2(int *__restrict a, int *__restrict b, int *__restrict c, int *__restrict d) { a[0] = (b[3] << c[3]) - d[3]; a[1] = (b[2] << c[2]) - d[2]; a[2] = (b[1] << c[1]) - d[1]; a[3] = (b[0] << c[0]) - d[0]; } continues to push the reverse down to just before the store, like the previous code did. In: int f3(int *__restrict a, int *__restrict b, int *__restrict c, int *__restrict d) { for (int i = 0; i < 100; ++i) { a[0] = (a[0] + c[3]); a[1] = (a[1] + c[2]); a[2] = (a[2] + c[1]); a[3] = (a[3] + c[0]); c += 4; } } the loads of a are hoisted and the stores of a are sunk, so that only the load from c happens in the loop. When optimising for speed, we prefer to have the loop operate on the reversed layout, changing on entry and exit from the loop: mov x3, x0 adrp x0, .LC0 add x1, x2, 1600 ldr q2, [x0, #:lo12:.LC0] ldr q0, [x3] mov v1.16b, v0.16b tbl v0.16b, {v0.16b - v1.16b}, v2.16b // <-------- .p2align 3,,7 .L6: ldr q1, [x2], 16 add v0.4s, v0.4s, v1.4s cmp x2, x1 bne .L6 mov v1.16b, v0.16b adrp x0, .LC0 ldr q2, [x0, #:lo12:.LC0] tbl v0.16b, {v0.16b - v1.16b}, v2.16b // <-------- str q0, [x3] ret Similarly, for the very artificial testcase: int f4(int *__restrict a, int *__restrict b, int *__restrict c, int *__restrict d) { int a0 = a[0]; int a1 = a[1]; int a2 = a[2]; int a3 = a[3]; for (int i = 0; i < 100; ++i) { a0 ^= c[0]; a1 ^= c[1]; a2 ^= c[2]; a3 ^= c[3]; c += 4; for (int j = 0; j < 100; ++j) { a0 += d[1]; a1 += d[0]; a2 += d[3]; a3 += d[2]; d += 4; } b[0] = a0; b[1] = a1; b[2] = a2; b[3] = a3; b += 4; } a[0] = a0; a[1] = a1; a[2] = a2; a[3] = a3; } the a vector in the inner loop maintains the order { 1, 0, 3, 2 }, even though it's part of an SCC that includes the outer loop. In other words, this is a motivating case for not assigning permutes at SCC granularity. The code we get is: ldr q0, [x0] mov x4, x1 mov x5, x0 add x1, x3, 1600 add x3, x4, 1600 .p2align 3,,7 .L11: ldr q1, [x2], 16 sub x0, x1, #1600 eor v0.16b, v1.16b, v0.16b rev64 v0.4s, v0.4s // <--- .p2align 3,,7 .L10: ldr q1, [x0], 16 add v0.4s, v0.4s, v1.4s cmp x0, x1 bne .L10 rev64 v0.4s, v0.4s // <--- add x1, x0, 1600 str q0, [x4], 16 cmp x3, x4 bne .L11 str q0, [x5] ret bb-slp-layout-17.c is a collection of compile tests for problems I hit with earlier versions of the patch. The same prolems might show up elsewhere, but it seemed worth having the test anyway. In slp-11b.c we previously pushed the permutation of the in[i*4] group down from the load to just before the store. That didn't reduce the number or frequency of the permutations (or increase them either). But separating the permute from the load meant that we could no longer use load/store lanes. Whether load/store lanes are a good idea here is another question. If there were two sets of loads, and if we could use a single permutation instead of one per load, then avoiding load/store lanes should be a good thing even under the current abstract cost model. But I think under the current model we should try to avoid splitting up potential load/store lanes groups if there is no specific benefit to the split. Preferring load/store lanes is still a source of missed optimisations that we should fix one day... gcc/ * params.opt (-param=vect-max-layout-candidates=): New parameter. * doc/invoke.texi (vect-max-layout-candidates): Document it. * tree-vectorizer.h (auto_lane_permutation_t): New typedef. (auto_load_permutation_t): Likewise. * tree-vect-slp.cc (vect_slp_node_weight): New function. (slpg_layout_cost): New class. (slpg_vertex): Replace perm_in and perm_out with partition, out_degree, weight and out_weight. (slpg_partition_info, slpg_partition_layout_costs): New classes. (vect_optimize_slp_pass): Likewise, cannibalizing some part of the previous vect_optimize_slp. (vect_optimize_slp): Use it. gcc/testsuite/ * lib/target-supports.exp (check_effective_target_vect_var_shift): Return true for aarch64. * gcc.dg/vect/bb-slp-layout-1.c: New test. * gcc.dg/vect/bb-slp-layout-2.c: New test. * gcc.dg/vect/bb-slp-layout-3.c: New test. * gcc.dg/vect/bb-slp-layout-4.c: New test. * gcc.dg/vect/bb-slp-layout-5.c: New test. * gcc.dg/vect/bb-slp-layout-6.c: New test. * gcc.dg/vect/bb-slp-layout-7.c: New test. * gcc.dg/vect/bb-slp-layout-8.c: New test. * gcc.dg/vect/bb-slp-layout-9.c: New test. * gcc.dg/vect/bb-slp-layout-10.c: New test. * gcc.dg/vect/bb-slp-layout-11.c: New test. * gcc.dg/vect/bb-slp-layout-13.c: New test. * gcc.dg/vect/bb-slp-layout-14.c: New test. * gcc.dg/vect/bb-slp-layout-15.c: New test. * gcc.dg/vect/bb-slp-layout-16.c: New test. * gcc.dg/vect/bb-slp-layout-17.c: New test. * gcc.dg/vect/slp-11b.c: XFAIL SLP test for load-lanes targets.
2022-08-30Add base hash traits for vectorsRichard Sandiford1-0/+55
This patch adds a class that provides basic hash/equal functions for vectors, based on corresponding traits for the element type. gcc/ * hash-traits.h (vec_hash_base): New class. (vec_free_hash_base): Likewise.
2022-08-30Rearrange unbounded_hashmap_traitsRichard Sandiford2-51/+65
int_hash combines two kinds of operation: (1) hashing and equality of integers (2) using spare integer encodings to represent empty and deleted slots (1) is really independent of (2), and could be useful in cases where no spare integer encodings are available. This patch adds a base class (int_hash_base) for (1) and makes int_hash inherit from it. If we follow a similar style for future hashes, we can make unbounded_hashmap_traits take the "base" hash for the key as a template parameter, rather than requiring every type of key to have a separate derivative of unbounded_hashmap_traits. A later patch applies this to vector keys. No functional change intended. gcc/ * hash-traits.h (int_hash_base): New struct, split out from... (int_hash): ...this class, which now inherits from int_hash_base. * hash-map-traits.h (unbounded_hashmap_traits): Take a template parameter for the key that provides hash and equality functions. (unbounded_int_hashmap_traits): Turn into a type alias of unbounded_hashmap_traits.
2022-08-30Make graphds_scc pass the node order back to callersRichard Sandiford2-4/+12
As a side-effect, graphds_scc constructs a vector in which all nodes in an SCC are listed consecutively. This can be useful information, so that the patch adds an optional pass-back parameter for it. The interface is similar to the one for graphds_dfs. gcc/ * graphds.cc (graphds_scc): Add a pass-back parameter for the final node order. * graphds.h (graphds_scc): Update prototype accordingly.
2022-08-30Split code out of vect_transform_slp_perm_loadRichard Sandiford1-17/+37
Similarly to the previous vectorizable_slp_permutation patch, this one splits out the main part of vect_transform_slp_perm_load so that a later patch can test a permutation without constructing a node for it. Also fixes a lingering use of STMT_VINFO_VECTYPE. gcc/ * tree-vect-slp.cc (vect_transform_slp_perm_load_1): Split out from... (vect_transform_slp_perm_load): ...here. Use SLP_TREE_VECTYPE instead of STMT_VINFO_VECTYPE.
2022-08-30Split code out of vectorizable_slp_permutationRichard Sandiford1-32/+66
A later patch needs to test whether the target supports a lane_permutation_t without having to construct a full SLP node to test that. This patch splits out most of the work of vectorizable_slp_permutation into a subroutine, so that properties of the permutation can be passed explicitly without disturbing the main interface. The new subroutine still uses an slp_tree argument to get things like the number of lanes and the vector type. That's a bit clunky, but it seemed like the least worst option. gcc/ * tree-vect-slp.cc (vectorizable_slp_permutation_1): Split out from... (vectorizable_slp_permutation): ...here.
2022-08-30vect: Tighten get_related_vectype_for_scalar_typeRichard Sandiford3-3/+28
Builds of glibc with SVE enabled have been failing since V1DI was added to the aarch64 port. The problem is that BB SLP starts the (hopeless) attempt to use variable-length modes to vectorise a single-element vector, and that now gets further than it did before. Initially we tried getting a vector mode with 1 + 1X DI elements (i.e. 1 DI per 128-bit vector chunk). We don't provide such a mode -- it would be VNx1DI -- because it isn't a native SVE format. We then try just 1 DI, which previously failed but now succeeds. There are numerous ways we could fix this. Perhaps the most obvious would be to skip variable-length modes for BB SLP. However, I think that'd just be kicking the can down the road, since eventually we want to support BB SLP and VLA vectors using predication. However, if we do use VLA vectors for BB SLP, the vector modes we use should actually be variable length. We don't want to use variable-length vectors for some element types/group sizes and fixed-length vectors for others, since it would be difficult to handle the seams. The same principle applies during loop vectorisation. We can't use a mixture of variable-length and fixed-length vectors for the same loop because the relative unroll/vectorisation factors would not be constant (compile-time) multiples of each other. This patch therefore makes get_related_vectype_for_scalar_type check that the provided number of units is interoperable with the provided prevailing mode. The function is generally quite forgiving -- it does basic things like checking for scalarness itself rather than expecting callers to do them -- so the new check feels in keeping with that. This seems to subsume the fix for PR96974. I'm not sure it's worth reverting that code to an assert though, so the patch just drops the scan for the associated message. gcc/ * tree-vect-stmts.cc (get_related_vectype_for_scalar_type): Check that the requested number of units is interoperable with the requested prevailing mode. gcc/testsuite/ * gcc.target/aarch64/sve/slp_15.c: New test. * g++.target/aarch64/sve/pr96974.C: Remove scan test.
2022-08-30Change get_std_name_hint to use generated hash tableUlrich Drepper4-224/+978
The get_std_name_hint function so far uses linear search to locate matching entries. After adding more hint entries this might not be appropriate anymore. Therefore this patch also replaces the linear array with a gperf-generated hash table. contrib/ChangeLog * gcc_update (files_and_dependencies): Add rule for gcc/cp/std-name-hint.h. gcc/cp/ChangeLog * Make-lang.in: Add rule to rebuild std-name-hint.h from std-name-hint.gperf. * name-lookup.cc (get_std_name_hint): Remove hints array. Use gperf-generated class std_name_hint_lookup. Include "std-name-hint.h". * std-name-hint.gperf: New file. * std-name-hint.h: New file. Generated from the .gperf file.
2022-08-30m32c-rtems: remove obsoleted portMartin Liska2-44/+0
contrib/ChangeLog: * config-list.mk: Remove the port. gcc/ChangeLog: * config.gcc: Remove the port. * config/m32c/rtems.h: Removed. libgcc/ChangeLog: * config.host: Remove the port.
2022-08-30tree-optimization/73550 - apply MAX_NUM_CHAINS consistentlyRichard Biener1-7/+0
The MAX_NUM_CHAINS is applied once with <= and once with < which results in the chains not limited but analyis dropped completely. That's one issue in the PR. PR tree-optimization/73550 * gimple-predicate-analysis.cc (predicate::init_from_control_deps): Do not apply MAX_NUM_CHAINS again.
2022-08-30Improve uninit pass dumpingRichard Biener1-30/+4
This produces less redundancy and more complete info dumping the control dependence chains. * gimple-predicate-analysis.cc (format_edge_vec): Dump both source and destination. (dump_dep_chains): Remove. (uninit_analysis::init_use_preds): Remove redundant dumping of chains.
2022-08-30c++: __has_builtin gives the wrong answer [PR106759]Marek Polacek2-0/+135
We've supported __is_nothrow_constructible since r11-4386, but names_builtin_p didn't know about it, so it gave the wrong answer for #if __has_builtin(__is_nothrow_constructible) ... #endif I've tested all C++-only built-ins and only two were missing. PR c++/106759 gcc/cp/ChangeLog: * cp-objcp-common.cc (names_builtin_p): Handle RID_IS_NOTHROW_ASSIGNABLE and RID_IS_NOTHROW_CONSTRUCTIBLE. gcc/testsuite/ChangeLog: * g++.dg/ext/has-builtin-1.C: New test.
2022-08-30Force a [NAN, NAN] range when the definite NAN property is set.Aldy Hernandez3-34/+51
Setting the definite NAN property should also force a [NAN, NAN] range, otherwise we'd have two ways of representing a NAN: with the endpoints or with the property. In the ranger world we avoid at all costs having more than one representation for a range. In doing this, I removed the FRANGE_PROP_ACCESSOR macro, since it looks like setting a property may have repercurssions in the range itself, so it's best for the client to definte its own setter. gcc/ChangeLog: * value-range-storage.cc (frange_storage_slot::get_frange): Use frange_nan. * value-range.cc (frange::set_nan): New. (frange_nan): Move to header file. (range_tests_nan): Adjust frange_nan callers to pass type. New test. * value-range.h (FRANGE_PROP_ACCESSOR): Remove. (frange_nan): New.
2022-08-30tree-optimization/67196 - normalize use predicates earlierRichard Biener2-4/+4
The following makes sure to have use predicates simplified and normalized before doing uninit_analysis::overlap because that otherwise cannot pick up all flag setting cases. This fixes half of the issue in PR67196 and conveniently resolves the XFAIL in gcc.dg/uninit-pred-7_a.c. PR tree-optimization/67196 * gimple-predicate-analysis.cc (uninit_analysis::is_use_guarded): Simplify and normalize use prediates before first use. * gcc.dg/uninit-pred-7_a.c: Un-XFAIL.
2022-08-30Remove GENERIC expr building from predicate analysis, improve dumpsRichard Biener1-67/+10
The following removes duplicate dumping and makes the predicate dumping more readable. That makes the GENERIC predicate build routines unused which is also nice. * gimple-predicate-analysis.cc (dump_pred_chain): Fix parentizing and AND prepending. (predicate::dump): Do not dump the GENERIC expanded predicate, properly parentize and prepend ORs to the piecewise predicate dump. (build_pred_expr): Remove.
2022-08-30Implement relational operators for frange with endpoints.Aldy Hernandez2-54/+309
This is the implementation of the relational range operators for frange. These are the core operations that require specific FP domain knowledge. gcc/ChangeLog: * range-op-float.cc (finite_operand_p): New. (build_le): New. (build_lt): New. (build_ge): New. (build_gt): New. (foperator_equal::fold_range): New implementation with endpoints. (foperator_equal::op1_range): Same. (foperator_not_equal::fold_range): Same. (foperator_not_equal::op1_range): Same. (foperator_lt::fold_range): Same. (foperator_lt::op1_range): Same. (foperator_lt::op2_range): Same. (foperator_le::fold_range): Same. (foperator_le::op1_range): Same. (foperator_le::op2_range): Same. (foperator_gt::fold_range): Same. (foperator_gt::op1_range): Same. (foperator_gt::op2_range): Same. (foperator_ge::fold_range): Same. (foperator_ge::op1_range): Same. (foperator_ge::op2_range): Same. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/recip-3.c: Avoid premature optimization so test has a chance to succeed.
2022-08-30Add support for floating point endpoints to frange.Aldy Hernandez6-79/+585
The current implementation of frange is just a type with some bits to represent NAN and INF. We can do better and represent endpoints to ultimately solve longstanding PRs such as PR24021. This patch adds these endpoints. In follow-up patches I will add support for a bare bones PLUS_EXPR range-op-float entry to solve the PR. I have chosen to use REAL_VALUE_TYPEs for the endpoints, since that's what we use underneath the trees. This will be somewhat analogous to our eventual use of wide-ints in the irange. No sense going through added levels of indirection if we can avoid it. That, plus real.* already has a nice API for dealing with floats. With this patch, ranges will be closed float point intervals, which make the implementation simpler, since we don't have to keep track of open/closed intervals. This is conservative enough for use in the ranger world, as we'd rather err on the side of more elements in a range, than less. For example, even though we cannot precisely represent the open interval (3.0, 5.0) with this approach, it is perfectably reasonable to represent it as [3.0, 5.0] since the closed interval is a super set of the open one. In the VRP/ranger world, it is always better to err on the side of more information in a range, than not. After all, when we don't know anything about a range, we just use VARYING which is a fancy term for a range spanning the entire domain. Since REAL_VALUE_TYPEs have properly defined infinity and NAN semantics, all the math can be made to work: [-INF, 3.0] !NAN => Numbers <= 3.0 (NAN cannot happen) [3.0, 3.0] => 3.0 or NAN. [3.0, +INF] => Numbers >= 3.0 (NAN is possible) [-INF, +INF] => VARYING (NAN is possible) [-INF, +INF] !NAN => Entire domain. NAN cannot happen. Also, since REAL_VALUE_TYPEs can represent the minimum and maximum representable values of a TYPE_MODE, we can disambiguate between them and negative and positive infinity (see get_max_float in real.cc). This also makes the math all work. For example, suppose we know nothing about x and y (VARYING). On the TRUE side of x > y, we can deduce that: (a) x cannot be NAN (b) y cannot be NAN (c) y cannot be +INF. (c) means that we can drop the upper bound of "y" from +INF to the maximum representable value for its type. Having endpoints with different representation for infinity and the maximum representable values, means we can drop the +-INF properties we currently have in the frange. gcc/ChangeLog: * range-op-float.cc (frange_set_nan): New. (frange_drop_inf): New. (frange_drop_ninf): New. (foperator_equal::op1_range): Adjust for endpoints. (foperator_lt::op1_range): Same. (foperator_lt::op2_range): Same. (foperator_gt::op1_range): Same. (foperator_gt::op2_range): Same. (foperator_unordered::op1_range): Same. * value-query.cc (range_query::get_tree_range): Same. * value-range-pretty-print.cc (vrange_printer::visit): Same. * value-range-storage.cc (frange_storage_slot::get_frange): Same. * value-range.cc (frange::set): Same. (frange::normalize_kind): Same. (frange::union_): Same. (frange::intersect): Same. (frange::operator=): Same. (early_nan_resolve): New. (frange::contains_p): New. (frange::singleton_p): New. (frange::set_nonzero): New. (frange::nonzero_p): New. (frange::set_zero): New. (frange::zero_p): New. (frange::set_nonnegative): New. (frange_float): New. (frange_nan): New. (range_tests_nan): New. (range_tests_signed_zeros): New. (range_tests_floats): New. (range_tests): New. * value-range.h (frange::lower_bound): New. (frange::upper_bound): New. (vrp_val_min): Use real_inf with a sign instead of negating inf. (frange::frange): New. (frange::set_varying): Adjust for endpoints. (real_max_representable): New. (real_min_representable): New.
2022-08-30A == 0 ? A : -A same as -A (when A is 0.0)Aldy Hernandez1-1/+1
The upcoming work for frange triggers a regression in gcc.dg/tree-ssa/phi-opt-24.c. For -O2 -fno-signed-zeros, we fail to transform the following into -A: float f0(float A) { // A == 0? A : -A same as -A if (A == 0) return A; return -A; } This is because the abs/negative match.pd pattern here: /* abs/negative simplifications moved from fold_cond_expr_with_comparison, Need to handle (A - B) case as fold_cond_expr_with_comparison does. Need to handle UN* comparisons. ... ... Sees IL that has the 0.0 propagated. Instead of: <bb 2> [local count: 1073741824]: if (A_2(D) == 0.0) goto <bb 4>; [34.00%] else goto <bb 3>; [66.00%] <bb 3> [local count: 708669601]: _3 = -A_2(D); <bb 4> [local count: 1073741824]: # _1 = PHI <A_2(D)(2), _3(3)> It now sees: <bb 4> [local count: 1073741824]: # _1 = PHI <0.0(2), _3(3)> which it leaves untouched, causing the if conditional to survive. Changing integger_zerop to zerop fixes the problem. I did not include a testcase, as it's just phi-opt-24.c which will get triggered when I commit the frange with endpoints work. gcc/ChangeLog: * match.pd ((cmp @0 zerop) real_zerop (negate@1 @0)): Add variant for real zero.
2022-08-30s390: fix build on 32-bit hostsMartin Liska1-1/+1
Fixes build on i686: gcc/config/s390/s390.cc: In function 'bool s390_rtx_costs(rtx, machine_mode, int, int, int*, bool)': gcc/config/s390/s390.cc:3728:63: error: cannot convert 'long int*' to 'long long int*' gcc/ChangeLog: * config/s390/s390.cc (s390_rtx_costs): Use proper type as argument.
2022-08-30Use reachability analysis to improve uninit diagnosticRichard Biener1-9/+38
This patch does what the comment in uninit diagnostic suggests. When the value-numbering run done without optimizing figures there's a fallthru path, consider blocks on it as always executed. * tree-ssa-uninit.cc (warn_uninitialized_vars): Pre-compute the set of fallthru reachable blocks from function entry and use that to determine wlims.always_executed.
2022-08-30tree-optimization/63660 - testcase for fixed PRRichard Biener1-0/+58
This adds a testcase for the PR which was fixed with r13-2155-gbaa3ffb19c54fa PR tree-optimization/63660 * gcc.dg/uninit-pr63660.c: New testcase.
2022-08-30tree-optimization/56654 - sort uninit candidates after RPORichard Biener1-31/+61
The following sorts the immediate uses of a possibly uninitialized SSA variable after their RPO order so we prefer warning for an earlier occuring use rather than issueing the diagnostic for the first uninitialized immediate use. The sorting will inevitably be imperfect but it also allows us to optimize the expensive predicate check for the case where there are multiple uses in the same basic-block which is a nice side-effect. PR tree-optimization/56654 * tree-ssa-uninit.cc (cand_cmp): New. (find_uninit_use): First process all PHIs and collect candidate stmts, then sort those after RPO. (warn_uninitialized_phi): Pass on bb_to_rpo. (execute_late_warn_uninitialized): Compute and pass on reverse lookup of RPO number from basic block index.
2022-08-30Make uninit PHI processing more consistentRichard Biener5-121/+176
Currently the main working of the maybe-uninit pass is to scan over all PHIs with possibly undefined arguments, diagnosing whether there's a direct not guarded use. For not guarded uses in PHIs those are queued for later processing and to make the uninit analysis PHI def handling work, mark the PHI def as possibly uninitialized. But this happens only for those PHI uses that happen to be seen before a direct not guarded use and whether all arguments of a PHI node which are defined by a PHI are properly marked as maybe uninitialized depends on the processing order. The following changes the uninit pass to perform an RPO walk over the function, ensuring that PHI argument defs are visited before the PHI node (besides backedge uses which we ignore already), getting rid of the worklist. It also makes sure to process all PHI uses, but recording those that are properly guarded so they are not treated as maybe undefined when processing the PHI use later. Overall this should make behavior more consistent, avoid some false negative because of the previous early out and order issue, and avoid some false positive because of the missed recording of guarded PHI uses. The patch correctly diagnoses an uninitalized use of 'regnum' in store_bit_field_1 and also diagnoses an uninitialized use of best_match::m_best_candidate_len in c-decl.cc which I've chosen to silence by initializing m_best_candidate_len. The warning is a false positive but GCC cannot see that m_best_candidate_len is initialized when m_best_candidate is not NULL so from this perspective this was a false negative. I've added g++.dg/uninit-pred-5.C with a reduced testcase that nicely shows how the previous behavior missed the diagnostic because the worklist ended up visiting the PHI with the dependend uninit value before visiting the PHIs producing it. * gimple-predicate-analysis.h (uninit_analysis::operator()): Remove. * gimple-predicate-analysis.cc (uninit_analysis::collect_phi_def_edges): Use phi_arg_set, simplify a bit. * tree-ssa-uninit.cc (defined_args): New global. (compute_uninit_opnds_pos): Mask with the recorded set of guarded maybe-uninitialized uses. (uninit_undef_val_t::operator()): Remove. (find_uninit_use): Process all PHI uses, recording the guarded ones and marking the PHI result as uninitialized consistently. (warn_uninitialized_phi): Adjust. (execute_late_warn_uninitialized): Get rid of the PHI worklist and instead walk the function in RPO order. * spellcheck.h (best_match::m_best_candidate_len): Initialize. * g++.dg/uninit-pred-5.C: New testcase.
2022-08-30middle-end: fix min/max phiopts reduction [PR106744]Tamar Christina15-17/+189
This corrects the argument usage to use them in the order that they occur in the comparisons in gimple. gcc/ChangeLog: PR tree-optimization/106744 * tree-ssa-phiopt.cc (minmax_replacement): Correct arguments. gcc/testsuite/ChangeLog: PR tree-optimization/106744 * gcc.dg/tree-ssa/minmax-10.c: Make runtime test. * gcc.dg/tree-ssa/minmax-11.c: Likewise. * gcc.dg/tree-ssa/minmax-12.c: Likewise. * gcc.dg/tree-ssa/minmax-13.c: Likewise. * gcc.dg/tree-ssa/minmax-14.c: Likewise. * gcc.dg/tree-ssa/minmax-15.c: Likewise. * gcc.dg/tree-ssa/minmax-16.c: Likewise. * gcc.dg/tree-ssa/minmax-3.c: Likewise. * gcc.dg/tree-ssa/minmax-4.c: Likewise. * gcc.dg/tree-ssa/minmax-5.c: Likewise. * gcc.dg/tree-ssa/minmax-6.c: Likewise. * gcc.dg/tree-ssa/minmax-7.c: Likewise. * gcc.dg/tree-ssa/minmax-8.c: Likewise. * gcc.dg/tree-ssa/minmax-9.c: Likewise.
2022-08-30middle-end: intialize regnum in store_bit_field_1Tamar Christina1-1/+1
This initializes regnum to 0 for when undefined_p. 0 is the right default as it's supposed to get the lowpart when undefined. gcc/ChangeLog: * expmed.cc (store_bit_field_1): Initialize regnum to 0.