aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2019-11-18Refactor tree-loop-distribution.c for thread safetyGiuliano Belinassi4-270/+441
This patch refactors tree-loop-distribution.c for thread safety without use of C11 __thread feature. All global variables were moved to `class loop_distribution` which is initialized at ::execute time. From-SVN: r278421
2019-11-18re PR ipa/92508 (ICE in do_estimate_edge_time, at ipa-inline-analysis.c:223 ↵Jan Hubicka3-3/+9
since r278159) PR ipa/92508 * ipa-inline.c (inline_small_functions): Add new edges after reseting caches. * ipa-inline-analysis.c (do_estimate_edge_time): Fix sanity check. From-SVN: r278419
2019-11-18Add more C2x attributes tests.Joseph Myers7-0/+50
This patch adds more tests of C2x attributes, where I found cases that were handled correctly by my patches but missing from the original tests. Tests are added for -std=c11 -pedantic handling of C2x attribute syntax and corresponding -Wc11-c2x-compat handling; for struct [[deprecated]]; and for the [[__fallthrough__]] spelling of [[fallthrough]] in the case of valid fallthrough attributes. Tested for x86_64-pc-linux-gnu. * gcc.dg/c11-attr-syntax-1.c, gcc.dg/c11-attr-syntax-2.c, gcc.dg/c11-attr-syntax-3.c, gcc.dg/c2x-attr-syntax-4.c: New tests. * gcc.dg/c2x-attr-deprecated-1.c: Also test struct [[deprecated]]. * gcc.dg/c2x-attr-fallthrough-1.c: Also test [[__fallthrough__]]. From-SVN: r278418
2019-11-18PR c++/91962 - ICE with reference binding and qualification conversion.Marek Polacek4-1/+25
When fixing c++/91889 (r276251) I was assuming that we couldn't have a ck_qual under a ck_ref_bind, and I was introducing it in the patch and so this + if (next_conversion (convs)->kind == ck_qual) + { + gcc_assert (same_type_p (TREE_TYPE (expr), + next_conversion (convs)->type)); + /* Strip the cast created by the ck_qual; cp_build_addr_expr + below expects an lvalue. */ + STRIP_NOPS (expr); + } in convert_like_real was supposed to handle it. But that assumption was wrong as this test shows; here we have "(int *)f" where f is of type long int, and we're converting it to "const int *const &", so we have both ck_ref_bind and ck_qual. That means that the new STRIP_NOPS strips an expression it shouldn't have, and that then breaks when creating a TARGET_EXPR. So we want to limit the stripping to the new case only. This I do by checking need_temporary_p, which will be 0 in the new case. Yes, we can set need_temporary_p when binding a reference directly, but then we won't have a qualification conversion. It is possible to have a bit-field, convert it to a pointer, and then convert that pointer to a more-qualified pointer, but in that case we're not dealing with an lvalue, so gl_kind is 0, so we won't enter this block in reference_binding: 1747 if ((related_p || compatible_p) && gl_kind) * call.c (convert_like_real) <case ck_ref_bind>: Check need_temporary_p. * g++.dg/cpp0x/ref-bind7.C: New test. From-SVN: r278416
2019-11-18Add testcase for already fixed PR ipa/92528Martin Jambor2-0/+69
2019-11-18 Martin Jambor <mjambor@suse.cz> PR ipa/92528 * g++.dg/ipa/pr92528.C: New test. From-SVN: r278415
2019-11-18Add optabs for accelerating RAW and WAR alias checksRichard Sandiford20-4/+372
This patch adds optabs that check whether a read followed by a write or a write followed by a read can be divided into interleaved byte accesses without changing the dependencies between the bytes. This is one of the uses of the SVE2 WHILERW and WHILEWR instructions. (The instructions can also be used to limit the VF at runtime, but that's future work.) 2019-11-18 Richard Sandiford <richard.sandiford@arm.com> gcc/ * doc/sourcebuild.texi (vect_check_ptrs): Document. * optabs.def (check_raw_ptrs_optab, check_war_ptrs_optab): New optabs. * doc/md.texi: Document them. * internal-fn.def (IFN_CHECK_RAW_PTRS, IFN_CHECK_WAR_PTRS): New internal functions. * internal-fn.h (internal_check_ptrs_fn_supported_p): Declare. * internal-fn.c (check_ptrs_direct): New macro. (expand_check_ptrs_optab_fn): Likewise. (direct_check_ptrs_optab_supported_p): Likewise. (internal_check_ptrs_fn_supported_p): New fuction. * tree-data-ref.c: Include internal-fn.h. (create_ifn_alias_checks): New function. (create_intersect_range_checks): Use it. * config/aarch64/iterators.md (SVE2_WHILE_PTR): New int iterator. (optab, cmp_op): Handle it. (raw_war, unspec): New int attributes. * config/aarch64/aarch64.md (UNSPEC_WHILERW, UNSPEC_WHILE_WR): New constants. * config/aarch64/predicates.md (aarch64_bytes_per_sve_vector_operand): New predicate. * config/aarch64/aarch64-sve2.md (check_<raw_war>_ptrs<mode>): New expander. (@aarch64_sve2_while<cmp_op><GPI:mode><PRED_ALL:mode>_ptest): New pattern. gcc/testsuite/ * lib/target-supports.exp (check_effective_target_vect_check_ptrs): New procedure. * gcc.dg/vect/vect-alias-check-14.c: Expect IFN_CHECK_WAR to be used, if available. * gcc.dg/vect/vect-alias-check-15.c: Likewise. * gcc.dg/vect/vect-alias-check-16.c: Likewise IFN_CHECK_RAW. * gcc.target/aarch64/sve2/whilerw_1.c: New test. * gcc.target/aarch64/sve2/whilewr_1.c: Likewise. * gcc.target/aarch64/sve2/whilewr_2.c: Likewise. From-SVN: r278414
2019-11-18Add an empty constructor shortcut to build_vector_from_ctorRichard Sandiford2-0/+8
Empty vector constructors are equivalent to zero vectors. If we handle that case directly, we can support it for variable-length vectors and can hopefully make things more efficient for fixed-length vectors. This is needed by a later C++ patch. 2019-11-18 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree.c (build_vector_from_ctor): Directly return a zero vector for empty constructors. From-SVN: r278413
2019-11-18Two RTL CC tweaks for SVE pmore/plast conditionsRichard Sandiford5-21/+140
SVE has two composite conditions: pmore == at least one bit set && last bit clear plast == no bits set || last bit set So in general we generate them from: A: CC = test bits B: reg1 = first condition C: CC = test bits D: reg2 = second condition E: result = (reg1 op reg2) where op is || or && To fold all this into a single test, we need to be able to remove the redundant C (the cse.c patch) and then fold B, D and E down to a single condition (the simplify-rtx.c patch). The underlying conditions are unsigned, so the simplify-rtx.c part needs to support both unsigned comparisons and AND. However, to avoid opening the can of worms that is ANDing FP comparisons for unordered inputs, I've restricted the new AND handling to cases in which NaNs can be ignored. I think this is still a strict extension of what we have now, it just doesn't go as far as it could. Going further would need an entirely different set of testcases so I think would make more sense as separate work. 2019-11-18 Richard Sandiford <richard.sandiford@arm.com> gcc/ * cse.c (cse_insn): Delete no-op register moves too. * simplify-rtx.c (comparison_to_mask): Handle unsigned comparisons. Take a second comparison to control the value for NE. (mask_to_comparison): Handle unsigned comparisons. (simplify_logical_relational_operation): Likewise. Update call to comparison_to_mask. Handle AND if !HONOR_NANs. (simplify_binary_operation_1): Call the above for AND too. gcc/testsuite/ * gcc.target/aarch64/sve/acle/asm/ptest_pmore.c: New test. From-SVN: r278411
2019-11-18Handle VIEW_CONVERT_EXPR for variable-length vectorsRichard Sandiford4-9/+214
This patch handles VIEW_CONVERT_EXPRs of variable-length VECTOR_CSTs by adding tree-level versions of native_decode_vector_rtx and simplify_const_vector_subreg. It uses the same code for fixed-length vectors, both to get more coverage and because operating directly on the compressed encoding should be more efficient for longer vectors with a regular pattern. The structure and comments are very similar between the tree and rtx routines. 2019-11-18 Richard Sandiford <richard.sandiford@arm.com> gcc/ * fold-const.c (native_encode_vector): Turn into a wrapper function, splitting the main code out into... (native_encode_vector_part): ...this new function. (native_decode_vector_tree): New function. (fold_view_convert_vector_encoding): Likewise. (fold_view_convert_expr): Use it for converting VECTOR_CSTs to VECTOR_TYPEs. gcc/testsuite/ * gcc.target/aarch64/sve/acle/general/temporaries_1.c: New test. From-SVN: r278410
2019-11-18Optimise WAR and WAW alias checksRichard Sandiford12-41/+260
For: void f1 (int *x, int *y) { for (int i = 0; i < 32; ++i) x[i] += y[i]; } we checked at runtime whether one vector at x would overlap one vector at y. But in cases like this, the vector code would handle x <= y just fine, since any write to address A still happens after any read from address A. The only problem is if x is ahead of y by less than a vector. The same is true for two writes: void f2 (int *x, int *y) { for (int i = 0; i < 32; ++i) { x[i] = i; y[i] = 2; } } if y <= x then a vector write at y after a vector write at x would have the same net effect as the original scalar writes. This patch optimises the alias checks for these two cases. E.g., before the patch, f1 used: add x2, x0, 15 sub x2, x2, x1 cmp x2, 30 bls .L2 whereas after the patch it uses: add x2, x1, 4 sub x2, x0, x2 cmp x2, 8 bls .L2 Read-after-write cases like: int f3 (int *x, int *y) { int res = 0; for (int i = 0; i < 32; ++i) { x[i] = i; res += y[i]; } return res; } can cope with x == y, but otherwise don't allow overlap in either direction. Since checking for x == y at runtime would require extra code, we're probably better off sticking with the current overlap test. An overlap test is also needed if the scalar or vector accesses covered by the alias check are mixed together, rather than all statements for the second access following all statements for the first access. The new code for gcc.target/aarch64/sve/var_strict_[135].c is slightly better than before. 2019-11-18 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-data-ref.c (create_intersect_range_checks_index): If the alias pair describes simple WAW and WAR dependencies, just check whether the first B access overlaps later A accesses. (create_waw_or_war_checks): New function that performs the same optimization on addresses. (create_intersect_range_checks): Call it. gcc/testsuite/ * gcc.dg/vect/vect-alias-check-8.c: Expect WAR/WAW checks to be used. * gcc.dg/vect/vect-alias-check-14.c: Likewise. * gcc.dg/vect/vect-alias-check-15.c: Likewise. * gcc.dg/vect/vect-alias-check-18.c: Likewise. * gcc.dg/vect/vect-alias-check-19.c: Likewise. * gcc.target/aarch64/sve/var_stride_1.c: Update expected sequence. * gcc.target/aarch64/sve/var_stride_2.c: Likewise. * gcc.target/aarch64/sve/var_stride_3.c: Likewise. * gcc.target/aarch64/sve/var_stride_5.c: Likewise. From-SVN: r278409
2019-11-18LRA: handle memory constraints that accept more than "m"Richard Sandiford12-15/+48
LRA allows address constraints that are more relaxed than "p": /* Target hooks sometimes don't treat extra-constraint addresses as legitimate address_operands, so handle them specially. */ if (insn_extra_address_constraint (cn) && satisfies_address_constraint_p (&ad, cn)) return change_p; For SVE it's useful to allow the same thing for memory constraints. The particular use case is LD1RQ, which is an SVE instruction that addresses Advanced SIMD vector modes and that accepts some addresses that normal Advanced SIMD moves don't. Normally we require every memory to satisfy at least "m", which is defined to be a memory "with any kind of address that the machine supports in general". However, LD1RQ is very much special-purpose: it doesn't really have any relation to normal operations on these modes. Adding its addressing modes to "m" would lead to bad Advanced SIMD optimisation decisions in passes like ivopts. LD1RQ therefore has a memory constraint that accepts things "m" doesn't. 2019-11-18 Richard Sandiford <richard.sandiford@arm.com> gcc/ * lra-constraints.c (valid_address_p): Take the operand and a constraint as argument. If the operand is a MEM and the constraint is a memory constraint, check whether the eliminated form of the MEM already satisfies the constraint. (process_address_1): Update calls accordingly. gcc/testsuite/ * gcc.target/aarch64/sve/acle/asm/ld1rq_f16.c: Remove XFAIL. * gcc.target/aarch64/sve/acle/asm/ld1rq_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1rq_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1rq_s16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1rq_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1rq_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1rq_u16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1rq_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1rq_u64.c: Likewise. From-SVN: r278408
2019-11-18Remove vestiges of MODIFY_JNI_METHOD_CALLTom Tromey4-40/+6
I happened to notice that MODIFY_JNI_METHOD_CALL was defined in cygming.h and documented in tm.texi. However, because it was only needed for gcj, it is obsolete. This patch removes the vestiges. Tested by grep, and rebuilding the documentation. gcc/ChangeLog 2019-11-18 Tom Tromey <tromey@adacore.com> * doc/tm.texi: Rebuild. * doc/tm.texi.in (Misc): Don't document MODIFY_JNI_METHOD_CALL. * config/i386/cygming.h (MODIFY_JNI_METHOD_CALL): Don't define. From-SVN: r278407
2019-11-18re PR tree-optimization/92516 (ICE in vect_schedule_slp_instance, at ↵Richard Biener5-75/+141
tree-vect-slp.c:4095 since r278246) 2019-11-18 Richard Biener <rguenther@suse.de> PR tree-optimization/92516 * tree-vect-slp.c (vect_analyze_slp_instance): Add bst_map argument, hoist bst_map creation/destruction to ... (vect_analyze_slp): ... here, forming a true graph with SLP instances being the entries. (vect_detect_hybrid_slp_stmts): Remove wrapper. (vect_detect_hybrid_slp): Use one visited set for all graph entries. (vect_slp_analyze_node_operations): Simplify visited/lvisited to hash-sets of slp_tree. (vect_slp_analyze_operations): Likewise. (vect_bb_slp_scalar_cost): Remove wrapper. (vect_bb_vectorization_profitable_p): Use one visited set for all graph entries. (vect_schedule_slp_instance): Elide bst_map use. (vect_schedule_slp): Likewise. * g++.dg/vect/slp-pr92516.cc: New testcase. 2019-11-18 Richard Biener <rguenther@suse.de> * tree-vect-slp.c (vect_analyze_slp_instance): When a CTOR was vectorized with just external refs fail. * gcc.dg/vect/vect-ctor-1.c: New testcase. From-SVN: r278406
2019-11-18Unset m_checker in sem_function::init.Martin Liska2-0/+8
2019-11-18 Martin Liska <mliska@suse.cz> PR ipa/92525 * ipa-icf.c (sem_function::init): Unset m_checker at the end of the function. From-SVN: r278405
2019-11-18Remove strange dump suboptions in testsuite.Martin Liska3-2/+11
2019-11-18 Martin Liska <mliska@suse.cz> * gcc.dg/ipa/ipa-icf-36.c: Remove 'all-all-all'. * gcc.dg/ipa/ipa-icf-37.c: Likewise. From-SVN: r278404
2019-11-18re PR tree-optimization/92558 (Miscompare of 554.roms_r with -Ofast ↵Richard Biener4-0/+35
-march=znver2 -flto since r278289) 2019-11-18 Richard Biener <rguenther@suse.de> PR tree-optimization/92558 * tree-vect-loop.c (vect_create_epilog_for_reduction): When reducting the width of a reduction vector def update new_phis. * gcc.dg/vect/pr92558.c: New testcase. From-SVN: r278400
2019-11-18musl: use correct long double abi by defaultSzabolcs Nagy3-2/+33
On powerpc and s390x the musl ABI requires 64 bit and 128 bit long double respectively, so adjust the default. gcc/ChangeLog: 2019-11-18 Szabolcs Nagy <szabolcs.nagy@arm.com> * configure.ac (gcc_cv_target_ldbl128): Set for powerpc*-*-linux-musl* and s390*-*-linux-musl* targets. * configure: Regenerate. From-SVN: r278398
2019-11-18s390: add musl supportSzabolcs Nagy2-0/+8
Add the musl dynamic linker names. gcc/ChangeLog: 2019-11-18 Szabolcs Nagy <szabolcs.nagy@arm.com> * config/s390/linux.h (MUSL_DYNAMIC_LINKER32): Define. (MUSL_DYNAMIC_LINKER64): Define. From-SVN: r278397
2019-11-18Improve -dbg-cnt error message and support :0.Martin Liska2-11/+14
2019-11-18 Martin Liska <mliska@suse.cz> * dbgcnt.c (dbg_cnt_set_limit_by_name): Provide error message for an unknown counter. (dbg_cnt_process_single_pair): Support 0 as minimum value. (dbg_cnt_process_opt): Remove unreachable code. From-SVN: r278396
2019-11-18Verify NOP_EXPR LHS type in IPA ICF.Martin Liska4-0/+46
2019-11-18 Martin Liska <mliska@suse.cz> PR ipa/92529 * ipa-icf-gimple.c (func_checker::compare_gimple_assign): Compare LHS types of NOP_EXPR. 2019-11-18 Martin Liska <mliska@suse.cz> PR ipa/92529 * gcc.dg/ipa/pr92529.c: New test. From-SVN: r278395
2019-11-18[mid-end][__RTL] Clean state despite unspecified __RTL startwith passesMatthew Malcomson6-20/+70
Hi there, When compiling an __RTL function that has an unspecified "startwith" pass we currently don't run the cleanup pass, this means that we ICE on the next function (if it's a basic function). This change ensures that the clean_state pass is run even if the startwith pass is unspecified. We also ensure the name of the startwith pass is always freed correctly. As an example, before this change the following code would ICE when compiling the function `foo_a`. When compiled with ./aarch64-none-linux-gnu-gcc -O0 -S unspecified-pass-error.c -o test.s ``` int __RTL () badfoo () { (function "badfoo" (insn-chain (block 2 (edge-from entry (flags "FALLTHRU")) (cnote 3 [bb 2] NOTE_INSN_BASIC_BLOCK) (cinsn 101 (set (reg:DI x19) (reg:DI x0))) (cinsn 10 (use (reg/i:SI x19))) (edge-to exit (flags "FALLTHRU")) ) ;; block 2 ) ;; insn-chain ) ;; function "foo2" } int foo_a () { return 200; } ``` Now it silently ignores the __RTL function and successfully compiles foo_a. regtest done on aarch64 regtest done on x86_64 OK for trunk? gcc/ChangeLog: 2019-11-18 Matthew Malcomson <matthew.malcomson@arm.com> * run-rtl-passes.c (run_rtl_passes): Accept and handle empty "initial_pass_name" argument -- by running "*clean_state" pass. Also free the "initial_pass_name" when done. gcc/c/ChangeLog: 2019-11-18 Matthew Malcomson <matthew.malcomson@arm.com> * c-parser.c (c_parser_parse_rtl_body): Always call run_rtl_passes, even if startwith pass is not provided. gcc/testsuite/ChangeLog: 2019-11-18 Matthew Malcomson <matthew.malcomson@arm.com> * gcc.dg/rtl/aarch64/unspecified-pass-error.c: New test. From-SVN: r278393
2019-11-18re PR target/92462 ([arm32] -ftree-pre makes a variable to be wrongly ↵Richard Biener2-4/+16
hoisted out) 2019-11-18 Richard Biener <rguenther@suse.de> PR rtl-optimization/92462 * alias.c (find_base_term): Restrict the look through ANDs. (find_base_value): Likewise. From-SVN: r278391
2019-11-18[testsuite][ARM] check_effective_target_arm_vfp_ok_nocache: Fix typo in ↵Christophe Lyon2-2/+8
option name 2019-11-18 Christophe Lyon <christophe.lyon@linaro.org> * lib/target-supports.exp (check_effective_target_arm_vfp_ok_nocache): Fix typo in option name. From-SVN: r278390
2019-11-18re PR target/92545 (avr: support ATmega devices from the 0-series)Georg-Johann Lay1-1/+1
PR target/92545 * config/avr/gen-avr-mmcu-specs.c (print_mcu) [link_pm_base_address]: Symbol name is __RODATA_PM_OFFSET__. From-SVN: r278389
2019-11-18re PR target/92545 (avr: support ATmega devices from the 0-series)Georg-Johann Lay1-12/+12
PR target/92545 * doc/avr-mmcu.texi: Regenerate. From-SVN: r278388
2019-11-18Add support for AVR devices from the 0-series.Georg-Johann Lay7-333/+455
PR target/92545 * config/avr/avr-arch.h (avr_mcu_t) <flash_pm_offset>: New field. * config/avr/avr-devices.c (avr_mcu_types): Adjust initializers. * config/avr/avr-mcus.def (AVR_MCU): Add respective field. * config/avr/specs.h (LINK_SPEC) <%(link_pm_base_address)>: Add. * config/avr/gen-avr-mmcu-specs.c (print_mcu) <*cpp, *cpp_mcu, *cpp_avrlibc, *link_pm_base_address>: Emit code for spec definitions. * doc/avr-mmcu.texi: Regenerate. From-SVN: r278387
2019-11-18Split X86_TUNE_AVX128_OPTIMAL into X86_TUNE_AVX256_SPLIT_REGSHongtao Liu6-6/+25
and X86_TUNE_AVX128_OPTIMAL. Changelog gcc/ PR target/92448 * config/i386/i386-expand.c (ix86_expand_set_or_cpymem): Replace TARGET_AVX128_OPTIMAL with TARGET_AVX256_SPLIT_REGS. * config/i386/i386-option.c (ix86_vec_cost): Ditto. (ix86_reassociation_width): Ditto. * config/i386/i386-options.c (ix86_option_override_internal): Replace TARGET_AVX128_OPTIAML with ix86_tune_features[X86_TUNE_AVX128_OPTIMAL] * config/i386/i386.h (TARGET_AVX256_SPLIT_REGS): New macro. (TARGET_AVX128_OPTIMAL): Deleted. * config/i386/x86-tune.def (X86_TUNE_AVX256_SPLIT_REGS): New DEF_TUNE. From-SVN: r278385
2019-11-18Daily bump.GCC Administrator1-1/+1
From-SVN: r278382
2019-11-17* gcc.dg/complex-6.c: Do not run dump scan tests for rx target.Jeff Law2-2/+6
From-SVN: r278376
2019-11-17method.c (lookup_comparison_result): Use %qD instead of %<%T::%D%> to print ↵Jakub Jelinek4-2/+27
the decl. * method.c (lookup_comparison_result): Use %qD instead of %<%T::%D%> to print the decl. (lookup_comparison_category): Use %qD instead of %<std::%D%> to print the decl. * g++.dg/cpp2a/spaceship-err3.C: New test. From-SVN: r278375
2019-11-17Daily bump.GCC Administrator1-1/+1
From-SVN: r278369
2019-11-17rs6000: Allow mode GPR in cceq_{ior,rev}_compareSegher Boessenkool2-7/+18
Also make it a parmeterized name: @cceq_{ior,rev}_compare_<mode>. * config/rs6000/rs6000.md (cceq_ior_compare): Rename to... (@cceq_ior_compare_<mode> for GPR): ... this. Allow GPR instead of just SI. (cceq_rev_compare): Rename to... (@cceq_rev_compare_<mode> for GPR): ... this. Allow GPR instead of just SI. (define_split for <bd>tf_<mode>): Add SImode first argument to gen_cceq_ior_compare. From-SVN: r278366
2019-11-16Delete common/config/powerpcspeSegher Boessenkool2-321/+4
I missed this part in r266961. Various people have been editing it since; I finally noticed. * common/config/powerpcspe: Delete. From-SVN: r278361
2019-11-16[AArch64] Robustify aarch64_wrffrRichard Sandiford2-1/+6
This patch uses distinct values for the FFR and FFRT outputs of aarch64_wrffr, so that a following aarch64_copy_ffr_to_ffrt has an effect. This is needed to avoid regressions with later patches. The block comment at the head of the file already described the pattern this way, and there was already an unspec for it. Not sure what made me change it... 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64-sve.md (aarch64_wrffr): Wrap the FFRT output in UNSPEC_WRFFR. From-SVN: r278356
2019-11-16Use a single comparison for index-based alias checksRichard Sandiford6-51/+295
This patch rewrites the index-based alias checks to use conditions of the form: (unsigned T) (a - b + bias) <= limit E.g. before the patch: struct s { int x[100]; }; void f1 (struct s *s1, int a, int b) { for (int i = 0; i < 32; ++i) s1->x[i + a] += s1->x[i + b]; } used: add w3, w1, 3 cmp w3, w2 add w3, w2, 3 ccmp w1, w3, 0, ge ble .L2 whereas after the patch it uses: sub w3, w1, w2 add w3, w3, 3 cmp w3, 6 bls .L2 The patch also fixes the seg_len1 and seg_len2 negation for cases in which seg_len is a "negative unsigned" value narrower than 64 bits, like it is for 32-bit targets. Previously we'd end up with values like 0xffffffff000000001 instead of 1. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-data-ref.c (create_intersect_range_checks_index): Rewrite the index tests to have the form (unsigned T) (B - A + bias) <= limit. gcc/testsuite/ * gcc.dg/vect/vect-alias-check-18.c: New test. * gcc.dg/vect/vect-alias-check-19.c: Likewise. * gcc.dg/vect/vect-alias-check-20.c: Likewise. From-SVN: r278354
2019-11-16Print the type of alias check in a dump messageRichard Sandiford14-0/+48
This patch prints a message to say how an alias check is being implemented. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-data-ref.c (create_intersect_range_checks_index) (create_intersect_range_checks): Print dump messages. gcc/testsuite/ * gcc.dg/vect/vect-alias-check-1.c: Test for the type of alias check. * gcc.dg/vect/vect-alias-check-8.c: Likewise. * gcc.dg/vect/vect-alias-check-9.c: Likewise. * gcc.dg/vect/vect-alias-check-10.c: Likewise. * gcc.dg/vect/vect-alias-check-11.c: Likewise. * gcc.dg/vect/vect-alias-check-12.c: Likewise. * gcc.dg/vect/vect-alias-check-13.c: Likewise. * gcc.dg/vect/vect-alias-check-14.c: Likewise. * gcc.dg/vect/vect-alias-check-15.c: Likewise. * gcc.dg/vect/vect-alias-check-16.c: Likewise. * gcc.dg/vect/vect-alias-check-17.c: Likewise. From-SVN: r278353
2019-11-16Dump the list of merged alias pairsRichard Sandiford9-1/+270
This patch dumps the final (merged) list of alias pairs. It also adds: - WAW and RAW versions of vect-alias-check-8.c - a "well-ordered" version of vect-alias-check-9.c (i.e. all reads before any writes) - a test with mixed steps in the same alias pair I also tweaked the test value in vect-alias-check-9.c so that the result was less likely to be accidentally correct if the alias isn't honoured. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-data-ref.c (dump_alias_pair): New function. (prune_runtime_alias_test_list): Use it to dump each merged alias pair. gcc/testsuite/ * gcc.dg/vect/vect-alias-check-8.c: Test for the RAW flag. * gcc.dg/vect/vect-alias-check-9.c: Test for the ARBITRARY flag. (TEST_VALUE): Use a higher value for early iterations. * gcc.dg/vect/vect-alias-check-14.c: New test. * gcc.dg/vect/vect-alias-check-15.c: Likewise. * gcc.dg/vect/vect-alias-check-16.c: Likewise. * gcc.dg/vect/vect-alias-check-17.c: Likewise. From-SVN: r278352
2019-11-16Record whether a dr_with_seg_len contains mixed stepsRichard Sandiford3-22/+55
prune_runtime_alias_test_list can merge dr_with_seg_len_pair_ts that have different steps for the first reference or different steps for the second reference. This patch adds a flag to record that. I don't know whether the change to create_intersect_range_checks_index fixes anything in practice. It would have to be a corner case if so, since at present we only merge two alias pairs if either the first or the second references are identical and only the other references differ. And the vectoriser uses VF-based segment lengths only if both references in a pair have the same step. Either way, it still seems wrong to use DR_STEP when it doesn't represent all checks that have been merged into the pair. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-data-ref.h (DR_ALIAS_MIXED_STEPS): New flag. * tree-data-ref.c (prune_runtime_alias_test_list): Set it when merging data references with different steps. (create_intersect_range_checks_index): Take a dr_with_seg_len_pair_t instead of two dr_with_seg_lens. Bail out if DR_ALIAS_MIXED_STEPS is set. (create_intersect_range_checks): Take a dr_with_seg_len_pair_t instead of two dr_with_seg_lens. Update call to create_intersect_range_checks_index. (create_runtime_alias_checks): Update call accordingly. From-SVN: r278351
2019-11-16Add flags to dr_with_seg_len_pair_tRichard Sandiford5-14/+161
This patch adds a bunch of flags to dr_with_seg_len_pair_t, for use by later patches. The update to tree-loop-distribution.c is conservatively correct, but might be tweakable later. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-data-ref.h (DR_ALIAS_RAW, DR_ALIAS_WAR, DR_ALIAS_WAW) (DR_ALIAS_ARBITRARY, DR_ALIAS_SWAPPED, DR_ALIAS_UNSWAPPED): New flags. (dr_with_seg_len_pair_t::sequencing): New enum. (dr_with_seg_len_pair_t::flags): New member variable. (dr_with_seg_len_pair_t::dr_with_seg_len_pair_t): Take a sequencing parameter and initialize the flags member variable. * tree-loop-distribution.c (compute_alias_check_pairs): Update call accordingly. * tree-vect-data-refs.c (vect_prune_runtime_alias_test_list): Likewise. Ensure the two data references in an alias pair are in statement order, if there is a defined order. * tree-data-ref.c (prune_runtime_alias_test_list): Use DR_ALIAS_SWAPPED and DR_ALIAS_UNSWAPPED to record whether we've swapped the references in a dr_with_seg_len_pair_t. OR together the flags when merging two dr_with_seg_len_pair_ts. After merging, try to restore the original dr_with_seg_len order, updating the flags if that fails. From-SVN: r278350
2019-11-16Delay swapping data refs in prune_runtime_alias_test_listRichard Sandiford2-9/+21
prune_runtime_alias_test_list swapped dr_as between two dr_with_seg_len pairs before finally deciding whether to merge them. Bailing out later would therefore leave the pairs in an incorrect state. IMO a better fix would be to split this out into a subroutine that produces a temporary dr_with_seg_len on success, rather than changing an existing one in-place. It would then be easy to merge both the dr_as and dr_bs if we wanted to, rather than requiring one of them to be equal. But here I tried to do something that could be backported if necessary. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-data-ref.c (prune_runtime_alias_test_list): Delay swapping the dr_as based on init values until we've decided whether to merge them. From-SVN: r278349
2019-11-16Move canonicalisation of dr_with_seg_len_pair_tsRichard Sandiford4-23/+33
The two users of tree-data-ref's runtime alias checks both canonicalise the order of the dr_with_seg_lens in a pair before passing them to prune_runtime_alias_test_list. It's more convenient for later patches if prune_runtime_alias_test_list does that itself. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-data-ref.c (prune_runtime_alias_test_list): Sort the two accesses in each dr_with_seg_len_pair_t before trying to combine separate dr_with_seg_len_pair_ts. * tree-loop-distribution.c (compute_alias_check_pairs): Don't do that here. * tree-vect-data-refs.c (vect_prune_runtime_alias_test_list): Likewise. From-SVN: r278348
2019-11-16[AArch64] Add scatter stores for partial SVE modesRichard Sandiford10-40/+185
This patch adds support for scatter stores of partial vectors, where the vector base or offset elements can be wider than the elements being stored. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64-sve.md (scatter_store<SVE_FULL_SD:mode><v_int_equiv>): Extend to... (scatter_store<SVE_24:mode><v_int_container>): ...this. (mask_scatter_store<SVE_FULL_S:mode><v_int_equiv>): Extend to... (mask_scatter_store<SVE_4:mode><v_int_equiv>): ...this. (mask_scatter_store<SVE_FULL_D:mode><v_int_equiv>): Extend to... (mask_scatter_store<SVE_2:mode><v_int_equiv>): ...this. (*mask_scatter_store<mode><v_int_container>_<su>xtw_unpacked): New pattern. (*mask_scatter_store<SVE_FULL_D:mode><v_int_equiv>_sxtw): Extend to... (*mask_scatter_store<SVE_2:mode><v_int_equiv>_sxtw): ...this. (*mask_scatter_store<SVE_FULL_D:mode><v_int_equiv>_uxtw): Extend to... (*mask_scatter_store<SVE_2:mode><v_int_equiv>_uxtw): ...this. gcc/testsuite/ * gcc.target/aarch64/sve/scatter_store_1.c (TEST_LOOP): Start at 0. (TEST_ALL): Add tests for 8-bit and 16-bit elements. * gcc.target/aarch64/sve/scatter_store_2.c: Update accordingly. * gcc.target/aarch64/sve/scatter_store_3.c (TEST_LOOP): Start at 0. (TEST_ALL): Add tests for 8-bit and 16-bit elements. * gcc.target/aarch64/sve/scatter_store_4.c: Update accordingly. * gcc.target/aarch64/sve/scatter_store_5.c (TEST_LOOP): Start at 0. (TEST_ALL): Add tests for 8-bit, 16-bit and 32-bit elements. * gcc.target/aarch64/sve/scatter_store_8.c: New test. * gcc.target/aarch64/sve/scatter_store_9.c: Likewise. From-SVN: r278347
2019-11-16[AArch64] Pattern-match SVE extending gather loadsRichard Sandiford17-128/+710
This patch pattern-matches a partial gather load followed by a sign or zero extension into an extending gather load. (The partial gather load is already an extending load; we just don't rely on the upper bits of the elements.) 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/iterators.md (SVE_2BHSI, SVE_2HSDI, SVE_4BHI) (SVE_4HSI): New mode iterators. (ANY_EXTEND2): New code iterator. * config/aarch64/aarch64-sve.md (@aarch64_gather_load_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>): Extend to... (@aarch64_gather_load_<ANY_EXTEND:optab><SVE_4HSI:mode><SVE_4BHI:mode>): ...this, handling extension to partial modes as well as full modes. Describe the extension as a predicated rather than unpredicated extension. (@aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>): Likewise extend to... (@aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>): ...this, making the same adjustments. (*aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_sxtw): Likewise extend to... (*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>_sxtw) ...this, making the same adjustments. (*aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_uxtw): Likewise extend to... (*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>_uxtw) ...this, making the same adjustments. (*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>_<ANY_EXTEND2:su>xtw_unpacked): New pattern. (*aarch64_ldff1_gather<mode>_sxtw): Canonicalize to a constant extension predicate. (@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>) (@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>) (*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_uxtw): Describe the extension as a predicated rather than unpredicated extension. (*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_sxtw): Likewise. Canonicalize to a constant extension predicate. * config/aarch64/aarch64-sve-builtins-base.cc (svld1_gather_extend_impl::expand): Add an extra predicate for the extension. (svldff1_gather_extend_impl::expand): Likewise. gcc/testsuite/ * gcc.target/aarch64/sve/gather_load_extend_1.c: New test. * gcc.target/aarch64/sve/gather_load_extend_2.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_3.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_4.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_5.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_6.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_7.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_8.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_9.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_10.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_11.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_12.c: Likewise. From-SVN: r278346
2019-11-16[AArch64] Add gather loads for partial SVE modesRichard Sandiford14-50/+268
This patch adds support for gather loads of partial vectors, where the vector base or offset elements can be wider than the elements being loaded. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/iterators.md (SVE_24, SVE_2, SVE_4): New mode iterators. * config/aarch64/aarch64-sve.md (gather_load<SVE_FULL_SD:mode><v_int_equiv>): Extend to... (gather_load<SVE_24:mode><v_int_container>): ...this. (mask_gather_load<SVE_FULL_S:mode><v_int_equiv>): Extend to... (mask_gather_load<SVE_4:mode><v_int_container>): ...this. (mask_gather_load<SVE_FULL_D:mode><v_int_equiv>): Extend to... (mask_gather_load<SVE_2:mode><v_int_container>): ...this. (*mask_gather_load<SVE_2:mode><v_int_container>_<su>xtw_unpacked): New pattern. (*mask_gather_load<SVE_FULL_D:mode><v_int_equiv>_sxtw): Extend to... (*mask_gather_load<SVE_2:mode><v_int_equiv>_sxtw): ...this. Allow the nominal extension predicate to be different from the load predicate. (*mask_gather_load<SVE_FULL_D:mode><v_int_equiv>_uxtw): Extend to... (*mask_gather_load<SVE_2:mode><v_int_equiv>_uxtw): ...this. gcc/testsuite/ * gcc.target/aarch64/sve/gather_load_1.c (TEST_LOOP): Start at 0. (TEST_ALL): Add tests for 8-bit and 16-bit elements. * gcc.target/aarch64/sve/gather_load_2.c: Update accordingly. * gcc.target/aarch64/sve/gather_load_3.c (TEST_LOOP): Start at 0. (TEST_ALL): Add tests for 8-bit and 16-bit elements. * gcc.target/aarch64/sve/gather_load_4.c: Update accordingly. * gcc.target/aarch64/sve/gather_load_5.c (TEST_LOOP): Start at 0. (TEST_ALL): Add tests for 8-bit, 16-bit and 32-bit elements. * gcc.target/aarch64/sve/gather_load_6.c: Add --param aarch64-sve-compare-costs=0. (TEST_LOOP): Start at 0. * gcc.target/aarch64/sve/gather_load_7.c: Add --param aarch64-sve-compare-costs=0. * gcc.target/aarch64/sve/gather_load_8.c: New test. * gcc.target/aarch64/sve/gather_load_9.c: Likewise. * gcc.target/aarch64/sve/mask_gather_load_6.c: Add --param aarch64-sve-compare-costs=0. From-SVN: r278345
2019-11-16[AArch64] Add truncation for partial SVE modesRichard Sandiford11-6/+113
This patch adds support for "truncating" to a partial SVE vector from either a full SVE vector or a wider partial vector. This truncation is actually a no-op and so should have zero cost in the vector cost model. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64-sve.md (trunc<SVE_HSDI:mode><SVE_PARTIAL_I:mode>2): New pattern. * config/aarch64/aarch64.c (aarch64_integer_truncation_p): New function. (aarch64_sve_adjust_stmt_cost): Call it. gcc/testsuite/ * gcc.target/aarch64/sve/mask_struct_load_1.c: Add --param aarch64-sve-compare-costs=0. * gcc.target/aarch64/sve/mask_struct_load_2.c: Likewise. * gcc.target/aarch64/sve/mask_struct_load_3.c: Likewise. * gcc.target/aarch64/sve/mask_struct_load_4.c: Likewise. * gcc.target/aarch64/sve/mask_struct_load_5.c: Likewise. * gcc.target/aarch64/sve/pack_1.c: Likewise. * gcc.target/aarch64/sve/truncate_1.c: New test. From-SVN: r278344
2019-11-16[AArch64] Pattern-match SVE extending loadsRichard Sandiford16-75/+358
This patch pattern-matches a partial SVE load followed by a sign or zero extension into an extending load. (The partial load is already an extending load; we just don't rely on the upper bits of the elements.) Nothing yet uses the extra LDFF1 and LDNF1 combinations, but it seemed more consistent to provide them, since I needed to update the pattern to use a predicated extension anyway. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64-sve.md (@aarch64_load_<ANY_EXTEND:optab><VNx8_WIDE:mode><VNx8_NARROW:mode>): (@aarch64_load_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>) (@aarch64_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>): Combine into... (@aarch64_load_<ANY_EXTEND:optab><SVE_HSDI:mode><SVE_PARTIAL_I:mode>): ...this new pattern, handling extension to partial modes as well as full modes. Describe the extension as a predicated rather than unpredicated extension. (@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx8_WIDE:mode><VNx8_NARROW:mode>) (@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>) (@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>): Combine into... (@aarch64_ld<fn>f1_<ANY_EXTEND:optab><SVE_HSDI:mode><SVE_PARTIAL_I:mode>): ...this new pattern, handling extension to partial modes as well as full modes. Describe the extension as a predicated rather than unpredicated extension. * config/aarch64/aarch64-sve-builtins.cc (function_expander::use_contiguous_load_insn): Add an extra predicate for extending loads. * config/aarch64/aarch64.c (aarch64_extending_load_p): New function. (aarch64_sve_adjust_stmt_cost): Likewise. (aarch64_add_stmt_cost): Use aarch64_sve_adjust_stmt_cost to adjust the cost of SVE vector stmts. gcc/testsuite/ * gcc.target/aarch64/sve/load_extend_1.c: New test. * gcc.target/aarch64/sve/load_extend_2.c: Likewise. * gcc.target/aarch64/sve/load_extend_3.c: Likewise. * gcc.target/aarch64/sve/load_extend_4.c: Likewise. * gcc.target/aarch64/sve/load_extend_5.c: Likewise. * gcc.target/aarch64/sve/load_extend_6.c: Likewise. * gcc.target/aarch64/sve/load_extend_7.c: Likewise. * gcc.target/aarch64/sve/load_extend_8.c: Likewise. * gcc.target/aarch64/sve/load_extend_9.c: Likewise. * gcc.target/aarch64/sve/load_extend_10.c: Likewise. * gcc.target/aarch64/sve/reduc_4.c: Add --param aarch64-sve-compare-costs=0. From-SVN: r278343
2019-11-16[AArch64] Add sign and zero extension for partial SVE modesRichard Sandiford16-31/+220
This patch adds support for extending from partial SVE modes to both full vector modes and wider partial modes. Some tests now need --param aarch64-sve-compare-costs=0 to force the original full-vector code. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/iterators.md (SVE_HSDI): New mode iterator. (narrower_mask): Handle VNx4HI, VNx2HI and VNx2SI. * config/aarch64/aarch64-sve.md (<ANY_EXTEND:optab><SVE_PARTIAL_I:mode><SVE_HSDI:mode>2): New pattern. (*<ANY_EXTEND:optab><SVE_PARTIAL_I:mode><SVE_HSDI:mode>2): Likewise. (@aarch64_pred_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>): Update comment. Avoid new narrower_mask ambiguity. (@aarch64_cond_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>): Likewise. (*cond_uxt<mode>_2): Update comment. (*cond_uxt<mode>_any): Likewise. gcc/testsuite/ * gcc.target/aarch64/sve/cost_model_1.c: Expect the loop to be vectorized with bytes stored in 32-bit containers. * gcc.target/aarch64/sve/extend_1.c: New test. * gcc.target/aarch64/sve/extend_2.c: New test. * gcc.target/aarch64/sve/extend_3.c: New test. * gcc.target/aarch64/sve/extend_4.c: New test. * gcc.target/aarch64/sve/load_const_offset_3.c: Add --param aarch64-sve-compare-costs=0. * gcc.target/aarch64/sve/mask_struct_store_1.c: Likewise. * gcc.target/aarch64/sve/mask_struct_store_1_run.c: Likewise. * gcc.target/aarch64/sve/mask_struct_store_2.c: Likewise. * gcc.target/aarch64/sve/mask_struct_store_2_run.c: Likewise. * gcc.target/aarch64/sve/unpack_unsigned_1.c: Likewise. * gcc.target/aarch64/sve/unpack_unsigned_1_run.c: Likewise. From-SVN: r278342
2019-11-16[AArch64] Add autovec support for partial SVE vectorsRichard Sandiford12-167/+674
This patch adds the bare minimum needed to support autovectorisation of partial SVE vectors, namely moves and integer addition. Later patches add more interesting cases. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64-modes.def: Define partial SVE vector float modes. * config/aarch64/aarch64-protos.h (aarch64_sve_pred_mode): New function. * config/aarch64/aarch64.c (aarch64_classify_vector_mode): Handle the new vector float modes. (aarch64_sve_container_bits): New function. (aarch64_sve_pred_mode): Likewise. (aarch64_get_mask_mode): Use it. (aarch64_sve_element_int_mode): Handle structure modes and partial modes. (aarch64_sve_container_int_mode): New function. (aarch64_vectorize_related_mode): Return SVE modes when given SVE modes. Handle partial modes, taking the preferred number of units from the size of the given mode. (aarch64_hard_regno_mode_ok): Allow partial modes to be stored in registers. (aarch64_expand_sve_ld1rq): Use the mode form of aarch64_sve_pred_mode. (aarch64_expand_sve_const_vector): Handle partial SVE vectors. (aarch64_split_sve_subreg_move): Use the mode form of aarch64_sve_pred_mode. (aarch64_secondary_reload): Handle partial modes in the same way as full big-endian vectors. (aarch64_vector_mode_supported_p): Allow partial SVE vectors. (aarch64_autovectorize_vector_modes): Try unpacked SVE vectors, merging with the Advanced SIMD modes. If two modes have the same size, try the Advanced SIMD mode first. (aarch64_simd_valid_immediate): Use the container rather than the element mode for INDEX constants. (aarch64_simd_vector_alignment): Make the alignment of partial SVE vector modes the same as their minimum size. (aarch64_evpc_sel): Use the mode form of aarch64_sve_pred_mode. * config/aarch64/aarch64-sve.md (mov<SVE_FULL:mode>): Extend to... (mov<SVE_ALL:mode>): ...this. (movmisalign<SVE_FULL:mode>): Extend to... (movmisalign<SVE_ALL:mode>): ...this. (*aarch64_sve_mov<mode>_le): Rename to... (*aarch64_sve_mov<mode>_ldr_str): ...this. (*aarch64_sve_mov<SVE_FULL:mode>_be): Rename and extend to... (*aarch64_sve_mov<SVE_ALL:mode>_no_ldr_str): ...this. Handle partial modes regardless of endianness. (aarch64_sve_reload_be): Rename to... (aarch64_sve_reload_mem): ...this and enable for little-endian. Use aarch64_sve_pred_mode to get the appropriate predicate mode. (@aarch64_pred_mov<SVE_FULL:mode>): Extend to... (@aarch64_pred_mov<SVE_ALL:mode>): ...this. (*aarch64_sve_mov<SVE_FULL:mode>_subreg_be): Extend to... (*aarch64_sve_mov<SVE_ALL:mode>_subreg_be): ...this. (@aarch64_sve_reinterpret<SVE_FULL:mode>): Extend to... (@aarch64_sve_reinterpret<SVE_ALL:mode>): ...this. (*aarch64_sve_reinterpret<SVE_FULL:mode>): Extend to... (*aarch64_sve_reinterpret<SVE_ALL:mode>): ...this. (maskload<SVE_FULL:mode><vpred>): Extend to... (maskload<SVE_ALL:mode><vpred>): ...this. (maskstore<SVE_FULL:mode><vpred>): Extend to... (maskstore<SVE_ALL:mode><vpred>): ...this. (vec_duplicate<SVE_FULL:mode>): Extend to... (vec_duplicate<SVE_ALL:mode>): ...this. (*vec_duplicate<SVE_FULL:mode>_reg): Extend to... (*vec_duplicate<SVE_ALL:mode>_reg): ...this. (sve_ld1r<SVE_FULL:mode>): Extend to... (sve_ld1r<SVE_ALL:mode>): ...this. (vec_series<SVE_FULL_I:mode>): Extend to... (vec_series<SVE_I:mode>): ...this. (*vec_series<SVE_FULL_I:mode>_plus): Extend to... (*vec_series<SVE_I:mode>_plus): ...this. (@aarch64_pred_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>): Avoid new VPRED ambiguity. (@aarch64_cond_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>): Likewise. (add<SVE_FULL_I:mode>3): Extend to... (add<SVE_I:mode>3): ...this. * config/aarch64/iterators.md (SVE_ALL, SVE_I): New mode iterators. (Vetype, Vesize, VEL, Vel, vwcore): Handle partial SVE vector modes. (VPRED, vpred): Likewise. (Vctype): New iterator. (vw): Remove SVE modes. gcc/testsuite/ * gcc.target/aarch64/sve/mixed_size_1.c: New test. * gcc.target/aarch64/sve/mixed_size_2.c: Likewise. * gcc.target/aarch64/sve/mixed_size_3.c: Likewise. * gcc.target/aarch64/sve/mixed_size_4.c: Likewise. * gcc.target/aarch64/sve/mixed_size_5.c: Likewise. From-SVN: r278341
2019-11-16[AArch64] Tweak gcc.target/aarch64/sve/clastb_8.cRichard Sandiford2-2/+10
clastb_8.c was using scan-tree-dump-times to check for fully-masked loops, which made it sensitive to the number of times we try to vectorize. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/testsuite/ * gcc.target/aarch64/sve/clastb_8.c: Use assembly tests to check for fully-masked loops. From-SVN: r278340
2019-11-16[AArch64] Replace SVE_PARTIAL with SVE_PARTIAL_IRichard Sandiford3-12/+18
Another renaming, this time to make way for partial/unpacked float modes. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/iterators.md (SVE_PARTIAL): Rename to... (SVE_PARTIAL_I): ...this. * config/aarch64/aarch64-sve.md: Apply the above renaming throughout. From-SVN: r278339