aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-11-16[AArch64] Add scatter stores for partial SVE modesRichard Sandiford10-40/+185
This patch adds support for scatter stores of partial vectors, where the vector base or offset elements can be wider than the elements being stored. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64-sve.md (scatter_store<SVE_FULL_SD:mode><v_int_equiv>): Extend to... (scatter_store<SVE_24:mode><v_int_container>): ...this. (mask_scatter_store<SVE_FULL_S:mode><v_int_equiv>): Extend to... (mask_scatter_store<SVE_4:mode><v_int_equiv>): ...this. (mask_scatter_store<SVE_FULL_D:mode><v_int_equiv>): Extend to... (mask_scatter_store<SVE_2:mode><v_int_equiv>): ...this. (*mask_scatter_store<mode><v_int_container>_<su>xtw_unpacked): New pattern. (*mask_scatter_store<SVE_FULL_D:mode><v_int_equiv>_sxtw): Extend to... (*mask_scatter_store<SVE_2:mode><v_int_equiv>_sxtw): ...this. (*mask_scatter_store<SVE_FULL_D:mode><v_int_equiv>_uxtw): Extend to... (*mask_scatter_store<SVE_2:mode><v_int_equiv>_uxtw): ...this. gcc/testsuite/ * gcc.target/aarch64/sve/scatter_store_1.c (TEST_LOOP): Start at 0. (TEST_ALL): Add tests for 8-bit and 16-bit elements. * gcc.target/aarch64/sve/scatter_store_2.c: Update accordingly. * gcc.target/aarch64/sve/scatter_store_3.c (TEST_LOOP): Start at 0. (TEST_ALL): Add tests for 8-bit and 16-bit elements. * gcc.target/aarch64/sve/scatter_store_4.c: Update accordingly. * gcc.target/aarch64/sve/scatter_store_5.c (TEST_LOOP): Start at 0. (TEST_ALL): Add tests for 8-bit, 16-bit and 32-bit elements. * gcc.target/aarch64/sve/scatter_store_8.c: New test. * gcc.target/aarch64/sve/scatter_store_9.c: Likewise. From-SVN: r278347
2019-11-16[AArch64] Pattern-match SVE extending gather loadsRichard Sandiford17-128/+710
This patch pattern-matches a partial gather load followed by a sign or zero extension into an extending gather load. (The partial gather load is already an extending load; we just don't rely on the upper bits of the elements.) 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/iterators.md (SVE_2BHSI, SVE_2HSDI, SVE_4BHI) (SVE_4HSI): New mode iterators. (ANY_EXTEND2): New code iterator. * config/aarch64/aarch64-sve.md (@aarch64_gather_load_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>): Extend to... (@aarch64_gather_load_<ANY_EXTEND:optab><SVE_4HSI:mode><SVE_4BHI:mode>): ...this, handling extension to partial modes as well as full modes. Describe the extension as a predicated rather than unpredicated extension. (@aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>): Likewise extend to... (@aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>): ...this, making the same adjustments. (*aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_sxtw): Likewise extend to... (*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>_sxtw) ...this, making the same adjustments. (*aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_uxtw): Likewise extend to... (*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>_uxtw) ...this, making the same adjustments. (*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>_<ANY_EXTEND2:su>xtw_unpacked): New pattern. (*aarch64_ldff1_gather<mode>_sxtw): Canonicalize to a constant extension predicate. (@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>) (@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>) (*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_uxtw): Describe the extension as a predicated rather than unpredicated extension. (*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_sxtw): Likewise. Canonicalize to a constant extension predicate. * config/aarch64/aarch64-sve-builtins-base.cc (svld1_gather_extend_impl::expand): Add an extra predicate for the extension. (svldff1_gather_extend_impl::expand): Likewise. gcc/testsuite/ * gcc.target/aarch64/sve/gather_load_extend_1.c: New test. * gcc.target/aarch64/sve/gather_load_extend_2.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_3.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_4.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_5.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_6.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_7.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_8.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_9.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_10.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_11.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_12.c: Likewise. From-SVN: r278346
2019-11-16[AArch64] Add gather loads for partial SVE modesRichard Sandiford14-50/+268
This patch adds support for gather loads of partial vectors, where the vector base or offset elements can be wider than the elements being loaded. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/iterators.md (SVE_24, SVE_2, SVE_4): New mode iterators. * config/aarch64/aarch64-sve.md (gather_load<SVE_FULL_SD:mode><v_int_equiv>): Extend to... (gather_load<SVE_24:mode><v_int_container>): ...this. (mask_gather_load<SVE_FULL_S:mode><v_int_equiv>): Extend to... (mask_gather_load<SVE_4:mode><v_int_container>): ...this. (mask_gather_load<SVE_FULL_D:mode><v_int_equiv>): Extend to... (mask_gather_load<SVE_2:mode><v_int_container>): ...this. (*mask_gather_load<SVE_2:mode><v_int_container>_<su>xtw_unpacked): New pattern. (*mask_gather_load<SVE_FULL_D:mode><v_int_equiv>_sxtw): Extend to... (*mask_gather_load<SVE_2:mode><v_int_equiv>_sxtw): ...this. Allow the nominal extension predicate to be different from the load predicate. (*mask_gather_load<SVE_FULL_D:mode><v_int_equiv>_uxtw): Extend to... (*mask_gather_load<SVE_2:mode><v_int_equiv>_uxtw): ...this. gcc/testsuite/ * gcc.target/aarch64/sve/gather_load_1.c (TEST_LOOP): Start at 0. (TEST_ALL): Add tests for 8-bit and 16-bit elements. * gcc.target/aarch64/sve/gather_load_2.c: Update accordingly. * gcc.target/aarch64/sve/gather_load_3.c (TEST_LOOP): Start at 0. (TEST_ALL): Add tests for 8-bit and 16-bit elements. * gcc.target/aarch64/sve/gather_load_4.c: Update accordingly. * gcc.target/aarch64/sve/gather_load_5.c (TEST_LOOP): Start at 0. (TEST_ALL): Add tests for 8-bit, 16-bit and 32-bit elements. * gcc.target/aarch64/sve/gather_load_6.c: Add --param aarch64-sve-compare-costs=0. (TEST_LOOP): Start at 0. * gcc.target/aarch64/sve/gather_load_7.c: Add --param aarch64-sve-compare-costs=0. * gcc.target/aarch64/sve/gather_load_8.c: New test. * gcc.target/aarch64/sve/gather_load_9.c: Likewise. * gcc.target/aarch64/sve/mask_gather_load_6.c: Add --param aarch64-sve-compare-costs=0. From-SVN: r278345
2019-11-16[AArch64] Add truncation for partial SVE modesRichard Sandiford11-6/+113
This patch adds support for "truncating" to a partial SVE vector from either a full SVE vector or a wider partial vector. This truncation is actually a no-op and so should have zero cost in the vector cost model. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64-sve.md (trunc<SVE_HSDI:mode><SVE_PARTIAL_I:mode>2): New pattern. * config/aarch64/aarch64.c (aarch64_integer_truncation_p): New function. (aarch64_sve_adjust_stmt_cost): Call it. gcc/testsuite/ * gcc.target/aarch64/sve/mask_struct_load_1.c: Add --param aarch64-sve-compare-costs=0. * gcc.target/aarch64/sve/mask_struct_load_2.c: Likewise. * gcc.target/aarch64/sve/mask_struct_load_3.c: Likewise. * gcc.target/aarch64/sve/mask_struct_load_4.c: Likewise. * gcc.target/aarch64/sve/mask_struct_load_5.c: Likewise. * gcc.target/aarch64/sve/pack_1.c: Likewise. * gcc.target/aarch64/sve/truncate_1.c: New test. From-SVN: r278344
2019-11-16[AArch64] Pattern-match SVE extending loadsRichard Sandiford16-75/+358
This patch pattern-matches a partial SVE load followed by a sign or zero extension into an extending load. (The partial load is already an extending load; we just don't rely on the upper bits of the elements.) Nothing yet uses the extra LDFF1 and LDNF1 combinations, but it seemed more consistent to provide them, since I needed to update the pattern to use a predicated extension anyway. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64-sve.md (@aarch64_load_<ANY_EXTEND:optab><VNx8_WIDE:mode><VNx8_NARROW:mode>): (@aarch64_load_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>) (@aarch64_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>): Combine into... (@aarch64_load_<ANY_EXTEND:optab><SVE_HSDI:mode><SVE_PARTIAL_I:mode>): ...this new pattern, handling extension to partial modes as well as full modes. Describe the extension as a predicated rather than unpredicated extension. (@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx8_WIDE:mode><VNx8_NARROW:mode>) (@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>) (@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>): Combine into... (@aarch64_ld<fn>f1_<ANY_EXTEND:optab><SVE_HSDI:mode><SVE_PARTIAL_I:mode>): ...this new pattern, handling extension to partial modes as well as full modes. Describe the extension as a predicated rather than unpredicated extension. * config/aarch64/aarch64-sve-builtins.cc (function_expander::use_contiguous_load_insn): Add an extra predicate for extending loads. * config/aarch64/aarch64.c (aarch64_extending_load_p): New function. (aarch64_sve_adjust_stmt_cost): Likewise. (aarch64_add_stmt_cost): Use aarch64_sve_adjust_stmt_cost to adjust the cost of SVE vector stmts. gcc/testsuite/ * gcc.target/aarch64/sve/load_extend_1.c: New test. * gcc.target/aarch64/sve/load_extend_2.c: Likewise. * gcc.target/aarch64/sve/load_extend_3.c: Likewise. * gcc.target/aarch64/sve/load_extend_4.c: Likewise. * gcc.target/aarch64/sve/load_extend_5.c: Likewise. * gcc.target/aarch64/sve/load_extend_6.c: Likewise. * gcc.target/aarch64/sve/load_extend_7.c: Likewise. * gcc.target/aarch64/sve/load_extend_8.c: Likewise. * gcc.target/aarch64/sve/load_extend_9.c: Likewise. * gcc.target/aarch64/sve/load_extend_10.c: Likewise. * gcc.target/aarch64/sve/reduc_4.c: Add --param aarch64-sve-compare-costs=0. From-SVN: r278343
2019-11-16[AArch64] Add sign and zero extension for partial SVE modesRichard Sandiford16-31/+220
This patch adds support for extending from partial SVE modes to both full vector modes and wider partial modes. Some tests now need --param aarch64-sve-compare-costs=0 to force the original full-vector code. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/iterators.md (SVE_HSDI): New mode iterator. (narrower_mask): Handle VNx4HI, VNx2HI and VNx2SI. * config/aarch64/aarch64-sve.md (<ANY_EXTEND:optab><SVE_PARTIAL_I:mode><SVE_HSDI:mode>2): New pattern. (*<ANY_EXTEND:optab><SVE_PARTIAL_I:mode><SVE_HSDI:mode>2): Likewise. (@aarch64_pred_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>): Update comment. Avoid new narrower_mask ambiguity. (@aarch64_cond_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>): Likewise. (*cond_uxt<mode>_2): Update comment. (*cond_uxt<mode>_any): Likewise. gcc/testsuite/ * gcc.target/aarch64/sve/cost_model_1.c: Expect the loop to be vectorized with bytes stored in 32-bit containers. * gcc.target/aarch64/sve/extend_1.c: New test. * gcc.target/aarch64/sve/extend_2.c: New test. * gcc.target/aarch64/sve/extend_3.c: New test. * gcc.target/aarch64/sve/extend_4.c: New test. * gcc.target/aarch64/sve/load_const_offset_3.c: Add --param aarch64-sve-compare-costs=0. * gcc.target/aarch64/sve/mask_struct_store_1.c: Likewise. * gcc.target/aarch64/sve/mask_struct_store_1_run.c: Likewise. * gcc.target/aarch64/sve/mask_struct_store_2.c: Likewise. * gcc.target/aarch64/sve/mask_struct_store_2_run.c: Likewise. * gcc.target/aarch64/sve/unpack_unsigned_1.c: Likewise. * gcc.target/aarch64/sve/unpack_unsigned_1_run.c: Likewise. From-SVN: r278342
2019-11-16[AArch64] Add autovec support for partial SVE vectorsRichard Sandiford12-167/+674
This patch adds the bare minimum needed to support autovectorisation of partial SVE vectors, namely moves and integer addition. Later patches add more interesting cases. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64-modes.def: Define partial SVE vector float modes. * config/aarch64/aarch64-protos.h (aarch64_sve_pred_mode): New function. * config/aarch64/aarch64.c (aarch64_classify_vector_mode): Handle the new vector float modes. (aarch64_sve_container_bits): New function. (aarch64_sve_pred_mode): Likewise. (aarch64_get_mask_mode): Use it. (aarch64_sve_element_int_mode): Handle structure modes and partial modes. (aarch64_sve_container_int_mode): New function. (aarch64_vectorize_related_mode): Return SVE modes when given SVE modes. Handle partial modes, taking the preferred number of units from the size of the given mode. (aarch64_hard_regno_mode_ok): Allow partial modes to be stored in registers. (aarch64_expand_sve_ld1rq): Use the mode form of aarch64_sve_pred_mode. (aarch64_expand_sve_const_vector): Handle partial SVE vectors. (aarch64_split_sve_subreg_move): Use the mode form of aarch64_sve_pred_mode. (aarch64_secondary_reload): Handle partial modes in the same way as full big-endian vectors. (aarch64_vector_mode_supported_p): Allow partial SVE vectors. (aarch64_autovectorize_vector_modes): Try unpacked SVE vectors, merging with the Advanced SIMD modes. If two modes have the same size, try the Advanced SIMD mode first. (aarch64_simd_valid_immediate): Use the container rather than the element mode for INDEX constants. (aarch64_simd_vector_alignment): Make the alignment of partial SVE vector modes the same as their minimum size. (aarch64_evpc_sel): Use the mode form of aarch64_sve_pred_mode. * config/aarch64/aarch64-sve.md (mov<SVE_FULL:mode>): Extend to... (mov<SVE_ALL:mode>): ...this. (movmisalign<SVE_FULL:mode>): Extend to... (movmisalign<SVE_ALL:mode>): ...this. (*aarch64_sve_mov<mode>_le): Rename to... (*aarch64_sve_mov<mode>_ldr_str): ...this. (*aarch64_sve_mov<SVE_FULL:mode>_be): Rename and extend to... (*aarch64_sve_mov<SVE_ALL:mode>_no_ldr_str): ...this. Handle partial modes regardless of endianness. (aarch64_sve_reload_be): Rename to... (aarch64_sve_reload_mem): ...this and enable for little-endian. Use aarch64_sve_pred_mode to get the appropriate predicate mode. (@aarch64_pred_mov<SVE_FULL:mode>): Extend to... (@aarch64_pred_mov<SVE_ALL:mode>): ...this. (*aarch64_sve_mov<SVE_FULL:mode>_subreg_be): Extend to... (*aarch64_sve_mov<SVE_ALL:mode>_subreg_be): ...this. (@aarch64_sve_reinterpret<SVE_FULL:mode>): Extend to... (@aarch64_sve_reinterpret<SVE_ALL:mode>): ...this. (*aarch64_sve_reinterpret<SVE_FULL:mode>): Extend to... (*aarch64_sve_reinterpret<SVE_ALL:mode>): ...this. (maskload<SVE_FULL:mode><vpred>): Extend to... (maskload<SVE_ALL:mode><vpred>): ...this. (maskstore<SVE_FULL:mode><vpred>): Extend to... (maskstore<SVE_ALL:mode><vpred>): ...this. (vec_duplicate<SVE_FULL:mode>): Extend to... (vec_duplicate<SVE_ALL:mode>): ...this. (*vec_duplicate<SVE_FULL:mode>_reg): Extend to... (*vec_duplicate<SVE_ALL:mode>_reg): ...this. (sve_ld1r<SVE_FULL:mode>): Extend to... (sve_ld1r<SVE_ALL:mode>): ...this. (vec_series<SVE_FULL_I:mode>): Extend to... (vec_series<SVE_I:mode>): ...this. (*vec_series<SVE_FULL_I:mode>_plus): Extend to... (*vec_series<SVE_I:mode>_plus): ...this. (@aarch64_pred_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>): Avoid new VPRED ambiguity. (@aarch64_cond_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>): Likewise. (add<SVE_FULL_I:mode>3): Extend to... (add<SVE_I:mode>3): ...this. * config/aarch64/iterators.md (SVE_ALL, SVE_I): New mode iterators. (Vetype, Vesize, VEL, Vel, vwcore): Handle partial SVE vector modes. (VPRED, vpred): Likewise. (Vctype): New iterator. (vw): Remove SVE modes. gcc/testsuite/ * gcc.target/aarch64/sve/mixed_size_1.c: New test. * gcc.target/aarch64/sve/mixed_size_2.c: Likewise. * gcc.target/aarch64/sve/mixed_size_3.c: Likewise. * gcc.target/aarch64/sve/mixed_size_4.c: Likewise. * gcc.target/aarch64/sve/mixed_size_5.c: Likewise. From-SVN: r278341
2019-11-16[AArch64] Tweak gcc.target/aarch64/sve/clastb_8.cRichard Sandiford2-2/+10
clastb_8.c was using scan-tree-dump-times to check for fully-masked loops, which made it sensitive to the number of times we try to vectorize. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/testsuite/ * gcc.target/aarch64/sve/clastb_8.c: Use assembly tests to check for fully-masked loops. From-SVN: r278340
2019-11-16[AArch64] Replace SVE_PARTIAL with SVE_PARTIAL_IRichard Sandiford3-12/+18
Another renaming, this time to make way for partial/unpacked float modes. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/iterators.md (SVE_PARTIAL): Rename to... (SVE_PARTIAL_I): ...this. * config/aarch64/aarch64-sve.md: Apply the above renaming throughout. From-SVN: r278339
2019-11-16[AArch64] Add "FULL" to SVE mode iterator namesRichard Sandiford4-1195/+1272
An upcoming patch will make more use of partial/unpacked SVE vectors. We then need a distinction between mode iterators that include partial modes and those that only include "full" modes. This patch prepares for that by adding "FULL" to the names of iterators that only select full modes. There should be no change in behaviour. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/iterators.md (SVE_ALL): Rename to... (SVE_FULL): ...this. (SVE_I): Rename to... (SVE_FULL_I): ...this. (SVE_F): Rename to... (SVE_FULL_F): ...this. (SVE_BHSI): Rename to... (SVE_FULL_BHSI): ...this. (SVE_HSD): Rename to... (SVE_FULL_HSD): ...this. (SVE_HSDI): Rename to... (SVE_FULL_HSDI): ...this. (SVE_HSF): Rename to... (SVE_FULL_HSF): ...this. (SVE_SD): Rename to... (SVE_FULL_SD): ...this. (SVE_SDI): Rename to... (SVE_FULL_SDI): ...this. (SVE_SDF): Rename to... (SVE_FULL_SDF): ...this. (SVE_S): Rename to... (SVE_FULL_S): ...this. (SVE_D): Rename to... (SVE_FULL_D): ...this. * config/aarch64/aarch64-sve.md: Apply the above renaming throughout. * config/aarch64/aarch64-sve2.md: Likewise. From-SVN: r278338
2019-11-16[AArch64] Enable VECT_COMPARE_COSTS by default for SVERichard Sandiford14-36/+154
This patch enables VECT_COMPARE_COSTS by default for SVE, both so that we can compare SVE against Advanced SIMD and so that (with future patches) we can compare multiple SVE vectorisation approaches against each other. It also adds a target-specific --param to control this. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64.opt (--param=aarch64-sve-compare-costs): New option. * doc/invoke.texi: Document it. * config/aarch64/aarch64.c (aarch64_autovectorize_vector_modes): By default, return VECT_COMPARE_COSTS for SVE. gcc/testsuite/ * gcc.target/aarch64/sve/reduc_3.c: Split multi-vector cases out into... * gcc.target/aarch64/sve/reduc_3_costly.c: ...this new test, passing -fno-vect-cost-model for them. * gcc.target/aarch64/sve/slp_6.c: Add -fno-vect-cost-model. * gcc.target/aarch64/sve/slp_7.c, * gcc.target/aarch64/sve/slp_7_run.c: Split multi-vector cases out into... * gcc.target/aarch64/sve/slp_7_costly.c, * gcc.target/aarch64/sve/slp_7_costly_run.c: ...these new tests, passing -fno-vect-cost-model for them. * gcc.target/aarch64/sve/while_7.c: Add -fno-vect-cost-model. * gcc.target/aarch64/sve/while_9.c: Likewise. From-SVN: r278337
2019-11-16Optionally pick the cheapest loop_vec_infoRichard Sandiford13-19/+219
This patch adds a mode in which the vectoriser tries each available base vector mode and picks the one with the lowest cost. The new behaviour is selected by autovectorize_vector_modes. The patch keeps the current behaviour of preferring a VF of loop->simdlen over any larger or smaller VF, regardless of costs or target preferences. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * target.h (VECT_COMPARE_COSTS): New constant. * target.def (autovectorize_vector_modes): Return a bitmask of flags. * doc/tm.texi: Regenerate. * targhooks.h (default_autovectorize_vector_modes): Update accordingly. * targhooks.c (default_autovectorize_vector_modes): Likewise. * config/aarch64/aarch64.c (aarch64_autovectorize_vector_modes): Likewise. * config/arc/arc.c (arc_autovectorize_vector_modes): Likewise. * config/arm/arm.c (arm_autovectorize_vector_modes): Likewise. * config/i386/i386.c (ix86_autovectorize_vector_modes): Likewise. * config/mips/mips.c (mips_autovectorize_vector_modes): Likewise. * tree-vectorizer.h (_loop_vec_info::vec_outside_cost) (_loop_vec_info::vec_inside_cost): New member variables. * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize them. (vect_better_loop_vinfo_p, vect_joust_loop_vinfos): New functions. (vect_analyze_loop): When autovectorize_vector_modes returns VECT_COMPARE_COSTS, try vectorizing the loop with each available vector mode and picking the one with the lowest cost. (vect_estimate_min_profitable_iters): Record the computed costs in the loop_vec_info. From-SVN: r278336
2019-11-16Extend can_duplicate_and_interleave_p to mixed-size vectorsRichard Sandiford4-15/+36
This patch makes can_duplicate_and_interleave_p cope with mixtures of vector sizes, by using queries based on get_vectype_for_scalar_type instead of directly querying GET_MODE_SIZE (vinfo->vector_mode). int_mode_for_size is now the first check we do for a candidate mode, so it seemed better to restrict it to MAX_FIXED_MODE_SIZE. This avoids unnecessary work and avoids trying to create scalar types that the target might not support. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (can_duplicate_and_interleave_p): Take an element type rather than an element mode. * tree-vect-slp.c (can_duplicate_and_interleave_p): Likewise. Use get_vectype_for_scalar_type to query the natural types for a given element type rather than basing everything on GET_MODE_SIZE (vinfo->vector_mode). Limit int_mode_for_size query to MAX_FIXED_MODE_SIZE. (duplicate_and_interleave): Update call accordingly. * tree-vect-loop.c (vectorizable_reduction): Likewise. From-SVN: r278335
2019-11-16Apply maximum nunits for BB SLPRichard Sandiford11-64/+359
The BB vectoriser picked vector types in the same way as the loop vectoriser: it picked a vector mode/size for the region and then based all the vector types off that choice. This meant we could end up trying to use vector types that had too many elements for the group size. The main part of this patch is therefore about passing the SLP group size down to routines like get_vectype_for_scalar_type and ensuring that each vector type in the SLP tree is chosen wrt the group size. That part in itself is pretty easy and mechanical. The main warts are: (1) We normally pick a STMT_VINFO_VECTYPE for data references at an early stage (vect_analyze_data_refs). However, nothing in the BB vectoriser relied on this, or on the min_vf calculated from it. I couldn't see anything other than vect_recog_bool_pattern that tried to access the vector type before the SLP tree is built. (2) It's possible for the same statement to be used in groups of different sizes. Taking the group size into account meant that we could try to pick different vector types for the same statement. This problem should go away with the move to doing everything on SLP trees, where presumably we would attach the vector type to the SLP node rather than the stmt_vec_info. Until then, the patch just uses a first-come, first-served approach. (3) A similar problem exists for grouped data references, where different statements in the same dataref group could be used in SLP nodes that have different group sizes. The patch copes with that by making sure that all vector types in a dataref group remain consistent. The patch means that: void f (int *x, short *y) { x[0] += y[0]; x[1] += y[1]; x[2] += y[2]; x[3] += y[3]; } now produces: ldr q0, [x0] ldr d1, [x1] saddw v0.4s, v0.4s, v1.4h str q0, [x0] ret instead of: ldrsh w2, [x1] ldrsh w3, [x1, 2] fmov s0, w2 ldrsh w2, [x1, 4] ldrsh w1, [x1, 6] ins v0.s[1], w3 ldr q1, [x0] ins v0.s[2], w2 ins v0.s[3], w1 add v0.4s, v0.4s, v1.4s str q0, [x0] ret Unfortunately it also means we start to vectorise gcc.target/i386/pr84101.c for -m32. That seems like a target cost issue though; see PR92265 for details. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (vect_get_vector_types_for_stmt): Take an optional maximum nunits. (get_vectype_for_scalar_type): Likewise. Also declare a form that takes an slp_tree. (get_mask_type_for_scalar_type): Take an optional slp_tree. (vect_get_mask_type_for_stmt): Likewise. * tree-vect-data-refs.c (vect_analyze_data_refs): Don't store the vector type in STMT_VINFO_VECTYPE for BB vectorization. * tree-vect-patterns.c (vect_recog_bool_pattern): Use vect_get_vector_types_for_stmt instead of STMT_VINFO_VECTYPE to get an assumed vector type for data references. * tree-vect-slp.c (vect_update_shared_vectype): New function. (vect_update_all_shared_vectypes): Likewise. (vect_build_slp_tree_1): Pass the group size to vect_get_vector_types_for_stmt. Use vect_update_shared_vectype for BB vectorization. (vect_build_slp_tree_2): Call vect_update_all_shared_vectypes before building the vectof from scalars. (vect_analyze_slp_instance): Pass the group size to get_vectype_for_scalar_type. (vect_slp_analyze_node_operations_1): Don't recompute the vector types for BB vectorization here; just handle the case in which we deferred the choice for booleans. (vect_get_constant_vectors): Pass the slp_tree to get_vectype_for_scalar_type. * tree-vect-stmts.c (vect_prologue_cost_for_slp_op): Likewise. (vectorizable_call): Likewise. (vectorizable_simd_clone_call): Likewise. (vectorizable_conversion): Likewise. (vectorizable_shift): Likewise. (vectorizable_operation): Likewise. (vectorizable_comparison): Likewise. (vect_is_simple_cond): Take the slp_tree as argument and pass it to get_vectype_for_scalar_type. (vectorizable_condition): Update call accordingly. (get_vectype_for_scalar_type): Take a group_size argument. For BB vectorization, limit the the vector to that number of elements. Also define an overload that takes an slp_tree. (get_mask_type_for_scalar_type): Add an slp_tree argument and pass it to get_vectype_for_scalar_type. (vect_get_vector_types_for_stmt): Add a group_size argument and pass it to get_vectype_for_scalar_type. Don't use the cached vector type for BB vectorization if a group size is given. Handle data references in that case. (vect_get_mask_type_for_stmt): Take an slp_tree argument and pass it to get_mask_type_for_scalar_type. gcc/testsuite/ * gcc.dg/vect/bb-slp-4.c: Expect the block to be vectorized with -fno-vect-cost-model. * gcc.dg/vect/bb-slp-bool-1.c: New test. * gcc.target/aarch64/vect_mixed_sizes_14.c: Likewise. * gcc.target/i386/pr84101.c: XFAIL for -m32. From-SVN: r278334
2019-11-16Fix nonspec_time when there is no cached value.Jan Hubicka3-3/+13
* ipa-inline.h (do_estimate_edge_time): Add nonspec_time parameter. (estimate_edge_time): Use it. * ipa-inline-analysis.c (do_estimate_edge_time): Add ret_nonspec_time parameter. From-SVN: r278333
2019-11-16Implement the <tuple> part of C++20 p1032 Misc constexpr bits.Edward Smith-Rowland6-1/+164
2019-11-15 Edward Smith-Rowland <3dw4rd@verizon.net> Implement the <tuple> part of C++20 p1032 Misc constexpr bits. * include/std/tuple (_Head_base, _Tuple_impl(allocator_arg_t,...) (_M_assign, tuple(allocator_arg_t,...), _Inherited, operator=, _M_swap) (swap, pair(piecewise_construct_t,): Constexpr. * (__uses_alloc0::_Sink::operator=, __uses_alloc_t): Constexpr. * testsuite/20_util/tuple/cons/constexpr_allocator_arg_t.cc: New test. * testsuite/20_util/tuple/constexpr_swap.cc : New test. * testsuite/20_util/uses_allocator/69293_neg.cc: Extra error for C++20. * testsuite/20_util/uses_allocator/cons_neg.cc: : Extra error for C++20. From-SVN: r278331
2019-11-16Daily bump.GCC Administrator1-1/+1
From-SVN: r278328
2019-11-15libstdc++: Fix <stop_token> and improve testsJonathan Wakely6-82/+353
* include/std/stop_token: Reduce header dependencies by including internal headers. (stop_token::swap(stop_token&), swap(stop_token&, stop_token&)): Define. (operator!=(const stop_token&, const stop_token&)): Fix return value. (stop_token::_Stop_cb::_Stop_cb(Cb&&)): Use std::forward instead of (stop_token::_Stop_state_t) [_GLIBCXX_HAS_GTHREADS]: Use lock_guard instead of unique_lock. [!_GLIBCXX_HAS_GTHREADS]: Do not use mutex. (stop_token::stop_token(_Stop_state)): Change parameter to lvalue reference. (stop_source): Remove unnecessary using-declarations for names only used once. (swap(stop_source&, stop_source&)): Define. (stop_callback(const stop_token&, _Cb&&)) (stop_callback(stop_token&&, _Cb&&)): Replace lambdas with a named function. Use std::forward instead of std::move. Run callbacks if a stop request has already been made. (stop_source::_M_execute()): Remove. (stop_source::_S_execute(_Stop_cb*)): Define. * include/std/version (__cpp_lib_jthread): Define conditionally. * testsuite/30_threads/stop_token/stop_callback.cc: New test. * testsuite/30_threads/stop_token/stop_source.cc: New test. * testsuite/30_threads/stop_token/stop_token.cc: Enable test for immediate execution of callback. From-SVN: r278325
2019-11-15Diagnose duplicate C2x standard attributes.Joseph Myers6-2/+111
For each of the attributes currently included in C2x, it has a constraint that the attribute shall appear at most once in each attribute list (attribute-list being what appear between a single [[ and ]]). This patch implements that check. As the corresponding check in the C++ front end (cp_parser_check_std_attribute) makes violations into errors, I made them into errors, with the same wording, for C as well. There is an existing check in the case of the fallthrough attribute, with a warning rather than an error, in attribute_fallthrough_p. That is more general, as it also covers __attribute__ ((fallthrough)) and the case of [[fallthrough]] [[fallthrough]] (multiple attribute-lists in a single attribute-specifier-sequence), which is not a constraint violation. To avoid some [[fallthrough, fallthrough]] being diagnosed twice, the check I added avoids adding duplicate attributes to the list. Bootstrapped with no regressions on x86_64-pc-linux-gnu. gcc/c: * c-parser.c (c_parser_std_attribute_specifier): Diagnose duplicate standard attributes. gcc/testsuite: * gcc.dg/c2x-attr-deprecated-4.c, gcc.dg/c2x-attr-fallthrough-4.c, gcc.dg/c2x-attr-maybe_unused-4.c: New tests. From-SVN: r278324
2019-11-15typeck.c (cp_truthvalue_conversion): Add tsubst_flags_t parameter and use it ↵Paolo Carlini11-38/+56
in calls... /cp 2019-11-15 Paolo Carlini <paolo.carlini@oracle.com> * typeck.c (cp_truthvalue_conversion): Add tsubst_flags_t parameter and use it in calls; also pass the location_t of the expression to cp_build_binary_op and c_common_truthvalue_conversion. * rtti.c (build_dynamic_cast_1): Adjust call. * cvt.c (ocp_convert): Likewise. * cp-gimplify.c (cp_fold): Likewise. * cp-tree.h (cp_truthvalue_conversion): Update declaration. /testsuite 2019-11-15 Paolo Carlini <paolo.carlini@oracle.com> * g++.dg/warn/Walways-true-1.C: Check locations too. * g++.dg/warn/Walways-true-2.C: Likewise. * g++.dg/warn/Walways-true-3.C: Likewise. * g++.dg/warn/Waddress-1.C: Check additional location. From-SVN: r278320
2019-11-15Forgot to change teh date range.Edward Smith-Rowland1-1/+1
From-SVN: r278318
2019-11-15Implement the default_searcher part of C++20 p1032 Misc constexpr bits.Edward Smith-Rowland3-1/+62
2019-11-15 Edward Smith-Rowland <3dw4rd@verizon.net> Implement the default_searcher part of C++20 p1032 Misc constexpr bits. * include/std/functional (default_searcher, default_searcher::operator()): Constexpr. * testsuite/20_util/function_objects/constexpr_searcher.cc: New. From-SVN: r278317
2019-11-15testmain.exp: link against GOLIBSIan Lance Taylor2-2/+6
Patch by Maciej W. Rozycki. Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/207458 From-SVN: r278316
2019-11-15libstdc++: Implement LWG 3149 for std::default_constructibleJonathan Wakely6-25/+85
The change approved in Belfast did not actually rename the concept from std::default_constructible to std::default_initializable, even though that was intended. That is expected to be done soon as a separate issue, so I'm implementing that now too. * include/bits/iterator_concepts.h (weakly_incrementable): Adjust. * include/std/concepts (default_constructible): Rename to default_initializable and require default-list-initialization and default-initialization to be valid (LWG 3149). (semiregular): Adjust to new name. * testsuite/std/concepts/concepts.lang/concept.defaultconstructible/ 1.cc: Rename directory to concept.defaultinitializable and adjust to new name. * testsuite/std/concepts/concepts.lang/concept.defaultinitializable/ lwg3149.cc: New test. * testsuite/util/testsuite_iterators.h (test_range): Adjust. From-SVN: r278314
2019-11-15libstdc++: Implement LWG 3070 in path::lexically_relativeJonathan Wakely3-1/+45
* src/c++17/fs_path.cc [_GLIBCXX_FILESYSTEM_IS_WINDOWS] (is_disk_designator): New helper function. (path::_Parser::root_path()): Use is_disk_designator. (path::lexically_relative(const path&)): Implement resolution of LWG 3070. * testsuite/27_io/filesystem/path/generation/relative.cc: Check with path components that look like a root-name. From-SVN: r278313
2019-11-15m68k: add musl supportSzabolcs Nagy4-1/+13
Add the dynamic linker name and fix a type name to use the public name instead of the glibc internal name. gcc/ChangeLog: 2019-11-15 Szabolcs Nagy <szabolcs.nagy@arm.com> * config/m68k/linux.h (MUSL_DYNAMIC_LINKER): Define. libgcc/ChangeLog: 2019-11-15 Szabolcs Nagy <szabolcs.nagy@arm.com> * config/m68k/linux-unwind.h (struct uw_ucontext): Use sigset_t instead of __sigset_t. From-SVN: r278312
2019-11-15Support C2x [[maybe_unused]] attribute.Joseph Myers6-0/+79
This patch adds support for the C2x [[maybe_unused]] attribute, using the same handler as for GNU __attribute__ ((unused)). As with other such attribute support, I think turning certain warnings into pedwarns for usage in cases where that is a constraint violation can be addressed later as a bug fix, as can the C2x constraint for various standard attributes that they do not appear more than once inside a single [[]]. However, the warnings that appear in c2x-attr-maybe_unused-1.c (that the attribute is ignored on member declarations) need to remain as warnings not pedwarns, since C2x does permit the attribute there. (Or they could be silenced, on the basis that GCC doesn't have warnings for unused struct and union members so it's completely harmless that it's ignoring an attribute that might do something useful with another compiler that does have such warnings.) Bootstrapped with no regressions on x86_64-pc-linux-gnu. gcc/c: * c-decl.c (std_attribute_table): Add maybe_unused. gcc/testsuite: * gcc.dg/c2x-attr-maybe_unused-1.c, gcc.dg/c2x-attr-maybe_unused-2.c, gcc.dg/c2x-attr-maybe_unused-3.c: New tests. From-SVN: r278310
2019-11-15MAINTAINERS: Change my email address as maintainer.Kelvin Nilsen2-1/+5
ChangeLog: 2019-11-15 Kelvin Nilsen <kelvin@gcc.gnu.org> * MAINTAINERS: Change my email address as maintainer. From-SVN: r278309
2019-11-15microblaze: fix PR65649Nick Clifton2-2/+8
microblaze-linux-musl build fails without this. (This is a rebase of an earlier patch posted on bugzilla.) gcc/ChangeLog: 2019-11-15 Nick Clifton <nickc@redhat.com> Szabolcs Nagy <szabolcs.nagy@arm.com> PR target/65649 * config/microblaze/microblaze.c (print_operand): Print value as long. Co-Authored-By: Szabolcs Nagy <szabolcs.nagy@arm.com> From-SVN: r278308
2019-11-15ipa-inline.c (edge_badness, [...]): Revert accidental commit.Jan Hubicka2-14/+6
* ipa-inline.c (edge_badness, inline_small_functions): Revert accidental commit. From-SVN: r278307
2019-11-15[amdgcn] Unfix registers for frame pointerKwok Cheung Yeung2-2/+7
Allow the registers used for the frame pointer to be used for other purposes if the frame pointer is not being used. 2019-11-15 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/ * config/gcn/gcn.h (FIXED_REGISTERS): Unfix frame pointer. (CALL_USED_REGISTERS): Make frame pointer callee-saved. From-SVN: r278306
2019-11-15[amdgcn] Update lower bounds for the number of registers in non-leaf kernelsKwok Cheung Yeung2-6/+23
Reduce the lower limits on the number of registers requested by non-leaf kernels to help improve CU occupancy. 2019-11-15 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/ * config/gcn/gcn.c (MAX_NORMAL_SGPR_COUNT, MAX_NORMAL_VGPR_COUNT): New. (gcn_conditional_register_usage): Use constants in place of hard-coded values. (gcn_hsa_declare_function_name): Set lower bound for number of SGPRs/VGPRs in non-leaf kernels to MAX_NORMAL_SGPR_COUNT and MAX_NORMAL_VGPR_COUNT. From-SVN: r278305
2019-11-15ipa: Remove stray declarationMartin Jambor2-3/+5
2019-11-15 Martin Jambor <mjambor@suse.cz> * ipa-utils.h (ipa_remove_useless_jump_functions): Remove stray declaration. From-SVN: r278303
2019-11-15[amdgcn] Restrict registers available to non-kernel functionsKwok Cheung Yeung3-30/+49
Restrict the number of SGPRs and VGPRs available to non-kernel functions to improve compute-unit occupancy with multiple threads. 2019-11-15 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/ * config/gcn/gcn.c (default_requested_args): New. (gcn_parse_amdgpu_hsa_kernel_attribute): Initialize requested args set with default_requested_args. (gcn_conditional_register_usage): Limit register usage of non-kernel functions. Reassign fixed registers if a non-standard set of args is requested. * config/gcn/gcn.h (FIXED_REGISTERS): Fix registers according to ABI. From-SVN: r278301
2019-11-15re PR ipa/92528 (ICE in ipa_get_parm_lattices since r278219)Feng Xue2-6/+13
2019-11-15 Feng Xue <fxue@os.amperecomputing.com> PR ipa/92528 * ipa-prop.c (update_jump_functions_after_inlining): Invalidate aggregate jump function when inlined-to caller has no edge summary. From-SVN: r278300
2019-11-15[amdgcn] Reinitialize registers for every functionKwok Cheung Yeung2-1/+6
gcn_conditional_register_usage needs to be called for every function to set the fixed registers depending on the kernel args currently requested. 2019-11-15 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/ * config/gcn/gcn.c (gcn_init_cumulative_args): Call reinit_regs. From-SVN: r278299
2019-11-15Implement P1816R0, class template argument deduction for aggregates.Jason Merrill7-35/+254
Rather than reimplement brace elision here, we call reshape_init and then discard the result. We needed to set CLASSTYPE_NON_AGGREGATE a bit more in this patch, since outside a template it's set in check_bases_and_members. * pt.c (maybe_aggr_guide, collect_ctor_idx_types): New. (is_spec_or_derived): Split out from do_class_deduction. (build_deduction_guide): Handle aggregate guide. * class.c (finish_struct): Set CLASSTYPE_NON_AGGREGATE in a template. * cp-tree.h (CP_AGGREGATE_TYPE_P): An incomplete class is not an aggregate. From-SVN: r278298
2019-11-15[amdgcn] Use first lane of v1 for zero offsetKwok Cheung Yeung2-14/+10
Use v1 instead of v0 when a zero-valued VGPR is needed. This frees up v0 for other purposes. 2019-11-15 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/ * config/gcn/gcn.c (gcn_expand_prologue): Remove initialization and prologue use of v0. (print_operand_address): Use v1 for zero vector offset. From-SVN: r278297
2019-11-15libstdc++: Fix definition of std::nostopstate objectJonathan Wakely8-7/+156
Also add <stop_token> header to PCH and Doxygen config. * doc/doxygen/user.cfg.in: Add <stop_token>. * include/precompiled/stdc++.h: Likewise. * include/std/stop_token: Fix definition of std::nostopstate. * testsuite/30_threads/headers/stop_token/synopsis.cc: New test. * testsuite/30_threads/headers/thread/types_std_c++20.cc: New test. * testsuite/30_threads/stop_token/stop_source.cc: New test. * testsuite/30_threads/stop_token/stop_token.cc: Remove unnecessary dg-require directives. Remove I/O and inclusion of <iostream>. From-SVN: r278296
2019-11-15Fix vector/scalar to vector/vector conversion (PR92515)Richard Sandiford2-15/+19
r278235 broke conversions of vector/scalar shifts into vector/vector shifts on targets that only provide the latter. We need to record whether a conversion is required in that case too. Also, the old useless_type_conversion_p condition seemed unnecessarily strong, since the shift amount can have a different signedness from the shifted value and its vector type is never assumed to be identical to vectype. The patch therefore uses tree_nop_conversion_p instead. 2019-11-15 Richard Sandiford <richard.sandiford@arm.com> gcc/ PR tree-optimization/92515 * tree-vect-stmts.c (vectorizable_shift): Record incompatible op1 types when converting a vector/scalar shift into a vector/vector one, using tree_nop_conversion_p instead of useless_type_conversion_p. Move the conversion code to the transform block. From-SVN: r278295
2019-11-15[mid-end][__RTL] Account for column numbers in __RTL functionsMatthew Malcomson4-3/+51
The documentation for __RTL tests (see "(gccint) RTL Tests" info node) has the following snippet. ``` The parser expects the RTL body to be in the format emitted by this dumping function: DEBUG_FUNCTION void print_rtx_function (FILE *outfile, function *fn, bool compact); when "compact" is true. So you can capture RTL in the correct format from the debugger using: (gdb) print_rtx_function (stderr, cfun, true); and copy and paste the output into the body of the C function. ``` Since r264944 print_rtx_function prints column number information, which the __RTL function parsing does not handle. This patch handles column number information optionally, so pre-existing __RTL functions still work, and the above documentation quote still holds. Note: If people would prefer to require column information I could make a slightly neater code and update existing tests. I guess this would be OK since the intended use for __RTL functions is in these testcases so there is no worry about other existing code. bootstrapped and regtested on aarch64 bootstrapped and regtested on x86_64 Ok for trunk? Cheers, Matthew gcc/ChangeLog: 2019-11-15 Matthew Malcomson <matthew.malcomson@arm.com> * read-rtl-function.c (function_reader::add_fixup_source_location): Take additional parameter of a column. (function_reader::maybe_read_location): Optionally parse column information and pass to add_fixup_source_location. gcc/testsuite/ChangeLog: 2019-11-15 Matthew Malcomson <matthew.malcomson@arm.com> * gcc.dg/rtl/aarch64/rtl-handle-column-numbers.c: New test. From-SVN: r278294
2019-11-15re PR tree-optimization/92512 (ICE in gimple_op, at gimple.h:2436)Richard Biener4-4/+45
2019-11-15 Richard Biener <rguenther@suse.de> PR tree-optimization/92512 * tree-vect-loop.c (check_reduction_path): Fix operand index computability check. Add check for second use in COND_EXPRs. * gcc.dg/torture/pr92512.c: New testcase. From-SVN: r278293
2019-11-15[rs6000] Use VIEW_CONVERT_EXPR to reinterpret vectors (PR 92515)Richard Sandiford2-3/+12
The new tree-cfg.c checking in r278245 tripped on folds of ALTIVEC_BUILTIN_VPERM_*, which were using gimple_convert rather than VIEW_CONVERT_EXPR to reinterpret the contents of a vector as a different type. 2019-11-15 Richard Sandiford <richard.sandiford@arm.com> gcc/ PR target/92515 * config/rs6000/rs6000-call.c (rs6000_gimple_fold_builtin): Use VIEW_CONVERT_EXPR to reinterpret vectors as different types. From-SVN: r278292
2019-11-15[amdgcn] Fix handling of VCC_CONDITIONAL_REGKwok Cheung Yeung2-1/+12
Classify vcc_lo and vcc_hi into the VCC_CONDITIONAL_REG class, and spill them into SGPRs if necessary. 2019-11-15 Kwok Cheung Yeung <kcy@codesourcery.com> gcc/ * config/gcn/gcn.c (gcn_regno_reg_class): Return VCC_CONDITIONAL_REG register class for VCC_LO and VCC_HI. (gcn_spill_class): Use SGPR_REGS to spill registers in VCC_CONDITIONAL_REG. From-SVN: r278290
2019-11-15re PR tree-optimization/92324 (ICE in expand_direct_optab_fn, at ↵Richard Biener4-30/+81
internal-fn.c:2890) 2019-11-15 Richard Biener <rguenther@suse.de> PR tree-optimization/92324 * tree-vect-loop.c (vect_create_epilog_for_reduction): Fix singedness of SLP reduction epilouge operations. Also reduce the vector width for SLP reductions before doing elementwise operations if possible. * gcc.dg/vect/pr92324-4.c: New testcase. From-SVN: r278289
2019-11-15re PR fortran/69654 (ICE in gfc_trans_structure_assign)Paul Thomas4-1/+85
2019-11-15 Paul Thomas <pault@gcc.gnu.org> PR fortran/69654 * trans-expr.c (gfc_trans_structure_assign): Move assignment to 'cm' after treatment of C pointer types and test that the type has been completely built before it. Add an assert that the backend_decl for each component exists. 2019-11-15 Paul Thomas <pault@gcc.gnu.org> PR fortran/69654 * gfortran.dg/derived_init_6.f90: New test. From-SVN: r278287
2019-11-15libstdc++: Fix changelog whitespaceJonathan Wakely1-3/+4
From-SVN: r278286
2019-11-15[mid-end][__RTL] Set global epilogue_completed in skip_passMatthew Malcomson4-0/+39
Set global epilogue_completed when skipping pro_and_epilogue pass When compiling RTL functions marked to start at a pass after the reload pass, `skip_pass` is used to mark the reload pass as having completed since many patterns use the `reload_completed` variable to determine whether to run or not. Here we do the same for the `epilogue_completed` variable and the pro_and_epilogue pass. Also include a testcase that relies on the availability of a define_split in the aarch64 backend that is conditioned on this `epilogue_completed` variable. regtest done on native aarch64 regtest done on native x64_86 gcc/ChangeLog: 2019-11-15 Matthew Malcomson <matthew.malcomson@arm.com> * passes.c (skip_pass): Set epilogue_completed if skipping the pro_and_epilogue pass. gcc/testsuite/ChangeLog: 2019-11-15 Matthew Malcomson <matthew.malcomson@arm.com> * gcc.dg/rtl/aarch64/test-epilogue-set.c: New test. From-SVN: r278285
2019-11-15Add tests for print from offload target.Andrew Stubbs5-0/+71
2019-11-15 Andrew Stubbs <ams@codesourcery.com> libgomp/ * testsuite/libgomp.c/target-print-1.c: New file. * testsuite/libgomp.fortran/target-print-1.f90: New file. * testsuite/libgomp.oacc-c/print-1.c: New file. * testsuite/libgomp.oacc-fortran/print-1.f90: New file. From-SVN: r278284
2019-11-15[mid-end][__RTL] Clean state despite invalid __RTL startwith passesMatthew Malcomson4-1/+55
Hi there, When compiling an __RTL function that has an invalid "startwith" pass we currently don't run the dfinish cleanup pass. This means we ICE on the next function. This change ensures that all state is cleaned up for the next function to run correctly. As an example, before this change the following code would ICE when compiling the function `foo2` because the "peephole2" pass is not run at optimisation level -O0. When compiled with ./aarch64-none-linux-gnu-gcc -O0 -S missed-pass-error.c -o test.s ``` int __RTL (startwith ("peephole2")) badfoo () { (function "badfoo" (insn-chain (block 2 (edge-from entry (flags "FALLTHRU")) (cnote 3 [bb 2] NOTE_INSN_BASIC_BLOCK) (cinsn 101 (set (reg:DI x19) (reg:DI x0))) (cinsn 10 (use (reg/i:SI x19))) (edge-to exit (flags "FALLTHRU")) ) ;; block 2 ) ;; insn-chain ) ;; function "foo2" } int __RTL (startwith ("final")) foo2 () { (function "foo2" (insn-chain (block 2 (edge-from entry (flags "FALLTHRU")) (cnote 3 [bb 2] NOTE_INSN_BASIC_BLOCK) (cinsn 101 (set (reg:DI x19) (reg:DI x0))) (cinsn 10 (use (reg/i:SI x19))) (edge-to exit (flags "FALLTHRU")) ) ;; block 2 ) ;; insn-chain ) ;; function "foo2" } ``` Now it silently ignores the __RTL function and successfully compiles foo2. regtest done on aarch64 regtest done on x86_64 OK for trunk? gcc/ChangeLog: 2019-11-15 Matthew Malcomson <matthew.malcomson@arm.com> * passes.c (should_skip_pass_p): Always run "dfinish". gcc/testsuite/ChangeLog: 2019-11-15 Matthew Malcomson <matthew.malcomson@arm.com> * gcc.dg/rtl/aarch64/missed-pass-error.c: New test. From-SVN: r278283