Age | Commit message (Collapse) | Author | Files | Lines |
|
cp/constexpr.c:2568)
PR c++/92015
* constexpr.c (cxx_eval_component_reference, cxx_eval_bit_field_ref):
Use STRIP_ANY_LOCATION_WRAPPER on CONSTRUCTOR elts.
* g++.dg/cpp0x/constexpr-92015.C: New test.
From-SVN: r277267
|
|
has_value_dependent_address wasn't stripping location wrappers so it
gave the wrong answer for "&x" in the static_assert. That led us to
thinking that the expression isn't instantiation-dependent, and we
skipped static initialization of A<0>::x.
This patch adds stripping so that has_value_dependent_address gives the
same answer as it used to before the location wrappers addition.
* pt.c (has_value_dependent_address): Strip location wrappers.
* g++.dg/cpp0x/constexpr-odr1.C: New test.
* g++.dg/cpp0x/constexpr-odr2.C: New test.
From-SVN: r277266
|
|
* typeck.c (maybe_warn_about_returning_address_of_local): Avoid
recursing on null initializer and return false instead.
* g++.dg/cpp1z/decomp50.C: New test.
From-SVN: r277264
|
|
My DImode arithmetic patches introduced a bug on thumb2 where we could
generate a register controlled shift into an ALU operation. In
fairness the bug was always present, but latent.
As part of cleaning this up (and auditing to ensure I've caught them
all this time) I've gone through all the shift generating patterns in
the MD files and cleaned them up, reducing some duplicate patterns
between the arm and thumb2 descriptions where we can now share the
same pattern. In some cases we were missing the shift attribute; in
most cases I've eliminated an ugly attribute setting using the fact
that we normally need separate alternatives for shift immediate and
shift reg to simplify the logic.
* config/arm/iterators.md (t2_binop0): Fix typo in comment.
* config/arm/arm.md (addsi3_carryin_shift): Simplify selection of the
type attribute.
(subsi3_carryin_shift): Separate into register and constant controlled
alternatives. Use shift_amount_operand for operand 4. Set shift
attribute and simplify type attribute.
(subsi3_carryin_shift_alt): Likewise.
(rsbsi3_carryin_shift): Likewise.
(rsbsi3_carryin_shift_alt): Likewise.
(andsi_not_shiftsi_si): Enable for TARGET_32BIT. Separate constant
and register controlled shifts into distinct alternatives.
(andsi_not_shiftsi_si_scc_no_reuse): Likewise.
(andsi_not_shiftsi_si_scc): Likewise.
(arm_cmpsi_negshiftsi_si): Likewise.
(not_shiftsi): Remove redundant M constraint from alternative 1.
(not_shiftsi_compare0): Likewise.
(arm_cmpsi_insn): Remove redundant alternative 2.
(cmpsi_shift_swp): Likewise.
(sub_shiftsi): Likewise.
(sub_shiftsi_compare0_scratch): Likewise.
* config/arm/thumb2.md (thumb_andsi_not_shiftsi_si): Delete pattern.
(thumb2_cmpsi_neg_shiftsi): Likewise.
From-SVN: r277262
|
|
tree-vect-loop.c:4252)
2019-10-21 Richard Biener <rguenther@suse.de>
PR tree-optimization/92162
* tree-vect-loop.c (vect_create_epilog_for_reduction): Lookup
STMT_VINFO_REDUC_IDX in reduc_info.
* tree-vect-stmts.c (vectorizable_condition): Likewise.
* gcc.dg/pr92162.c: New testcase.
From-SVN: r277261
|
|
2019-10-21 Richard Biener <rguenther@suse.de>
* tree-vectorizer.h (_slp_tree::ops): New member.
(SLP_TREE_SCALAR_OPS): New.
(vect_get_slp_defs): Adjust prototype.
* tree-vect-slp.c (vect_free_slp_tree): Release
SLP_TREE_SCALAR_OPS.
(vect_create_new_slp_node): Initialize it. New overload for
initializing by an operands array.
(_slp_oprnd_info::ops): New member.
(vect_create_oprnd_info): Initialize it.
(vect_free_oprnd_info): Release it.
(vect_get_and_check_slp_defs): Populate the operands array.
Do not swap operands in the IL when not necessary.
(vect_build_slp_tree_2): Build SLP nodes for invariant operands.
Record SLP_TREE_SCALAR_OPS for all invariant nodes. Also
swap operands in the operands array. Do not swap operands in
the IL.
(vect_slp_rearrange_stmts): Re-arrange SLP_TREE_SCALAR_OPS as well.
(vect_gather_slp_loads): Fix.
(vect_detect_hybrid_slp_stmts): Likewise.
(vect_slp_analyze_node_operations_1): Search for a internal
def child for computing reduction SLP_TREE_NUMBER_OF_VEC_STMTS.
(vect_slp_analyze_node_operations): Skip ops-only stmts for
the def-type push/pop dance.
(vect_get_constant_vectors): Compute number_of_vectors here.
Use SLP_TREE_SCALAR_OPS and simplify greatly.
(vect_get_slp_vect_defs): Use gimple_get_lhs also for PHIs.
(vect_get_slp_defs): Simplify greatly.
* tree-vect-loop.c (vectorize_fold_left_reduction): Simplify.
(vect_transform_reduction): Likewise.
* tree-vect-stmts.c (vect_get_vec_defs): Simplify.
(vectorizable_call): Likewise.
(vectorizable_operation): Likewise.
(vectorizable_load): Likewise.
(vectorizable_condition): Likewise.
(vectorizable_comparison): Likewise.
From-SVN: r277241
|
|
tree-vect-stmts.c:1687)
2019-10-21 Richard Biener <rguenther@suse.de>
PR tree-optimization/92161
* tree-vect-loop.c (vect_analyze_loop_2): Reset stmts def-type
for reductions.
* gfortran.dg/pr92161.f: New testcase.
From-SVN: r277240
|
|
This patch implements the recently published[1] __rndr and __rndrrs
intrinsics used to access the RNG in Armv8.5-A.
The __rndrrs intrinsics can be used to reseed the generator too.
They are guarded by the __ARM_FEATURE_RNG feature macro.
A quirk with these intrinsics is that they store the random number in
their pointer argument and return a status
code if the generation succeeded.
The instructions themselves write the CC flags indicating the success of
the operation that we can then read with a CSET.
Therefore this implementation makes use of the IGNORE indicator to the
builtin expand machinery to avoid generating
the CSET if its result is unused (the CC reg clobbering effect is still
reflected in the pattern).
I've checked that using unspec_volatile prevents undesirable CSEing of
the instructions.
[1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics
* config/aarch64/aarch64.md (UNSPEC_RNDR, UNSPEC_RNDRRS): Define.
(aarch64_rndr): New define_insn.
(aarch64_rndrrs): Likewise.
* config/aarch64/aarch64.h (AARCH64_ISA_RNG): Define.
(TARGET_RNG): Likewise.
* config/aarch64/aarch64.c (aarch64_expand_builtin): Use IGNORE
argument.
* config/aarch64/aarch64-protos.h (aarch64_general_expand_builtin):
Add fourth argument in prototype.
* config/aarch64/aarch64-builtins.c (enum aarch64_builtins):
Add AARCH64_BUILTIN_RNG_RNDR, AARCH64_BUILTIN_RNG_RNDRRS.
(aarch64_init_rng_builtins): Define.
(aarch64_general_init_builtins): Call aarch64_init_rng_builtins.
(aarch64_expand_rng_builtin): Define.
(aarch64_general_expand_builtin): Use IGNORE argument, handle
RNG builtins.
* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Define
__ARM_FEATURE_RNG when TARGET_RNG.
* config/aarch64/arm_acle.h (__rndr, __rndrrs): Define.
* gcc.target/aarch64/acle/rng_1.c: New test.
From-SVN: r277239
|
|
This patch makes sure ensure_base_align only changes alignment if the new
alignment is more restrictive. It already did this if we were dealing with
symbols, but it now does it for all types of declarations.
gcc/ChangeLog:
2019-10-21 Andre Vieira <andre.simoesdiasvieira@arm.com>
* tree-vect-stmts (ensure_base_align): Only change alignment if new
alignment is more restrictive.
From-SVN: r277238
|
|
gcc.target/aarch64/fmla_2.c)
2019-10-21 Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>
PR tree-optimization/91532
* gcc.target/aarch64/sve/fmla_2.c: Add dg-scan check for two st1d
insns.
From-SVN: r277237
|
|
PR testsuite/52641
* gcc.dg/torture/pr86034.c: Use 32-bit base type for a bitfield of
width > 16 bits.
* gcc.dg/torture/pr90972.c [avr]: Add option "-w".
* gcc.dg/torture/pr87693.c: Same.
* gcc.dg/torture/pr91178.c: Add dg-require-effective-target size32plus.
* gcc.dg/torture/pr91178-2.c: Same.
* gcc.dg/torture/20181024-1.c
* gcc.dg/torture/pr86554-1.c: Use 32-bit integers.
* gcc.dg/tree-ssa/pr91091-1.c: Same.
From-SVN: r277236
|
|
2019-10-21 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (vec_info::vector_size): New member variable.
(vect_update_max_nunits): Update comment.
(current_vector_size): Delete.
* tree-vect-stmts.c (current_vector_size): Likewise.
(get_vectype_for_scalar_type): Use vec_info::vector_size instead
of current_vector_size.
(get_mask_type_for_scalar_type): Likewise.
* tree-vectorizer.c (try_vectorize_loop_1): Likewise.
* tree-vect-loop.c (vect_update_vf_for_slp): Likewise.
(vect_analyze_loop, vect_halve_mask_nunits): Likewise.
(vect_double_mask_nunits, vect_transform_loop): Likewise.
* tree-vect-slp.c (can_duplicate_and_interleave_p): Likewise.
(vect_make_slp_decision, vect_slp_bb_region): Likewise.
From-SVN: r277235
|
|
2019-10-21 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (vect_double_mask_nunits): Take a vec_info.
* tree-vect-loop.c (vect_double_mask_nunits): Likewise.
* tree-vect-stmts.c (supportable_narrowing_operation): Update call
accordingly.
From-SVN: r277234
|
|
2019-10-21 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (vect_halve_mask_nunits): Take a vec_info.
* tree-vect-loop.c (vect_halve_mask_nunits): Likewise.
* tree-vect-loop-manip.c (vect_maybe_permute_loop_masks): Update
call accordingly.
* tree-vect-stmts.c (supportable_widening_operation): Likewise.
From-SVN: r277233
|
|
2019-10-21 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-loop-manip.c (vect_maybe_permute_loop_masks): Take
a loop_vec_info.
(vect_set_loop_condition_masked): Update call accordingly.
From-SVN: r277232
|
|
2019-10-21 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (supportable_narrowing_operation): Take a vec_info.
* tree-vect-stmts.c (supportable_narrowing_operation): Likewise.
(simple_integer_narrowing): Update call accordingly.
(vectorizable_conversion): Likewise.
From-SVN: r277231
|
|
2019-10-21 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-stmts.c (simple_integer_narrowing): Take a vec_info.
(vectorizable_call): Update call accordingly.
From-SVN: r277230
|
|
2019-10-21 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (can_duplicate_and_interleave_p): Take a vec_info.
* tree-vect-slp.c (can_duplicate_and_interleave_p): Likewise.
(duplicate_and_interleave): Update call accordingly.
* tree-vect-loop.c (vectorizable_reduction): Likewise.
From-SVN: r277229
|
|
2019-10-21 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (duplicate_and_interleave): Take a vec_info.
* tree-vect-slp.c (duplicate_and_interleave): Likewise.
(vect_get_constant_vectors): Update call accordingly.
* tree-vect-loop.c (get_initial_defs_for_reduction): Likewise.
From-SVN: r277228
|
|
2019-10-21 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (get_vectype_for_scalar_type): Take a vec_info.
* tree-vect-stmts.c (get_vectype_for_scalar_type): Likewise.
(vect_prologue_cost_for_slp_op): Update call accordingly.
(vect_get_vec_def_for_operand, vect_get_gather_scatter_ops)
(vect_get_strided_load_store_ops, vectorizable_simd_clone_call)
(vect_supportable_shift, vect_is_simple_cond, vectorizable_comparison)
(get_mask_type_for_scalar_type): Likewise.
(vect_get_vector_types_for_stmt): Likewise.
* tree-vect-data-refs.c (vect_analyze_data_refs): Likewise.
* tree-vect-loop.c (vect_determine_vectorization_factor): Likewise.
(get_initial_def_for_reduction, build_vect_cond_expr): Likewise.
* tree-vect-patterns.c (vect_supportable_direct_optab_p): Likewise.
(vect_split_statement, vect_convert_input): Likewise.
(vect_recog_widen_op_pattern, vect_recog_pow_pattern): Likewise.
(vect_recog_over_widening_pattern, vect_recog_mulhs_pattern): Likewise.
(vect_recog_average_pattern, vect_recog_cast_forwprop_pattern)
(vect_recog_rotate_pattern, vect_recog_vector_vector_shift_pattern)
(vect_synth_mult_by_constant, vect_recog_mult_pattern): Likewise.
(vect_recog_divmod_pattern, vect_recog_mixed_size_cond_pattern)
(check_bool_pattern, adjust_bool_pattern_cast, adjust_bool_pattern)
(search_type_for_mask_1, vect_recog_bool_pattern): Likewise.
(vect_recog_mask_conversion_pattern): Likewise.
(vect_add_conversion_to_pattern): Likewise.
(vect_recog_gather_scatter_pattern): Likewise.
* tree-vect-slp.c (vect_build_slp_tree_2): Likewise.
(vect_analyze_slp_instance, vect_get_constant_vectors): Likewise.
From-SVN: r277227
|
|
2019-10-21 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (get_mask_type_for_scalar_type): Take a vec_info.
* tree-vect-stmts.c (get_mask_type_for_scalar_type): Likewise.
(vect_check_load_store_mask): Update call accordingly.
(vect_get_mask_type_for_stmt): Likewise.
* tree-vect-patterns.c (check_bool_pattern): Likewise.
(search_type_for_mask_1, vect_recog_mask_conversion_pattern): Likewise.
(vect_convert_mask_for_vectype): Likewise.
From-SVN: r277226
|
|
2019-10-21 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-patterns.c (vect_supportable_direct_optab_p): Take
a vec_info.
(vect_recog_dot_prod_pattern): Update call accordingly.
(vect_recog_sad_pattern, vect_recog_pow_pattern): Likewise.
(vect_recog_widen_sum_pattern): Likewise.
From-SVN: r277225
|
|
2019-10-21 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.h (vect_supportable_shift): Take a vec_info.
* tree-vect-stmts.c (vect_supportable_shift): Likewise.
* tree-vect-patterns.c (vect_synth_mult_by_constant): Update call
accordingly.
From-SVN: r277224
|
|
The increase_alignment pass was using get_vectype_for_scalar_type
to get the preferred vector type for each array element type.
This has the effect of carrying over the vector size chosen by
the first successful call to all subsequent calls, whereas it seems
more natural to treat each array type independently and pick the
"best" vector type for each element type.
2019-10-21 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vectorizer.c (get_vec_alignment_for_array_type): Use
get_vectype_for_scalar_type_and_size instead of
get_vectype_for_scalar_type.
From-SVN: r277223
|
|
From-SVN: r277221
|
|
2019-10-20 Bernd Edlinger <bernd.edlinger@hotmail.de>
* common.opt (-fcommon): Fix description.
From-SVN: r277217
|
|
* config/i386/i386-protos.h (ix86_pre_reload_split): Declare.
* config/i386/i386.c (ix86_pre_reload_split): New function.
* config/i386/i386.md (*fix_trunc<mode>_i387_1, *add<mode>3_eq,
*add<mode>3_ne, *add<mode>3_eq_0, *add<mode>3_ne_0, *add<mode>3_eq,
*add<mode>3_ne, *add<mode>3_eq_1, *add<mode>3_eq_0, *add<mode>3_ne_0,
*anddi3_doubleword, *andndi3_doubleword, *<code>di3_doubleword,
*one_cmpldi2_doubleword, *ashl<dwi>3_doubleword_mask,
*ashl<dwi>3_doubleword_mask_1, *ashl<mode>3_mask, *ashl<mode>3_mask_1,
*<shift_insn><mode>3_mask, *<shift_insn><mode>3_mask_1,
*<shift_insn><dwi>3_doubleword_mask,
*<shift_insn><dwi>3_doubleword_mask_1, *<rotate_insn><mode>3_mask,
*<rotate_insn><mode>3_mask_1, *<btsc><mode>_mask, *<btsc><mode>_mask_1,
*btr<mode>_mask, *btr<mode>_mask_1, *jcc_bt<mode>, *jcc_bt<mode>_1,
*jcc_bt<mode>_mask, *popcounthi2_1, frndintxf2_<rounding>,
*fist<mode>2_<rounding>_1, *<code><mode>3_1, *<code>di3_doubleword):
Use ix86_pre_reload_split instead of can_create_pseudo_p in condition.
* config/i386/sse.md (*sse4_1_<code>v8qiv8hi2<mask_name>_2,
*avx2_<code>v8qiv8si2<mask_name>_2,
*sse4_1_<code>v4qiv4si2<mask_name>_2,
*sse4_1_<code>v4hiv4si2<mask_name>_2,
*avx512f_<code>v8qiv8di2<mask_name>_2,
*avx2_<code>v4qiv4di2<mask_name>_2, *avx2_<code>v4hiv4di2<mask_name>_2,
*sse4_1_<code>v2hiv2di2<mask_name>_2,
*sse4_1_<code>v2siv2di2<mask_name>_2, sse4_2_pcmpestr,
sse4_2_pcmpistr): Likewise.
From-SVN: r277216
|
|
* doc/install.texi (Configuration, --enable-objc-gc): hboehm.info
now defaults to https.
From-SVN: r277215
|
|
array accesses.
* tree-ssa-alias.c (nonoverlapping_refs_since_match_p): Do not
skip non-zero array accesses.
* gcc.c-torture/execute/alias-access-path-2.c: New testcase.
* gcc.dg/tree-ssa/alias-access-path-11.c: xfail.
From-SVN: r277214
|
|
After the previous patch, it seems more natural to apply the
PARAM_SLP_MAX_INSNS_IN_BB threshold as soon as we know what
the region is, rather than delaying it to vect_slp_analyze_bb_1.
(But rather than carve out the biggest region possible and then
reject it, wouldn't it be better to stop when the region gets
too big, to at least give us a chance of vectorising something?)
It also seems more natural for vect_slp_bb_region to create the
bb_vec_info itself rather than (a) having to pass bits of data down
for the initialisation and (b) forcing vect_slp_analyze_bb_1 to free
on every failure return.
2019-10-20 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-slp.c (vect_slp_analyze_bb_1): Take a bb_vec_info
and return a boolean success value. Move the allocation and
initialization of the bb_vec_info to...
(vect_slp_bb_region): ...here. Update call accordingly.
(vect_slp_bb): Apply PARAM_SLP_MAX_INSNS_IN_BB here rather
than in vect_slp_analyze_bb_1.
From-SVN: r277211
|
|
If the first attempt at applying BB SLP to a region fails, the main loop
in vect_slp_bb recomputes the region's bounds and datarefs for the next
vector size. AFAICT this isn't needed any more; we should be able
to reuse the datarefs from the first attempt instead.
2019-10-20 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-vect-slp.c (vect_slp_analyze_bb_1): Call save_datarefs
when processing the given datarefs for the first time and
check_datarefs subsequently.
(vect_slp_bb_region): New function, split out of...
(vect_slp_bb): ...here. Don't recompute the region bounds and
dataref sets when retrying with a different vector size.
From-SVN: r277210
|
|
From-SVN: r277209
|
|
replace .* with \[^\n\r]*.
* g++.dg/cpp2a/nodiscard-reason-only-one.C: In dg-error or dg-warning
remove (?n) uses and replace .* with \[^\n\r]*.
* g++.dg/cpp2a/nodiscard-reason.C: Likewise.
* g++.dg/cpp2a/nodiscard-once.C: Likewise.
* g++.dg/cpp2a/nodiscard-reason-nonstring.C: Likewise.
From-SVN: r277205
|
|
PR target/92140
* config/i386/predicates.md (int_nonimmediate_operand): New special
predicate.
* config/i386/i386.md (*add<mode>3_eq, *add<mode>3_ne,
*add<mode>3_eq_0, *add<mode>3_ne_0, *sub<mode>3_eq, *sub<mode>3_ne,
*sub<mode>3_eq_1, *sub<mode>3_eq_0, *sub<mode>3_ne_0): New
define_insn_and_split patterns.
* gcc.target/i386/pr92140.c: New test.
* gcc.c-torture/execute/pr92140.c: New test.
Co-Authored-By: Uros Bizjak <ubizjak@gmail.com>
From-SVN: r277203
|
|
Darwin does not mark entries in string.h with nonnull attributes
so the test fails. Since the purpose of the test is to check that
the warnings are issued for an inlined function, not that the target
headers are marked up, we can provide marked up headers for Darwin.
gcc/testsuite/ChangeLog:
2019-10-19 Iain Sandoe <iain@sandoe.co.uk>
* gcc.dg/Wnonnull.c: Add attributed function declarations for
memcpy and strlen for Darwin.
From-SVN: r277202
|
|
Removes a comment that's no longer relevant.
gcc/ChangeLog:
2019-10-19 Iain Sandoe <iain@sandoe.co.uk>
* config/rs6000/rs6000.md: Delete out--of-date comment about
special-casing integer loads.
From-SVN: r277201
|
|
2019-10-17 JeanHeyd Meneide <phdofthehouse@gmail.com>
gcc/
* escaped_string.h (escaped_string): New header.
* tree.c (escaped_string): Remove escaped_string class.
gcc/c-family
* c-lex.c (c_common_has_attribute): Update nodiscard value.
gcc/cp/
* tree.c (handle_nodiscard_attribute) Added C++2a nodiscard
string message.
(std_attribute_table) Increase nodiscard argument handling
max_length from 0 to 1.
* parser.c (cp_parser_check_std_attribute): Add requirement
that nodiscard only be seen once in attribute-list.
(cp_parser_std_attribute): Check that empty parenthesis lists are
not specified for attributes that have max_length > 0 (e.g.
[[attr()]]).
* cvt.c (maybe_warn_nodiscard): Add nodiscard message to
output, if applicable.
(convert_to_void): Allow constructors to be nodiscard-able (P1771).
gcc/testsuite/g++.dg/cpp0x
* gen-attrs-67.C: Test new error message for empty-parenthesis-list.
gcc/testsuite/g++.dg/cpp2a
* nodiscard-construct.C: New test.
* nodiscard-once.C: New test.
* nodiscard-reason-nonstring.C: New test.
* nodiscard-reason-only-one.C: New test.
* nodiscard-reason.C: New test.
Reviewed-by: Jason Merrill <jason@redhat.com>
From-SVN: r277200
|
|
From-SVN: r277199
|
|
gcc/testsuite/ChangeLog:
PR tree-optimization/92157
* gcc.dg/strlenopt-69.c: Disable test failing due to PR 92155.
* gcc.dg/strlenopt-87.c: New test.
gcc/ChangeLog:
PR tree-optimization/92157
* tree-ssa-strlen.c (handle_builtin_string_cmp): Be prepared for
compute_string_length to return a negative result.
From-SVN: r277194
|
|
In thumb2 we now generate a NEGS instruction rather than RSBS, so this
test needs updating.
* gcc.target/arm/negdi-3.c: Update expected output to allow NEGS.
From-SVN: r277192
|
|
The generic expansion code for negv does not try the subv patterns,
but instead emits a sub and a compare separately. Fortunately, the
patterns can make use of the new subv operations, so just call those.
We can also rewrite this using an iterator to simplify things further.
Finally, we can now make negvdi4 work on Thumb2 as well as Arm.
* config/arm/arm.md (negv<SIDI:mode>3): New expansion rule.
(negvsi3, negvdi3): Delete.
(negdi2_compare): Delete.
From-SVN: r277191
|
|
This patch adds early expansion of subvdi4. The expansion sequence
is broadly based on the expansion of usubvdi4.
* config/arm/arm.md (subvdi4): Decompose calculation into 32-bit
operations.
(subdi3_compare1): Delete pattern.
(subvsi3_borrow): New insn pattern.
(subvsi3_borrow_imm): Likewise.
From-SVN: r277190
|
|
This patch addresses constant handling in subvsi4. Either operand may
be a constant. If the second input (operand[2]) is a constant, then
we can canonicalize this into an addition form, providing we take care
of the INT_MIN case. In that case the negation has to handle the fact
that -INT_MIN is still INT_MIN and we need to ensure that a subtract
operation is performed rather than an addition. The remaining cases
are largely duals of the usubvsi4 expansion.
This patch also fixes a technical correctness bug in the old
expansion, where we did not realy describe the test for overflow in
the RTL. We seem to have got away with that, however...
* config/arm/arm.md (subv<mode>4): Delete.
(subvdi4): New expander pattern.
(subvsi4): Likewise. Handle some immediate values.
(subvsi3_intmin): New insn pattern.
(subvsi3): Likewise.
(subvsi3_imm1): Likewise.
* config/arm/arm.c (select_cc_mode): Also allow minus for CC_V
idioms.
From-SVN: r277189
|
|
This patch adds early expansion of usubvdi4, allowing us to handle some
constants in place, which previously we were unable to do.
* config/arm/arm.md (usubvdi4): Allow registers or integers for
incoming operands. Early split the calculation into SImode
operations.
(usubvsi3_borrow): New insn pattern.
(usubvsi3_borrow_imm): Likewise.
From-SVN: r277188
|
|
This patch improves the expansion of usubvsi4 by allowing suitable
constants to be passed directly. Unlike normal subtraction, either
operand may be a constant (and indeed I have seen cases where both can
be with LTO enabled). One interesting testcase that improves as a
result of this is:
unsigned f6 (unsigned a)
{
unsigned x;
return __builtin_sub_overflow (5U, a, &x) ? 0 : x;
}
Which previously compiled to:
rsbs r3, r0, #5
cmp r0, #5
movls r0, r3
movhi r0, #0
but now generates the optimal sequence:
rsbs r0, r0, #5
movcc r0, #0
* config/arm/arm.md (usubv<mode>4): Delete expansion.
(usubvsi4): New pattern. Allow some immediate values for inputs.
(usubvdi4): New pattern.
From-SVN: r277187
|
|
This patch adds early splitting for addvdi4; it's very similar to the
uaddvdi4 splitter, but the details are just different enough in
places, especially for the patterns that match the splitting, where we
have to compare against the non-widened version to detect if overflow
occurred.
I've also added a testcase to the testsuite for a couple of constants
that caught me out during the development of this patch. They're
probably arm-specific values, but the test is generic enough that I've
included it for all targets.
[gcc]
* config/arm/arm.c (arm_select_cc_mode): Allow either the first
or second operand of the PLUS inside a DImode equality test to be
sign-extend when selecting CC_Vmode.
* config/arm/arm.md (addvdi4): Early-split the operation into SImode
instructions.
(addsi3_cin_vout_reg, addsi3_cin_vout_imm, addsi3_cin_vout_0): New
expand patterns.
(addsi3_cin_vout_reg_insn, addsi3_cin_vout_imm_insn): New patterns.
(addsi3_cin_vout_0): Likewise.
(adddi3_compareV): Delete.
[gcc/testsuite]
* gcc.dg/builtin-arith-overflow-3.c: New test.
From-SVN: r277186
|
|
This patch matches the signed add-with-overflow patterns when the
summation itself is dropped. In this case we can use CMN (or CMP with
some immediates). There are a small number of constants in thumb2
where this can result in less dense code (as we lack 16-bit CMN with
immediate patterns). To handle this we use peepholes to try these
alternatives when either a scratch is available (0 <= i <= 7) or the
original register is dead (0 <= i <= 255). We don't use a scratch in
the pattern as if those conditions are not satisfied then the 32-bit
form is preferable to forcing a reload.
* config/arm/arm.md (addsi3_compareV_reg_nosum): New insn.
(addsi3_compareV_imm_nosum): New insn. Also add peephole2 patterns
to transform this back into the summation version when that leads
to smaller code.
From-SVN: r277185
|
|
Similar to the improvements for uaddvsi4, this patch improves the code
generation for addvsi4 to handle immediates and to add alternatives
that better target thumb2. To do this we separate out the expansion
of uaddvsi4 from that of uaddvdi4 and then add an additional pattern
to handle constants. Also, while doing this I've fixed the incorrect
usage of NE instead of COMPARE in the generated RTL.
* config/arm/arm.md (addv<mode>4): Delete.
(addvsi4): New pattern. Handle immediate values that the architecture
supports.
(addvdi4): New pattern.
(addsi3_compareV): Rename to ...
(addsi3_compareV_reg): ... this. Add constraints for thumb2 variants
and use COMPARE rather than NE.
(addsi3_compareV_imm): New pattern.
* config/arm/arm.c (arm_select_cc_mode): Return CC_Vmode for
a signed-overflow check.
From-SVN: r277184
|
|
This code borrows strongly on the uaddvti4 expansion for aarch64 since
the principles are similar. Firstly, if the one of the low words of
the expansion is 0, we can simply copy the other low word to the
destination and use uaddvsi4 for the upper word. If that doesn't work
we have to handle three possible cases for the upper work (the lower
word is simply an add-with-carry operation as for adddi3): zero in the
upper word, some other constant and a register (each has a different
canonicalization). We use CC_ADCmode (a new CC mode variant) to
describe the cases as the introduction of the carry means we can
no-longer use the normal overflow trick of comparing the sum against
one of the operands.
* config/arm/arm-modes.def (CC_ADC): New CC mode.
* config/arm/arm.c (arm_select_cc_mode): Detect selection of
CC_ADCmode.
(maybe_get_arm_condition_code): Handle CC_ADCmode.
* config/arm/arm.md (uaddvdi4): Early expansion of unsigned addition
with overflow.
(addsi3_cin_cout_reg, addsi3_cin_cout_imm, addsi3_cin_cout_0): New
expand patterns.
(addsi3_cin_cout_reg_insn, addsi3_cin_cout_0_insn): New insn patterns
(addsi3_cin_cout_imm_insn): Likewise.
(adddi3_compareC): Delete insn.
* config/arm/predicates.md (arm_carry_operation): Handle CC_ADCmode.
From-SVN: r277183
|
|
The uaddv patterns in the arm back-end do not currenty handle immediates
during expansion. This patch adds this support for uaddvsi4. It's really
a stepping-stone towards early expansion of uaddvdi4, but it complete and
a useful change in its own right.
Whilst making this change I also observed that we really had two patterns
that did exactly the same thing, but with slightly different properties;
consequently I've cleaned up all of the add-and-compare patterns to bring
some consistency.
* config/arm/arm.md (adddi3): Call gen_addsi3_compare_op1.
* (uaddv<mode>4): Delete expansion pattern.
(uaddvsi4): New pattern.
(uaddvdi4): Likewise.
(addsi3_compareC): Delete pattern, change callers to use
addsi3_compare_op1.
(addsi3_compare_op1): No-longer anonymous. Clean up constraints to
reduce the number of alternatives and re-work type attribute handling.
(addsi3_compare_op2): Clean up constraints to reduce the number of
alternatives and re-work type attribute handling.
(compare_addsi2_op0): Likewise.
(compare_addsi2_op1): Likewise.
From-SVN: r277182
|