aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2020-10-02c++: Hash table iteration for namespace-member spelling suggestionsNathan Sidwell3-45/+87
For 'no such binding' errors, we iterate over binding levels to find a close match. At the namespace level we were using DECL_ANTICIPATED to skip undeclared builtins. But (a) there are other unnameable things there and (b) decl-anticipated is about to go away. This changes the namespace scanning to iterate over the hash table, and look at non-hidden bindings. This does mean we look at fewer strings (hurrarh), but the order we meet them is somewhat 'random'. Our distance measure is not very fine grained, and a couple of testcases change their suggestion. I notice for the c/c++ common one, we now match the output of the C compiler. For the other one we think 'int' and 'int64_t' have the same distance from 'int64', and now meet the former first. That's a little unfortunate. If it's too problematic I suppose we could sort the strings via an intermediate array before measuring distance. gcc/cp/ * name-lookup.c (consider_decl): New, broken out of ... (consider_binding_level): ... here. Iterate the hash table for namespace bindings. gcc/testsuite/ * c-c++-common/spellcheck-reserved.c: Adjust diagnostic. * g++.dg/spellcheck-typenames.C: Adjust diagnostic.
2020-10-02c++: cleanup ctor_omit_inherited_parms [PR97268]Nathan Sidwell4-134/+222
ctor_omit_inherited_parms was being somewhat abused. What I'd missed is that it checks for a base-dtor name, before proceeding with the check. But we ended up passing it that during cloning before we'd completed the cloning. It was also using DECL_ORIGIN to get to the in-charge ctor, but we sometimes zap DECL_ABSTRACT_ORIGIN, and it ends up processing the incoming function -- which happens to work. so, this breaks out a predicate that expects to get the incharge ctor, and will tell you whether its base ctor will need to omit the parms. We call that directly during cloning. Then the original fn is essentially just a wrapper, but uses DECL_CLONED_FUNCTION to get to the in-charge ctor. That uncovered abuse in add_method, which was happily passing TEMPLATE_DECLs to it. Let's not do that. add_method itself contained a loop mostly containing an 'if (nomatch) continue' idiom, except for a final 'if (match) {...}' check, which itself contained instances of the former idiom. I refactored that to use the former idiom throughout. In doing that I found a place where we'd issue an error, but then not actually reject the new member. gcc/cp/ * cp-tree.h (base_ctor_omit_inherited_parms): Declare. * class.c (add_method): Refactor main loop, only pass fns to ctor_omit_inherited_parms. (build_cdtor_clones): Rename bool parms. (clone_cdtor): Call base_ctor_omit_inherited_parms. * method.c (base_ctor_omit_inherited_parms): New, broken out of ... (ctor_omit_inherited_parms): ... here, call it with DECL_CLONED_FUNCTION. gcc/testsuite/ * g++.dg/inherit/pr97268.C: New.
2020-10-02ipa-cp: Separate and increase the large-unit parameterMartin Jambor2-1/+5
A previous patch in the series has taught IPA-CP to identify the important cloning opportunities in 548.exchange2_r as worthwhile on their own, but the optimization is still prevented from taking place because of the overall unit-growh limit. This patches raises that limit so that it takes place and the benchmark runs 30% faster (on AMD Zen2 CPU at least). Before this patch, IPA-CP uses the following formulae to arrive at the overall_size limit: base = MAX(orig_size, param_large_unit_insns) unit_growth_limit = base + base * param_ipa_cp_unit_growth / 100 since param_ipa_cp_unit_growth has default 10, param_large_unit_insns has default value 10000. The problem with exchange2 (at least on zen2 but I have had a quick look on aarch64 too) is that the original estimated unit size is 10513 and so param_large_unit_insns does not apply and the default limit is therefore 11564 which is good enough only for one of the ideal 8 clonings, we need the limit to be at least 16291. I would like to raise param_ipa_cp_unit_growth a little bit more soon too, but most certainly not to 55. Therefore, the large_unit must be increased. In this patch, I decided to decouple the inlining and ipa-cp large-unit parameters. It also makes sense because IPA-CP uses it only at -O3 while inlining also at -O2 (IIUC). But if we agree we can try raising param_large_unit_insns to 13-14 thousand "instructions," perhaps it is not necessary. But then again, it may make sense to actually increase the IPA-CP limit further. I plan to experiment with IPA-CP tuning on a larger set of programs. Meanwhile, mainly to address the 548.exchange2_r regression, I'm suggesting this simple change. gcc/ChangeLog: 2020-09-07 Martin Jambor <mjambor@suse.cz> * params.opt (ipa-cp-large-unit-insns): New parameter. * ipa-cp.c (get_max_overall_size): Use the new parameter.
2020-10-02ipa-cp: Add dumping of overall_size after cloningMartin Jambor1-1/+5
When experimenting with IPA-CP parameters, especially when looking into exchange2_r, it has been very useful to know what the value of overall_size is at different stages of the decision process. This patch therefore adds it to the generated dumps. gcc/ChangeLog: 2020-09-07 Martin Jambor <mjambor@suse.cz> * ipa-cp.c (estimate_local_effects): Add overeall_size to dumped string. (decide_about_value): Add dumping new overall_size.
2020-10-02ipa: Multiple predicates for loop properties, with frequenciesMartin Jambor6-114/+288
This patch enhances the ability of IPA to reason under what conditions loops in a function have known iteration counts or strides because it replaces single predicates which currently hold conjunction of predicates for all loops with vectors capable of holding multiple predicates, each with a cumulative frequency of loops with the property. This second property is then used by IPA-CP to much more aggressively boost its heuristic score for cloning opportunities which make iteration counts or strides of frequent loops compile time constant. gcc/ChangeLog: 2020-09-03 Martin Jambor <mjambor@suse.cz> * ipa-fnsummary.h (ipa_freqcounting_predicate): New type. (ipa_fn_summary): Change the type of loop_iterations and loop_strides to vectors of ipa_freqcounting_predicate. (ipa_fn_summary::ipa_fn_summary): Construct the new vectors. (ipa_call_estimates): New fields loops_with_known_iterations and loops_with_known_strides. * ipa-cp.c (hint_time_bonus): Multiply param_ipa_cp_loop_hint_bonus with the expected frequencies of loops with known iteration count or stride. * ipa-fnsummary.c (add_freqcounting_predicate): New function. (ipa_fn_summary::~ipa_fn_summary): Release the new vectors instead of just two predicates. (remap_hint_predicate_after_duplication): Replace with function remap_freqcounting_preds_after_dup. (ipa_fn_summary_t::duplicate): Use it or duplicate new vectors. (ipa_dump_fn_summary): Dump the new vectors. (analyze_function_body): Compute the loop property vectors. (ipa_call_context::estimate_size_and_time): Calculate also loops_with_known_iterations and loops_with_known_strides. Adjusted dumping accordinly. (remap_hint_predicate): Replace with function remap_freqcounting_predicate. (ipa_merge_fn_summary_after_inlining): Use it. (inline_read_section): Stream loopcounting vectors instead of two simple predicates. (ipa_fn_summary_write): Likewise. * params.opt (ipa-max-loop-predicates): New parameter. * doc/invoke.texi (ipa-max-loop-predicates): Document new param. gcc/testsuite/ChangeLog: 2020-09-03 Martin Jambor <mjambor@suse.cz> * gcc.dg/ipa/ipcp-loophint-1.c: New test.
2020-10-02ipa: Bundle estimates of ipa_call_context::estimate_size_and_timeMartin Jambor4-83/+105
A subsequent patch adds another two estimates that the code in ipa_call_context::estimate_size_and_time computes, and the fact that the function has a special output parameter for each thing it computes would make it have just too many. Therefore, this patch collapses all those ouptut parameters into one output structure. gcc/ChangeLog: 2020-09-02 Martin Jambor <mjambor@suse.cz> * ipa-inline-analysis.c (do_estimate_edge_time): Adjusted to use ipa_call_estimates. (do_estimate_edge_size): Likewise. (do_estimate_edge_hints): Likewise. * ipa-fnsummary.h (struct ipa_call_estimates): New type. (ipa_call_context::estimate_size_and_time): Adjusted declaration. (estimate_ipcp_clone_size_and_time): Likewise. * ipa-cp.c (hint_time_bonus): Changed the type of the second argument to ipa_call_estimates. (perform_estimation_of_a_value): Adjusted to use ipa_call_estimates. (estimate_local_effects): Likewise. * ipa-fnsummary.c (ipa_call_context::estimate_size_and_time): Adjusted to return estimates in a single ipa_call_estimates parameter. (estimate_ipcp_clone_size_and_time): Likewise.
2020-10-02ipa: Introduce ipa_cached_call_contextMartin Jambor3-17/+24
Hi, as we discussed with Honza on the mailin glist last week, making cached call context structure distinct from the normal one may make it clearer that the cached data need to be explicitely deallocated. This patch does that division. It is not mandatory for the overall main goals of the patch set and can be dropped if deemed superfluous. gcc/ChangeLog: 2020-09-02 Martin Jambor <mjambor@suse.cz> * ipa-fnsummary.h (ipa_cached_call_context): New forward declaration and class. (class ipa_call_context): Make friend ipa_cached_call_context. Moved methods duplicate_from and release to it too. * ipa-fnsummary.c (ipa_call_context::duplicate_from): Moved to class ipa_cached_call_context. (ipa_call_context::release): Likewise, removed the parameter. * ipa-inline-analysis.c (node_context_cache_entry): Change the type of ctx to ipa_cached_call_context. (do_estimate_edge_time): Remove parameter from the call to ipa_cached_call_context::release.
2020-10-02ipa: Bundle vectors describing argument valuesMartin Jambor6-426/+449
Hi, this large patch is mostly mechanical change which aims to replace uses of separate vectors about known scalar values (usually called known_vals or known_csts), known aggregate values (known_aggs), known virtual call contexts (known_contexts) and known value ranges (known_value_ranges) with uses of either new type ipa_call_arg_values or ipa_auto_call_arg_values, both of which simply contain these vectors inside them. The need for two distinct comes from the fact that when the vectors are constructed from jump functions or lattices, we really should use auto_vecs with embedded storage allocated on stack. On the other hand, the bundle in ipa_call_context can be allocated on heap when in cache, one time for each call_graph node. ipa_call_context is constructible from ipa_auto_call_arg_values but then its vectors must not be resized, otherwise the vectors will stop pointing to the stack ones. Unfortunately, I don't think the structure embedded in ipa_call_context can be made constant because we need to manipulate and deallocate it when in cache. gcc/ChangeLog: 2020-09-01 Martin Jambor <mjambor@suse.cz> * ipa-prop.h (ipa_auto_call_arg_values): New type. (class ipa_call_arg_values): Likewise. (ipa_get_indirect_edge_target): Replaced vector arguments with ipa_call_arg_values in declaration. Added an overload for ipa_auto_call_arg_values. * ipa-fnsummary.h (ipa_call_context): Removed members m_known_vals, m_known_contexts, m_known_aggs, duplicate_from, release and equal_to, new members m_avals, store_to_cache and equivalent_to_p. Adjusted construcotr arguments. (estimate_ipcp_clone_size_and_time): Replaced vector arguments with ipa_auto_call_arg_values in declaration. (evaluate_properties_for_edge): Likewise. * ipa-cp.c (ipa_get_indirect_edge_target): Adjusted to work on ipa_call_arg_values rather than on separate vectors. Added an overload for ipa_auto_call_arg_values. (devirtualization_time_bonus): Adjusted to work on ipa_auto_call_arg_values rather than on separate vectors. (gather_context_independent_values): Adjusted to work on ipa_auto_call_arg_values rather than on separate vectors. (perform_estimation_of_a_value): Likewise. (estimate_local_effects): Likewise. (modify_known_vectors_with_val): Adjusted both variants to work on ipa_auto_call_arg_values and rename them to copy_known_vectors_add_val. (decide_about_value): Adjusted to work on ipa_call_arg_values rather than on separate vectors. (decide_whether_version_node): Likewise. * ipa-fnsummary.c (evaluate_conditions_for_known_args): Likewise. (evaluate_properties_for_edge): Likewise. (ipa_fn_summary_t::duplicate): Likewise. (estimate_edge_devirt_benefit): Adjusted to work on ipa_call_arg_values rather than on separate vectors. (estimate_edge_size_and_time): Likewise. (estimate_calls_size_and_time_1): Likewise. (summarize_calls_size_and_time): Adjusted calls to estimate_edge_size_and_time. (estimate_calls_size_and_time): Adjusted to work on ipa_call_arg_values rather than on separate vectors. (ipa_call_context::ipa_call_context): Construct from a pointer to ipa_auto_call_arg_values instead of inividual vectors. (ipa_call_context::duplicate_from): Adjusted to access vectors within m_avals. (ipa_call_context::release): Likewise. (ipa_call_context::equal_to): Likewise. (ipa_call_context::estimate_size_and_time): Adjusted to work on ipa_call_arg_values rather than on separate vectors. (estimate_ipcp_clone_size_and_time): Adjusted to work with ipa_auto_call_arg_values rather than on separate vectors. (ipa_merge_fn_summary_after_inlining): Likewise. Adjusted call to estimate_edge_size_and_time. (ipa_update_overall_fn_summary): Adjusted call to estimate_edge_size_and_time. * ipa-inline-analysis.c (do_estimate_edge_time): Adjusted to work with ipa_auto_call_arg_values rather than with separate vectors. (do_estimate_edge_size): Likewise. (do_estimate_edge_hints): Likewise. * ipa-prop.c (ipa_auto_call_arg_values::~ipa_auto_call_arg_values): New destructor.
2020-10-02arm: Remove coercion from scalar argument to vmin & vmax intrinsicsJoe Ramsay53-140/+500
This patch fixes an issue with vmin* and vmax* intrinsics which accept a scalar argument. Previously when the scalar was of different width to the vector elements this would generate __ARM_undef. This change allows the scalar argument to be implicitly converted to the correct width. Also tidied up the relevant unit tests, some of which would have passed even if only one of two or three intrinsic calls had compiled correctly. Bootstrapped and tested on arm-none-eabi, gcc and CMSIS_DSP testsuites are clean. OK for trunk? Thanks, Joe gcc/ChangeLog: 2020-08-10 Joe Ramsay <joe.ramsay@arm.com> * config/arm/arm_mve.h (__arm_vmaxnmavq): Remove coercion of scalar argument. (__arm_vmaxnmvq): Likewise. (__arm_vminnmavq): Likewise. (__arm_vminnmvq): Likewise. (__arm_vmaxnmavq_p): Likewise. (__arm_vmaxnmvq_p): Likewise (and delete duplicate definition). (__arm_vminnmavq_p): Likewise. (__arm_vminnmvq_p): Likewise. (__arm_vmaxavq): Likewise. (__arm_vmaxavq_p): Likewise. (__arm_vmaxvq): Likewise. (__arm_vmaxvq_p): Likewise. (__arm_vminavq): Likewise. (__arm_vminavq_p): Likewise. (__arm_vminvq): Likewise. (__arm_vminvq_p): Likewise. gcc/testsuite/ChangeLog: * gcc.target/arm/mve/intrinsics/vmaxavq_p_s16.c: Add test for mismatched width of scalar argument. * gcc.target/arm/mve/intrinsics/vmaxavq_p_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxavq_p_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxavq_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxavq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxavq_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxnmavq_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxnmavq_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxnmavq_p_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxnmavq_p_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxnmvq_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxnmvq_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxnmvq_p_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxnmvq_p_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxvq_p_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxvq_p_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxvq_p_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxvq_p_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxvq_p_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxvq_p_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxvq_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxvq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxvq_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxvq_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxvq_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vmaxvq_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vminavq_p_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vminavq_p_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vminavq_p_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vminavq_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vminavq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vminavq_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vminnmavq_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vminnmavq_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vminnmavq_p_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vminnmavq_p_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vminnmvq_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vminnmvq_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vminnmvq_p_f16.c: Likewise. * gcc.target/arm/mve/intrinsics/vminnmvq_p_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vminvq_p_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vminvq_p_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vminvq_p_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vminvq_p_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vminvq_p_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vminvq_p_u8.c: Likewise. * gcc.target/arm/mve/intrinsics/vminvq_s16.c: Likewise. * gcc.target/arm/mve/intrinsics/vminvq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vminvq_s8.c: Likewise. * gcc.target/arm/mve/intrinsics/vminvq_u16.c: Likewise. * gcc.target/arm/mve/intrinsics/vminvq_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vminvq_u8.c: Likewise.
2020-10-02AArch64: Add neoversev1_tunings structKyrylo Tkachov2-2/+28
This patch adds a Neoverse V1-specific tuning struct that currently is just a deduplication of the N1 struct it was using before and specifying the SVE width. This will allow us to tweak Neoverse V1 things in the future as needed. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ * config/aarch64/aarch64.c (neoversev1_tunings): Define. * config/aarch64/aarch64-cores.def (zeus): Use it. (neoverse-v1): Likewise.
2020-10-02Perforate fnspec stringsJan Hubicka7-142/+191
gcc/ChangeLog: 2020-10-02 Jan Hubicka <hubicka@ucw.cz> * attr-fnspec.h: Update documentation. (attr_fnsec::return_desc_size): Set to 2 (attr_fnsec::arg_desc_size): Set to 2 * builtin-attrs.def (STR1): Update fnspec. * internal-fn.def (UBSAN_NULL): Update fnspec. (UBSAN_VPTR): Update fnspec. (UBSAN_PTR): Update fnspec. (ASAN_CHECK): Update fnspec. (GOACC_DIM_SIZE): Remove fnspec. (GOACC_DIM_POS): Remove fnspec. * tree-ssa-alias.c (attr_fnspec::verify): Update verification. gcc/fortran/ChangeLog: 2020-10-02 Jan Hubicka <hubicka@ucw.cz> * trans-decl.c (gfc_build_library_function_decl_with_spec): Verify fnspec. (gfc_build_intrinsic_function_decls): Update fnspecs. (gfc_build_builtin_function_decls): Update fnspecs. * trans-io.c (gfc_build_io_library_fndecls): Update fnspecs. * trans-types.c (create_fn_spec): Update fnspecs.
2020-10-02c++: Simplify __FUNCTION__ creationNathan Sidwell2-54/+32
I had reason to wander into cp_make_fname, and noticed it's the only caller of cp_fname_init. Folding it in makes the code simpler. gcc/cp/ * cp-tree.h (cp_fname_init): Delete declaration. * decl.c (cp_fname_init): Merge into only caller ... (cp_make_fname): ... here & refactor.
2020-10-02Commonize handling of attr-fnspecJan Hubicka5-62/+229
* attr-fnspec.h: New file. * calls.c (decl_return_flags): Use attr_fnspec. * gimple.c (gimple_call_arg_flags): Use attr_fnspec. (gimple_call_return_flags): Use attr_fnspec. * tree-into-ssa.c (pass_build_ssa::execute): Use attr_fnspec. * tree-ssa-alias.c (attr_fnspec::verify): New member fuction.
2020-10-02Break out ao_ref_init_from_ptr_and_range from ao_ref_init_from_ptr_and_sizeJan Hubicka1-12/+40
* tree-ssa-alias.c (ao_ref_init_from_ptr_and_range): Break out from ... (ao_ref_init_from_ptr_and_size): ... here.
2020-10-02Add poly_int64 streaming supportJan Hubicka3-0/+22
2020-10-02 Jan Hubicka <hubicka@ucw.cz> * data-streamer-in.c (streamer_read_poly_int64): New function. * data-streamer-out.c (streamer_write_poly_int64): New function. * data-streamer.h (streamer_write_poly_int64): Declare. (streamer_read_poly_int64): Declare.
2020-10-02aarch64: Remove aarch64_sve_pred_dominates_pRichard Sandiford4-162/+853
In r11-2922, Przemek fixed a post-RA instruction match failure caused by the SVE FP subtraction patterns.. This patch applies the same fix to the other patterns. To recap, the issue is around the handling of predication. We want to do two things: - Optimise cases in which a predicate is known to be all-true. - Differentiate cases in which the predicate on an _x ACLE function has to be kept as-is from cases in which we can make more lanes active. The former is true by default, the latter is true for certain combinations of flags in the -ffast-math group. This is handled by a boolean flag in the unspecs to say whether the predicate is “strict” or “relaxed”. When combining multiple strict operations, the predicates used in the operations generally need to match. When combining multiple relaxed operations, we can ignore the predicates on nested operations and just use the predicate on the “outermost” operation. Originally I'd tried to reduce combinatorial explosion by using aarch64_sve_pred_dominates_p. This required matching predicates for strict operations but allowed more combinations for relaxed operations. The problem (as I should have remembered) is that C conditions on insn patterns can't reliably enforce matching operands. If the same register is used in two different input operands, the RA is allowed to use different hard registers for those input operands (and sometimes it has to). So operands that match before RA might not match afterwards. The only sure way to force a match is via match_dup. This patch splits the cases into two. I cry bitter tears at having to do this, but I think it's the only backportable fix. There might be some way of using define_subst to generate the cond_* patterns from the pred_* patterns, with some alternatives strategically disabled in each case, but that's future work and might not be an improvement. Since so many patterns now do this, I moved the comments from the subtraction pattern to a new banner comment at the head of the file. gcc/ * config/aarch64/aarch64-protos.h (aarch64_sve_pred_dominates_p): Delete. * config/aarch64/aarch64.c (aarch64_sve_pred_dominates_p): Likewise. * config/aarch64/aarch64-sve.md: Add banner comment describing how merging predicated FP operations are represented. (*cond_<SVE_COND_FP_UNARY:optab><mode>_2): Split into... (*cond_<SVE_COND_FP_UNARY:optab><mode>_2_relaxed): ...this and... (*cond_<SVE_COND_FP_UNARY:optab><mode>_2_strict): ...this. (*cond_<SVE_COND_FP_UNARY:optab><mode>_any): Split into... (*cond_<SVE_COND_FP_UNARY:optab><mode>_any_relaxed): ...this and... (*cond_<SVE_COND_FP_UNARY:optab><mode>_any_strict): ...this. (*cond_<SVE_COND_FP_BINARY_INT:optab><mode>_2): Split into... (*cond_<SVE_COND_FP_BINARY_INT:optab><mode>_2_relaxed): ...this and... (*cond_<SVE_COND_FP_BINARY_INT:optab><mode>_2_strict): ...this. (*cond_<SVE_COND_FP_BINARY_INT:optab><mode>_any): Split into... (*cond_<SVE_COND_FP_BINARY_INT:optab><mode>_any_relaxed): ...this and... (*cond_<SVE_COND_FP_BINARY_INT:optab><mode>_any_strict): ...this. (*cond_<SVE_COND_FP_BINARY:optab><mode>_2): Split into... (*cond_<SVE_COND_FP_BINARY:optab><mode>_2_relaxed): ...this and... (*cond_<SVE_COND_FP_BINARY:optab><mode>_2_strict): ...this. (*cond_<SVE_COND_FP_BINARY_I1:optab><mode>_2_const): Split into... (*cond_<SVE_COND_FP_BINARY_I1:optab><mode>_2_const_relaxed): ...this and... (*cond_<SVE_COND_FP_BINARY_I1:optab><mode>_2_const_strict): ...this. (*cond_<SVE_COND_FP_BINARY:optab><mode>_3): Split into... (*cond_<SVE_COND_FP_BINARY:optab><mode>_3_relaxed): ...this and... (*cond_<SVE_COND_FP_BINARY:optab><mode>_3_strict): ...this. (*cond_<SVE_COND_FP_BINARY:optab><mode>_any): Split into... (*cond_<SVE_COND_FP_BINARY:optab><mode>_any_relaxed): ...this and... (*cond_<SVE_COND_FP_BINARY:optab><mode>_any_strict): ...this. (*cond_<SVE_COND_FP_BINARY_I1:optab><mode>_any_const): Split into... (*cond_<SVE_COND_FP_BINARY_I1:optab><mode>_any_const_relaxed): ...this and... (*cond_<SVE_COND_FP_BINARY_I1:optab><mode>_any_const_strict): ...this. (*cond_add<mode>_2_const): Split into... (*cond_add<mode>_2_const_relaxed): ...this and... (*cond_add<mode>_2_const_strict): ...this. (*cond_add<mode>_any_const): Split into... (*cond_add<mode>_any_const_relaxed): ...this and... (*cond_add<mode>_any_const_strict): ...this. (*cond_<SVE_COND_FCADD:optab><mode>_2): Split into... (*cond_<SVE_COND_FCADD:optab><mode>_2_relaxed): ...this and... (*cond_<SVE_COND_FCADD:optab><mode>_2_strict): ...this. (*cond_<SVE_COND_FCADD:optab><mode>_any): Split into... (*cond_<SVE_COND_FCADD:optab><mode>_any_relaxed): ...this and... (*cond_<SVE_COND_FCADD:optab><mode>_any_strict): ...this. (*cond_sub<mode>_3_const): Split into... (*cond_sub<mode>_3_const_relaxed): ...this and... (*cond_sub<mode>_3_const_strict): ...this. (*aarch64_pred_abd<mode>): Split into... (*aarch64_pred_abd<mode>_relaxed): ...this and... (*aarch64_pred_abd<mode>_strict): ...this. (*aarch64_cond_abd<mode>_2): Split into... (*aarch64_cond_abd<mode>_2_relaxed): ...this and... (*aarch64_cond_abd<mode>_2_strict): ...this. (*aarch64_cond_abd<mode>_3): Split into... (*aarch64_cond_abd<mode>_3_relaxed): ...this and... (*aarch64_cond_abd<mode>_3_strict): ...this. (*aarch64_cond_abd<mode>_any): Split into... (*aarch64_cond_abd<mode>_any_relaxed): ...this and... (*aarch64_cond_abd<mode>_any_strict): ...this. (*cond_<SVE_COND_FP_TERNARY:optab><mode>_2): Split into... (*cond_<SVE_COND_FP_TERNARY:optab><mode>_2_relaxed): ...this and... (*cond_<SVE_COND_FP_TERNARY:optab><mode>_2_strict): ...this. (*cond_<SVE_COND_FP_TERNARY:optab><mode>_4): Split into... (*cond_<SVE_COND_FP_TERNARY:optab><mode>_4_relaxed): ...this and... (*cond_<SVE_COND_FP_TERNARY:optab><mode>_4_strict): ...this. (*cond_<SVE_COND_FP_TERNARY:optab><mode>_any): Split into... (*cond_<SVE_COND_FP_TERNARY:optab><mode>_any_relaxed): ...this and... (*cond_<SVE_COND_FP_TERNARY:optab><mode>_any_strict): ...this. (*cond_<SVE_COND_FCMLA:optab><mode>_4): Split into... (*cond_<SVE_COND_FCMLA:optab><mode>_4_relaxed): ...this and... (*cond_<SVE_COND_FCMLA:optab><mode>_4_strict): ...this. (*cond_<SVE_COND_FCMLA:optab><mode>_any): Split into... (*cond_<SVE_COND_FCMLA:optab><mode>_any_relaxed): ...this and... (*cond_<SVE_COND_FCMLA:optab><mode>_any_strict): ...this. (*aarch64_pred_fac<cmp_op><mode>): Split into... (*aarch64_pred_fac<cmp_op><mode>_relaxed): ...this and... (*aarch64_pred_fac<cmp_op><mode>_strict): ...this. (*cond_<optab>_nontrunc<SVE_FULL_F:mode><SVE_FULL_HSDI:mode>): Split into... (*cond_<optab>_nontrunc<SVE_FULL_F:mode><SVE_FULL_HSDI:mode>_relaxed): ...this and... (*cond_<optab>_nontrunc<SVE_FULL_F:mode><SVE_FULL_HSDI:mode>_strict): ...this. (*cond_<optab>_nonextend<SVE_FULL_HSDI:mode><SVE_FULL_F:mode>): Split into... (*cond_<optab>_nonextend<SVE_FULL_HSDI:mode><SVE_FULL_F:mode>_relaxed): ...this and... (*cond_<optab>_nonextend<SVE_FULL_HSDI:mode><SVE_FULL_F:mode>_strict): ...this. * config/aarch64/aarch64-sve2.md (*cond_<SVE2_COND_FP_UNARY_LONG:optab><mode>): Split into... (*cond_<SVE2_COND_FP_UNARY_LONG:optab><mode>_relaxed): ...this and... (*cond_<SVE2_COND_FP_UNARY_LONG:optab><mode>_strict): ...this. (*cond_<SVE2_COND_FP_UNARY_NARROWB:optab><mode>_any): Split into... (*cond_<SVE2_COND_FP_UNARY_NARROWB:optab><mode>_any_relaxed): ...this and... (*cond_<SVE2_COND_FP_UNARY_NARROWB:optab><mode>_any_strict): ...this. (*cond_<SVE2_COND_INT_UNARY_FP:optab><mode>): Split into... (*cond_<SVE2_COND_INT_UNARY_FP:optab><mode>_relaxed): ...this and... (*cond_<SVE2_COND_INT_UNARY_FP:optab><mode>_strict): ...this.
2020-10-02arm: Make more use of the new mode macrosRichard Sandiford2-43/+29
As Christophe pointed out, my r11-3522 patch didn't in fact fix all of the armv8_2-fp16-arith-2.c failures introduced by allowing FP16 vectorisation without -funsafe-math-optimizations. I must have only tested the final patch on my usual arm-linux-gnueabihf bootstrap, which it turns out treats the test as unsupported. The focus of the original patch was to use mode macros for patterns that are shared between Advanced SIMD, iwMMXt and MVE. This patch uses the mode macros for general neon.md patterns too. gcc/ * config/arm/neon.md (*sub<VDQ:mode>3_neon): Use the new mode macros for the insn condition. (sub<VH:mode>3, *mul<VDQW:mode>3_neon): Likewise. (mul<VDQW:mode>3add<VDQW:mode>_neon): Likewise. (mul<VH:mode>3add<VH:mode>_neon): Likewise. (mul<VDQW:mode>3neg<VDQW:mode>add<VDQW:mode>_neon): Likewise. (fma<VCVTF:mode>4, fma<VH:mode>4, *fmsub<VCVTF:mode>4): Likewise. (quad_halves_<code>v4sf, reduc_plus_scal_<VD:mode>): Likewise. (reduc_plus_scal_<VQ:mode>, reduc_smin_scal_<VD:mode>): Likewise. (reduc_smin_scal_<VQ:mode>, reduc_smax_scal_<VD:mode>): Likewise. (reduc_smax_scal_<VQ:mode>, mul<VH:mode>3): Likewise. (neon_vabd<VF:mode>_2, neon_vabd<VF:mode>_3): Likewise. (fma<VH:mode>4_intrinsic): Delete. (neon_vadd<VCVTF:mode>): Use the new mode macros to decide which form of instruction to generate. (neon_vmla<VDQW:mode>, neon_vmls<VDQW:mode>): Likewise. (neon_vsub<VCVTF:mode>): Likewise. (neon_vfma<VH:mode>): Generate the main fma<mode>4 form instead of using fma<mode>4_intrinsic. gcc/testsuite/ * gcc.target/arm/armv8_2-fp16-arith-2.c (float16_t): Use _Float16_t rather than __fp16. (float16x4_t, float16x4_t): Likewise. (fp16_abs): Use __builtin_fabsf16.
2020-10-02aarch64: ilp32 testsuite fixesAlex Coplan2-3/+12
This fixes test failures on ilp32 introduced in r11-3032-gd4febc75e8dfab23bd3132d5747eded918f85107. The assembler checks in extend-syntax.c simply needed adjusting for 32-bit pointers. It appears the subsp.c test has never passed on ILP32 due to a missed optimisation there. Since this isn't a code quality regression, disable that check on ILP32. gcc/testsuite/ChangeLog: * gcc.target/aarch64/extend-syntax.c: Fix assembler checks for ilp32, disable check-function-bodies on ilp32. * gcc.target/aarch64/subsp.c: Only check second scan-assembler on lp64 since the code on ilp32 is missing the optimization needed for this test to pass.
2020-10-02GCOV: do not mangle .gcno files.Martin Liska1-3/+5
gcc/ChangeLog: PR gcov-profile/97193 * coverage.c (coverage_init): GCDA note files should not be mangled and should end in output directory.
2020-10-02c++: Set CALL_FROM_NEW_OR_DELETE_P on more calls.Jason Merrill8-31/+33
We were failing to set the flag on a delete call in a new expression, in a deleting destructor, and in a coroutine. Fixed by setting it in the function that builds the call. 2020-10-02 Jason Merril <jason@redhat.com> gcc/cp/ChangeLog: * call.c (build_operator_new_call): Set CALL_FROM_NEW_OR_DELETE_P. (build_op_delete_call): Likewise. * init.c (build_new_1, build_vec_delete_1, build_delete): Not here. (build_delete): gcc/ChangeLog: * gimple.h (gimple_call_operator_delete_p): Rename from gimple_call_replaceable_operator_delete_p. * gimple.c (gimple_call_operator_delete_p): Likewise. * tree.h (DECL_IS_REPLACEABLE_OPERATOR_DELETE_P): Remove. * tree-ssa-dce.c (mark_all_reaching_defs_necessary_1): Adjust. (propagate_necessity): Likewise. (eliminate_unnecessary_stmts): Likewise. * tree-ssa-structalias.c (find_func_aliases_for_call): Likewise. gcc/testsuite/ChangeLog: * g++.dg/pr94314.C: new/delete no longer omitted.
2020-10-02make use of CALL_FROM_NEW_OR_DELETE_PRichard Biener6-14/+78
This fixes points-to analysis and DCE to only consider new/delete operator calls from new or delete expressions and not direct calls. 2020-10-01 Richard Biener <rguenther@suse.de> * gimple.h (GF_CALL_FROM_NEW_OR_DELETE): New call flag. (gimple_call_set_from_new_or_delete): New. (gimple_call_from_new_or_delete): Likewise. * gimple.c (gimple_build_call_from_tree): Set GF_CALL_FROM_NEW_OR_DELETE appropriately. * ipa-icf-gimple.c (func_checker::compare_gimple_call): Compare gimple_call_from_new_or_delete. * tree-ssa-dce.c (mark_all_reaching_defs_necessary_1): Make sure to only consider new/delete calls from new or delete expressions. (propagate_necessity): Likewise. (eliminate_unnecessary_stmts): Likewise. * tree-ssa-structalias.c (find_func_aliases_for_call): Likewise. * g++.dg/tree-ssa/pta-delete-1.C: New testcase.
2020-10-02c++: Move CALL_FROM_NEW_OR_DELETE_P to tree.hJason Merrill7-11/+21
As discussed with richi, we should be able to use TREE_PROTECTED for this flag, since CALL_FROM_THUNK_P will never be set on a call to an operator new or delete. 2020-10-01 Jason Merril <jason@redhat.com> gcc/cp/ChangeLog: * lambda.c (call_from_lambda_thunk_p): New. * cp-gimplify.c (cp_genericize_r): Use it. * pt.c (tsubst_copy_and_build): Use it. * typeck.c (check_return_expr): Use it. * cp-tree.h: Declare it. (CALL_FROM_NEW_OR_DELETE_P): Move to gcc/tree.h. gcc/ChangeLog: * tree.h (CALL_FROM_NEW_OR_DELETE_P): Move from cp-tree.h. * tree-core.h: Document new usage of protected_flag.
2020-10-02Implement irange::fits_p.Aldy Hernandez1-0/+1
This should have been included in the irange_allocator patch, as a method to see if the current object can hold a passed range without truncation. gcc/ChangeLog: * value-range.h (irange::fits_p): New.
2020-10-02Daily bump.GCC Administrator4-1/+373
2020-10-01compiler: set varargs correctly for type of method expressionIan Lance Taylor2-3/+7
Fixes golang/go#41737 Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/258977
2020-10-02[RS6000] ICE in decompose, at rtl.h:2282Alan Modra1-1/+1
during RTL pass: fwprop1 gcc.dg/pr82596.c: In function 'test_cststring': gcc.dg/pr82596.c:27:1: internal compiler error: in decompose, at rtl.h:2282 -m32 gcc/testsuite/gcc.dg/pr82596.c fails along with other tests after applying rtx_cost patches, which exposed a backend bug. legitimize_address when presented with the following address (plus (reg) (const_int 0x7ffffffff)) attempts to rewrite it as a high/low sum. The low part is 0xffff, or -1, making the high part 0x80000000. But this is no longer canonical for SImode. * config/rs6000/rs6000.c (rs6000_legitimize_address): Use gen_int_mode for high part of address constant.
2020-10-02[RS6000] rs6000_linux64_override_options fixAlan Modra1-13/+7
Commit c6be439b37 wrongly left a block of code inside an "else" block, which changed the default for power10 TARGET_NO_FP_IN_TOC accidentally. We don't want FP constants in the TOC when -mcmodel=medium can address them just as efficiently outside the TOC. * config/rs6000/rs6000.c (rs6000_linux64_override_options): Formatting. Correct setting of TARGET_NO_FP_IN_TOC and TARGET_NO_SUM_IN_TOC.
2020-10-02[RS6000] function for linux64 SUBSUBTARGET_OVERRIDE_OPTIONSAlan Modra3-152/+98
* config/rs6000/freebsd64.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Use rs6000_linux64_override_options. * config/rs6000/linux64.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Break out to.. * config/rs6000/rs6000.c (rs6000_linux64_override_options): ..this, new function. Tweak non-biarch test and clearing of profile_kernel to work with freebsd64.h.
2020-10-01c++: Kill DECL_HIDDEN_PNathan Sidwell2-22/+3
There are only a couple of asserts remaining using this macro, and nothing using TYPE_HIDDEN_P. Killed thusly. gcc/cp/ * cp-tree.h (DECL_ANTICIPATED): Adjust comment. (DECL_HIDDEN_P, TYPE_HIDDEN_P): Delete. * tree.c (ovl_insert): Delete DECL_HIDDEN_P assert. (ovl_skip_hidden): Likewise.
2020-10-01Fix build of ppc64 target.Martin Liska2-0/+2
Since a889e06ac68 the following fails. In file included from ../../gcc/tree-ssa-propagate.h:25:0, from ../../gcc/config/rs6000/rs6000.c:78: ../../gcc/value-query.h:90:31: error: ‘irange’ has not been declared virtual bool range_of_expr (irange &r, tree name, gimple * = NULL) = 0; ^~~~~~ ../../gcc/value-query.h:91:31: error: ‘irange’ has not been declared virtual bool range_on_edge (irange &r, edge, tree name); ^~~~~~ ../../gcc/value-query.h:92:31: error: ‘irange’ has not been declared virtual bool range_of_stmt (irange &r, gimple *, tree name = NULL); ^~~~~~ In file included from ../../gcc/tree-ssa-propagate.h:25:0, from ../../gcc/config/rs6000/rs6000-call.c:67: ../../gcc/value-query.h:90:31: error: ‘irange’ has not been declared virtual bool range_of_expr (irange &r, tree name, gimple * = NULL) = 0; ^~~~~~ ../../gcc/value-query.h:91:31: error: ‘irange’ has not been declared virtual bool range_on_edge (irange &r, edge, tree name); ^~~~~~ ../../gcc/value-query.h:92:31: error: ‘irange’ has not been declared virtual bool range_of_stmt (irange &r, gimple *, tree name = NULL); gcc/ChangeLog: * config/rs6000/rs6000-call.c: Include value-range.h. * config/rs6000/rs6000.c: Likewise.
2020-10-01[nvptx] Emit mov.u32 instead of cvt.u32.u32 for truncsiqi2Tom de Vries1-3/+7
When running: ... $ gcc.sh src/gcc/testsuite/gcc.target/nvptx/abi-complex-arg.c -S -dP ... we have in abi-complex-arg.s: ... //(insn 3 5 4 2 // (set // (reg:QI 23) // (truncate:QI (reg:SI 22))) "abi-complex-arg.c":38:1 29 {truncsiqi2} // (nil)) cvt.u32.u32 %r23, %r22; // 3 [c=4] truncsiqi2/0 ... The cvt.u32.u32 can be written shorter and clearer as mov.u32. Fix this in define_insn "truncsi<QHIM>2". Tested on nvptx. gcc/ChangeLog: 2020-10-01 Tom de Vries <tdevries@suse.de> PR target/80845 * config/nvptx/nvptx.md (define_insn "truncsi<QHIM>2"): Emit mov.u32 instead of cvt.u32.u32.
2020-10-01arm: Add missing vec_cmp and vcond patternsRichard Sandiford12-221/+442
This patch does several things at once: (1) Add vector compare patterns (vec_cmp and vec_cmpu). (2) Add vector selects between floating-point modes when the values being compared are integers (affects vcond and vcondu). (3) Add vector selects between integer modes when the values being compared are floating-point (affects vcond). (4) Add standalone vector select patterns (vcond_mask). (5) Tweak the handling of compound comparisons with zeros. Unfortunately it proved too difficult (for me) to separate this out into a series of smaller patches, since everything is so inter-related. Defining only some of the new patterns does not leave things in a happy state. The handling of comparisons is mostly taken from the vcond patterns. This means that it remains non-compliant with IEEE: “quiet” comparisons use signalling instructions. But that shouldn't matter for floats, since we require -funsafe-math-optimizations to vectorize for them anyway. It remains the case that comparisons and selects aren't implemented at all for HF vectors. Implementing those feels like separate work. gcc/ PR target/96528 PR target/97288 * config/arm/arm-protos.h (arm_expand_vector_compare): Declare. (arm_expand_vcond): Likewise. * config/arm/arm.c (arm_expand_vector_compare): New function. (arm_expand_vcond): Likewise. * config/arm/neon.md (vec_cmp<VDQW:mode><v_cmp_result>): New pattern. (vec_cmpu<VDQW:mode><VDQW:mode>): Likewise. (vcond<VDQW:mode><VDQW:mode>): Require operand 5 to be a register or zero. Use arm_expand_vcond. (vcond<V_cvtto><V32:mode>): New pattern. (vcondu<VDQIW:mode><VDQIW:mode>): Generalize to... (vcondu<VDQW:mode><v_cmp_result): ...this. Require operand 5 to be a register or zero. Use arm_expand_vcond. (vcond_mask_<VDQW:mode><v_cmp_result>): New pattern. (neon_vc<cmp_op><mode>, neon_vc<cmp_op><mode>_insn): Add "@" marker. (neon_vbsl<mode>): Likewise. (neon_vc<cmp_op>u<mode>): Reexpress as... (@neon_vc<code><mode>): ...this. gcc/testsuite/ * lib/target-supports.exp (check_effective_target_vect_cond_mixed): Add arm neon targets. * gcc.target/arm/neon-compare-1.c: New test. * gcc.target/arm/neon-compare-2.c: Likewise. * gcc.target/arm/neon-compare-3.c: Likewise. * gcc.target/arm/neon-compare-4.c: Likewise. * gcc.target/arm/neon-compare-5.c: Likewise. * gcc.target/arm/neon-vcond-gt.c: Expect comparisons with zero. * gcc.target/arm/neon-vcond-ltgt.c: Likewise. * gcc.target/arm/neon-vcond-unordered.c: Likewise.
2020-10-01aarch64: Restrict asm-matching tests to lp64Richard Sandiford2-2/+2
gcc/testsuite/ * gcc.target/aarch64/movtf_1.c: Restrict the asm matching to lp64. * gcc.target/aarch64/movti_1.c: Likewise.
2020-10-01arm: Fix testcase selection for Low Overhead Loop tests [PR96375]Andrea Corallo7-8/+8
gcc/testsuite/ PR target/96375 * gcc.target/arm/lob1.c: Fix missing flag. * gcc.target/arm/lob2.c: Likewise. * gcc.target/arm/lob3.c: Likewise. * gcc.target/arm/lob4.c: Likewise. * gcc.target/arm/lob5.c: Likewise. * gcc.target/arm/lob6.c: Likewise. * lib/target-supports.exp (check_effective_target_arm_v8_1_lob_ok): Return 1 only for cortex-m targets, add '-mthumb' flag.
2020-10-01config/i386/t-rtems: Change from mtune to march for multilibsMichael Davidsaver1-4/+4
* config/i386/t-rtems: Change from mtune to march when building multilibs. The mtune argument tunes or optimizes for a specific CPU model but does not ensure the generated code is appropriate for the CPU model. Prior to this patch, i386 compatible code was always generated but tuned for later models.
2020-10-01Convert sprintf/strlen passes to value query class.Aldy Hernandez5-178/+186
gcc/ChangeLog: * builtins.c (compute_objsize): Replace vr_values with range_query. (get_range): Same. (gimple_call_alloc_size): Same. * builtins.h (class vr_values): Remove. (gimple_call_alloc_size): Replace vr_values with range_query. * gimple-ssa-sprintf.c (get_int_range): Same. (struct directive): Pass gimple context to fmtfunc callback. (directive::set_width): Replace inline with out-of-line version. (directive::set_precision): Same. (format_none): New gimple argument. (format_percent): New gimple argument. (format_integer): New gimple argument. (format_floating): New gimple argument. (get_string_length): Use range_query API. (format_character): New gimple argument. (format_string): New gimple argument. (format_plain): New gimple argument. (format_directive): New gimple argument. (parse_directive): Replace vr_values with range_query. (compute_format_length): Same. (handle_printf_call): Same. Adjust for range_query API. * tree-ssa-strlen.c (get_range): Same. (compare_nonzero_chars): Same. (get_addr_stridx) Replace vr_values with range_query. (get_stridx): Same. (dump_strlen_info): Same. (get_range_strlen_dynamic): Adjust for range_query API. (set_strlen_range): Same (maybe_warn_overflow): Replace vr_values with range_query. (handle_builtin_strcpy): Same. (maybe_diag_stxncpy_trunc): Add FIXME comment. (handle_builtin_memcpy): Replace vr_values with range_query. (handle_builtin_memset): Same. (get_len_or_size): Same. (strxcmp_eqz_result): Same. (handle_builtin_string_cmp): Same. (count_nonzero_bytes_addr): Same, plus adjust for range_query API. (count_nonzero_bytes): Replace vr_values with range_query. (handle_store): Same. (strlen_check_and_optimize_call): Same. (handle_integral_assign): Same. (check_and_optimize_stmt): Same. * tree-ssa-strlen.h (class vr_values): Remove. (get_range): Replace vr_values with range_query. (get_range_strlen_dynamic): Same. (handle_printf_call): Same.
2020-10-01Convert vr-values to value query class.Aldy Hernandez14-146/+139
gcc/ChangeLog: * gimple-loop-versioning.cc (lv_dom_walker::before_dom_children): Pass m_range_analyzer instead of get_vr_values. (loop_versioning::name_prop::get_value): Rename to... (loop_versioning::name_prop::value_of_expr): ...this. * gimple-ssa-evrp-analyze.c (evrp_range_analyzer::evrp_range_analyzer): Adjust for evrp_range_analyzer inheriting from vr_values. (evrp_range_analyzer::try_find_new_range): Same. (evrp_range_analyzer::record_ranges_from_incoming_edge): Same. (evrp_range_analyzer::record_ranges_from_phis): Same. (evrp_range_analyzer::record_ranges_from_stmt): Same. (evrp_range_analyzer::push_value_range): Same. (evrp_range_analyzer::pop_value_range): Same. * gimple-ssa-evrp-analyze.h (class evrp_range_analyzer): Inherit from vr_values. Adjust accordingly. * gimple-ssa-evrp.c: Adjust for evrp_range_analyzer inheriting from vr_values. (evrp_folder::value_of_evrp): Rename from get_value. * tree-ssa-ccp.c (class ccp_folder): Rename get_value to value_of_expr. (ccp_folder::get_value): Rename to... (ccp_folder::value_of_expr): ...this. * tree-ssa-copy.c (class copy_folder): Rename get_value to value_of_expr. (copy_folder::get_value): Rename to... (copy_folder::value_of_expr): ...this. * tree-ssa-dom.c (dom_opt_dom_walker::after_dom_children): Adjust for evrp_range_analyzer inheriting from vr_values. (dom_opt_dom_walker::optimize_stmt): Same. * tree-ssa-propagate.c (substitute_and_fold_engine::replace_uses_in): Call value_of_* instead of get_value. (substitute_and_fold_engine::replace_phi_args_in): Same. (substitute_and_fold_engine::propagate_into_phi_args): Same. (substitute_and_fold_dom_walker::before_dom_children): Same. * tree-ssa-propagate.h: Include value-query.h. (class substitute_and_fold_engine): Inherit from value_query. * tree-ssa-strlen.c (strlen_dom_walker::before_dom_children): Adjust for evrp_range_analyzer inheriting from vr_values. * tree-ssa-threadedge.c (record_temporary_equivalences_from_phis): Same. * tree-vrp.c (class vrp_folder): Same. (vrp_folder::get_value): Rename to value_of_expr. * vr-values.c (vr_values::get_lattice_entry): Adjust for vr_values inheriting from range_query. (vr_values::range_of_expr): New. (vr_values::value_of_expr): New. (vr_values::value_on_edge): New. (vr_values::value_of_stmt): New. (simplify_using_ranges::op_with_boolean_value_range_p): Call get_value_range through query. (check_for_binary_op_overflow): Rename store to query. (vr_values::vr_values): Remove vrp_value_range_pool. (vr_values::~vr_values): Same. (simplify_using_ranges::get_vr_for_comparison): Call get_value_range through query. (simplify_using_ranges::compare_names): Same. (simplify_using_ranges::vrp_evaluate_conditional): Same. (simplify_using_ranges::vrp_visit_cond_stmt): Same. (simplify_using_ranges::simplify_abs_using_ranges): Same. (simplify_using_ranges::simplify_cond_using_ranges_1): Same. (simplify_cond_using_ranges_2): Same. (simplify_using_ranges::simplify_switch_using_ranges): Same. (simplify_using_ranges::two_valued_val_range_p): Same. (simplify_using_ranges::simplify_using_ranges): Rename store to query. (simplify_using_ranges::simplify): Assert that we have a query. * vr-values.h (class range_query): Remove. (class simplify_using_ranges): Remove inheritance of range_query. (class vr_values): Add virtuals for range_of_expr, value_of_expr, value_on_edge, value_of_stmt, and get_value_range. Call range_query allocator instead of using vrp_value_range_pool. Remove vrp_value_range_pool. (simplify_using_ranges::get_value_range): Remove.
2020-10-01tree-optimization/97236 - fix bad use of VMAT_CONTIGUOUSRichard Biener2-11/+52
This avoids using VMAT_CONTIGUOUS with single-element interleaving when using V1mode vectors. Instead keep VMAT_ELEMENTWISE but continue to avoid load-lanes and gathers. 2020-10-01 Richard Biener <rguenther@suse.de> PR tree-optimization/97236 * tree-vect-stmts.c (get_group_load_store_type): Keep VMAT_ELEMENTWISE for single-element vectors. * gcc.dg/vect/pr97236.c: New testcase.
2020-10-01c++: pushdecl_top_level must set contextNathan Sidwell3-1/+6
I discovered pushdecl_top_level was not setting the decl's context, and we ended up with namespace-scope decls with NULL context. That broke modules. Then I discovered a couple of places where we set the context to a FUNCTION_DECL, which is also wrong. AFAICT the literals in question belong in global scope, as they're comdatable entities. But create_temporary would use current_scope for the context before we pushed it into namespace scope. This patch asserts the context is NULL and then sets it to the frobbed global_namespace. gcc/cp/ * name-lookup.c (pushdecl_top_level): Assert incoming context is null, add global_namespace context. (pushdecl_top_level_and_finish): Likewise. * pt.c (get_template_parm_object): Clear decl context before pushing. * semantics.c (finish_compound_literal): Likewise.
2020-10-01Add gcc.c-torture/compile/pr97243.c testcase.Jan Hubicka1-0/+10
PR ipa/97243 * gcc.c-torture/compile/pr97243.c: New test.
2020-10-01Fix ICE in compute_parm_mapJan Hubicka1-1/+1
gcc/ChangeLog: * ipa-modref.c (compute_parm_map): Be ready for callee_pi to be NULL.
2020-10-01Add -fno-ipa-modref to gcc.dg/ipa/remref-2a.cJan Hubicka1-1/+1
PR ipa/97244 * gcc.dg/ipa/remref-2a.c: Add -fno-ipa-modref
2020-10-01Fix ICE in ipa_edge_args_sum_t::duplicateJan Hubicka3-4/+6
PR ipa/97244 * ipa-fnsummary.c (pass_free_fnsummary::execute): Free also indirect inlining datastructure. * ipa-modref.c (pass_ipa_modref::execute): Do not free them here. * ipa-prop.c (ipa_free_all_node_params): Do not crash when info does not exist. (ipa_unregister_cgraph_hooks): Likewise.
2020-10-01Fix handling of fnspec for internal functions.Jan Hubicka1-1/+1
* internal-fn.c (DEF_INTERNAL_FN): Fix handling of fnspec
2020-10-01Initial implementation of value query class.Aldy Hernandez3-0/+270
gcc/ChangeLog: * Makefile.in: Add value-query.o. * value-query.cc: New file. * value-query.h: New file.
2020-10-01c++: Refactor lookup_and_check_tagNathan Sidwell1-57/+59
It turns out I'd already found lookup_and_check_tag's control flow confusing, and had refactored it on the modules branch. For instance, it continually checks 'if (decl &&$ condition)' before finally getting to 'else if (!decl)'. why not just check !decl first and be done? Well, it is done thusly. gcc/cp/ * decl.c (lookup_and_check_tag): Refactor.
2020-10-01arm: Fix ordering in arm-cpus.inAlex Coplan3-16/+17
This moves the recent entry for Neoverse N2 down and adds a comment in order to preserve the existing order/structure in arm-cpus.in. gcc/ChangeLog: * config/arm/arm-cpus.in: Fix ordering, move Neoverse N2 down. * config/arm/arm-tables.opt: Regenerate. * config/arm/arm-tune.md: Regenerate.
2020-10-01[testsuite] Enable pr94600-{1,3}.c tests for nvptxTom de Vries2-6/+16
When compiling test-case pr94600-1.c for nvptx, this gimple mem move: ... MEM[(volatile struct t0 *)655404B] ={v} a0[0]; ... is expanded into a memcpy, but when compiling pr94600-2.c instead, this similar gimple mem move: ... MEM[(volatile struct t0 *)655404B] ={v} a00; ... is expanded into a 32-bit load/store pair. In both cases, emit_block_move is called. In the latter case, can_move_by_pieces (4 /* byte-size */, 32 /* bit-align */) is called, which returns true (because by_pieces_ninsns returns 1, which is smaller than the MOVE_RATIO of 4). In the former case, can_move_by_pieces (4 /* byte-size */, 8 /* bit-align */) is called, which returns false (because by_pieces_ninsns returns 4, which is not smaller than the MOVE_RATIO of 4). So the difference in code generation is explained by the alignment. The difference in alignment comes from the move sources: a0[0] vs. a00. Both have the same type with 8-bit alignment, but a00 is on stack, which based on the base stack align and stack variable placement happens to result in a 32-bit alignment. Enable test-cases pr94600-{1,3}.c for nvptx by forcing the currently 8-byte aligned variables to have a 32-bit alignment for STRICT_ALIGNMENT targets. Tested on nvptx. gcc/testsuite/ChangeLog: 2020-10-01 Tom de Vries <tdevries@suse.de> * gcc.dg/pr94600-1.c: Force 32-bit alignment for a0 for !non_strict_align targets. Remove target clauses from scan tests. * gcc.dg/pr94600-3.c: Same.
2020-10-01c++: Fix up default initialization with consteval default ctor [PR96994]Jakub Jelinek2-0/+28
> > The following testcase is miscompiled (in particular the a and i > > initialization). The problem is that build_special_member_call due to > > the immediate constructors (but not evaluated in constant expression mode) > > doesn't create a CALL_EXPR, but returns a TARGET_EXPR with CONSTRUCTOR > > as the initializer for it, > > That seems like the bug; at the end of build_over_call, after you > > > call = cxx_constant_value (call, obj_arg); > > You need to build an INIT_EXPR if obj_arg isn't a dummy. That works. obj_arg is NULL if it is a dummy from the earlier code. 2020-10-01 Jakub Jelinek <jakub@redhat.com> PR c++/96994 * call.c (build_over_call): If obj_arg is non-NULL, return INIT_EXPR setting obj_arg to call. * g++.dg/cpp2a/consteval18.C: New test.
2020-10-01c++: Handle std::construct_at on automatic vars during constant evaluation ↵Jakub Jelinek2-1/+75
[PR97195] As mentioned in the PR, we only support due to a bug in constant expressions std::construct_at on non-automatic variables, because we VERIFY_CONSTANT the second argument of placement new, which fails verification if it is an address of an automatic variable. The following patch fixes it by not performing that verification, the placement new evaluation later on will verify it after it is dereferenced. 2020-10-01 Jakub Jelinek <jakub@redhat.com> PR c++/97195 * constexpr.c (cxx_eval_call_expression): Don't VERIFY_CONSTANT the second argument. * g++.dg/cpp2a/constexpr-new14.C: New test.