aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2021-07-27Let -Wuninitialized assume built-ins don't change const arguments [PR101584].Martin Sebor3-37/+190
PR tree-optimization/101584 - missing -Wuninitialized with an allocated object after a built-in call gcc/ChangeLog: PR tree-optimization/101584 * tree-ssa-uninit.c (builtin_call_nomodifying_p): New function. (check_defs): Call it. gcc/testsuite/ChangeLog: PR tree-optimization/101584 * gcc.dg/uninit-38.c: Remove assertions. * gcc.dg/uninit-41.c: New test.
2021-07-27testsuite: Add missing C++ includes to tests [PR101646]Jonathan Wakely2-0/+2
These tests stopped working after some libstdc++ refactoring, because they aren't including what they use. gcc/testsuite/ChangeLog: PR testsuite/101646 * g++.dg/coroutines/pr99047.C: * g++.dg/pr71655.C:
2021-07-27Use OEP_DECL_NAME when comparing VLA bounds [PR101585].Martin Sebor2-1/+20
Resolves: PR c/101585 - Bad interaction of -fsanitize=undefined and -Wvla-parameters gcc/c-family: PR c/101585 * c-warn.c (warn_parm_ptrarray_mismatch): Use OEP_DECL_NAME. gcc/testsuite: PR c/101585 * gcc.dg/Wvla-parameter-13.c: New test.
2021-07-27Fix argument to pthread_joinJeff Law1-1/+1
gcc/testsuite * g++.dg/gcov/gcov-threads-1.C: Fix argument to pthread_join.
2021-07-27Abstract out (forward) jump threader state handling.Aldy Hernandez4-123/+172
The *forward* jump threader has multiple places where it pushes and pops state, and where it sets context up for the jump threading simplifier callback. Not only are the idioms repetitive, but the only reason for passing const_and_copies, avail_exprs_stack, and the evrp engine around are so we can set up context. As part of my jump threading work, I will divorce the evrp engine from the DOM jump threader, replacing it with a subset of the path solver I have just contributed. Since this will entail passing even more context around, I've abstracted out the state handling so it can be passed around in one object. This cleans up the code, and also makes it trivial to set up context with another engine in the future. FWIW, I've used these cleanups and the path solver in a POC to improve DOM's threaded edges by an additional 5%, and the overall threading opportunities in the compiler by 1%. This is in addition to the gains I have documented in the backwards threader rewrite. There are no functional changes with this patch. Tested on x86-64 Linux. gcc/ChangeLog: * tree-ssa-dom.c (dom_jump_threader_simplifier): Put avail_exprs_stack in the class, instead of passing it to jump_threader_simplifier. (dom_jump_threader_simplifier::simplify): Add state argument. (dom_opt_dom_walker): Add state. (pass_dominator::execute): Pass state to threader. (dom_opt_dom_walker::before_dom_children): Use state. * tree-ssa-threadedge.c (jump_threader::jump_threader): Replace arguments by state. (jump_threader::record_temporary_equivalences_from_phis): Register equivalences through the state variable. (jump_threader::record_temporary_equivalences_from_stmts_at_dest): Record ranges in a statement through the state variable. (jump_threader::simplify_control_stmt_condition): Pass state to simplify. (jump_threader::simplify_control_stmt_condition_1): Same. (jump_threader::thread_around_empty_blocks): Remove obsolete comment. (jump_threader::thread_through_normal_block): Record equivalences on edge through the state variable. (jump_threader::thread_across_edge): Abstract state pushing. (jt_state::jt_state): New. (jt_state::push): New. (jt_state::pop): New. (jt_state::register_equiv): New. (jt_state::record_ranges_from_stmt): New. (jt_state::register_equivs_on_edge): New. (jump_threader_simplifier::jump_threader_simplifier): Move from header. (jump_threader_simplifier::simplify): Add state argument. * tree-ssa-threadedge.h (class jt_state): New. (class jump_threader): Add state to constructor. (class jump_threader_simplifier): Add state to simplify. Remove avail_exprs_stack from class. * tree-vrp.c (vrp_jump_threader_simplifier::simplify): Add state argument. (vrp_jump_threader::vrp_jump_threader): Add state. (vrp_jump_threader::~vrp_jump_threader): Cleanup state.
2021-07-27c++: Reject ordered comparison of null pointers [PR99701]Marek Polacek6-33/+40
When implementing DR 1512 in r11-467 I neglected to reject ordered comparison of two null pointers, like nullptr < nullptr. This patch fixes that omission. DR 1512 PR c++/99701 gcc/cp/ChangeLog: * cp-gimplify.c (cp_fold): Remove {LE,LT,GE,GT_EXPR} from a switch. * typeck.c (cp_build_binary_op): Reject ordered comparison of two null pointers. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/nullptr11.C: Remove invalid tests. * g++.dg/cpp0x/nullptr46.C: Add dg-error. * g++.dg/cpp2a/spaceship-err7.C: New test. * g++.dg/expr/ptr-comp4.C: New test. libstdc++-v3/ChangeLog: * testsuite/20_util/tuple/comparison_operators/overloaded.cc: Move a line... * testsuite/20_util/tuple/comparison_operators/overloaded2.cc: ...here. New test.
2021-07-27Implement basic block path solver.Aldy Hernandez3-0/+415
This is is the main basic block path solver for use in the ranger-based backwards threader. Given a path of BBs, the class can solve the final conditional or any SSA name used in calculating the final conditional. gcc/ChangeLog: * Makefile.in (OBJS): Add gimple-range-path.o. * gimple-range-path.cc: New file. * gimple-range-path.h: New file.
2021-07-27simplify-rtx: Push sign/zero-extension inside vec_duplicateJonathan Wright2-183/+211
As a general principle, vec_duplicate should be as close to the root of an expression as possible. Where unary operations have vec_duplicate as an argument, these operations should be pushed inside the vec_duplicate. This patch modifies unary operation simplification to push sign/zero-extension of a scalar inside vec_duplicate. This patch also updates all RTL patterns in aarch64-simd.md to use the new canonical form. gcc/ChangeLog: 2021-07-19 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/aarch64-simd.md: Push sign/zero-extension inside vec_duplicate for all patterns. * simplify-rtx.c (simplify_context::simplify_unary_operation_1): Push sign/zero-extension inside vec_duplicate.
2021-07-27tree-optimization/101573 - improve uninit warning at -O0Richard Biener6-10/+95
We can improve uninit warnings from the early pass by looking at PHI arguments on fallthru edges that are uninitialized and have uses that are before a possible loop exit. This catches some cases earlier that we'd only warn in a more confusing way after early inlining as seen by testcase adjustments. It introduces FAIL: gcc.dg/uninit-23.c (test for excess errors) where we additionally warn gcc.dg/uninit-23.c:21:13: warning: 't4' is used uninitialized [-Wuninitialized] which I think is OK even if it's not obvious that the new warning is an improvement when you look at the obvious source. Somehow for all cases I never get the `'foo' was declared here` notes, I didn't dig why that happens but it's odd. 2021-07-22 Richard Biener <rguenther@suse.de> PR tree-optimization/101573 * tree-ssa-uninit.c (warn_uninit_phi_uses): New function looking at uninitialized PHI arg defs in some constrained cases. (warn_uninitialized_vars): Call it. (execute_early_warn_uninitialized): Calculate dominators. * gcc.dg/uninit-pr101573.c: New testcase. * gcc.dg/uninit-15-O0.c: Adjust. * gcc.dg/uninit-15.c: Likewise. * gcc.dg/uninit-23.c: Likewise. * c-c++-common/uninit-17.c: Likewise.
2021-07-27tree-optimization/39821 - fix cost classification for widening arithRichard Biener1-9/+16
This adjusts the vectorizer to cost vector_stmt for widening arithmetic instead of vec_promote_demote in the line of telling the target that stmt_info->stmt is the meaningful piece we cost. 2021-07-27 Richard Biener <rguenther@suse.de> PR tree-optimization/39821 * tree-vect-stmts.c (vect_model_promotion_demotion_cost): Use vector_stmt for widening arithmetic. (vectorizable_conversion): Adjust.
2021-07-27ipa: Adjust references to identify read-only globalsMartin Jambor9-50/+431
this patch has been motivated by SPEC 2017's 544.nab_r in which there is a static variable which is never written to and so zero throughout the run-time of the benchmark. However, it is passed by reference to a function in which it is read and (after some multiplications) passed into __builtin_exp which in turn unnecessarily consumes almost 10% of the total benchmark run-time. The situation is illustrated by the added testcase remref-3.c. The patch adds a flag to ipa-prop descriptor of each parameter to mark such parameters. IPA-CP and inling then take the effort to remove IPA_REF_ADDR references in the caller and only add IPA_REF_LOAD reference to the clone/overall inlined function. This is sufficient for subsequent symbol table analysis code to identify the read-only variable as such and optimize the code. There are two changes from the RFC version posted to the list earlier. First, three missing calls to get_base_address were added (there was another one in an assert). Second, references are not stripped off the callers if the cloned function cannot change the signature. The second change reveals a real shortcoming stemming from the fact we cannot adjust function prototypes with fnspecs. But that is a more general problem. gcc/ChangeLog: 2021-07-20 Martin Jambor <mjambor@suse.cz> * cgraph.h (ipa_replace_map): New field force_load_ref. * ipa-prop.h (ipa_param_descriptor): Reduce precision of move_cost, aded new flag load_dereferenced, adjusted comments. (ipa_get_param_dereferenced): New function. (ipa_set_param_dereferenced): Likewise. * cgraphclones.c (cgraph_node::create_virtual_clone): Follow it. * ipa-cp.c: Include gimple.h. (ipcp_discover_new_direct_edges): Take into account dereferenced flag. (get_replacement_map): New parameter force_load_ref, set the appropriate flag in ipa_replace_map if set. (struct symbol_and_index_together): New type. (adjust_refs_in_act_callers): New function. (adjust_references_in_caller): Likewise. (create_specialized_node): When appropriate, call adjust_references_in_caller and force only load references. * ipa-prop.c (load_from_dereferenced_name): New function. (ipa_analyze_controlled_uses): Also detect loads from a dereference, harden testing of call statements. (ipa_write_node_info): Stream the dereferenced flag. (ipa_read_node_info): Likewise. (ipa_set_jf_constant): Also create refdesc when jump function references a variable. (cgraph_node_for_jfunc): Rename to symtab_node_for_jfunc, work also on references of variables and return a symtab_node. Adjust all callers. (propagate_controlled_uses): Also remove references to VAR_DECLs. gcc/testsuite/ChangeLog: 2021-06-29 Martin Jambor <mjambor@suse.cz> * gcc.dg/ipa/remref-3.c: New test. * gcc.dg/ipa/remref-4.c: Likewise. * gcc.dg/ipa/remref-5.c: Likewise. * gcc.dg/ipa/remref-6.c: Likewise.
2021-07-27gimple-fold: Fix up __builtin_clear_padding on classes with virtual ↵Jakub Jelinek2-0/+48
inheritence [PR101586] For the following testcase, B is 16-byte type, containing 8-byte virtual pointer and 1-byte A member, and C contains two FIELD_DECLs, one with B type and size of just 8-byte and then a field with type A and 1-byte size. The __builtin_clear_padding code was upset about the B typed FIELD_DECL containing FIELD_DECLs beyond the field size and triggered assertion failure. This patch makes it ignore all FIELD_DECLs that are (fully) beyond the sz passed from the caller (except for the flexible array member diagnostics that is kept). 2021-07-27 Jakub Jelinek <jakub@redhat.com> PR middle-end/101586 * gimple-fold.c (clear_padding_type): Ignore FIELD_DECLs with byte positions above or equal to sz except for diagnostics of flexible array members. * g++.dg/torture/builtin-clear-padding-4.C: New test.
2021-07-26PR 100170: Fix eq/ne tests on power10.Michael Meissner3-26/+33
This patch updates eq/ne tests in the testsuite to adjust the test if power10 code generation is used. 2021-07-26 Michael Meissner <meissner@linux.ibm.com> gcc/testsuite/ PR testsuite/100170 * gcc.target/powerpc/ppc-eq0-1.c: Adjust insn counts if power10 code is generated. * gcc.target/powerpc/ppc-ne0-1.c: (ne0): Adjust insn counts if power10 code is generated. (plus_ne0): Move to ppc-ne0-2.c. (cmp_plus_ne): Likewise. (plus_ne0_cmp): Likewise. * gcc.target/powerpc/ppc-ne0-2.c: New file.
2021-07-27Daily bump.GCC Administrator7-1/+157
2021-07-26Confirm and Handle only ASCII in toupper and tolower ranges.Andrew MacLeod1-10/+39
PR tree-optimization/78888 * gimple-range-fold.cc (get_letter_range): New. (fold_using_range::range_of_builtin_call): Call get_letter_range.
2021-07-26analyzer: fix uninit false +ve when returning structsDavid Malcolm3-8/+137
This patch fixes some false positives from -Wanalyzer-use-of-uninitialized-value when returning structs from functions (seen on the Linux kernel). gcc/analyzer/ChangeLog: * region-model.cc (region_model::on_call_pre): Always set conjured LHS, not just for SSA names. gcc/testsuite/ChangeLog: * gcc.dg/analyzer/sock-1.c: New test. * gcc.dg/analyzer/sock-2.c: New test. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2021-07-26Adjust ranges for to_upper and to_lower.Andrew MacLeod2-0/+61
Exclude lower case chars from to_upper and upper case chars from to_lower. gcc/ PR tree-optimization/78888 * gimple-range-fold.cc (fold_using_range::range_of_builtin_call): Add cases for CFN_BUILT_IN_TOUPPER and CFN_BUILT_IN_TOLOWER. gcc/testsuite/ * gcc.dg/pr78888.c: New.
2021-07-26Fold bswap32(x) != 0 to x != 0 (and related transforms)Roger Sayle3-0/+183
This patch to match.pd implements several closely related folding simplifications at the tree-level, that make use of the property that bit permutation functions, rotate and bswap have inverses. [1] bswap(X) eq/ne C, for constant C, simplifies to X eq/ne C' where C'=bswap(C), generalizing the transform in the subject. [2] bswap(X) eq/ne bswap(Y) simplifies to X eq/ne Y. [3] lrotate(X,C1) eq/ne C2 simplifies to X eq/ne C3, where C3 = rrotate(C2,C1), i.e. apply the inverse rotation to C2. [4] Likewise, rrotate(X,C1) eq/ne C2 simplifies to X eq/ne C3, where C3 = lrotate(C2,C1). [5] rotate(X,Z) eq/ne rotate(Y,Z) simplifies to X eq/ne Y, when the bit-count Z (the same on both sides) has no side-effects. [6] rotate(X,Y) eq/ne 0 simplifies to X eq/ne 0 if Y has no side-effects. [7] Likewise, rotate(X,Y) eq/ne -1 simplifies to X eq/ne -1, if Y has no side-effects. 2010-07-26 Roger Sayle <roger@nextmovesoftware.com> Marc Glisse <marc.glisse@inria.fr> gcc/ChangeLog * match.pd (rotate): Simplify equality/inequality of rotations. (bswap): Simplify equality/inequality tests of byte swapping. gcc/testsuite/ChangeLog * gcc.dg/fold-eqrotate-1.c: New test case. * gcc.dg/fold-eqbswap-1.c: New test case.
2021-07-26Regenerate .pot files.Joseph Myers1-10205/+10865
gcc/po/ * gcc.pot: Regenerate. libcpp/po/ * cpplib.pot: Regenerate.
2021-07-26Implement operator_bitwise_xor::op1_op2_relation_effect.Aldy Hernandez1-0/+33
This patch adjusts XORing of ranges where the operands are known to be equal or not equal. We should probably do the same thing for the op[12]_range methods. gcc/ChangeLog: * range-op.cc (operator_bitwise_xor::op1_op2_relation_effect): New.
2021-07-26Pass relationship to methods calling generic fold_range.Aldy Hernandez1-4/+4
Fix a small oversight in methods calling the base class fold_range. gcc/ChangeLog: * range-op.cc (operator_lshift::fold_range): Pass rel to base class fold_range. (operator_rshift::fold_range): Same.
2021-07-26Remove legacy external declarations in toplev.h [PR101447]Ashimida1-5/+0
gcc/ PR driver/101447 * toplev.h (min_align_loops_log): Remove declaration. (min_align_jumps_log, min_align_labels_log): Likewise. (min_align_functions_log): Likewise.
2021-07-26PR fortran/93308/93963/94327/94331/97046 problems raised by descriptor handlingTobias Burnus11-25/+885
Fortran: Fix attributes and bounds in ISO_Fortran_binding. 2021-07-26 José Rui Faustino de Sousa <jrfsousa@gmail.com> Tobias Burnus <tobias@codesourcery.com> PR fortran/93308 PR fortran/93963 PR fortran/94327 PR fortran/94331 PR fortran/97046 gcc/fortran/ChangeLog: * trans-decl.c (convert_CFI_desc): Only copy out the descriptor if necessary. * trans-expr.c (gfc_conv_gfc_desc_to_cfi_desc): Updated attribute handling which reflect a previous intermediate version of the standard. Only copy out the descriptor if necessary. libgfortran/ChangeLog: * runtime/ISO_Fortran_binding.c (cfi_desc_to_gfc_desc): Add code to verify the descriptor. Correct bounds calculation. (gfc_desc_to_cfi_desc): Add code to verify the descriptor. gcc/testsuite/ChangeLog: * gfortran.dg/ISO_Fortran_binding_1.f90: Add pointer attribute, this test is still erroneous but now it compiles. * gfortran.dg/bind_c_array_params_2.f90: Update regex to match code changes. * gfortran.dg/PR93308.f90: New test. * gfortran.dg/PR93963.f90: New test. * gfortran.dg/PR94327.c: New test. * gfortran.dg/PR94327.f90: New test. * gfortran.dg/PR94331.c: New test. * gfortran.dg/PR94331.f90: New test. * gfortran.dg/PR97046.f90: New test.
2021-07-26Abstract out conditional simplification out of execute_vrp.Aldy Hernandez1-16/+23
VRP simplifies conditionals involving casted values outside of the main folding mechanism, because this optimization inhibits the VRP jump threader from threading through the comparison. As part of replacing VRP with an evrp instance, I am making sure we do everything VRP does. Hence, I am abstracting this functionality out so we can call it from from elsewhere. ISTM that when the proposed ranger-based jump threader can handle everything the forward threader does, there will be no need for this optimization to be done outside of the evrp folder. Perhaps we can fold this into the substitute_using_ranges class. But that's further down the line. Also, there is no need to pass a vr_values around, when the base range_query class will do. I fixed this, at it makes it trivial to pass down a ranger or evrp instance. Tested on x86-64 Linux. gcc/ChangeLog: * tree-vrp.c (vrp_simplify_cond_using_ranges): Rename vr_values with range_query. (execute_vrp): Abstract out simplification of conditionals... (simplify_casted_conds): ...here.
2021-07-26Pass gimple context to array_bounds_checker.Aldy Hernandez2-11/+12
I have changed the use of the array_bounds_checker in VRP to use a ranger in my local tree to make sure there are no regressions when using either VRP or the ranger. In doing so I noticed that the checker does not pass context to get_value_range, which causes the ranger to miss a few cases. This patch fixes the oversight. Tested on x86-64 Linux using the array bounds checker both with VRP and the ranger. gcc/ChangeLog: * gimple-array-bounds.cc (array_bounds_checker::get_value_range): Add gimple argument. (array_bounds_checker::check_array_ref): Same. (array_bounds_checker::check_addr_expr): Same. (array_bounds_checker::check_array_bounds): Pass statement to check_array_bounds and check_addr_expr. * gimple-array-bounds.h (check_array_bounds): Add gimple argument. (check_addr_expr): Same. (get_value_range): Same.
2021-07-26AArch64: correct dot-product RTL patterns for aarch64.Tamar Christina3-44/+31
The previous fix for this problem was wrong due to a subtle difference between where NEON expects the RMW values and where intrinsics expects them. The insn pattern is modeled after the intrinsics and so needs an expand for the vectorizer optab to switch the RTL. However operand[3] is not expected to be written to so the current pattern is bogus. Instead I rewrite the RTL to be in canonical ordering and merge them. gcc/ChangeLog: * config/aarch64/aarch64-simd-builtins.def (sdot, udot): Rename to.. (sdot_prod, udot_prod): ... This. * config/aarch64/aarch64-simd.md (aarch64_<sur>dot<vsi2qi>): Merged into... (<sur>dot_prod<vsi2qi>): ... this. (aarch64_<sur>dot_lane<vsi2qi>, aarch64_<sur>dot_laneq<vsi2qi>): Change operands order. (<sur>sadv16qi): Use new operands order. * config/aarch64/arm_neon.h (vdot_u32, vdotq_u32, vdot_s32, vdotq_s32): Use new RTL ordering.
2021-07-26AArch64: correct usdot vectorizer and intrinsics optabsTamar Christina4-17/+21
There's a slight mismatch between the vectorizer optabs and the intrinsics patterns for NEON. The vectorizer expects operands[3] and operands[0] to be the same but the aarch64 intrinsics expanders expect operands[0] and operands[1] to be the same. This means we need different patterns here. This adds a separate usdot vectorizer pattern which just shuffles around the RTL params. There's also an inconsistency between the usdot and (u|s)dot intrinsics RTL patterns which is not corrected here. gcc/ChangeLog: * config/aarch64/aarch64-builtins.c (TYPES_TERNOP_SUSS, aarch64_types_ternop_suss_qualifiers): New. * config/aarch64/aarch64-simd-builtins.def (usdot_prod): Use it. * config/aarch64/aarch64-simd.md (usdot_prod<vsi2qi>): Re-organize RTL. * config/aarch64/arm_neon.h (vusdot_s32, vusdotq_s32): Use it.
2021-07-26openmp: Add support for omp attributes section and scan directivesJakub Jelinek7-17/+240
This patch adds support for expressing the section and scan directives using the attribute syntax and additionally fixes some bugs in the attribute syntax directive handling. For now it requires that the scan and section directives appear as the only attribute, not combined with other OpenMP or non-OpenMP attributes on the same statement. 2021-07-26 Jakub Jelinek <jakub@redhat.com> * parser.h (struct cp_lexer): Add orphan_p member. * parser.c (cp_parser_statement): Don't change in_omp_attribute_pragma upon restart from CPP_PRAGMA handling. Fix up condition when a lexer should be destroyed and adjust saved_tokens if it records tokens from the to be destroyed lexer. (cp_parser_omp_section_scan): New function. (cp_parser_omp_scan_loop_body): Use it. If parser->lexer->in_omp_attribute_pragma, allow optional comma after scan. (cp_parser_omp_sections_scope): Use cp_parser_omp_section_scan. * g++.dg/gomp/attrs-1.C: Use attribute syntax even for section and scan directives. * g++.dg/gomp/attrs-2.C: Likewise. * g++.dg/gomp/attrs-6.C: New test. * g++.dg/gomp/attrs-7.C: New test. * g++.dg/gomp/attrs-8.C: New test.
2021-07-26Daily bump.GCC Administrator2-1/+5
2021-07-25[Ada] Declare time_t uniformly based on a system parameter #2Arnaud Charlet1-0/+2
gcc/ada/ * libgnat/s-osprim__x32.adb: Add missing with clause.
2021-07-25Daily bump.GCC Administrator1-1/+1
2021-07-24Daily bump.GCC Administrator7-1/+420
2021-07-23Fortran: extend check for array arguments and reject CLASS array elements.Harald Anlauf2-2/+34
gcc/fortran/ChangeLog: PR fortran/101536 * check.c (array_check): Adjust check for the case of CLASS arrays. gcc/testsuite/ChangeLog: PR fortran/101536 * gfortran.dg/pr101536.f90: New test.
2021-07-23expmed: Fix store_integral_bit_field [PR101562]Jakub Jelinek2-1/+25
Our documentation says that paradoxical subregs shouldn't appear in strict_low_part: '(strict_low_part (subreg:M (reg:N R) 0))' This expression code is used in only one context: as the destination operand of a 'set' expression. In addition, the operand of this expression must be a non-paradoxical 'subreg' expression. but on the testcase below that triggers UB at runtime store_integral_bit_field emits exactly that. The following patch fixes it by ensuring the requirement is satisfied. 2021-07-23 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/101562 * expmed.c (store_integral_bit_field): Only use movstrict_optab if the operand isn't paradoxical. * gcc.c-torture/compile/pr101562.c: New test.
2021-07-23Use range_query object in array bounds class.Aldy Hernandez1-2/+2
Now that all dependencies of array_bounds_checker take a range_query, we can sever the relationship with vr_values. Changing this will allow us to use the array_bounds_checker with VRP, evrp, or the ranger. Tested on x86-64 Linux. gcc/ChangeLog: * gimple-array-bounds.h (class array_bounds_checker): Change ranges type to range_query.
2021-07-23aarch64: Use memcpy to copy vector tables in vst1[q]_x2 intrinsicsJonathan Wright2-61/+44
Use __builtin_memcpy to copy vector structures instead of building a new opaque structure one vector at a time in each of the vst1[q]_x2 Neon intrinsics in arm_neon.h. This simplifies the header file and also improves code generation - superfluous move instructions were emitted for every register extraction/set in this additional structure. Add new code generation tests to verify that superfluous move instructions are not generated for the vst1q_x2 intrinsics. gcc/ChangeLog: 2021-07-23 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/arm_neon.h (vst1_s64_x2): Use __builtin_memcpy instead of constructing __builtin_aarch64_simd_oi one vector at a time. (vst1_u64_x2): Likewise. (vst1_f64_x2): Likewise. (vst1_s8_x2): Likewise. (vst1_p8_x2): Likewise. (vst1_s16_x2): Likewise. (vst1_p16_x2): Likewise. (vst1_s32_x2): Likewise. (vst1_u8_x2): Likewise. (vst1_u16_x2): Likewise. (vst1_u32_x2): Likewise. (vst1_f16_x2): Likewise. (vst1_f32_x2): Likewise. (vst1_p64_x2): Likewise. (vst1q_s8_x2): Likewise. (vst1q_p8_x2): Likewise. (vst1q_s16_x2): Likewise. (vst1q_p16_x2): Likewise. (vst1q_s32_x2): Likewise. (vst1q_s64_x2): Likewise. (vst1q_u8_x2): Likewise. (vst1q_u16_x2): Likewise. (vst1q_u32_x2): Likewise. (vst1q_u64_x2): Likewise. (vst1q_f16_x2): Likewise. (vst1q_f32_x2): Likewise. (vst1q_f64_x2): Likewise. (vst1q_p64_x2): Likewise. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vector_structure_intrinsics.c: Add new tests.
2021-07-23aarch64: Use memcpy to copy vector tables in vst1[q]_x3 intrinsicsJonathan Wright2-91/+51
Use __builtin_memcpy to copy vector structures instead of building a new opaque structure one vector at a time in each of the vst1[q]_x3 Neon intrinsics in arm_neon.h. This simplifies the header file and also improves code generation - superfluous move instructions were emitted for every register extraction/set in this additional structure. Add new code generation tests to verify that superfluous move instructions are not generated for the vst1q_x3 intrinsics. gcc/ChangeLog: 2021-07-23 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/arm_neon.h (vst1_s64_x3): Use __builtin_memcpy instead of constructing __builtin_aarch64_simd_ci one vector at a time. (vst1_u64_x3): Likewise. (vst1_f64_x3): Likewise. (vst1_s8_x3): Likewise. (vst1_p8_x3): Likewise. (vst1_s16_x3): Likewise. (vst1_p16_x3): Likewise. (vst1_s32_x3): Likewise. (vst1_u8_x3): Likewise. (vst1_u16_x3): Likewise. (vst1_u32_x3): Likewise. (vst1_f16_x3): Likewise. (vst1_f32_x3): Likewise. (vst1_p64_x3): Likewise. (vst1q_s8_x3): Likewise. (vst1q_p8_x3): Likewise. (vst1q_s16_x3): Likewise. (vst1q_p16_x3): Likewise. (vst1q_s32_x3): Likewise. (vst1q_s64_x3): Likewise. (vst1q_u8_x3): Likewise. (vst1q_u16_x3): Likewise. (vst1q_u32_x3): Likewise. (vst1q_u64_x3): Likewise. (vst1q_f16_x3): Likewise. (vst1q_f32_x3): Likewise. (vst1q_f64_x3): Likewise. (vst1q_p64_x3): Likewise. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vector_structure_intrinsics.c: Add new tests.
2021-07-23x86: Don't return hard register when LRA is in progressH.J. Lu2-1/+24
Don't return hard register in ix86_gen_scratch_sse_rtx when LRA is in progress to avoid ICE when there are no available hard registers for LRA. gcc/ PR target/101504 * config/i386/i386.c (ix86_gen_scratch_sse_rtx): Don't return hard register when LRA is in progress. gcc/testsuite/ PR target/101504 * gcc.target/i386/pr101504.c: New test.
2021-07-23aarch64: Use memcpy to copy vector tables in vst1[q]_x4 intrinsicsJonathan Wright2-84/+204
Use __builtin_memcpy to copy vector structures instead of using a union in each of the vst1[q]_x4 Neon intrinsics in arm_neon.h. Add new code generation tests to verify that superfluous move instructions are not generated for the vst1q_x4 intrinsics. gcc/ChangeLog: 2021-07-21 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/arm_neon.h (vst1_s8_x4): Use __builtin_memcpy instead of using a union. (vst1q_s8_x4): Likewise. (vst1_s16_x4): Likewise. (vst1q_s16_x4): Likewise. (vst1_s32_x4): Likewise. (vst1q_s32_x4): Likewise. (vst1_u8_x4): Likewise. (vst1q_u8_x4): Likewise. (vst1_u16_x4): Likewise. (vst1q_u16_x4): Likewise. (vst1_u32_x4): Likewise. (vst1q_u32_x4): Likewise. (vst1_f16_x4): Likewise. (vst1q_f16_x4): Likewise. (vst1_f32_x4): Likewise. (vst1q_f32_x4): Likewise. (vst1_p8_x4): Likewise. (vst1q_p8_x4): Likewise. (vst1_p16_x4): Likewise. (vst1q_p16_x4): Likewise. (vst1_s64_x4): Likewise. (vst1_u64_x4): Likewise. (vst1_p64_x4): Likewise. (vst1q_s64_x4): Likewise. (vst1q_u64_x4): Likewise. (vst1q_p64_x4): Likewise. (vst1_f64_x4): Likewise. (vst1q_f64_x4): Likewise. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vector_structure_intrinsics.c: Add new tests.
2021-07-23aarch64: Use memcpy to copy vector tables in vst2[q] intrinsicsJonathan Wright2-60/+44
Use __builtin_memcpy to copy vector structures instead of building a new opaque structure one vector at a time in each of the vst2[q] Neon intrinsics in arm_neon.h. This simplifies the header file and also improves code generation - superfluous move instructions were emitted for every register extraction/set in this additional structure. Add new code generation tests to verify that superfluous move instructions are no longer generated for the vst2q intrinsics. gcc/ChangeLog: 2021-07-21 Jonathan Wrightt <jonathan.wright@arm.com> * config/aarch64/arm_neon.h (vst2_s64): Use __builtin_memcpy instead of constructing __builtin_aarch64_simd_oi one vector at a time. (vst2_u64): Likewise. (vst2_f64): Likewise. (vst2_s8): Likewise. (vst2_p8): Likewise. (vst2_s16): Likewise. (vst2_p16): Likewise. (vst2_s32): Likewise. (vst2_u8): Likewise. (vst2_u16): Likewise. (vst2_u32): Likewise. (vst2_f16): Likewise. (vst2_f32): Likewise. (vst2_p64): Likewise. (vst2q_s8): Likewise. (vst2q_p8): Likewise. (vst2q_s16): Likewise. (vst2q_p16): Likewise. (vst2q_s32): Likewise. (vst2q_s64): Likewise. (vst2q_u8): Likewise. (vst2q_u16): Likewise. (vst2q_u32): Likewise. (vst2q_u64): Likewise. (vst2q_f16): Likewise. (vst2q_f32): Likewise. (vst2q_f64): Likewise. (vst2q_p64): Likewise. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vector_structure_intrinsics.c: Add new tests.
2021-07-23aarch64: Use memcpy to copy vector tables in vst3[q] intrinsicsJonathan Wright2-90/+50
Use __builtin_memcpy to copy vector structures instead of building a new opaque structure one vector at a time in each of the vst3[q] Neon intrinsics in arm_neon.h. This simplifies the header file and also improves code generation - superfluous move instructions were emitted for every register extraction/set in this additional structure. Add new code generation tests to verify that superfluous move instructions are no longer generated for the vst3q intrinsics. gcc/ChangeLog: 2021-07-21 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/arm_neon.h (vst3_s64): Use __builtin_memcpy instead of constructing __builtin_aarch64_simd_ci one vector at a time. (vst3_u64): Likewise. (vst3_f64): Likewise. (vst3_s8): Likewise. (vst3_p8): Likewise. (vst3_s16): Likewise. (vst3_p16): Likewise. (vst3_s32): Likewise. (vst3_u8): Likewise. (vst3_u16): Likewise. (vst3_u32): Likewise. (vst3_f16): Likewise. (vst3_f32): Likewise. (vst3_p64): Likewise. (vst3q_s8): Likewise. (vst3q_p8): Likewise. (vst3q_s16): Likewise. (vst3q_p16): Likewise. (vst3q_s32): Likewise. (vst3q_s64): Likewise. (vst3q_u8): Likewise. (vst3q_u16): Likewise. (vst3q_u32): Likewise. (vst3q_u64): Likewise. (vst3q_f16): Likewise. (vst3q_f32): Likewise. (vst3q_f64): Likewise. (vst3q_p64): Likewise. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vector_structure_intrinsics.c: Add new tests.
2021-07-23aarch64: Use memcpy to copy vector tables in vst4[q] intrinsicsJonathan Wright2-120/+50
Use __builtin_memcpy to copy vector structures instead of building a new opaque structure one vector at a time in each of the vst4[q] Neon intrinsics in arm_neon.h. This simplifies the header file and also improves code generation - superfluous move instructions were emitted for every register extraction/set in this additional structure. Add new code generation tests to verify that superfluous move instructions are no longer generated for the vst4q intrinsics. gcc/ChangeLog: 2021-07-20 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/arm_neon.h (vst4_s64): Use __builtin_memcpy instead of constructing __builtin_aarch64_simd_xi one vector at a time. (vst4_u64): Likewise. (vst4_f64): Likewise. (vst4_s8): Likewise. (vst4_p8): Likewise. (vst4_s16): Likewise. (vst4_p16): Likewise. (vst4_s32): Likewise. (vst4_u8): Likewise. (vst4_u16): Likewise. (vst4_u32): Likewise. (vst4_f16): Likewise. (vst4_f32): Likewise. (vst4_p64): Likewise. (vst4q_s8): Likewise. (vst4q_p8): Likewise. (vst4q_s16): Likewise. (vst4q_p16): Likewise. (vst4q_s32): Likewise. (vst4q_s64): Likewise. (vst4q_u8): Likewise. (vst4q_u16): Likewise. (vst4q_u32): Likewise. (vst4q_u64): Likewise. (vst4q_f16): Likewise. (vst4q_f32): Likewise. (vst4q_f64): Likewise. (vst4q_p64): Likewise. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vector_structure_intrinsics.c: Add new tests.
2021-07-23aarch64: Use memcpy to copy vector tables in vtbx4 intrinsicsJonathan Wright1-12/+3
Use __builtin_memcpy to copy vector structures instead of building a new opaque structure one vector at a time in each of the vtbx4 Neon intrinsics in arm_neon.h. This simplifies the header file and also improves code generation - superfluous move instructions were emitted for every register extraction/set in this additional structure. gcc/ChangeLog: 2021-07-19 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/arm_neon.h (vtbx4_s8): Use __builtin_memcpy instead of constructing __builtin_aarch64_simd_oi one vector at a time. (vtbx4_u8): Likewise. (vtbx4_p8): Likewise.
2021-07-23aarch64: Use memcpy to copy vector tables in vtbl[34] intrinsicsJonathan Wright1-27/+12
Use __builtin_memcpy to copy vector structures instead of building a new opaque structure one vector at a time in each of the vtbl[34] Neon intrinsics in arm_neon.h. This simplifies the header file and also improves code generation - superfluous move instructions were emitted for every register extraction/set in this additional structure. gcc/ChangeLog: 2021-07-08 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/arm_neon.h (vtbl3_s8): Use __builtin_memcpy instead of constructing __builtin_aarch64_simd_oi one vector at a time. (vtbl3_u8): Likewise. (vtbl3_p8): Likewise. (vtbl4_s8): Likewise. (vtbl4_u8): Likewise. (vtbl4_p8): Likewise.
2021-07-23aarch64: Use memcpy to copy vector tables in vqtbx[234] intrinsicsJonathan Wright2-56/+65
Use __builtin_memcpy to copy vector structures instead of building a new opaque structure one vector at a time in each of the vqtbx[234] Neon intrinsics in arm_neon.h. This simplifies the header file and also improves code generation - superfluous move instructions were emitted for every register extraction/set in this additional structure. Add new code generation tests to verify that superfluous move instructions are no longer generated for the vqtbx[234] intrinsics. gcc/ChangeLog: 2021-07-08 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/arm_neon.h (vqtbx2_s8): Use __builtin_memcpy instead of constructing __builtin_aarch64_simd_oi one vector at a time. (vqtbx2_u8): Likewise. (vqtbx2_p8): Likewise. (vqtbx2q_s8): Likewise. (vqtbx2q_u8): Likewise. (vqtbx2q_p8): Likewise. (vqtbx3_s8): Use __builtin_memcpy instead of constructing __builtin_aarch64_simd_ci one vector at a time. (vqtbx3_u8): Likewise. (vqtbx3_p8): Likewise. (vqtbx3q_s8): Likewise. (vqtbx3q_u8): Likewise. (vqtbx3q_p8): Likewise. (vqtbx4_s8): Use __builtin_memcpy instead of constructing __builtin_aarch64_simd_xi one vector at a time. (vqtbx4_u8): Likewise. (vqtbx4_p8): Likewise. (vqtbx4q_s8): Likewise. (vqtbx4q_u8): Likewise. (vqtbx4q_p8): Likewise. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vector_structure_intrinsics.c: New tests.
2021-07-23aarch64: Use memcpy to copy vector tables in vqtbl[234] intrinsicsJonathan Wright2-54/+62
Use __builtin_memcpy to copy vector structures instead of building a new opaque structure one vector at a time in each of the vqtbl[234] Neon intrinsics in arm_neon.h. This simplifies the header file and also improves code generation - superfluous move instructions were emitted for every register extraction/set in this additional structure. Add new code generation tests to verify that superfluous move instructions are no longer generated for the vqtbl[234] intrinsics. gcc/ChangeLog: 2021-07-08 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/arm_neon.h (vqtbl2_s8): Use __builtin_memcpy instead of constructing __builtin_aarch64_simd_oi one vector at a time. (vqtbl2_u8): Likewise. (vqtbl2_p8): Likewise. (vqtbl2q_s8): Likewise. (vqtbl2q_u8): Likewise. (vqtbl2q_p8): Likewise. (vqtbl3_s8): Use __builtin_memcpy instead of constructing __builtin_aarch64_simd_ci one vector at a time. (vqtbl3_u8): Likewise. (vqtbl3_p8): Likewise. (vqtbl3q_s8): Likewise. (vqtbl3q_u8): Likewise. (vqtbl3q_p8): Likewise. (vqtbl4_s8): Use __builtin_memcpy instead of constructing __builtin_aarch64_simd_xi one vector at a time. (vqtbl4_u8): Likewise. (vqtbl4_p8): Likewise. (vqtbl4q_s8): Likewise. (vqtbl4q_u8): Likewise. (vqtbl4q_p8): Likewise. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vector_structure_intrinsics.c: New test.
2021-07-23openmp: Add support for __has_attribute(omp::directive) and ↵Jakub Jelinek4-1/+380
__has_attribute(omp::sequence) Now that the C++ FE supports these attributes, but not through registering them in the attributes tables (they work quite differently from other attributes), this teaches c_common_has_attributes about those. 2021-07-23 Jakub Jelinek <jakub@redhat.com> * c-lex.c (c_common_has_attribute): Call canonicalize_attr_name also on attr_id. Return 1 for omp::directive or omp::sequence in C++11 and later. * c-c++-common/gomp/attrs-1.c: New test. * c-c++-common/gomp/attrs-2.c: New test. * c-c++-common/gomp/attrs-3.c: New test.
2021-07-23openmp: Diagnose invalid mixing of the attribute and pragma syntax directivesJakub Jelinek5-5/+149
The OpenMP 5.1 spec says that the attribute and pragma syntax directives should not be mixed on the same statement. The following patch adds diagnostic for that, [[omp::directive (...)]] #pragma omp ... is always an error and for the other order #pragma omp ... [[omp::directive (...)]] it depends on whether the pragma directive is an OpenMP construct (then it is an error because it needs a structured block or loop or statement as body) or e.g. a standalone directive (then it is fine). Only block scope is handled for now though, namespace scope and class scope still needs implementing even the basic support. 2021-07-23 Jakub Jelinek <jakub@redhat.com> gcc/c-family/ * c-pragma.h (enum pragma_kind): Add PRAGMA_OMP__START_ and PRAGMA_OMP__LAST_ enumerators. gcc/cp/ * parser.h (struct cp_parser): Add omp_attrs_forbidden_p member. * parser.c (cp_parser_handle_statement_omp_attributes): Diagnose mixing of attribute and pragma syntax directives when seeing omp::directive if parser->omp_attrs_forbidden_p or if attribute syntax directives are followed by OpenMP pragma. (cp_parser_statement): Clear parser->omp_attrs_forbidden_p after the cp_parser_handle_statement_omp_attributes call. (cp_parser_omp_structured_block): Add disallow_omp_attrs argument, if true, set parser->omp_attrs_forbidden_p. (cp_parser_omp_scan_loop_body, cp_parser_omp_sections_scope): Pass false as disallow_omp_attrs to cp_parser_omp_structured_block. (cp_parser_omp_parallel, cp_parser_omp_task): Set parser->omp_attrs_forbidden_p. gcc/testsuite/ * g++.dg/gomp/attrs-4.C: New test. * g++.dg/gomp/attrs-5.C: New test.
2021-07-23testsuite: mips: pass -finline/-fnoinline throughXi Ruoyao1-0/+1
gcc/testsuite/ * gcc.target/mips/mips.exp (mips_option_groups): add -finline and -fno-inline.
2021-07-23Revert "testsuite: mips: use noinline attribute instead of -fno-inline"Xi Ruoyao2-11/+6
This reverts commit 3b33b1136d5ba1903a56fa601a848accc3db46ef.