aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2021-08-28MIPS: use N64 ABI by default if the triple end with -gnuabi64YunQiang Su1-0/+14
gcc/ChangeLog: PR target/102089 * config.gcc: MIPS: use N64 ABI by default if the triple end with -gnuabi64, which is used by Debian since 2013.
2021-08-28fix latent bootstrap-debug issueAlexandre Oliva2-3/+4
I've hit a bootstrap-debug error involving large subprograms in gcc/ada/sem_ch12.adb. I'm afraid I couldn't narrow it down to a reasonable testcase. thread1 made different decisions about a block containing a builtin_eh_filter call because in one compilation, estimate_num_insns found a cgraph_node for the builtin and could thus get to the is_simple_builtin test, but in the other it didn't. With different insn counts, one stage jump-threaded and the other didn't, and the resulting code diverged quite a bit. The reason the builtin had a cgraph_node in one case but not the other was that modref got a chance to analyze the builtin call when it was the first stmt in the block, and that created the cgraph_node. However, when it was preceded by debug stmts, the loop in analyze_function was cut short after the first debug stmt, because the summary so far was not useful. This patch fixes both issues: skip debug stmts in the analyze_function loop, so as to prevent them from affecting any decisions in the loop, and enable the insn count estimator to get to the is_simple_builtin test when a cgraph_node has not been created for the builtin. for gcc/ChangeLog * ipa-modref.c (analyze_function): Skip debug stmts. * tree-inline.c (estimate_num_insn): Consider builtins even without a cgraph_node.
2021-08-28Daily bump.GCC Administrator4-1/+137
2021-08-27c++: Set type on dependent ARROW_EXPRJason Merrill1-3/+9
Even if the operand of -> has dependent type, if it's a pointer we know that the result will be the target type of that pointer. This should avoid some unnecessary TYPEOF_EXPR when looking up a name after ->. gcc/cp/ChangeLog: * typeck2.c (build_x_arrow): Do set TREE_TYPE when operand is a dependent pointer.
2021-08-27Support limited setcc for H8Jeff Law5-35/+89
gcc/ * config/h8300/bitfield.md (cstore<mode>4): Remove expander. * config/h8300/h8300.c (h8300_expand_branch): Remove function. * config/h8300/h8300-protos.h (h8300_expadn_branch): Remove prototype. * config/h8300/h8300.md (eqne): New code iterator. (geultu, geultu_to_c): Similarly. * config/h8300/testcompare.md (cstore<mode>4): Dummy expander. (store_c_<mode>, store_c_i_<mode>): New define_insn_and_splits (cmp<mode>_c): New pattern
2021-08-27Update comments in float128-call.c test.Michael Meissner1-5/+7
Segher asked that I update the comments to include the d-form vector stores (even though they wouldn't be generated by this test). 2021-08-25 Michael Meissner <meissner@linux.ibm.com> gcc/testsuite/ * gcc.target/powerpc/float128-call.c: Update comments.
2021-08-27Reduce vector comparison of uniform vectors to a scalar comparisonJeff Law1-0/+65
gcc/ * tree-ssa-dom.c (reduce_vector_comparison_to_scalar_comparison): New function. (dom_opt_dom_walker::optimize_stmt): Use it.
2021-08-27Fix float128-call.c test for power8 IEEE 128 and power10.Michael Meissner1-8/+19
I built a compiler on a little endian power8 system where the default long double was IEEE 128-bit instead of IBM 128-bit. I discovered that on power8, we would generate a lxvd2x and xxpermdi to deal with the endianess instead of the Altivec lxv. In addition, I noticed the constant that was being loaded (1.0q) could be loaded by the lxvkq instruction. I rewrote the test to handle all forms of vector load and store that can be generated. 2021-08-27 Michael Meissner <meissner@linux.ibm.com> gcc/testsuite/ * gcc.target/powerpc/float128-call.c: Fix test for IEEE 128-bit long double and power10.
2021-08-27Darwin : Mark the mod init/term section starts with a linker-visible sym.Iain Sandoe2-5/+35
Some newer assemblers emit section start temp symbols for mod init and term sections if there is no suitable symbol present already. The temp symbols are linker visible and therefore appear in the symbol tables. Since the temp symbol number can vary when debug is enabled, that causes compare-debug fails. The solution is to provide a stable linker-visible symbol. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> gcc/ChangeLog: * config/darwin.c (finalize_ctors): Add a section-start linker- visible symbol. (finalize_dtors): Likewise. * config/darwin.h (MIN_LD64_INIT_TERM_START_LABELS): New.
2021-08-27rs6000: Execute the automatic built-in initialization codeBill Schmidt2-2/+14
2021-08-27 Bill Schmidt <wschmidt@linux.ibm.com> gcc/ * config/rs6000/rs6000-call.c (rs6000-builtins.h): New #include. (rs6000_init_builtins): Call rs6000_init_generated_builtins. Skip the old initialization logic when new builtins are enabled. * config/rs6000/rs6000-gen-builtins.c (write_decls): Rename rs6000_autoinit_builtins to rs6000_init_generated_builtins. (write_init_file): Likewise.
2021-08-27testsuite, Darwin : Do not claim 'GAS' for cctools assembler.Iain Sandoe1-1/+8
Although the cctools assembler is based of GNU GAS, it is from a very old version (1.38) which does not support many of the features that the target supports test is expecting. gcc/testsuite/ChangeLog: * lib/target-supports.exp: Exclude cctools assembler based on GAS 1.38.
2021-08-27configure: Adjust several assembler checks to remove an unused parm.Iain Sandoe1-5/+5
In r12-3048-ge0b6d0b39c6, the GAS version parameter was removed from the gcc_GAS_CHECK_FEATURE macro. It seems that overlapping comit/test cycles resulted in several AMDGCN and one Darwin commit with the now extra parameter still present. This causes wrong configure code to be generated when autoreconf is used in the gcc directory. Fixed by removing the extraneous parm from the AMDGCN and Darwin cases. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> gcc/ChangeLog: * configure.ac (darwin2[[0-9]]* | darwin19*): Alter use of gcc_GAS_CHECK_FEATURE to remove an extraneous parameter. (amdgcn-* | gcn-*) Likewise.
2021-08-27call_summary: add missing template keywordAnthony Sharp1-2/+2
Without the 'template', this function template compares 'traverse' to 'f', and then compares the result to 'a'. Evidently it hasn't been instantiated yet. gcc/ChangeLog: * symbol-summary.h: Added missing template keyword.
2021-08-27tree-optimization/45178 - DCE of dead control flow in infinite loopRichard Biener2-9/+14
This fixes DCE to be able to elide dead control flow in an infinite loop without an exit edge. This special situation is handled well by the code finding an edge to preserve since there's no chance it will find the exit edge and make the loop finite. 2021-08-27 Richard Biener <rguenther@suse.de> PR tree-optimization/45178 * tree-ssa-dce.c (find_obviously_necessary_stmts): For infinite loops without exit do not mark control dependent edges of the latch necessary. * gcc.dg/tree-ssa/ssa-dce-3.c: Adjust testcase.
2021-08-27i386: Fix wrong optimization for consecutive masked scatters [PR 101472]konglin13-8/+140
gcc/ChangeLog: PR target/101472 * config/i386/sse.md: (<avx512>scattersi<mode>): Add mask operand to UNSPEC_VSIBADDR. (<avx512>scattersi<mode>): Likewise. (*avx512f_scattersi<VI48F:mode>): Merge mask operand to set_dest. (*avx512f_scatterdi<VI48F:mode>): Likewise gcc/testsuite/ChangeLog: PR target/101472 * gcc.target/i386/avx512f-pr101472.c: New test. * gcc.target/i386/avx512vl-pr101472.c: New test.
2021-08-26rs6000: Make some BIFs vectorized on P10Kewen Lin10-0/+335
This patch is to add the support to make vectorizer able to vectorize some built-in function scalar versions on Power10. gcc/ChangeLog: * config/rs6000/rs6000.c (rs6000_builtin_md_vectorized_function): Add support for built-in functions MISC_BUILTIN_DIVWE, MISC_BUILTIN_DIVWEU, MISC_BUILTIN_DIVDE, MISC_BUILTIN_DIVDEU, P10_BUILTIN_CFUGED, P10_BUILTIN_CNTLZDM, P10_BUILTIN_CNTTZDM, P10_BUILTIN_PDEPD and P10_BUILTIN_PEXTD on Power10. gcc/testsuite/ChangeLog: * gcc.target/powerpc/dive-vectorize-1.c: New test. * gcc.target/powerpc/dive-vectorize-1.h: New test. * gcc.target/powerpc/dive-vectorize-2.c: New test. * gcc.target/powerpc/dive-vectorize-2.h: New test. * gcc.target/powerpc/dive-vectorize-run-1.c: New test. * gcc.target/powerpc/dive-vectorize-run-2.c: New test. * gcc.target/powerpc/p10-bifs-vectorize-1.c: New test. * gcc.target/powerpc/p10-bifs-vectorize-1.h: New test. * gcc.target/powerpc/p10-bifs-vectorize-run-1.c: New test.
2021-08-26rs6000: Add missing unsigned info for some P10 bifsKewen Lin1-0/+5
This patch is to make prototypes of some Power10 built-in functions consistent with what's in the documentation, as well as the vector version. Otherwise, useless conversions can be generated in gimple IR, and the vectorized versions will have inconsistent types. gcc/ChangeLog: * config/rs6000/rs6000-call.c (builtin_function_type): Add unsigned signedness for some Power10 bifs.
2021-08-26aix: packed struct alignment [PR102068]David Edelsohn1-1/+1
Further fixes to structure alignment when the structure is packed and contains double. This patch checks for packed attribute at the top level. gcc/ChangeLog: PR target/102068 * config/rs6000/rs6000.c (rs6000_adjust_field_align): Use computed alignment if the entire struct has attribute packed.
2021-08-27Fold more shuffle builtins to VEC_PERM_EXPR.liuhongt4-24/+88
A follow-up to https://gcc.gnu.org/pipermail/gcc-patches/2019-May/521983.html gcc/ PR target/98167 PR target/43147 * config/i386/i386.c (ix86_gimple_fold_builtin): Fold IX86_BUILTIN_SHUFPD512, IX86_BUILTIN_SHUFPS512, IX86_BUILTIN_SHUFPD256, IX86_BUILTIN_SHUFPS, IX86_BUILTIN_SHUFPS256. (ix86_masked_all_ones): New function. gcc/testsuite/ * gcc.target/i386/avx512f-vshufpd-1.c: Adjust testcase. * gcc.target/i386/avx512f-vshufps-1.c: Adjust testcase. * gcc.target/i386/pr43147.c: New test.
2021-08-27Daily bump.GCC Administrator3-1/+109
2021-08-26[i386] Call force_reg unconditionally.Uros Bizjak2-13/+8
There is no point to check RTXes before calling force_reg, force_reg checks for REG RTX by itself. 2021-08-26 Uroš Bizjak <ubizjak@gmail.com> gcc/ * config/i386/i386.md (*btr<mode>_1): Call force_reg unconditionally. (conditional moves with memory inputs splitters): Ditto. * config/i386/sse.md (one_cmpl<mode>2): Simplify.
2021-08-26Fix ipa-modref verification icesJan Hubicka1-9/+24
* ipa-modref-tree.h (modref_access_node::try_merge_with): Restart search after merging.
2021-08-26rs6000: Add remaining overloadsBill Schmidt1-0/+6083
2021-08-26 Bill Schmidt <wschmidt@linux.ibm.com> gcc/ * config/rs6000/rs6000-overload.def: Add remaining overloads.
2021-08-26rs6000: Add Cell builtinsBill Schmidt1-0/+27
2021-06-07 Bill Schmidt <wschmidt@linux.ibm.com> gcc/ * config/rs6000/rs6000-builtin-new.def: Add cell stanza.
2021-08-26rs6000: Add miscellaneous builtinsBill Schmidt1-0/+215
2021-06-15 Bill Schmidt <wschmidt@linux.ibm.com> gcc/ * config/rs6000/rs6000-builtin-new.def: Add ieee128-hw, dfp, crypto, and htm stanzas.
2021-08-26rs6000: Add MMA builtinsBill Schmidt1-0/+416
2021-06-16 Bill Schmidt <wschmidt@linux.ibm.com> gcc/ * config/rs6000/rs6000-builtin-new.def: Add mma stanza.
2021-08-26Refactor warn_uninit() code.Martin Sebor1-103/+83
gcc/ChangeLog: * tree-ssa-uninit.c (warn_uninit): Refactor and simplify. (warn_uninit_phi_uses): Remove argument from calls to warn_uninit. (warn_uninitialized_vars): Same. Reduce visibility of locals. (warn_uninitialized_phi): Same.
2021-08-26Improved handling of shifts/rotates in bit CCP.Roger Sayle2-0/+171
This patch is the next in the series to improve bit bounds in tree-ssa's bit CCP pass, this time: bounds for shifts and rotates by unknown amounts. This allows us to optimize expressions such as ((x&15)<<(y&24))&64. In this case, the expression (y&24) contains only two unknown bits, and can therefore have only four possible values: 0, 8, 16 and 24. From this (x&15)<<(y&24) has the nonzero bits 0x0f0f0f0f, and from that ((x&15)<<(y&24))&64 must always be zero. One clever use of computer science in this patch is the use of XOR to efficiently enumerate bit patterns in Gray code order. As the order in which we generate values is not significant, it's faster and more convenient to enumerate values by flipping one bit at a time, rather than in numerical order [which would require carry bits and additional logic]. There's a pre-existing ??? comment in tree-ssa-ccp.c that we should eventually be able to optimize (x<<(y|8))&255, but this patch takes the conservatively paranoid approach of only optimizing cases where the shift/rotate is guaranteed to be less than the target precision, and therefore avoids changing any cases that potentially might invoke undefined behavior. This patch does optimize (x<<((y&31)|8))&255. 2021-08-26 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * tree-ssa-ccp.c (get_individual_bits): Helper function to extract the individual bits from a widest_int constant (mask). (gray_code_bit_flips): New read-only table for effiently enumerating permutations/combinations of bits. (bit_value_binop) [LROTATE_EXPR, RROTATE_EXPR]: Handle rotates by unknown counts that are guaranteed less than the target precision and four or fewer unknown bits by enumeration. [LSHIFT_EXPR, RSHIFT_EXPR]: Likewise, also handle shifts by enumeration under the same conditions. Handle remaining shifts as a mask based upon the minimum possible shift value. gcc/testsuite/ChangeLog * gcc.dg/tree-ssa/ssa-ccp-41.c: New test case.
2021-08-26[Committed] Tidy up !POINTER_TYPE_P test in match.pd LSHIFT_EXPR foldingRoger Sayle1-1/+0
As suggested by Richard Biener in the comments of PR middle-end/102029, the new test "INTEGRAL_TYPE_P (type) && !POINTER_TYPE_P (type) ..." is redundant, and just "INTEGRAL_TYPE_P (type)" is the preferred form. 2021-08-26 Roger Sayle <roger@nextmovesoftware.com> Richard Biener <rguenther@suse.de> gcc/ChangeLog * match.pd (shift transformations): Remove a redundant !POINTER_TYPE_P check.
2021-08-26[i386] Set all_regs to true in the call to replace_rtx [PR102057]Uros Bizjak1-4/+4
We want to replace all REGs equal to FROM. 2021-08-26 Uroš Bizjak <ubizjak@gmail.com> gcc/ PR target/102057 * config/i386/i386.md (cmove reg-reg move elimination peephole2s): Set all_regs to true in the call to replace_rtx.
2021-08-26Improve handling of modref params.Jan Hubicka3-51/+63
this patch makes insertion to modref access tree smarter when --param modref-max-bases and moredref-max-refs are hit. Instead of giving up we either give up on base alias set (make it equal to ref) or turn the alias set to 0. This lets us to track useful info on quite large functions, such as ggc_free. gcc/ChangeLog: * ipa-modref-tree.c (test_insert_search_collapse): Update test. * ipa-modref-tree.h (modref_base_node::insert): Be smarter when hiting --param modref-max-refs limit. (modref_tree:insert_base): Be smarter when hitting --param modref-max-bases limit. Add new parameter REF. (modref_tree:insert): Update. (modref_tree:merge): Update. * ipa-modref.c (read_modref_records): Update.
2021-08-26Add full stop to params.opt.Jan Hubicka1-1/+1
gcc/ChangeLog: * params.opt: (modref-max-adjustments): Add full stop.
2021-08-26Fix off-by-one error in try_merge_withJan Hubicka1-5/+24
gcc/ChangeLog: * ipa-modref-tree.h (modref_ref_node::verify): New member functoin. (modref_ref_node::insert): Use it. (modref_ref_node::try_mere_with): Fix off by one error.
2021-08-26Use non-numbered clones for target_clones.Martin Liska7-19/+33
gcc/ChangeLog: * cgraph.h (create_version_clone_with_body): Add new parameter. * cgraphclones.c: Likewise. * multiple_target.c (create_dispatcher_calls): Do not use numbered suffixes. (create_target_clone): Likewise here. gcc/testsuite/ChangeLog: * gcc.target/i386/mvc5.c: Scan assembly names. * gcc.target/i386/mvc7.c: Likewise. * gcc.target/i386/pr95778-1.c: Update scanned patterns. * gcc.target/i386/pr95778-2.c: Likewise. Co-Authored-By: Stefan Kneifel <stefan.kneifel@bluewin.ch>
2021-08-26extend.texi: add note about reserved ctor/dtor prioritiesJonathan Yong1-10/+10
gcc/Changelog: * doc/extend.texi: Add note about reserved priorities to the constructor attribute. Signed-off-by: Jonathan Yong <10walls@gmail.com>
2021-08-26Daily bump.GCC Administrator8-1/+247
2021-08-25Add -details to dump option needed after r12-3144.Martin Sebor6-8/+6
gcc/testsuite: * gcc.dg/tree-ssa/evrp1.c: Add -details to dump option. * gcc.dg/tree-ssa/evrp2.c: Same. * gcc.dg/tree-ssa/evrp3.c: Same. * gcc.dg/tree-ssa/evrp4.c: Same. * gcc.dg/tree-ssa/evrp6.c: Same. * gcc.dg/tree-ssa/pr64130.c: Same.
2021-08-25Fix tests that require IBM 128-bit long doubleMichael Meissner3-24/+148
This patch adds 3 more selections to target-supports.exp to see if we can specify to use a particular long double format (IEEE 128-bit, IBM extended double, 64-bit), and the library support will track the changes for the long double. This is needed because two of the tests in the test suite use long double, and they are actually testing IBM extended double. This patch also forces the two tests that explicitly require long double to use the IBM double-double encoding to explicitly run the test. This requires GLIBC 2.32 or greater in order to do the switch. I have run tests on a little endian power9 system with 3 compilers. There were no regressions with these patches, and the two tests in the following patches now work if the default long double is not IBM 128-bit: * One compiler used the default IBM 128-bit format; * One compiler used the IEEE 128-bit format; (and) * One compiler used 64-bit long doubles. I have also tested compilers on a big endian power8 system with a compiler defaulting to power8 code generation and another with the default cpu set. There were no regressions. 2021-08-25 Michael Meissner <meissner@linux.ibm.com> gcc/testsuite/ PR target/94630 * gcc.target/powerpc/pr70117.c: Specify that we need the long double type to be IBM 128-bit. Remove the code to use __ibm128. * c-c++-common/dfp/convert-bfp-11.c: Specify that we need the long double type to be IBM 128-bit. Run the test at -O2 optimization. * lib/target-supports.exp (add_options_for_long_double_ibm128): New function. (check_effective_target_long_double_ibm128): New function. (add_options_for_long_double_ieee128): New function. (check_effective_target_long_double_ieee128): New function. (add_options_for_long_double_64bit): New function. (check_effective_target_long_double_64bit): New function.
2021-08-25Fix PR c++/66590: incorrect warning "reaches end of non-void function" for ↵Andrew Pinski2-0/+24
switch So the problem here is there is code in the C++ front-end not to add a break statement (to the IR) if the previous block does not fall through. The problem is the code which does the check to see if the block may fallthrough does not check a CLEANUP_STMT; it assumes it is always fall through. Anyways this adds the code for the case of a CLEANUP_STMT that is only for !CLEANUP_EH_ONLY (the try/finally case). OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/cp/ChangeLog: PR c++/66590 * cp-objcp-common.c (cxx_block_may_fallthru): Handle CLEANUP_STMT for the case which will be try/finally. gcc/testsuite/ChangeLog: PR c++/66590 * g++.dg/warn/Wreturn-5.C: New test.
2021-08-25Avoid printing range table header alone.Martin Sebor2-35/+50
gcc/ChangeLog: * gimple-range-cache.cc (ssa_global_cache::dump): Avoid printing range table header alone. * gimple-range.cc (gimple_ranger::export_global_ranges): Same.
2021-08-25c++: Fix up value initialization of structs with zero width bitfields [PR102019]Jakub Jelinek1-0/+5
The removal of remove_zero_width_bit_fields, in addition to triggering some ABI issues that need solving anyway (ABI incompatibility between C and C++) also resulted in UB inside of gcc, we now call build_zero_init which calls build_int_cst on an integral type with TYPE_PRECISION of 0. Fixed by ignoring the zero width bitfields. I understand build_value_init_noctor wants to initialize to 0 even unnamed bitfields (of non-zero width), at least until we have some CONSTRUCTOR flag that says that even all the padding bits should be cleared. 2021-08-25 Jakub Jelinek <jakub@redhat.com> PR c++/102019 * init.c (build_value_init_noctor): Ignore unnamed zero-width bitfields.
2021-08-25Merge load/stores in ipa-modref summariesJan Hubicka8-78/+342
this patch adds logic needed to merge neighbouring accesses in ipa-modref summaries. This helps analyzing array initializers and similar code. It is bit of work, since it breaks the fact that modref tree makes a good lattice for dataflow: the access ranges can be extended indefinitely. For this reason I added counter tracking number of adjustments and a cap to limit them during the dataflow. gcc/ChangeLog: * doc/invoke.texi: Document --param modref-max-adjustments. * ipa-modref-tree.c (test_insert_search_collapse): Update. (test_merge): Update. * ipa-modref-tree.h (struct modref_access_node): Add adjustments; (modref_access_node::operator==): Fix handling of access ranges. (modref_access_node::contains): Constify parameter; handle also mismatched parm offsets. (modref_access_node::update): New function. (modref_access_node::merge): New function. (unspecified_modref_access_node): Update constructor. (modref_ref_node::insert_access): Add record_adjustments parameter; handle merging. (modref_ref_node::try_merge_with): New private function. (modref_tree::insert): New record_adjustments parameter. (modref_tree::merge): New record_adjustments parameter. (modref_tree::copy_from): Update. * ipa-modref.c (dump_access): Dump adjustments field. (get_access): Update constructor. (record_access): Update call of insert. (record_access_lto): Update call of insert. (merge_call_side_effects): Add record_adjustments parameter. (get_access_for_fnspec): Update. (process_fnspec): Update. (analyze_call): Update. (analyze_function): Update. (read_modref_records): Update. (ipa_merge_modref_summary_after_inlining): Update. (propagate_unknown_call): Update. (modref_propagate_in_scc): Update. * params.opt (param-max-modref-adjustments=): New. gcc/testsuite/ChangeLog: * gcc.dg/ipa/modref-1.c: Update testcase. * gcc.dg/tree-ssa/modref-4.c: Update testcase. * gcc.dg/tree-ssa/modref-8.c: New test.
2021-08-25Make xxsplti*, xpermx, xxeval be vecperm type.Michael Meissner1-13/+13
I noticed that the built-functions for xxspltiw, xxspltidp, xxsplti32dx, xxpermx, and xxeval all used the 'vecsimple' type. These instructions are permute instructions (3 cycle latency) and should use 'vecperm' instead. While I was at it, I changed the UNSPEC name for xxspltidp to be UNSPEC_XXSPLTIDP instead of UNSPEC_XXSPLTID. 2021-08-25 Michael Meissner <meissner@linux.ibm.com> gcc/ * config/rs6000/vsx.md (UNSPEC_XXSPLTIDP): Rename from UNSPEC_XXSPLTID. (xxspltiw_v4si): Use vecperm type attribute. (xxspltiw_v4si_inst): Use vecperm type attribute. (xxspltiw_v4sf_inst): Likewise. (xxspltidp_v2df): Use vecperm type attribute. Use UNSPEC_XXSPLTIDP instead of UNSPEC_XXSPLTID. (xxspltidp_v2df_inst): Likewise. (xxsplti32dx_v4si): Use vecperm type attribute. (xxsplti32dx_v4si_inst): Likewise. (xxsplti32dx_v4sf_inst): Likewise. (xxblend_<mode>): Likewise. (xxpermx): Likewise. (xxpermx_inst): Likewise. (xxeval): Likewise.
2021-08-25diagnostics: Support for -finput-charset [PR93067]Lewis Hyatt10-12/+198
Adds the logic to handle -finput-charset in layout_get_source_line(), so that source lines are converted from their input encodings prior to being output by diagnostics machinery. Also adds the ability to strip a UTF-8 BOM similarly. gcc/c-family/ChangeLog: PR other/93067 * c-opts.c (c_common_input_charset_cb): New function. (c_common_post_options): Call new function diagnostic_initialize_input_context(). gcc/d/ChangeLog: PR other/93067 * d-lang.cc (d_input_charset_callback): New function. (d_init): Call new function diagnostic_initialize_input_context(). gcc/fortran/ChangeLog: PR other/93067 * cpp.c (gfc_cpp_post_options): Call new function diagnostic_initialize_input_context(). gcc/ChangeLog: PR other/93067 * coretypes.h (typedef diagnostic_input_charset_callback): Declare. * diagnostic.c (diagnostic_initialize_input_context): New function. * diagnostic.h (diagnostic_initialize_input_context): Declare. * input.c (default_charset_callback): New function. (file_cache::initialize_input_context): New function. (file_cache_slot::create): Added ability to convert the input according to the input context. (file_cache::file_cache): Initialize the new input context. (class file_cache_slot): Added new m_alloc_offset member. (file_cache_slot::file_cache_slot): Initialize the new member. (file_cache_slot::~file_cache_slot): Handle potentially offset buffer. (file_cache_slot::maybe_grow): Likewise. (file_cache_slot::needs_read_p): Handle NULL fp, which is now possible. (file_cache_slot::get_next_line): Likewise. * input.h (class file_cache): Added input context member. libcpp/ChangeLog: PR other/93067 * charset.c (init_iconv_desc): Adapt to permit PFILE argument to be NULL. (_cpp_convert_input): Likewise. Also move UTF-8 BOM logic to... (cpp_check_utf8_bom): ...here. New function. (cpp_input_conversion_is_trivial): New function. * files.c (read_file_guts): Allow PFILE argument to be NULL. Add INPUT_CHARSET argument as an alternate source of this information. (read_file): Pass the new argument to read_file_guts. (cpp_get_converted_source): New function. * include/cpplib.h (struct cpp_converted_source): Declare. (cpp_get_converted_source): Declare. (cpp_input_conversion_is_trivial): Declare. (cpp_check_utf8_bom): Declare. gcc/testsuite/ChangeLog: PR other/93067 * gcc.dg/diagnostic-input-charset-1.c: New test. * gcc.dg/diagnostic-input-utf8-bom.c: New test.
2021-08-25analyzer: Impose recursion limit on indirect calls.Ankur Saini1-0/+14
2021-08-25 Ankur Saini <arsenic@sourceware.org> gcc/analyzer/ChangeLog: PR analyzer/101980 * engine.cc (exploded_graph::maybe_create_dynamic_call): Don't create calls if max recursion limit is reached.
2021-08-25tree-optimization/102046 - fix SLP build from scalars with patternsRichard Biener2-0/+23
When we swap operands for SLP builds we lose track where exactly pattern defs are - but we fail to update the any_pattern member of the operands info. Do so conservatively. 2021-08-25 Richard Biener <rguenther@suse.de> PR tree-optimization/102046 * tree-vect-slp.c (vect_build_slp_tree_2): Conservatively update ->any_pattern when swapping operands. * gcc.dg/vect/pr102046.c: New testcase.
2021-08-25i386: Optimize lea with zero-extend. [PR 101716]Hongyu Wang2-6/+41
For ASHIFT + ZERO_EXTEND pattern, combine pass failed to match it to lea since it will generate non-canonical zero-extend. Adjust predicate and cost_model to allow combine for lea. gcc/ChangeLog: PR target/101716 * config/i386/i386.c (ix86_live_on_entry): Adjust comment. (ix86_decompose_address): Remove retval check for ASHIFT, allow non-canonical zero extend if AND mask covers ASHIFT count. (ix86_legitimate_address_p): Adjust condition for decompose. (ix86_rtx_costs): Adjust cost for lea with non-canonical zero-extend. Co-Authored by: Uros Bizjak <ubizjak@gmail.com> gcc/testsuite/ChangeLog: PR target/101716 * gcc.target/i386/pr101716.c: New test.
2021-08-25Analyze niter for until-wrap condition [PR101145]Jiufu Guo9-65/+459
For code like: unsigned foo(unsigned val, unsigned start) { unsigned cnt = 0; for (unsigned i = start; i > val; ++i) cnt++; return cnt; } The number of iterations should be about UINT_MAX - start. There is function adjust_cond_for_loop_until_wrap which handles similar work for const bases. Like adjust_cond_for_loop_until_wrap, this patch enhance function number_of_iterations_cond/number_of_iterations_lt to analyze number of iterations for this kind of loop. gcc/ChangeLog: 2021-08-25 Jiufu Guo <guojiufu@linux.ibm.com> PR tree-optimization/101145 * tree-ssa-loop-niter.c (number_of_iterations_until_wrap): New function. (number_of_iterations_lt): Invoke above function. (adjust_cond_for_loop_until_wrap): Merge to number_of_iterations_until_wrap. (number_of_iterations_cond): Update invokes for adjust_cond_for_loop_until_wrap and number_of_iterations_lt. gcc/testsuite/ChangeLog: 2021-08-25 Jiufu Guo <guojiufu@linux.ibm.com> PR tree-optimization/101145 * gcc.dg/vect/pr101145.c: New test. * gcc.dg/vect/pr101145.inc: New test. * gcc.dg/vect/pr101145_1.c: New test. * gcc.dg/vect/pr101145_2.c: New test. * gcc.dg/vect/pr101145_3.c: New test. * gcc.dg/vect/pr101145inf.c: New test. * gcc.dg/vect/pr101145inf.inc: New test. * gcc.dg/vect/pr101145inf_1.c: New test.
2021-08-25i386: Fix _mm512_fpclass_ps_mask in O0 [PR 101471]konglin12-2/+20
gcc/ChangeLog: PR target/101471 * config/i386/avx512dqintrin.h (_mm512_fpclass_ps_mask): Fix macro define in O0. (_mm512_mask_fpclass_ps_mask): Ditto. gcc/testsuite/ChangeLog: PR target/101471 * gcc.target/i386/avx512f-pr101471.c: New test.
2021-08-24rs6000: Add vec_unpacku_{hi,lo}_v4siKewen Lin11-129/+196
The existing vec_unpacku_{hi,lo} supports emulated unsigned unpacking for short and char but misses the support for int. This patch adds the support of vec_unpacku_{hi,lo}_v4si. Meanwhile, the current implementation uses vector permutation way, which requires one extra customized constant vector as the permutation control vector. It's better to use vector merge high/low with zero constant vector, to save the space in constant area as well as the cost to initialize pcv in prologue. This patch updates it with vector merging and simplify it with iterators. gcc/ChangeLog: * config/rs6000/altivec.md (vec_unpacku_hi_v16qi): Remove. (vec_unpacku_hi_v8hi): Likewise. (vec_unpacku_lo_v16qi): Likewise. (vec_unpacku_lo_v8hi): Likewise. (vec_unpacku_hi_<VP_small_lc>): New define_expand. (vec_unpacku_lo_<VP_small_lc>): Likewise. gcc/testsuite/ChangeLog: * gcc.target/powerpc/unpack-vectorize-1.c: New test. * gcc.target/powerpc/unpack-vectorize-1.h: New test. * gcc.target/powerpc/unpack-vectorize-2.c: New test. * gcc.target/powerpc/unpack-vectorize-2.h: New test. * gcc.target/powerpc/unpack-vectorize-3.c: New test. * gcc.target/powerpc/unpack-vectorize-3.h: New test. * gcc.target/powerpc/unpack-vectorize-run-1.c: New test. * gcc.target/powerpc/unpack-vectorize-run-2.c: New test. * gcc.target/powerpc/unpack-vectorize-run-3.c: New test. * gcc.target/powerpc/unpack-vectorize.h: New test.