aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-12-17re PR rtl-optimization/88253 (Inlining of function incorrectly deletes ↵Senthil Kumar Selvaraj4-2/+30
volatile register access when using XOR in avr-gcc) Fix PR 88253 gcc/ChangeLog: PR rtl-optimization/88253 * combine.c (combine_simplify_rtx): Test for side-effects before substituting by zero. gcc/testsuite/ChangeLog: PR rtl-optimization/88253 * gcc.target/avr/pr88253.c: New test. From-SVN: r267198
2018-12-17Add a loop versioning passRichard Sandiford39-12/+3204
This patch adds a pass that versions loops with variable index strides for the case in which the stride is 1. E.g.: for (int i = 0; i < n; ++i) x[i * stride] = ...; becomes: if (stepx == 1) for (int i = 0; i < n; ++i) x[i] = ...; else for (int i = 0; i < n; ++i) x[i * stride] = ...; This is useful for both vector code and scalar code, and in some cases can enable further optimisations like loop interchange or pattern recognition. The pass gives a 7.6% improvement on Cortex-A72 for 554.roms_r at -O3 and a 2.4% improvement for 465.tonto. I haven't found any SPEC tests that regress. Sizewise, there's a 10% increase in .text for both 554.roms_r and 465.tonto. That's obviously a lot, but in tonto's case it's because the whole program is written using assumed-shape arrays and pointers, so a large number of functions really do benefit from versioning. roms likewise makes heavy use of assumed-shape arrays, and that improvement in performance IMO justifies the code growth. The next biggest .text increase is 4.5% for 548.exchange2_r. I did see a small (0.4%) speed improvement there, but although both 3-iteration runs produced stable results, that might still be noise. There was a slightly larger (non-noise) improvement for a 256-bit SVE model. 481.wrf and 521.wrf_r .text grew by 2.8% and 2.5% respectively, but without any noticeable improvement in performance. No other test grew by more than 2%. Although the main SPEC beneficiaries are all Fortran tests, the benchmarks we use for SVE also include some C and C++ tests that benefit. Using -frepack-arrays gives the same benefits in many Fortran cases. The problem is that using that option inappropriately can force a full array copy for arguments that the function only reads once, and so it isn't really something we can turn on by default. The new pass is supposed to give most of the benefits of -frepack-arrays without the risk of unnecessary repacking. The patch therefore enables the pass by default at -O3. 2018-12-17 Richard Sandiford <richard.sandiford@arm.com> Ramana Radhakrishnan <ramana.radhakrishnan@arm.com> Kyrylo Tkachov <kyrylo.tkachov@arm.com> gcc/ * doc/invoke.texi (-fversion-loops-for-strides): Document (loop-versioning-group-size, loop-versioning-max-inner-insns) (loop-versioning-max-outer-insns): Document new --params. * Makefile.in (OBJS): Add gimple-loop-versioning.o. * common.opt (fversion-loops-for-strides): New option. * opts.c (default_options_table): Enable fversion-loops-for-strides at -O3. * params.def (PARAM_LOOP_VERSIONING_GROUP_SIZE) (PARAM_LOOP_VERSIONING_MAX_INNER_INSNS) (PARAM_LOOP_VERSIONING_MAX_OUTER_INSNS): New parameters. * passes.def: Add pass_loop_versioning. * timevar.def (TV_LOOP_VERSIONING): New time variable. * tree-ssa-propagate.h (substitute_and_fold_engine::substitute_and_fold): Add an optional block parameter. * tree-ssa-propagate.c (substitute_and_fold_engine::substitute_and_fold): Likewise. When passed, only walk blocks dominated by that block. * tree-vrp.h (range_includes_p): Declare. (range_includes_zero_p): Turn into an inline wrapper around range_includes_p. * tree-vrp.c (range_includes_p): New function, generalizing... (range_includes_zero_p): ...this. * tree-pass.h (make_pass_loop_versioning): Declare. * gimple-loop-versioning.cc: New file. gcc/testsuite/ * gcc.dg/loop-versioning-1.c: New test. * gcc.dg/loop-versioning-10.c: Likewise. * gcc.dg/loop-versioning-11.c: Likewise. * gcc.dg/loop-versioning-2.c: Likewise. * gcc.dg/loop-versioning-3.c: Likewise. * gcc.dg/loop-versioning-4.c: Likewise. * gcc.dg/loop-versioning-5.c: Likewise. * gcc.dg/loop-versioning-6.c: Likewise. * gcc.dg/loop-versioning-7.c: Likewise. * gcc.dg/loop-versioning-8.c: Likewise. * gcc.dg/loop-versioning-9.c: Likewise. * gfortran.dg/loop_versioning_1.f90: Likewise. * gfortran.dg/loop_versioning_2.f90: Likewise. * gfortran.dg/loop_versioning_3.f90: Likewise. * gfortran.dg/loop_versioning_4.f90: Likewise. * gfortran.dg/loop_versioning_5.f90: Likewise. * gfortran.dg/loop_versioning_6.f90: Likewise. * gfortran.dg/loop_versioning_7.f90: Likewise. * gfortran.dg/loop_versioning_8.f90: Likewise. From-SVN: r267197
2018-12-17re PR fortran/85314 (gcc/fortran/resolve.c:9222: unreachable code ?)Steven G. Kargl2-4/+5
2018-12-16 Steven G. Kargl <kargl@gcc.gnu.org> PR fortran/85314 * resolve.c (resolve_transfer): Remove dead code. From-SVN: r267196
2018-12-17Daily bump.GCC Administrator1-1/+1
From-SVN: r267195
2018-12-16libphobos: Merge common version blocks for core.sys.posix.sys.msg.Iain Buclaw1-395/+174
This is a continuation of simplifying C bindings so there aren't dozens of duplicated code for each architecture. For this particular module, it now more closely resembles how glibc arranges msq.h, fixing a couple of targets in the process, notably X32. Backport from upstream druntime 2.084. Reviewed-on: https://github.com/dlang/druntime/pull/2362 From-SVN: r267192
2018-12-16ipa-fnsummary.c (remap_edge_change_prob): Do not ICE when changes are not ↵Jan Hubicka2-0/+8
streamed in. * ipa-fnsummary.c (remap_edge_change_prob): Do not ICE when changes are not streamed in. From-SVN: r267191
2018-12-16re PR fortran/88116 (ICE in gfc_convert_constant(): Unexpected type)Steven G. Kargl8-2/+47
2018-12-16 Steven G. Kargl <kargl@gcc.gnu.org> PR fortran/88116 PR fortran/88467 * array.c (gfc_match_array_constructor): Check return value of gfc_convert_type(). Skip constructor elements with BT_UNKNOWN, which need to go through resolution. * intrinsic.c (gfc_convert_type_warn): Return early if the types martch (i.e., no conversion is required). * simplify.c (gfc_convert_constant): Remove a gfc_internal_error, and return gfc_bad_expr. 2018-12-16 Steven G. Kargl <kargl@gcc.gnu.org> PR fortran/88116 * gfortran.dg/pr88116_1.f90: New test. * gfortran.dg/pr88116_2.f90: Ditto. PR fortran/88467 * gfortran.dg/pr88467.f90: New test. From-SVN: r267189
2018-12-16decl.c (variable_decl): Typo fixes.Steven G. Kargl4-3/+11
2018-12-16 Steven G. Kargl <kargl@gcc.gnu.org> * decl.c (variable_decl): Typo fixes. 2018-12-16 Steven G. Kargl <kargl@gcc.gnu.org> * gfortran.dg/pr88138.f90: Remove extraneous 's' in comment. From-SVN: r267188
2018-12-16PF fortran/88364Thomas Koenig4-2/+35
2018-12-16 Thomas Koenig <tkoenig@gcc.gnu.org> PF fortran/88364 * trans-expr.c (gfc_conv_expr_reference): Do not add clobber if the expression contains a reference. 2018-12-16 Thomas Koenig <tkoenig@gcc.gnu.org> PR fortran/88363 * intent_out_13.f90: New test. From-SVN: r267187
2018-12-16x86: Revert reversion 267133H.J. Lu4-17/+13
Revert commit: commit 76c21b271247ccbd681bdb4530426d2fe35dbfa5 Author: hjl <hjl@138bc75d-0d04-0410-961f-82ee72b054a4> Date: Fri Dec 14 12:38:04 2018 +0000 x86: Don't use get_frame_size when finalizing stack frame gcc/ PR target/88483 * config/i386/i386.c (ix86_finalize_stack_frame_flags): Revert reversion 267133. gcc/testsuite/ PR target/88483 * gcc.target/i386/stackalign/pr88483.c: Removed. Revert reversion 267133. From-SVN: r267186
2018-12-16ipa-fnsummary.c (analyze_function_body): Do not loeak conds and size_time_table.Jan Hubicka2-24/+72
* ipa-fnsummary.c (analyze_function_body): Do not loeak conds and size_time_table. (ipa_fn_summary_generate): Add prevails parameter; do not allocate data when symbol is not prevailing. (inline_read_section): Likewise. From-SVN: r267185
2018-12-16re PR fortran/87994 (ICE in match_data_constant, at fortran/decl.c:399)Steven G. Kargl6-2/+45
2018-12-15 Steven G. Kargl <kargl@gcc.gnu.org> PR fortran/87944 * decl.c (match_data_constant): Allow inquiry parameter as data constant in data statement. 2018-12-15 Steven G. Kargl <kargl@gcc.gnu.org> PR fortran/87944 * gfortran.dg/pr87994_1.f90: New test. * gfortran.dg/pr87994_2.f90: Ditto. * gfortran.dg/pr87994_3.f90: Ditto. From-SVN: r267184
2018-12-16Daily bump.GCC Administrator1-1/+1
From-SVN: r267183
2018-12-16re PR c++/88482 (ICE when wrongly declaring __cxa_allocate_exception)Jakub Jelinek15-37/+361
PR c++/88482 * except.c (verify_library_fn): New function. (declare_library_fn): Use it. Initialize TM even if the non-TM library function has been user declared. (do_end_catch): Don't set TREE_NOTHROW on error_mark_node. (expand_start_catch_block): Don't call initialize_handler_parm for error_mark_node. (build_throw): Use verify_library_fn. Initialize TM even if the non-TM library function has been user declared. Don't crash if any library fn is error_mark_node. * g++.dg/eh/builtin5.C: New test. * g++.dg/eh/builtin6.C: New test. * g++.dg/eh/builtin7.C: New test. * g++.dg/eh/builtin8.C: New test. * g++.dg/eh/builtin9.C: New test. * g++.dg/eh/builtin10.C: New test. * g++.dg/eh/builtin11.C: New test. * g++.dg/parse/crash55.C: Adjust expected diagnostics. * eh_cpp.cc (__cxa_throw): Change DEST argument type from void * to void (*) (void *). (_ITM_cxa_throw): Likewise. * libitm.h (_ITM_cxa_throw): Likewise. * libitm.texi (_ITM_cxa_throw): Likewise. From-SVN: r267179
2018-12-15re PR fortran/88138 (ICE in gfc_arith_concat, at fortran/arith.c:1007)Steven G. Kargl4-0/+36
2019-12-15 Steven G. Kargl <kargl@gcc.gnu.org> PR fortran/88138 * decl.c (variable_decl): Check that a derived isn't being assigned an incompatible entity in an initialization. 2019-12-15 Steven G. Kargl <kargl@gcc.gnu.org> PR fortran/88138 * gfortran.dg/pr88138.f90: new test. From-SVN: r267177
2018-12-15Small lambda instantiation tweak.Jason Merrill2-2/+8
While looking at something else I noticed that we were passing 0 to the "nonclass" parameter here; we might as well pass 1, since capture proxies are always at block scope. * pt.c (tsubst_expr) [DECL_EXPR]: Ignore class-scope bindings when looking up a capture proxy. From-SVN: r267176
2018-12-15cgraph.h (cgraph_node): Add predicate prevailing_p.Jan Hubicka4-87/+166
* cgraph.h (cgraph_node): Add predicate prevailing_p. (cgraph_edge): Add predicate possible_call_in_translation_unit_p. * ipa-prop.c (ipa_write_jump_function): Optimize streaming of ADDR_EXPR. (ipa_read_jump_function): Add prevails parameter; optimize streaming. (ipa_read_edge_info): Break out from ... (ipa_read_node_info): ... here; optimize streaming. * cgraph.c (cgraph_edge::possibly_call_in_translation_unit_p): New predicate. From-SVN: r267175
2018-12-15ipa-utils.c (ipa_merge_profiles): Do no merging when source function has ↵Jan Hubicka2-0/+9
zero count. * ipa-utils.c (ipa_merge_profiles): Do no merging when source function has zero count. From-SVN: r267174
2018-12-15re PR tree-optimization/88464 (AVX-512 vectorization of masked scatter ↵Jakub Jelinek2-0/+106
failing with "not suitable for scatter store") PR tree-optimization/88464 PR target/88498 * tree-vect-stmts.c (vect_build_gather_load_calls): For NARROWING and mask with integral masktype, don't try to permute mask vectors, instead emit VEC_UNPACK_{LO,HI}_EXPR. Fix up NOP_EXPR operand. (vectorizable_store): Handle masked scatters with decl and integral mask type. (permute_vec_elements): Allow scalar_dest to be NULL. * config/i386/i386.c (ix86_get_builtin) <case IX86_BUILTIN_GATHER3ALTDIV16SF>: Use lowpart_subreg for masks. <case IX86_BUILTIN_GATHER3ALTDIV8SF>: Don't assume mask and src have to be the same. * gcc.target/i386/avx512f-pr88462-1.c: Rename to ... * gcc.target/i386/avx512f-pr88464-1.c: ... this. Fix up PR number. Expect 4 vectorized loops instead of 3. (f4): New function. * gcc.target/i386/avx512f-pr88462-2.c: Rename to ... * gcc.target/i386/avx512f-pr88464-2.c: ... this. Fix up PR number and #include. (avx512f_test): Prepare arguments for f4 and check the results. * gcc.target/i386/avx512f-pr88464-3.c: New test. * gcc.target/i386/avx512f-pr88464-4.c: New test. From-SVN: r267170
2018-12-15re PR tree-optimization/88464 (AVX-512 vectorization of masked scatter ↵Jakub Jelinek6-38/+152
failing with "not suitable for scatter store") PR tree-optimization/88464 PR target/88498 * tree-vect-stmts.c (vect_build_gather_load_calls): For NARROWING and mask with integral masktype, don't try to permute mask vectors, instead emit VEC_UNPACK_{LO,HI}_EXPR. Fix up NOP_EXPR operand. (vectorizable_store): Handle masked scatters with decl and integral mask type. (permute_vec_elements): Allow scalar_dest to be NULL. * config/i386/i386.c (ix86_get_builtin) <case IX86_BUILTIN_GATHER3ALTDIV16SF>: Use lowpart_subreg for masks. <case IX86_BUILTIN_GATHER3ALTDIV8SF>: Don't assume mask and src have to be the same. * gcc.target/i386/avx512f-pr88462-1.c: Rename to ... * gcc.target/i386/avx512f-pr88464-1.c: ... this. Fix up PR number. Expect 4 vectorized loops instead of 3. (f4): New function. * gcc.target/i386/avx512f-pr88462-2.c: Rename to ... * gcc.target/i386/avx512f-pr88464-2.c: ... this. Fix up PR number and #include. (avx512f_test): Prepare arguments for f4 and check the results. * gcc.target/i386/avx512f-pr88464-3.c: New test. * gcc.target/i386/avx512f-pr88464-4.c: New test. From-SVN: r267169
2018-12-15ipa.c (cgraph_build_static_cdtor_1): Add OPTIMIZATION and TARGET parameters.Jan Hubicka2-3/+17
* ipa.c (cgraph_build_static_cdtor_1): Add OPTIMIZATION and TARGET parameters. (cgraph_build_static_cdtor): Update. (build_cdtor): Use OPTIMIZATION and TARGET of the first real cdtor callsed. From-SVN: r267168
2018-12-15re PR c++/84644 (internal compiler error: in ↵Paolo Carlini7-6/+32
warn_misplaced_attr_for_class_type, at cp/decl.c:4718) /cp 2018-12-15 Paolo Carlini <paolo.carlini@oracle.com> PR c++/84644 * decl.c (check_tag_decl): A decltype with no declarator doesn't declare anything. /testsuite 2018-12-15 Paolo Carlini <paolo.carlini@oracle.com> PR c++/84644 * g++.dg/cpp0x/decltype68.C: New. * g++.dg/cpp0x/decltype-33838.C: Adjust. * g++.dg/template/spec32.C: Likewise. * g++.dg/template/ttp22.C: Likewise. From-SVN: r267165
2018-12-15[RS6000] Use gen_hard_reg_clobber in rs6000.cAlan Modra2-24/+19
I noticed when looking at PR88311 that rs6000_call_sysv should be using gen_hard_reg_clobber (as the sysv call insns did prior to introducing rs6000_call_sysv). This patch fixes that minor regression, and other like places in rs6000.c. * config/rs6000/rs6000.c (generate_set_vrsave, rs6000_emit_savres_rtx), (rs6000_emit_prologue, rs6000_call_aix, rs6000_call_sysv), (rs6000_call_darwin_1): Use gen_hard_reg_clobber. From-SVN: r267164
2018-12-15Daily bump.GCC Administrator1-1/+1
From-SVN: r267163
2018-12-15re PR target/88489 (FAIL: gcc.target/i386/avx512f-vfixupimmss-2.c execution ↵Jakub Jelinek5-1/+51
test) PR target/88489 * config/i386/sse.md (UNSPEC_SFIXUPIMM): New unspec enumerator. (avx512f_sfixupimm<mode><mask_name><round_saeonly_name>): Use it instead of UNSPEC_FIXUPIMM. * gcc.target/i386/avx512vl-vfixupimmsd-2.c: New test. * gcc.target/i386/avx512vl-vfixupimmss-2.c: New test. From-SVN: r267160
2018-12-15re PR rtl-optimization/88478 (valgrind error in cselib_record_sets)Jakub Jelinek4-1/+29
PR rtl-optimization/88478 * cselib.c (cselib_record_sets): Move sets[i].src_elt tests after REG_P (dest) test. * g++.dg/opt/pr88478.C: New test. From-SVN: r267159
2018-12-14PR tree-optimization/88372 - alloc_size attribute is ignored on function ↵Martin Sebor7-40/+308
pointers gcc/ChangeLog: PR tree-optimization/88372 * calls.c (maybe_warn_alloc_args_overflow): Handle function pointers. * tree-object-size.c (alloc_object_size): Same. Simplify. * doc/extend.texi (Object Size Checking): Update. (Other Builtins): Add __builtin_object_size. (Common Type Attributes): Add alloc_size. (Common Variable Attributes): Ditto. gcc/testsuite/ChangeLog: PR tree-optimization/88372 * gcc.dg/Walloc-size-larger-than-18.c: New test. * gcc.dg/builtin-object-size-19.c: Same. From-SVN: r267158
2018-12-14PR tree-optimization/87096 - Optimised snprintf is not POSIX conformantMartin Sebor4-7/+205
gcc/ChangeLog: PR rtl-optimization/87096 * gimple-ssa-sprintf.c (sprintf_dom_walker::handle_gimple_call): Avoid folding calls whose bound may exceed INT_MAX. Diagnose bound ranges that exceed the limit. gcc/testsuite/ChangeLog: PR tree-optimization/87096 * gcc.dg/tree-ssa/builtin-snprintf-4.c: New test. From-SVN: r267157
2018-12-14PR 79738 - Documentation for __attribute__((const)) slightly misleadingMartin Sebor2-30/+73
gcc/ChangeLog: * doc/extend.texi (attribute const, pure): Clarify. From-SVN: r267156
2018-12-14[PR c++/87814] undefer deferred noexcept on tsubst if requestAlexandre Oliva4-3/+48
tsubst_expr and tsubst_copy_and_build are not expected to handle DEFERRED_NOEXCEPT exprs, but if tsubst_exception_specification takes a DEFERRED_NOEXCEPT expr with !defer_ok, it just passes the expr on for tsubst_copy_and_build to barf. This patch arranges for tsubst_exception_specification to combine the incoming args with those already stored in a DEFERRED_NOEXCEPT, and then substitute them into the pattern, when retaining a deferred noexcept is unacceptable. for gcc/cp/ChangeLog PR c++/87814 * pt.c (tsubst_exception_specification): Handle DEFERRED_NOEXCEPT with !defer_ok. for gcc/testsuite/ChangeLog PR c++/87814 * g++.dg/cpp1z/pr87814.C: New. From-SVN: r267155
2018-12-14x86; Add -mmanual-endbr and cf_check function attributeH.J. Lu11-1/+96
Currently GCC inserts ENDBR instruction at entries of all non-static functions, unless LTO compilation is used. Marking all functions, which are not called indirectly with nocf_check attribute, is not ideal since 99% of functions in a program may be of this kind. This patch adds -mmanual-endbr and cf_check function attribute. They can be used together with -fcf-protection such that ENDBR instruction is inserted only at entries of functions with cf_check attribute. It can limit number of ENDBR instructions to reduce program size. gcc/ * config/i386/i386.c (rest_of_insert_endbranch): Insert ENDBR at the function entry only when -mmanual-endbr isn't used or there is cf_check function attribute. (ix86_attribute_table): Add cf_check. * config/i386/i386.opt: Add -mmanual-endbr. * doc/extend.texi: Document cf_check attribute. * doc/invoke.texi: Document -mmanual-endbr. gcc/testsuite/ * gcc.target/i386/cf_check-1.c: New test. * gcc.target/i386/cf_check-2.c: Likewise. * gcc.target/i386/cf_check-3.c: Likewise. * gcc.target/i386/cf_check-4.c: Likewise. * gcc.target/i386/cf_check-5.c: Likewise. From-SVN: r267154
2018-12-14Missing changes from "Adjust copy/copyin/copyout/create for OpenACC 2.5"Thomas Schwinge19-297/+51
Most of that patch's changes were already committed as part of r261813 "Update OpenACC data clause semantics to the 2.5 behavior", but not all of them. libgomp/ * oacc-mem.c (acc_present_or_create): Remove definition and change to alias of acc_create. (acc_present_or_copyin): Remove definition and change to alias of acc_copyin. * oacc-parallel.c (GOACC_enter_exit_data): Call acc_create instead of acc_present_or_create. * testsuite/libgomp.oacc-c-c++-common/data-already-1.c: Remove. * testsuite/libgomp.oacc-c-c++-common/data-already-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/data-already-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/data-already-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/data-already-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/data-already-6.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/data-already-7.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/data-already-8.c: Likewise. * testsuite/libgomp.oacc-fortran/data-already-1.f: Likewise. * testsuite/libgomp.oacc-fortran/data-already-2.f: Likewise. * testsuite/libgomp.oacc-fortran/data-already-3.f: Likewise. * testsuite/libgomp.oacc-fortran/data-already-4.f: Likewise. * testsuite/libgomp.oacc-fortran/data-already-5.f: Likewise. * testsuite/libgomp.oacc-fortran/data-already-6.f: Likewise. * testsuite/libgomp.oacc-fortran/data-already-7.f: Likewise. * testsuite/libgomp.oacc-fortran/data-already-8.f: Likewise. Co-Authored-By: Chung-Lin Tang <cltang@codesourcery.com> From-SVN: r267153
2018-12-14[PR88495] An OpenACC async queue is always synchronized with itselfThomas Schwinge4-139/+8
An OpenACC async queue is always synchronized with itself, so invocations like "#pragma acc wait(0) async(0)", or "acc_wait_async (0, 0)" don't make a lot of sense, but are still valid. libgomp/ PR libgomp/88495 * plugin/plugin-nvptx.c (nvptx_wait_async): Don't refuse "identical parameters". * testsuite/libgomp.oacc-c-c++-common/asyncwait-nop-1.c: Update. * testsuite/libgomp.oacc-c-c++-common/lib-80.c: Remove. From-SVN: r267152
2018-12-14[PR88484] OpenACC wait directive without wait argument but with async clauseThomas Schwinge3-2/+84
We don't correctly handle "#pragma acc wait async (a)" for "a >= 0", handling as a no-op whereas it should enqueue the appropriate wait operations on "async (a)". libgomp/ PR libgomp/88484 * oacc-parallel.c (GOACC_wait): Correct handling for "async >= 0". * testsuite/libgomp.oacc-c-c++-common/asyncwait-nop-1.c: New file. From-SVN: r267151
2018-12-14[PR88407] [OpenACC] Correctly handle unseen async-argumentsThomas Schwinge11-267/+93
... which turn the operation into a no-op. libgomp/ PR libgomp/88407 * plugin/plugin-nvptx.c (nvptx_async_test, nvptx_wait) (nvptx_wait_async): Unseen async-argument is a no-op. * testsuite/libgomp.oacc-c-c++-common/async_queue-1.c: Update. * testsuite/libgomp.oacc-c-c++-common/data-2-lib.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/data-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-79.c: Likewise. * testsuite/libgomp.oacc-fortran/lib-12.f90: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-71.c: Merge into... * testsuite/libgomp.oacc-c-c++-common/lib-69.c: ... this. Update. * testsuite/libgomp.oacc-c-c++-common/lib-77.c: Merge into... * testsuite/libgomp.oacc-c-c++-common/lib-74.c: ... this. Update From-SVN: r267150
2018-12-14Revise libgomp.oacc-c-c++-common/data-2-lib.c, ↵Thomas Schwinge3-157/+125
libgomp.oacc-c-c++-common/data-2.c These are meant to be functionally equivalent (but no longer are), just using different means. Also, use the OpenACC "*_async" functions recently added. libgomp/ * testsuite/libgomp.oacc-c-c++-common/data-2-lib.c: Revise. * testsuite/libgomp.oacc-c-c++-common/data-2.c: Likewise. From-SVN: r267149
2018-12-14Correctly describe OpenACC async/wait dependenciesChung-Lin Tang4-3/+9
libgomp/ * testsuite/libgomp.oacc-c-c++-common/data-2-lib.c: Adjust. * testsuite/libgomp.oacc-c-c++-common/data-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/data-3.c: Likewise. Reviewed-by: Thomas Schwinge <thomas@codesourcery.com> From-SVN: r267148
2018-12-14[PR88370] acc_get_cuda_stream/acc_set_cuda_stream: acc_async_sync, ↵Thomas Schwinge8-20/+222
acc_async_noval Per my reading of the OpenACC specification (and as supported by secondary documentation, such as code examples, or presentations), it's valid to call "acc_get_cuda_stream"/"acc_set_cuda_stream" also with "acc_async_sync", "acc_async_noval" arguments, not just with the nonnegative values as currently implemented. libgomp/ PR libgomp/88370 * libgomp.texi (acc_get_current_cuda_context, acc_get_cuda_stream) (acc_set_cuda_stream): Clarify. * oacc-cuda.c (acc_get_cuda_stream, acc_set_cuda_stream): Use "async_valid_p". * plugin/plugin-nvptx.c (nvptx_set_cuda_stream): Refuse "async == acc_async_sync". * testsuite/libgomp.oacc-c-c++-common/acc_set_cuda_stream-1.c: New file. * testsuite/libgomp.oacc-c-c++-common/async_queue-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/lib-84.c: Update. * testsuite/libgomp.oacc-c-c++-common/lib-85.c: Likewise. From-SVN: r267147
2018-12-14Add user-friendly diagnostics for OpenACC loop parallelism assignedThomas Schwinge17-16/+346
gcc/ * omp-offload.c (inform_oacc_loop): New function. (execute_oacc_device_lower): Use it to display loop parallelism. gcc/testsuite/ * c-c++-common/goacc/note-parallelism.c: New test. * gfortran.dg/goacc/note-parallelism.f90: New test. * c-c++-common/goacc/classify-kernels-unparallelized.c: Update. * c-c++-common/goacc/classify-kernels.c: Likewise. * c-c++-common/goacc/classify-parallel.c: Likewise. * c-c++-common/goacc/classify-routine.c: Likewise. * c-c++-common/goacc/kernels-1.c: Likewise. * c-c++-common/goacc/kernels-double-reduction-n.c: Likewise. * c-c++-common/goacc/kernels-double-reduction.c: Likewise. * gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise. * gfortran.dg/goacc/classify-kernels.f95: Likewise. * gfortran.dg/goacc/classify-parallel.f95: Likewise. * gfortran.dg/goacc/classify-routine.f95: Likewise. * gfortran.dg/goacc/kernels-loop-inner.f95: Likewise. Co-Authored-By: Cesar Philippidis <cesar@codesourcery.com> From-SVN: r267146
2018-12-14Repair liboffloadmic after "(Partial) OpenMP 5.0 support for GCC 9"Thomas Schwinge2-6/+12
..., which now failed to build, as follows: In file included from [...]/source-gcc/liboffloadmic/runtime/offload_common.h:43, from [...]/source-gcc/liboffloadmic/runtime/dv_util.cpp:31: [...]/source-gcc/liboffloadmic/runtime/offload.h:220:12: error: conflicting declaration of C function 'int omp_target_is_present(void*, int)' 220 | extern int omp_target_is_present( | ^~~~~~~~~~~~~~~~~~~~~ In file included from [...]/source-gcc/liboffloadmic/runtime/offload.h:45, from [...]/source-gcc/liboffloadmic/runtime/offload_common.h:43, from [...]/source-gcc/liboffloadmic/runtime/dv_util.cpp:31: ./../libgomp/omp.h:166:12: note: previous declaration 'int omp_target_is_present(const void*, int)' 166 | extern int omp_target_is_present (const void *, int) __GOMP_NOTHROW; | ^~~~~~~~~~~~~~~~~~~~~ In file included from [...]/source-gcc/liboffloadmic/runtime/offload_common.h:43, from [...]/source-gcc/liboffloadmic/runtime/dv_util.cpp:31: [...]/source-gcc/liboffloadmic/runtime/offload.h:236:12: error: conflicting declaration of C function 'int omp_target_memcpy(void*, void*, size_t, size_t, size_t, int, int)' 236 | extern int omp_target_memcpy( | ^~~~~~~~~~~~~~~~~ In file included from [...]/source-gcc/liboffloadmic/runtime/offload.h:45, from [...]/source-gcc/liboffloadmic/runtime/offload_common.h:43, from [...]/source-gcc/liboffloadmic/runtime/dv_util.cpp:31: ./../libgomp/omp.h:167:12: note: previous declaration 'int omp_target_memcpy(void*, const void*, long unsigned int, long unsigned int, long unsigned int, int, int)' 167 | extern int omp_target_memcpy (void *, const void *, __SIZE_TYPE__, | ^~~~~~~~~~~~~~~~~ In file included from [...]/source-gcc/liboffloadmic/runtime/offload_common.h:43, from [...]/source-gcc/liboffloadmic/runtime/dv_util.cpp:31: [...]/source-gcc/liboffloadmic/runtime/offload.h:262:12: error: conflicting declaration of C function 'int omp_target_memcpy_rect(void*, void*, size_t, int, const size_t*, const size_t*, const size_t*, const size_t*, const size_t*, int, int)' 262 | extern int omp_target_memcpy_rect( | ^~~~~~~~~~~~~~~~~~~~~~ In file included from [...]/source-gcc/liboffloadmic/runtime/offload.h:45, from [...]/source-gcc/liboffloadmic/runtime/offload_common.h:43, from [...]/source-gcc/liboffloadmic/runtime/dv_util.cpp:31: ./../libgomp/omp.h:170:12: note: previous declaration 'int omp_target_memcpy_rect(void*, const void*, long unsigned int, int, const long unsigned int*, const long unsigned int*, const long unsigned int*, const long unsigned int*, const long unsigned int*, int, int)' 170 | extern int omp_target_memcpy_rect (void *, const void *, __SIZE_TYPE__, int, | ^~~~~~~~~~~~~~~~~~~~~~ In file included from [...]/source-gcc/liboffloadmic/runtime/offload_common.h:43, from [...]/source-gcc/liboffloadmic/runtime/dv_util.cpp:31: [...]/source-gcc/liboffloadmic/runtime/offload.h:285:12: error: conflicting declaration of C function 'int omp_target_associate_ptr(void*, void*, size_t, size_t, int)' 285 | extern int omp_target_associate_ptr( | ^~~~~~~~~~~~~~~~~~~~~~~~ In file included from [...]/source-gcc/liboffloadmic/runtime/offload.h:45, from [...]/source-gcc/liboffloadmic/runtime/offload_common.h:43, from [...]/source-gcc/liboffloadmic/runtime/dv_util.cpp:31: ./../libgomp/omp.h:177:12: note: previous declaration 'int omp_target_associate_ptr(const void*, const void*, long unsigned int, long unsigned int, int)' 177 | extern int omp_target_associate_ptr (const void *, const void *, __SIZE_TYPE__, | ^~~~~~~~~~~~~~~~~~~~~~~~ In file included from [...]/source-gcc/liboffloadmic/runtime/offload_common.h:43, from [...]/source-gcc/liboffloadmic/runtime/dv_util.cpp:31: [...]/source-gcc/liboffloadmic/runtime/offload.h:299:12: error: conflicting declaration of C function 'int omp_target_disassociate_ptr(void*, int)' 299 | extern int omp_target_disassociate_ptr( | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from [...]/source-gcc/liboffloadmic/runtime/offload.h:45, from [...]/source-gcc/liboffloadmic/runtime/offload_common.h:43, from [...]/source-gcc/liboffloadmic/runtime/dv_util.cpp:31: ./../libgomp/omp.h:179:12: note: previous declaration 'int omp_target_disassociate_ptr(const void*, int)' 179 | extern int omp_target_disassociate_ptr (const void *, int) __GOMP_NOTHROW; | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ Makefile:904: recipe for target 'liboffloadmic_target_la-dv_util.lo' failed make[3]: *** [liboffloadmic_target_la-dv_util.lo] Error 1 make[3]: Leaving directory '[...]/build-gcc-offload-x86_64-intelmicemul-linux-gnu/x86_64-intelmicemul-linux-gnu/liboffloadmic' Makefile:1031: recipe for target 'all-recursive' failed make[2]: *** [all-recursive] Error 1 make[2]: Leaving directory '[...]/build-gcc-offload-x86_64-intelmicemul-linux-gnu/x86_64-intelmicemul-linux-gnu/liboffloadmic' Makefile:12707: recipe for target 'all-target-liboffloadmic' failed make[1]: *** [all-target-liboffloadmic] Error 2 make[1]: Leaving directory '[...]/build-gcc-offload-x86_64-intelmicemul-linux-gnu' Makefile:941: recipe for target 'all' failed make: *** [all] Error 2 liboffloadmic/ * runtime/offload.h (omp_target_is_present, omp_target_memcpy) (omp_target_memcpy_rect, omp_target_associate_ptr) (omp_target_disassociate_ptr): Adjust to libgomp changes. From-SVN: r267145
2018-12-14[PR86823] retain deferred access checks from outside firewallAlexandre Oliva4-4/+32
We used to preserve deferred access check along with resolved template ids, but a tentative parsing firewall introduced additional layers of deferred access checks, so that we don't preserve the checks we want to any more. This patch moves the deferred access checks from outside the firewall into it. From: Jason Merrill <jason@redhat.com> for gcc/cp/ChangeLog PR c++/86823 * parser.c (cp_parser_template_id): Rearrange deferred access checks into the firewall. From: Alexandre Oliva <aoliva@redhat.com> for gcc/testsuite/ChangeLog PR c++/86823 * g++.dg/pr86823.C: New. From-SVN: r267144
2018-12-14re PR c++/82294 (Array of objects with constexpr constructors initialized ↵Jakub Jelinek7-17/+131
from space-inefficient memory image) PR c++/82294 PR c++/87436 * expr.h (categorize_ctor_elements): Add p_unique_nz_elts argument. * expr.c (categorize_ctor_elements_1): Likewise. Compute it like p_nz_elts, except don't multiply it by mult. Adjust recursive call. Fix up COMPLEX_CST handling. (categorize_ctor_elements): Add p_unique_nz_elts argument, initialize it and pass it through to categorize_ctor_elements_1. (mostly_zeros_p, all_zeros_p): Adjust categorize_ctor_elements callers. * gimplify.c (gimplify_init_constructor): Likewise. Don't force ctor into readonly data section if num_unique_nonzero_elements is smaller or equal to 1/8 of num_nonzero_elements and size is >= 64 bytes. * g++.dg/tree-ssa/pr82294.C: New test. * g++.dg/tree-ssa/pr87436.C: New test. From-SVN: r267143
2018-12-14re PR c++/82294 (Array of objects with constexpr constructors initialized ↵Jakub Jelinek2-3/+14
from space-inefficient memory image) PR c++/82294 PR c++/87436 * init.c (build_vec_init): Change num_initialized_elts type from int to HOST_WIDE_INT. Build a RANGE_EXPR if e needs to be repeated more than once. From-SVN: r267142
2018-12-14ARM] Improve robustness of -mslow-flash-dataThomas Preud'homme11-47/+237
Current code to handle -mslow-flash-data in machine description files suffers from a number of issues which this patch fixes: 1) The insn_and_split in vfp.md to load a generic floating-point constant via GPR first and move it to VFP register are guarded by !reload_completed which is forbidden explicitely in the GCC internals documentation section 17.2 point 3; 2) A number of testcase in the testsuite ICEs under -mslow-flash-data when targeting the hardfloat ABI [1]; 3) Instructions performing load from literal pool are not disabled. These problems are addressed by 2 separate actions: 1) Making the splitters take a clobber and changing the expanders accordingly to generate a mov with clobber in cases where a literal pool would be used. The splitter can thus be enabled after reload since it does not call gen_reg_rtx anymore; 2) Adding new predicates and constraints to disable literal pool loads in existing instructions when -mslow-flash-data is in effect. The patch also rework the splitter for DFmode slightly to generate an intermediate DI load instead of 2 intermediate SI loads, thus relying on the existing DI splitters instead of redoing their job. At last, the patch adds some missing arm_fp_ok effective target to some of the slow-flash-data testcases. [1] c-c++-common/Wunused-var-3.c gcc.c-torture/compile/pr72771.c gcc.c-torture/compile/vector-5.c gcc.c-torture/compile/vector-6.c gcc.c-torture/execute/20030914-1.c gcc.c-torture/execute/20050316-1.c gcc.c-torture/execute/pr59643.c gcc.dg/builtin-tgmath-1.c gcc.dg/debug/pr55730.c gcc.dg/graphite/interchange-7.c gcc.dg/pr56890-2.c gcc.dg/pr68474.c gcc.dg/pr80286.c gcc.dg/torture/pr35227.c gcc.dg/torture/pr65077.c gcc.dg/torture/pr86363.c g++.dg/torture/pr81112.C g++.dg/torture/pr82985.C g++.dg/warn/Wunused-var-7.C and a lot more in libstdc++ in special_functions/*_comp_ellint_* and special_functions/*_ellint_* directories. 2018-12-14 Thomas Preud'homme <thomas.preudhomme@arm.com> gcc/ * config/arm/arm.md (arm_movdi): Split if -mslow-flash-data and source is a constant that would be loaded by literal pool. (movsf expander): Generate a no_literal_pool_sf_immediate insn if -mslow-flash-data is present, targeting hardfloat ABI and source is a float constant that cannot be loaded via vmov. (movdf expander): Likewise but generate a no_literal_pool_df_immediate insn. (arm_movsf_soft_insn): Split if -mslow-flash-data and source is a float constant that would be loaded by literal pool. (softfloat constant movsf splitter): Splitter for the above case. (movdf_soft_insn): Split if -mslow-flash-data and source is a float constant that would be loaded by literal pool. (softfloat constant movdf splitter): Splitter for the above case. * config/arm/constraints.md (Pz): Document existing constraint. (Ha): Define constraint. (Tu): Likewise. * config/arm/predicates.md (hard_sf_operand): New predicate. (hard_df_operand): Likewise. * config/arm/thumb2.md (thumb2_movsi_insn): Split if -mslow-flash-data and constant would be loaded by literal pool. * constant/arm/vfp.md (thumb2_movsi_vfp): Likewise and disable constant load in VFP register. (movdi_vfp): Likewise. (thumb2_movsf_vfp): Use hard_sf_operand as predicate for source to prevent match for a constant load if -mslow-flash-data and constant cannot be loaded via vmov. Adapt constraint accordingly by using Ha instead of E for generic floating-point constant load. (thumb2_movdf_vfp): Likewise using hard_df_operand predicate instead. (no_literal_pool_df_immediate): Add a clobber to use as the intermediate general purpose register and also enable it after reload but disable it constant is a valid FP constant. Add constraints and generate a DI intermediate load rather than 2 SI loads. (no_literal_pool_sf_immediate): Add a clobber to use as the intermediate general purpose register and also enable it after reload. 2018-11-14 Thomas Preud'homme <thomas.preudhomme@arm.com> gcc/testsuite/ * gcc.target/arm/thumb2-slow-flash-data-2.c: Require arm_fp_ok effective target. * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise. * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise. * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise. From-SVN: r267141
2018-12-14digest: Remove empty directory.Iain Buclaw1-0/+4
libphobos/ChangeLog: 2018-12-14 Iain Buclaw <ibuclaw@gdcproject.org> * src/std/internal/digest: Remove empty directory. From-SVN: r267138
2018-12-14re PR target/88474 (Inline built-in hypot for -ffast-math)Uros Bizjak4-0/+35
PR target/88474 * internal-fn.def (HYPOT): New. * optabs.def (hypot_optab): New. * config/i386/i386.md (hypot<mode>3): New expander. From-SVN: r267137
2018-12-14* target.def (post_cfi_startproc): Fix text.Jeff Law2-1/+6
From-SVN: r267136
2018-12-14[PATCH 1/3][GCC] Add new target hook asm_post_cfi_startprocSam Tebbs7-0/+39
2018-12-14 Sam Tebbs <sam.tebbs@arm.com> * doc/tm.texi (TARGET_ASM_POST_CFI_STARTPROC): Define. * doc/tm.texi.in (TARGET_ASM_POST_CFI_STARTPROC): Define. * dwarf2out.c (dwarf2out_do_cfi_startproc): Trigger the hook. * hooks.c (hook_void_FILEptr_tree): Define. * hooks.h (hook_void_FILEptr_tree): Define. * target.def (post_cfi_startproc): Define. From-SVN: r267135
2018-12-14[offloading] Error on missing symbolsTom de Vries8-7/+127
When compiling an OpenMP or OpenACC program containing a reference in the offloaded code to a symbol that has not been included in the offloaded code, the offloading compiler may ICE in lto1. Fix this by erroring out instead, mentioning the problematic symbol: ... error: variable 'var' has been referenced in offloaded code but hasn't been marked to be included in the offloaded code lto1: fatal error: errors during merging of translation units compilation terminated. ... Build x86_64 with nvptx accelerator and reg-tested libgomp. Build x86_64 and reg-tested libgomp. 2018-12-14 Tom de Vries <tdevries@suse.de> * lto-cgraph.c (verify_node_partition): New function. (input_overwrite_node, input_varpool_node): Use verify_node_partition. * testsuite/libgomp.c-c++-common/function-not-offloaded-aux.c: New test. * testsuite/libgomp.c-c++-common/function-not-offloaded.c: New test. * testsuite/libgomp.c-c++-common/variable-not-offloaded.c: New test. * testsuite/libgomp.oacc-c-c++-common/function-not-offloaded.c: New test. * testsuite/libgomp.oacc-c-c++-common/variable-not-offloaded.c: New test. From-SVN: r267134
2018-12-14x86: Don't use get_frame_size when finalizing stack frameH.J. Lu4-1/+28
get_frame_size () returns used stack slots during compilation, which may be optimized out later. Since ix86_find_max_used_stack_alignment is called by ix86_finalize_stack_frame_flags to check if stack frame is required, there is no need to call get_frame_size () which may give inaccurate final stack frame size. Tested on AVX512 machine configured with --with-arch=native --with-cpu=native gcc/ PR target/88483 * config/i386/i386.c (ix86_finalize_stack_frame_flags): Don't use get_frame_size (). gcc/testsuite/ PR target/88483 * gcc.target/i386/stackalign/pr88483.c: New test. From-SVN: r267133