aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-02-17openacc: Fix lowering for derived-type mappings through array elementsJulian Brown11-91/+344
This patch fixes lowering of derived-type mappings which select elements of arrays of derived types, and similar. These would previously lead to ICEs. With this change, OpenACC directives can pass through constructs that are no longer recognized by the gimplifier, hence alterations are needed there also. gcc/fortran/ * trans-openmp.c (gfc_trans_omp_clauses): Handle element selection for arrays of derived types. gcc/ * gimplify.c (gimplify_scan_omp_clauses): Handle ATTACH_DETACH for non-decls. gcc/testsuite/ * gfortran.dg/goacc/array-with-dt-1.f90: New test. * gfortran.dg/goacc/array-with-dt-3.f90: Likewise. * gfortran.dg/goacc/array-with-dt-4.f90: Likewise. * gfortran.dg/goacc/array-with-dt-5.f90: Likewise. * gfortran.dg/goacc/derived-chartypes-1.f90: Re-enable test. * gfortran.dg/goacc/derived-chartypes-2.f90: Likewise. * gfortran.dg/goacc/derived-classtypes-1.f95: Uncomment previously-broken directives. libgomp/ * testsuite/libgomp.oacc-fortran/derivedtypes-arrays-1.f90: New test. * testsuite/libgomp.oacc-fortran/update-dt-array.f90: Likewise.
2021-02-17c++: Fix up build_zero_init_1 once more [PR99106]Jakub Jelinek2-1/+6
My earlier build_zero_init_1 patch for flexible array members created an empty CONSTRUCTOR. As the following testcase shows, that doesn't work very well because the middle-end doesn't expect CONSTRUCTOR elements with incomplete type (that the empty CONSTRUCTOR at the end of outer CONSTRUCTOR had). The following patch just doesn't add any CONSTRUCTOR for the flexible array members, it doesn't seem to be needed. 2021-02-17 Jakub Jelinek <jakub@redhat.com> PR sanitizer/99106 * init.c (build_zero_init_1): For flexible array members just return NULL_TREE instead of returning empty CONSTRUCTOR with non-complete ARRAY_TYPE. * g++.dg/ubsan/pr99106.C: New test.
2021-02-17c++: More set_identifier_type_value fixing [PR 99116]Nathan Sidwell3-18/+58
My recent change looked under template_parms in two places, but that was covering up a separate problem. We were attempting to set the identifier_type_value of a template_parm into the template_parm scope. The peeking stopped us doing that, but confused poplevel, leaving an identifier value lying around. This fixes the underlying problem in do_pushtag -- we only need to set the identifier_type_value directly when we're in a template_parm scope (a later pushdecl will push the actual template_decl). for non-class non-template-parm bindings do_pushdecl already ends up manipulating identifier_type_value correctly. PR c++/99116 gcc/cp/ * name-lookup.c (do_pushdecl): Don't peek under template_parm bindings here ... (set_identifier_type_value_with_scope): ... or here. (do_pushtag): Only set_identifier_type_value_with_scope at non-class template parm scope, and use parent scope. gcc/testsuite/ * g++.dg/lookup/pr99116-1.C: New. * g++.dg/lookup/pr99116-2.C: New.
2021-02-17c++: ICE with header-units [PR 99071]Nathan Sidwell3-1/+15
This ICE was caused by dereferencing the wrong pointer and not finding the expected thing there. Pointers are like that. PR c++/99071 gcc/cp/ * name-lookup.c (maybe_record_mergeable_decl): Deref the correct pointer. gcc/testsuite/ * g++.dg/modules/pr99071_a.H: New. * g++.dg/modules/pr99071_b.H: New.
2021-02-17mips: Avoid out-of-bounds access in mips_symbol_insns [PR98491]Xi Ruoyao1-1/+1
An invalid use of MSA_SUPPORTED_MODE_P was causing an ICE on mips64el with -mmsa. The detailed analysis is posted on bugzilla. gcc/ChangeLog: 2021-02-17 Xi Ruoyao <xry111@mengyan1223.wang> PR target/98491 * config/mips/mips.c (mips_symbol_insns): Do not use MSA_SUPPORTED_MODE_P if mode is MAX_MACHINE_MODE.
2021-02-16c++: Revert EXPR_LOCATION change to build_aggr_init_expr [PR96997]Patrick Palka2-5/+2
My change in r10-7718 to make build_aggr_init_expr set EXPR_LOCATION (mimicking build_target_expr) causes the debuginfo regression PR96997. Given that this change is mostly independent of the rest of the commit, and that the only fallout of reverting it is a less accurate error message location in a testcase introduced in the same commit, it seems the best way forward is to just revert this part of the commit. gcc/cp/ChangeLog: PR debug/96997 PR c++/94034 * tree.c (build_aggr_init_expr): Revert r10-7718 change. gcc/testsuite/ChangeLog: PR debug/96997 PR c++/94034 * g++.dg/cpp1y/constexpr-nsdmi7b.C: Adjust expected location of "call to non-'constexpr' function" error message.
2021-02-17Daily bump.GCC Administrator6-1/+90
2021-02-16compiler: unalias receiver type in export dataIan Lance Taylor2-2/+2
Test case is https://golang.org/cl/292009. Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/291991
2021-02-16c++: directives-only preprocessing and include translation [PR 99050]Nathan Sidwell3-3/+18
We make sure files end in \n by placing one at the limit of the buffer (just past the end of what is read). We need to do the same for buffers generated via include-translation. Fortunately they have space. libcpp/ * files.c (_cpp_stack_file): Make buffers end in unread \n. gcc/testsuite/ * g++.dg/modules/pr99050_a.H: New. * g++.dg/modules/pr99050_b.C: New.
2021-02-16c-family: ICE with assume_aligned attribute [PR99062]Marek Polacek3-4/+16
We ICE in handle_assume_aligned_attribute since r271338 which added @@ -2935,8 +2936,8 @@ handle_assume_aligned_attribute (tree *node, tree name, tree args, int, /* The misalignment specified by the second argument must be non-negative and less than the alignment. */ warning (OPT_Wattributes, - "%qE attribute argument %E is not in the range [0, %E)", - name, val, align); + "%qE attribute argument %E is not in the range [0, %wu]", + name, val, tree_to_uhwi (align) - 1); *no_add_attrs = true; return NULL_TREE; } because align is INT_MIN and tree_to_uhwi asserts tree_fits_uhwi_p -- which ALIGN does not and the prior tree_fits_shwi_p check is fine with it, as well as the integer_pow2p check. Since neither of the arguments to assume_aligned can be negative, I've hoisted the tree_int_cst_sgn check. And add the missing "argument" word to an existing warning. gcc/c-family/ChangeLog: PR c++/99062 * c-attribs.c (handle_assume_aligned_attribute): Check that the alignment argument is non-negative. Tweak a warning message. gcc/testsuite/ChangeLog: PR c++/99062 * gcc.dg/attr-assume_aligned-4.c: Adjust dg-warning. * g++.dg/ext/attr-assume-aligned.C: New test.
2021-02-16[PATCH 3/3] MIPS: fix compact-branches test FAIL for PIC default configurationYunQiang Su2-2/+2
gcc/testsuite * gcc.target/mips/compact-branches-5.c: Force -fno-PIC. * gcc.target/mips/compact-branches-6.c: Force -fno-PIC.
2021-02-16Fortran: %re/%im fixes for OpenMP/OpenACC + gfc_is_simplify_contiguousTobias Burnus4-0/+105
gcc/fortran/ChangeLog: * expr.c (gfc_is_simplify_contiguous): Handle REF_INQUIRY, i.e. %im and %re which are EXPR_VARIABLE. * openmp.c (resolve_omp_clauses): Diagnose %re/%im explicitly. gcc/testsuite/ChangeLog: * gfortran.dg/goacc/ref_inquiry.f90: New test. * gfortran.dg/gomp/ref_inquiry.f90: New test.
2021-02-16[PR98096] inline-asm: Take inout operands into account for access to labels ↵Vladimir N. Makarov3-12/+33
by names. GCC splits inout operands into output and new matched input operands during gimplfication. Addressing operands by name or number is not problem as the new input operands are added at the end of existing input operands. However it became a problem for labels in asm goto with output reloads. Addressing labels should take into account the new input operands. The patch solves the problem. gcc/ChangeLog: PR inline-asm/98096 * stmt.c (resolve_operand_name_1): Take inout operands into account for access to labels by names. * doc/extend.texi: Describe counting operands for accessing labels. gcc/testsuite/ChangeLog: PR inline-asm/98096 * gcc.c-torture/compile/pr98096.c: New.
2021-02-16Fortran: Reject DT as fmt in I/O statments [PR99111]Tobias Burnus3-0/+75
gcc/fortran/ChangeLog: PR fortran/99111 * io.c (resolve_tag_format): Reject BT_DERIVED/CLASS/VOID as (array-valued) FORMAT tag. gcc/testsuite/ChangeLog: PR fortran/99111 * gfortran.dg/fmt_nonchar_1.f90: New test. * gfortran.dg/fmt_nonchar_2.f90: New test.
2021-02-16tree-optimization/38474 - improve PTA varinfo sortingRichard Biener1-7/+18
This improves a previous heuristic to sort address-taken variables first (because those appear in points-to bitmaps) by tracking which variables appear in ADDRESSOF constraints (there's also graph->address_taken but that's computed only later). This shaves off 30s worth of compile-time for the full testcase in PR38474 (which then still takes 965s to compile at -O2). 2021-02-16 Richard Biener <rguenther@suse.de> PR tree-optimization/38474 * tree-ssa-structalias.c (variable_info::address_taken): New. (new_var_info): Initialize address_taken. (process_constraint): Set address_taken. (solve_constraints): Use the new address_taken flag rather than is_reg_var for sorting variables. (dump_constraint): Dump the variable number if the name is just NULL.
2021-02-16openmp: Fix up vectorization simd call badness computation [PR99100]Jakub Jelinek3-5/+27
As mentioned in the PR, ix86_simd_clone_usable didn't make it more desirable to use 'e' mangled AVX512F entrypoints over 'd' mangled ones (AVX2) with the same simdlen. This patch fixes that. I have tweaked the generic code too to make more room for these target specific badness factors. 2021-02-16 Jakub Jelinek <jakub@redhat.com> PR target/99100 * tree-vect-stmts.c (vectorizable_simd_clone_call): For num_calls != 1 multiply by 4096 and for inbranch by 8192. * config/i386/i386.c (ix86_simd_clone_usable): For TARGET_AVX512F, return 3, 2 or 1 for mangle letters 'b', 'c' or 'd'. * gcc.target/i386/pr99100.c: New test.
2021-02-16gcc.misc-tests/outputs.exp (outest): Fix typo "is_target".Hans-Peter Nilsson1-1/+1
Fix typo for istarget in "is_target hppa*-*-hpux*", yielding an error running the test-suite for any target not matching powerpc*-*-aix* (presumably, by code inspection), aborting the check-gcc (check-gcc-c) regression test run some 3000 tests before the last one, missing e.g. all gcc.target tests like so: ----- ... Running /x/gcc/gcc/testsuite/gcc.misc-tests/outputs.exp ... ERROR: (DejaGnu) proc "is_target hppa*-*-hpux*" does not exist. The error code is TCL LOOKUP COMMAND is_target The info on the error is: invalid command name "is_target" while executing "::tcl_unknown is_target hppa*-*-hpux*" ("uplevel" body line 1) invoked from within "uplevel 1 ::tcl_unknown $args" === gcc Summary === ... ----- gcc/testsuite: * gcc.misc-tests/outputs.exp (outest): Fix typo "is_target".
2021-02-16Daily bump.GCC Administrator4-1/+144
2021-02-15aarch64: Run SUBTARGET_INIT_BUILTINS if it existsMaya Rashish1-0/+3
Some subtargets don't provide the canonical function names as the symbol name in C libraries, and libcalls will only work if the builtins are patched to emit the correct library name. For example, on NetBSD, cabsl has the symbol name __c99_cabsl, and the patching is done via netbsd_patch_builtin. With this change, libgfortran.so is correctly built with a reference to __c99_cabsl, instead of "cabsl" which is not defined. gcc/ChangeLog: * config/aarch64/aarch64.c (aarch64_init_builtins): Call SUBTARGET_INIT_BUILTINS.
2021-02-15rtl-optimization: Fix uninitialized use of opaque mode variable ICE [PR98872]Peter Bergner2-1/+23
The initialize_uninitialized_regs function emits (set (reg:) (CONST0_RTX)) for all uninitialized pseudo uses. However, some modes (eg, opaque modes) may not have a CONST0_RTX defined, leading to an ICE when we try and create the initialization insn. The fix is to skip emitting the initialization if there is no CONST0_RTX defined for the mode. 2021-02-15 Peter Bergner <bergner@linux.ibm.com> gcc/ PR rtl-optimization/98872 * init-regs.c (initialize_uninitialized_regs): Skip initialization if CONST0_RTX is NULL. gcc/testsuite/ PR rtl-optimization/98872 * gcc.target/powerpc/pr98872.c: New test.
2021-02-15libstdc++: Fix __thread_yield for non-gthreads targetsJonathan Wakely1-9/+8
The __gthread_yield() function is only defined for gthreads targets, so check _GLIBCXX_HAS_GTHREADS before using it. Also reorder __thread_relax and __thread_yield so that the former can use the latter instead of repeating the same preprocessor checks. libstdc++-v3/ChangeLog: * include/bits/atomic_wait.h (__thread_yield()): Check _GLIBCXX_HAS_GTHREADS before using __gthread_yield. (__thread_relax()): Use __thread_yield() instead of repeating the preprocessor checks for __gthread_yield.
2021-02-15libstdc++: Add missing return and use reserved nameJonathan Wakely1-4/+8
The once_flag::_M_activate() function is only ever called immediately after a call to once_flag::_M_passive(), and so in the non-gthreads case it is impossible for _M_passive() to be true in the body of _M_activate(). Add a check for it anyway, to avoid warnings about missing return. Also replace a non-reserved name with a reserved one. libstdc++-v3/ChangeLog: * include/std/mutex (once_flag::_M_activate()): Add explicit return statement for passive case. (once_flag::_M_finish(bool)): Use reserved name for parameter.
2021-02-15rtl-ssa: Reduce the amount of temporary memory needed [PR98863]Richard Sandiford8-460/+721
The rtl-ssa code uses an on-the-side IL and needs to build that IL for each block and RTL insn. I'd originally not used the classical dominance frontier method for placing phis on the basis that it seemed like more work in this context: we're having to visit everything in an RPO walk anyway, so for non-backedge cases we can tell immediately whether a phi node is needed. We then speculatively created phis for registers that are live across backedges and simplified them later. This avoided having to walk most of the IL twice (once to build the initial IL, and once to link uses to phis). However, as shown in PR98863, this leads to excessive temporary memory in extreme cases, since we had to record the value of every live register on exit from every block. In that PR, there were many registers that were live (but unused) across a large region of code. This patch does use the classical approach to placing phis, but tries to use the existing DF defs information to avoid two walks of the IL. We still use the previous approach for memory, since there is no up-front information to indicate whether a block defines memory or not. However, since memory is just treated as a single unified thing (like for gimple vops), memory doesn't suffer from the same scalability problems as registers. With this change, fwprop no longer seems to be a memory-hog outlier in the PR: the maximum RSS is similar with and without fwprop. The PR also shows the problems inherent in using bitmap operations involving the live-in and live-out sets, which in the testcase are very large. I've therefore tried to reduce those operations to the bare minimum. The patch also includes other compile-time optimisations motivated by the PR; see the changelog for details. I tried adding: for (int i = 0; i < 200; ++i) { crtl->ssa = new rtl_ssa::function_info (cfun); delete crtl->ssa; } to fwprop.c to stress the code. fwprop then took 35% of the compile time for the problematic partition in the PR (measured on a release build). fwprop takes less than .5% of the compile time when running normally. The command: git diff 0b76990a9d75d97b84014e37519086b81824c307~ gcc/fwprop.c | \ patch -p1 -R still gives a working compiler that uses the old fwprop.c. The compile time with that version is very similar. For a more reasonable testcase like optabs.ii at -O, I saw a 6.7% compile time regression with the loop above added (i.e. creating the info 201 times per pass instead of once per pass). That goes down to 4.8% with -O -g. I can't measure a significant difference with a normal compiler (no 200-iteration loop). So I think that (as expected) the patch does make things a bit slower in the normal case. But like Richi says, peak memory usage is harder for users to work around than slighter slower compile times. gcc/ PR rtl-optimization/98863 * rtl-ssa/functions.h (function_info::bb_live_out_info): Delete. (function_info::build_info): Turn into a declaration, moving the definition to internals.h. (function_info::bb_walker): Declare. (function_info::create_reg_use): Likewise. (function_info::calculate_potential_phi_regs): Take a build_info parameter. (function_info::place_phis, function_info::create_ebbs): Declare. (function_info::calculate_ebb_live_in_for_debug): Likewise. (function_info::populate_backedge_phis): Delete. (function_info::start_block, function_info::end_block): Declare. (function_info::populate_phi_inputs): Delete. (function_info::m_potential_phi_regs): Move information to build_info. * rtl-ssa/internals.h: New file. (function_info::bb_phi_info): New class. (function_info::build_info): Moved from functions.h. Add a constructor and destructor. (function_info::build_info::ebb_use): Delete. (function_info::build_info::ebb_def): Likewise. (function_info::build_info::bb_live_out): Likewise. (function_info::build_info::tmp_ebb_live_in_for_debug): New variable. (function_info::build_info::potential_phi_regs): Likewise. (function_info::build_info::potential_phi_regs_for_debug): Likewise. (function_info::build_info::ebb_def_regs): Likewise. (function_info::build_info::bb_phis): Likewise. (function_info::build_info::bb_mem_live_out): Likewise. (function_info::build_info::bb_to_rpo): Likewise. (function_info::build_info::def_stack): Likewise. (function_info::build_info::old_def_stack_limit): Likewise. * rtl-ssa/internals.inl (function_info::build_info::record_reg_def): Remove the regno argument. Push the previous definition onto the definition stack where necessary. * rtl-ssa/accesses.cc: Include internals.h. * rtl-ssa/changes.cc: Likewise. * rtl-ssa/blocks.cc: Likewise. (function_info::build_info::build_info): Define. (function_info::build_info::~build_info): Likewise. (function_info::bb_walker): New class. (function_info::bb_walker::bb_walker): Define. (function_info::add_live_out_use): Convert a logarithmic-complexity test into a linear one. Allow the same definition to be passed multiple times. (function_info::calculate_potential_phi_regs): Moved from functions.cc. Take a build_info parameter and store the information there instead. (function_info::place_phis): New function. (function_info::add_entry_block_defs): Update call to record_reg_def. (function_info::calculate_ebb_live_in_for_debug): New function. (function_info::add_phi_nodes): Use bb_phis to decide which registers need phi nodes and initialize ebb_def_regs accordingly. Do not add degenerate phis here. (function_info::add_artificial_accesses): Use create_reg_use. Assert that all definitions are listed in the DF LR sets. Update call to record_reg_def. (function_info::record_block_live_out): Record live-out register values in the phis of successor blocks. Use the live-out set when processing the last block in an EBB, instead of always using the live-in sets of successor blocks. AND the live sets with the set of registers that have been defined in the EBB, rather than with all potential phi registers. Cope correctly with branches back to the start of the current EBB. (function_info::start_block): New function. (function_info::end_block): Likewise. (function_info::populate_phi_inputs): Likewise. (function_info::create_ebbs): Likewise. (function_info::process_all_blocks): Rewrite into a multi-phase process. * rtl-ssa/functions.cc: Include internals.h. (function_info::calculate_potential_phi_regs): Move to blocks.cc. (function_info::init_function_data): Remove caller. * rtl-ssa/insns.cc: Include internals.h (function_info::create_reg_use): New function. Lazily any degenerate phis needed by the linear RPO view. (function_info::record_use): Use create_reg_use. When processing debug uses, use potential_phi_regs and test it before checking whether the register is live on entry to the current EBB. Lazily calculate ebb_live_in_for_debug. (function_info::record_call_clobbers): Update call to record_reg_def. (function_info::record_def): Likewise.
2021-02-15Fix 2 more leaks related to gen_command_line_string.Martin Liska1-5/+9
gcc/ChangeLog: * toplev.c (init_asm_output): Free output of gen_command_line_string function. (process_options): Likewise.
2021-02-15Add 2 missing Param keywords.Martin Liska1-2/+2
gcc/ChangeLog: * params.opt: Add 2 missing Param keywords.
2021-02-15Fix cast in df_worklist_dataflow_doublequeueEric Botcazou1-1/+1
The existing cast to float gives weird results in the RTL dump files on x86 when the compiler is configured -with-fpmath=sse. gcc/ * df-core.c (df_worklist_dataflow_doublequeue): Use proper cast.
2021-02-15match.pd: Fix up A % (cast) (pow2cst << B) simplification [PR99079]Jakub Jelinek3-6/+82
The (mod @0 (convert?@3 (power_of_two_cand@1 @2))) simplification uses tree_nop_conversion_p (type, TREE_TYPE (@3)) condition, but I believe it doesn't check what it was meant to check. On convert?@3 TREE_TYPE (@3) is not the type of what it has been converted from, but what it has been converted to, which needs to be (because it is operand of normal binary operation) equal or compatible to type of the modulo result and first operand - type. I could fix that by using && tree_nop_conversion_p (type, TREE_TYPE (@1)) and be done with it, but actually most of the non-nop conversions are IMHO ok and so we would regress those optimizations. In particular, if we have say narrowing conversions (foo5 and foo6 in the new testcase), I think we are fine, either the shift of the power of two constant after narrowing conversion is still that power of two (or negation of that) and then it will still work, or the result of narrowing conversion is 0 and then we would have UB which we can ignore. Similarly, widening conversions where the shift result is unsigned are fine, or even widening conversions where the shift result is signed, but we sign extend to a signed wider divisor, the problematic case of INT_MIN will become x % (long long) INT_MIN and we can still optimize that to x & (long long) INT_MAX. What doesn't work is the case in the pr99079.c testcase, widening conversion of a signed shift result to wider unsigned divisor, where if the shift is negative, we end up with x % (unsigned long long) INT_MIN which is x % 0xffffffff80000000ULL where the divisor is not a power of two and we can't optimize that to x & 0x7fffffffULL. So, the patch rejects only the single problematic case. Furthermore, when the shift result is signed, we were introducing UB into a program which previously didn't have one (well, left shift into the sign bit is UB in some language/version pairs, but it is definitely valid in C++20 - wonder if I shouldn't move the gcc.c-torture/execute/pr99079.c testcase to g++.dg/torture/pr99079.C and use -std=c++20), by adding that subtraction of 1, x % (1 << 31) in C++20 is well defined, but x & ((1 << 31) - 1) triggers UB on the subtraction. So, the patch performs the subtraction in the unsigned type if it isn't wrapping. 2021-02-15 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/99079 * match.pd (A % (pow2pcst << N) -> A & ((pow2pcst << N) - 1)): Remove useless tree_nop_conversion_p (type, TREE_TYPE (@3)) check. Instead require both type and TREE_TYPE (@1) to be integral types and either type having smaller or equal precision, or TREE_TYPE (@1) being unsigned type, or type being signed type. If TREE_TYPE (@1) doesn't have wrapping overflow, perform the subtraction of one in unsigned type. * gcc.dg/fold-modpow2-2.c: New test. * gcc.c-torture/execute/pr99079.c: New test.
2021-02-15Daily bump.GCC Administrator3-1/+16
2021-02-14Fix memory leak in ipa-referneceJan Hubicka1-5/+11
2021-02-14 Jan Hubicka <hubicka@ucw.cz> Richard Biener <rguether@suse.de> PR ipa/97346 * ipa-reference.c (ipa_init): Only conditinally initialize reference_vars_to_consider. (propagate): Conditionally deninitialize reference_vars_to_consider. (ipa_reference_write_optimization_summary): Sanity check that reference_vars_to_consider is not allocated.
2021-02-14libstdc++: Restore <unistd.h> in testsuite_fs.h header [PR 99096]Jonathan Wakely1-1/+1
libstdc++-v3/ChangeLog: PR libstdc++/99096 * testsuite/util/testsuite_fs.h: Always include <unistd.h>.
2021-02-14Daily bump.GCC Administrator4-1/+54
2021-02-13RISC-V: Avoid zero/sign extend for volatile loads. Fix for 97417.Levy Hsu2-7/+49
This expands sub-word loads as a zero/sign extended load, followed by a subreg. This helps eliminate unnecessary zero/sign extend insns after the load, particularly for volatiles, but also in some other cases. Testing shows that it gives consistent code size decreases. Tested with riscv32-elf rv32imac/ilp32 and riscv64-linux rv64gc/lp064d builds and checks. Some -gsplit-stack tests fail with the patch, but this turns out to be an existing bug with the split-stack support that I hadn't noticed before. It isn't a bug in this patch. Ignoring that there are no regressions. Committed. gcc/ PR target/97417 * config/riscv/riscv-shorten-memrefs.c (pass_shorten_memrefs): Add extend parameter to get_si_mem_base_reg declaration. (get_si_mem_base_reg): Add extend parameter. Set it. (analyze): Pass extend arg to get_si_mem_base_reg. (transform): Likewise. Use it when rewriting mems. * config/riscv/riscv.c (riscv_legitimize_move): Check for subword loads and emit sign/zero extending load followed by subreg move.
2021-02-13RISC-V: Shorten memrefs improvement, partial fix 97417.Jim Wilson1-8/+11
We already have a check for riscv_shorten_memrefs in riscv_address_cost. This adds the same check to riscv_rtx_costs. Making this work also requires a change to riscv_compressed_lw_address_p to work before reload by checking the offset and assuming any pseudo reg is OK. Testing shows that this consistently gives small code size reductions. gcc/ PR target/97417 * config/riscv/riscv.c (riscv_compressed_lw_address_p): Drop early exit when !reload_completed. Only perform check for compressed reg if reload_completed. (riscv_rtx_costs): In MEM case, when optimizing for size and shorten memrefs, if not compressible, then increase cost.
2021-02-13passes: Enable split4 with selective scheduling 2 [PR98439]Jakub Jelinek2-1/+18
As mentioned in the PR, we have 5 split passes (+ splitting during final). split1 is before RA and is unconditional, split2 is after RA and is gated on optimize > 0, split3 is before sched2 and is gated on defined(INSN_SCHEDULING) && optimize > 0 && flag_schedule_insns_after_reload split4 is before regstack and is gated on HAVE_ATTR_length && defined (STACK_REGS) && !gate (split3) split5 is before shorten_branches and is gated on HAVE_ATTR_length && !defined (STACK_REGS) and the splitting during final works only when !HAVE_ATTR_length. STACK_REGS is a macro enabled only on i386/x86_64. The problem with the following testcase is that split3 before sched2 is the last splitting pass for the target/command line options set, but selective scheduling unlike normal scheduling can create new instructions that need to be split, which means we ICE during final as there are insns that require splitting but nothing split them. This patch fixes it by doing split4 also when -fselective-scheduling2 is enabled on x86 and split3 has been run. As that option isn't on by default, it should slow down compilation only for those that enable that option. 2021-02-13 Jakub Jelinek <jakub@redhat.com> PR rtl-optimization/98439 * recog.c (pass_split_before_regstack::gate): Enable even when pass_split_before_sched2 is enabled if -fselective-scheduling2 is on. * gcc.target/i386/pr98439.c: New test.
2021-02-13d: Merge upstream dmd 7132b3537Iain Buclaw139-9782/+9937
Splits out all semantic passes for Dsymbol, Type, and TemplateParameter nodes into Visitors in separate files, and the copyright years of all sources have been updated. Reviewed-on: https://github.com/dlang/dmd/pull/12190 gcc/d/ChangeLog: * dmd/MERGE: Merge upstream dmd 7132b3537. * Make-lang.in (D_FRONTEND_OBJS): Add d/dsymbolsem.o, d/semantic2.o, d/semantic3.o, and d/templateparamsem.o. * d-compiler.cc (Compiler::genCmain): Update calls to semantic entrypoint functions. * d-lang.cc (d_parse_file): Likewise. * typeinfo.cc (make_frontend_typeinfo): Likewise.
2021-02-13i386: Add combiner splitter to optimize V2SImode memory rotation [PR96166]Jakub Jelinek2-0/+32
Since the x86 backend enabled V2SImode vectorization (with TARGET_MMX_WITH_SSE), slp vectorization can kick in and emit movq (%rdi), %xmm1 pshufd $225, %xmm1, %xmm0 movq %xmm0, (%rdi) instead of rolq $32, (%rdi) we used to emit (or emit when slp vectorization is disabled). I think the rotate is both smaller and faster, so this patch adds a combiner splitter to optimize that back. 2021-02-13 Jakub Jelinek <jakub@redhat.com> PR target/96166 * config/i386/mmx.md (*mmx_pshufd_1): Add a combine splitter for swap of V2SImode elements in memory into DImode memory rotate by 32. * gcc.target/i386/pr96166.c: New test.
2021-02-13Daily bump.GCC Administrator11-1/+320
2021-02-13testsuite: Restrict gcc.dg/rtl/aarch64/multi-subreg-1.c test to aarch64 onlyJakub Jelinek1-0/+1
2021-02-13 Jakub Jelinek <jakub@redhat.com> * gcc.dg/rtl/aarch64/multi-subreg-1.c: Add dg-do compile directive and restrict the test to aarch64-*-* target only.
2021-02-12c++: Seed imported bindings [PR 99039]Nathan Sidwell3-6/+48
As mentioned in 99040's fix, we can get inter-module using decls. If the using decl is the only reference to an import, we'll have failed to seed our imports leading to an assertion failure. The fix is straight-forwards, check binding contents when seeding imports. gcc/cp/ * module.cc (module_state::write_cluster): Check bindings for imported using-decls. gcc/testsuite/ * g++.dg/modules/pr99039_a.C: New. * g++.dg/modules/pr99039_b.C: New.
2021-02-12c++: Register streamed-in decls when new [PR 99040]Nathan Sidwell7-58/+82
With modules one can have using-decls refering to their own scope. This is the way to export things from the GMF or from an import. The problem was I was using current_ns == CP_DECL_CONTEXT (decl) to determine whether a decl should be registered in a namespace level or not. But that's an inadequate check and we ended up reregistering decls and creating a circular list. We should be registering the decl when first encountered -- whether we bind it is orthogonal to that. PR c++/99040 gcc/cp/ * module.cc (trees_in::decl_value): Call add_module_namespace_decl for new namespace-scope entities. (module_state::read_cluster): Don't call add_module_decl here. * name-lookup.h (add_module_decl): Rename to ... (add_module_namespace_decl): ... this. * name-lookup.c (newbinding_bookkeeping): Move into ... (do_pushdecl): ... here. Its only remaining caller. (add_module_decl): Rename to ... (add_module_namespace_decl): ... here. Add checking-assert for circularity. Don't call newbinding_bookkeeping, just extern_c checking and incomplete var checking. gcc/testsuite/ * g++.dg/modules/pr99040_a.C: New. * g++.dg/modules/pr99040_b.C: New. * g++.dg/modules/pr99040_c.C: New. * g++.dg/modules/pr99040_d.C: New.
2021-02-12Expunge namespace-scope IDENTIFIER_TYPE_VALUE & global_type_name [PR 99039]Nathan Sidwell10-162/+106
IDENTIFIER_TYPE_VALUE and friends is a remnant of G++'s C origins. It holds elaborated types on identifier-nodes. While this is fine for C and for local and class-scopes in C++, it fails badly for namespaces. In that case a marker 'global_type_node' was used, which essentially signified 'this is a namespace-scope type *somewhere*', and you'd have to do a regular name_lookup to find it. As the parser and substitution machinery has avanced over the last 25 years or so, there's not much outside of actual name-lookup that uses that. Amusingly the IDENTIFIER_HAS_TYPE_VALUE predicate will do an actual name-lookup and then users would repeat that lookup to find the now-known to be there type. Rather late I realized that this interferes with the lazy loading of module entities, because we were setting IDENTIFIER_TYPE_VALUE to global_type_node. But we could be inside some local scope where that identifier is bound to some local type. Not good! Rather than add more cruft to look at an identifier's shadow stack and alter that as necessary, this takes the approach of removing the existing cruft. We nuke the few places outside of name lookup that use IDENTIFIER_TYPE_VALUE. Replacing them with either proper name lookups, alternative sequences, or in some cases asserting that they (no longer) happen. Class template instantiation was calling pushtag after setting IDENTIFIER_TYPE_VALUE in order to stop pushtag creating an implicit typedef and pushing it, but to get the bookkeeping it needed. Let's just do the bookkeeping directly. Then we can stop having a 'bound at namespace-scope' marker at all, which means lazy loading won't screw up local shadow stacks. Also, it simplifies set_identifier_type_value_with_scope, as it never needs to inspect the scope stack. When developing this patch, I discovered a number of places we'd put an actual namespace-scope type on the type_value slot, rather than global_type_node. You might notice this is killing at least two 'why are we doing this?' comments. While this doesn't fix the two PRs mentioned, it is a necessary step. PR c++/99039 PR c++/99040 gcc/cp/ * cp-tree.h (CPTI_GLOBAL_TYPE): Delete. (global_type_node): Delete. (IDENTIFIER_TYPE_VALUE): Delete. (IDENTIFIER_HAS_TYPE_VALUE): Delete. (get_type_value): Delete. * name-lookup.h (identifier_type_value): Delete. * name-lookup.c (check_module_override): Don't SET_IDENTIFIER_TYPE_VALUE here. (do_pushdecl): Nor here. (identifier_type_value_1, identifier_type_value): Delete. (set_identifier_type_value_with_scope): Only SET_IDENTIFIER_TYPE_VALUE for local and class scopes. (pushdecl_nanmespace_level): Remove shadow stack nadgering. (do_pushtag): Use REAL_IDENTIFIER_TYPE_VALUE. * call.c (check_dtor_name): Use lookup_name. * decl.c (cxx_init_decl_processing): Drop global_type_node. * decl2.c (cplus_decl_attributes): Don't SET_IDENTIFIER_TYPE_VALUE here. * init.c (get_type_value): Delete. * pt.c (instantiate_class_template_1): Don't call pushtag or SET_IDENTIFIER_TYPE_VALUE here. (tsubst): Assert never an identifier. (dependent_type_p): Drop global_type_node assert. * typeck.c (error_args_num): Don't use IDENTIFIER_HAS_TYPE_VALUE to determine ctorness. gcc/testsuite/ * g++.dg/lookup/pr99039.C: New.
2021-02-12compiler: open byte slice and string embeds using the absolute pathMichael Matloob2-4/+3
The paths vector contains the names of the files that the embed_files_ map is keyed by. While the code processing embed.FS values looks up the paths in the embed_files_ map, the code processing string and byte slice embeds tries opening the files using their names directly. Look up the full paths in the embed_files_ map when opening them. Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/291429
2021-02-12PR c/99055 - memory leak in warn_parm_array_mismatchMartin Sebor2-5/+18
gcc/c-family/ChangeLog: PR c/99055 * c-warn.c (warn_parm_array_mismatch): Free strings returned from print_generic_expr_to_str. gcc/ChangeLog: * tree-pretty-print.c (print_generic_expr_to_str): Update comment.
2021-02-12libgfortran: Fix PR95647 by changing the interfaces of operators .eq. and .ne.Steve Kargl2-4/+29
The FE converts the old school .eq. to ==, and then tracks the ==. The module starts with == and so it does not properly overload the .eq. Reversing the interfaces fixes this. 2021-02-12 Steve Kargl <sgk@troutmask.apl.washington.edu> libgfortran/ChangeLog: PR libfortran/95647 * ieee/ieee_arithmetic.F90: Flip interfaces of operators .eq. to == and .ne. to /= . gcc/testsuite/ChangeLog: PR libfortran/95647 * gfortran.dg/ieee/ieee_12.f90: New test.
2021-02-12rtl-ssa: Use right obstack for temporary allocationRichard Sandiford1-1/+1
I noticed while working on PR98863 that we were using the main obstack to allocate temporary uses. That was safe, but represents a kind of local memory leak. gcc/ * rtl-ssa/accesses.cc (function_info::make_use_available): Use m_temp_obstack rather than m_obstack to allocate the temporary use.
2021-02-12df: Record all definitions in DF_LR_BB_INFO->def [PR98863]Richard Sandiford2-7/+66
df_lr_bb_local_compute has: FOR_EACH_INSN_INFO_DEF (def, insn_info) /* If the def is to only part of the reg, it does not kill the other defs that reach here. */ if (!(DF_REF_FLAGS (def) & (DF_REF_PARTIAL | DF_REF_CONDITIONAL))) However, as noted in the comment in the patch and below, almost all partial definitions have an associated use. This means that the confluence function: IN = (OUT & ~DEF) | USE is unaffected by whether partial definitions are in DEF or not. Even though the choice doesn't matter for the LR problem itself, it's IMO much more convenient for consumers if DEF contains all the definitions in the block. The only pre-RTL-SSA code that tries to consume DEF directly is shrink-wrap.c, which already has to work around the incompleteness of the information: /* DF_LR_BB_INFO (bb)->def does not comprise the DF_REF_PARTIAL and DF_REF_CONDITIONAL defs. So if DF_LIVE doesn't exist, i.e. at -O1, just give up searching NEXT_BLOCK. */ I hit the same problem when trying to fix the RTL-SSA part of PR98863. This patch treats partial definitions as both a def and a use, just like the df_ref records almost always do. To show that partial definitions almost always have uses: DF_REF_CONDITIONAL: Added by: case COND_EXEC: df_defs_record (collection_rec, COND_EXEC_CODE (x), bb, insn_info, DF_REF_CONDITIONAL); break; Later, df_get_conditional_uses creates uses for all DF_REF_CONDITIONAL definitions. DF_REF_PARTIAL: In total, there are 4 locations at which we add partial definitions. Case 1: if (GET_CODE (dst) == STRICT_LOW_PART) { flags |= DF_REF_READ_WRITE | DF_REF_PARTIAL | DF_REF_STRICT_LOW_PART; loc = &XEXP (dst, 0); dst = *loc; } Corresponding use: case STRICT_LOW_PART: { rtx *temp = &XEXP (dst, 0); /* A strict_low_part uses the whole REG and not just the SUBREG. */ dst = XEXP (dst, 0); df_uses_record (collection_rec, (GET_CODE (dst) == SUBREG) ? &SUBREG_REG (dst) : temp, DF_REF_REG_USE, bb, insn_info, DF_REF_READ_WRITE | DF_REF_STRICT_LOW_PART); } break; Case 2: if (GET_CODE (dst) == ZERO_EXTRACT) { flags |= DF_REF_READ_WRITE | DF_REF_PARTIAL | DF_REF_ZERO_EXTRACT; loc = &XEXP (dst, 0); dst = *loc; } Corresponding use: case ZERO_EXTRACT: { df_uses_record (collection_rec, &XEXP (dst, 1), DF_REF_REG_USE, bb, insn_info, flags); df_uses_record (collection_rec, &XEXP (dst, 2), DF_REF_REG_USE, bb, insn_info, flags); if (GET_CODE (XEXP (dst,0)) == MEM) df_uses_record (collection_rec, &XEXP (dst, 0), DF_REF_REG_USE, bb, insn_info, flags); else df_uses_record (collection_rec, &XEXP (dst, 0), DF_REF_REG_USE, bb, insn_info, DF_REF_READ_WRITE | DF_REF_ZERO_EXTRACT); ----------------------------^^^^^^^^^^^^^^^^^ } break; Case 3: else if (GET_CODE (dst) == SUBREG && REG_P (SUBREG_REG (dst))) { if (read_modify_subreg_p (dst)) flags |= DF_REF_READ_WRITE | DF_REF_PARTIAL; flags |= DF_REF_SUBREG; df_ref_record (DF_REF_REGULAR, collection_rec, dst, loc, bb, insn_info, DF_REF_REG_DEF, flags); } Corresponding use: case SUBREG: if (read_modify_subreg_p (dst)) { df_uses_record (collection_rec, &SUBREG_REG (dst), DF_REF_REG_USE, bb, insn_info, flags | DF_REF_READ_WRITE | DF_REF_SUBREG); break; } Case 4: /* If this is a multiword hardreg, we create some extra datastructures that will enable us to easily build REG_DEAD and REG_UNUSED notes. */ if (collection_rec && (endregno != regno + 1) && insn_info) { /* Sets to a subreg of a multiword register are partial. Sets to a non-subreg of a multiword register are not. */ if (GET_CODE (reg) == SUBREG) ref_flags |= DF_REF_PARTIAL; ref_flags |= DF_REF_MW_HARDREG; Corresponding use: None. However, this case should be rare to non-existent on most targets, and the current handling seems suspect. See the comment in the patch for more details. gcc/ * df-problems.c (df_lr_bb_local_compute): Treat partial definitions as read-modify operations. gcc/testsuite/ * gcc.dg/rtl/aarch64/multi-subreg-1.c: New test.
2021-02-12libstdc++: Re-enable workaround for _wstat64 bug, again [PR 88881]Jonathan Wakely1-2/+0
I forgot that the workaround is present in both filesystem::status and filesystem::symlink_status. This restores it in the latter. libstdc++-v3/ChangeLog: PR libstdc++/88881 * src/c++17/fs_ops.cc (fs::symlink_status): Re-enable workaround.
2021-02-12libstdc++: Fix filesystem::rename on Windows [PR 98985]Jonathan Wakely10-10/+474
The _wrename function won't overwrite an existing file, so use MoveFileEx instead. That allows renaming directories over files, which POSIX doesn't allow, so check for that case explicitly and report an error. Also document the deviation from the expected behaviour, and add a test for filesystem::rename which was previously missing. The Filesystem TS experimental::filesystem::rename doesn't have that extra code to handle directories correctly, so the relevant parts of the new test are not run on Windows. libstdc++-v3/ChangeLog: * doc/xml/manual/status_cxx2014.xml: Document implementation specific properties of std::experimental::filesystem::rename. * doc/xml/manual/status_cxx2017.xml: Document implementation specific properties of std::filesystem::rename. * doc/html/*: Regenerate. * src/c++17/fs_ops.cc (fs::rename): Implement correct behaviour for directories on Windows. * src/filesystem/ops-common.h (__gnu_posix::rename): Use MoveFileExW on Windows. * testsuite/27_io/filesystem/operations/rename.cc: New test. * testsuite/experimental/filesystem/operations/rename.cc: New test.
2021-02-12libstdc++: Make "nonexistent" paths less predictable in filesystem testsJonathan Wakely1-4/+9
The helper function for creating new paths doesn't work well on Windows, because the PID of a process started by Wine is very consistent and so the same path gets created each time. libstdc++-v3/ChangeLog: * testsuite/util/testsuite_fs.h (nonexistent_path): Add random number to the path.
2021-02-12libstdc++: Include scope ID in net::internet::address_v6::to_string()Jonathan Wakely2-8/+34
libstdc++-v3/ChangeLog: * include/experimental/internet (address_v6::to_string): Include scope ID in string. * testsuite/experimental/net/internet/address/v6/members.cc: Test to_string() results.