aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-10-01[Ada] Document that gnatmem requires fixed-position executablesEric Botcazou1-6/+8
gcc/ada/ * doc/gnat_ugn/gnat_and_program_execution.rst (gnatmem): Document that it works only with fixed-position executables.
2021-10-01[Ada] Switch to SR0660Doug Rupp1-2/+2
gcc/ada/ * libgnat/s-parame__vxworks.ads (time_t_bits): Change to Long_Long_Integer'Size.
2021-10-01Daily bump.GCC Administrator8-1/+216
2021-09-30testsuite: Fix cf-descriptor-5.f90David Edelsohn1-0/+1
gcc/testsuite/ChangeLog * gfortran.dg/c-interop/cf-descriptor-5-c.c: Include alloca.h.
2021-09-30arm: Enable Cortex-R52+ CPUPrzemyslaw Wirkus4-5/+18
Patch is adding Cortex-R52+ as 'cortex-r52plus' command line flag for -mcpu option. gcc/ChangeLog: * config/arm/arm-cpus.in: Add Cortex-R52+ CPU. * config/arm/arm-tables.opt: Regenerate. * config/arm/arm-tune.md: Regenerate. * doc/invoke.texi: Update docs.
2021-09-30c++: __is_trivially_xible and multi-arg aggr paren init [PR102535]Patrick Palka2-1/+20
is_xible_helper assumes only 0- and 1-argument ctors can be trivial, but C++20 aggregate paren init means multi-arg ctors can now be trivial too. This patch relaxes the relevant early exit check accordingly. PR c++/102535 gcc/cp/ChangeLog: * method.c (is_xible_helper): Don't exit early for multi-arg ctors in C++20. gcc/testsuite/ChangeLog: * g++.dg/ext/is_trivially_constructible7.C: New test.
2021-09-30c++: argument order in a variadic type trait intrinsicPatrick Palka2-0/+11
When parsing a variadic type trait intrinsic, we build up the list of trailing arguments in reverse, but we neglect to reverse the list to the true order afterwards. This causes us to confuse the meaning of e.g. __is_xible(x, y, z) vs __is_xible(x, z, y). Note that this bug doesn't affect the library traits because they pass a pack expansion as the single trailing argument to __is_xible, which gets expanded in the correct order by tsubst_tree_list. gcc/cp/ChangeLog: * parser.c (cp_parser_trait_expr): Call nreverse on the reversed list of trailing arguments. gcc/testsuite/ChangeLog: * g++.dg/ext/is_constructible6.C: New test.
2021-09-30c++: defaulted comparisons and vptr fields [PR95567]Patrick Palka2-0/+24
We need to explicitly skip over vptr fields when synthesizing a defaulted comparison operator, because next_initializable_field doesn't do so for us. PR c++/95567 gcc/cp/ChangeLog: * method.c (build_comparison_op): Skip DECL_VIRTUAL_P fields. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/spaceship-virtual1.C: New test.
2021-09-30compiler: avoid calling Expression::type before loweringIan Lance Taylor4-23/+69
This is a minor cleanup to ensure that the various Expression::do_type methods don't have to worry about the possibility that the Expression has not been lowered. Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/353140
2021-09-30Fortran: resolve expressions during SIZE simplificationHarald Anlauf2-0/+26
gcc/fortran/ChangeLog: PR fortran/102458 * simplify.c (simplify_size): Resolve expressions used in array specifications so that SIZE can be simplified. gcc/testsuite/ChangeLog: PR fortran/102458 * gfortran.dg/pr102458b.f90: New test.
2021-09-30Fortran: fix reference to Fortran standard in commentHarald Anlauf1-1/+1
gcc/fortran/ * expr.c: The correct reference to Fortran standard is: F2018:10.1.12.
2021-09-30i386: Eliminate sign extension after logic operation [PR89954]Uros Bizjak2-0/+79
Convert (sign_extend:WIDE (any_logic:NARROW (memory, immediate))) to (any_logic:WIDE (sign_extend (memory)), (sign_extend (immediate))). This eliminates sign extension after logic operation. 2021-09-30 Uroš Bizjak <ubizjak@gmail.com> gcc/ PR target/89954 * config/i386/i386.md (sign_extend:WIDE (any_logic:NARROW (memory, immediate)) splitters): New splitters. gcc/testsuite/ PR target/89954 * gcc.target/i386/pr89954.c: New test.
2021-09-30Fortran: Fix same_type_asTobias Burnus5-21/+296
A test for CLASS(*) + assumed rank was missing; adding a test to unlimited_polymorphic_1.f03 showed an ICE as backend_decl wasn't set. While gfc_get_symbol_decl would fix it, the code also assumed that the class(*) was a variable and could not be a subobject of a derived type. PR fortran/71703 PR fortran/84007 gcc/fortran/ChangeLog: * trans-intrinsic.c (gfc_conv_same_type_as): Fix handling of UNLIMITED_POLY. * trans.h (gfc_vtpr_hash_get): Renamed prototype to ... (gfc_vptr_hash_get): ... this to match function name. gcc/testsuite/ChangeLog: * gfortran.dg/c-interop/c535b-1.f90: Remove wrong comment. * gfortran.dg/unlimited_polymorphic_1.f03: Extend. * gfortran.dg/unlimited_polymorphic_32.f90: New test.
2021-09-30libphobos: Select the appropriate exception handler in getClassInfoIain Buclaw1-30/+44
This is analogous to __gdc_personality, which ignores in-flight exceptions that we haven't collided with yet. libphobos/ChangeLog: * libdruntime/gcc/deh.d (ExceptionHeader.getClassInfo): Move to... (getClassInfo): ...here as free function. Add lsda parameter. (scanLSDA): Pass lsda to actionTableLookup. (actionTableLookup): Add lsda parameter, pass to getClassInfo. (__gdc_personality): Remove currentCfa variable.
2021-09-30libphobos: Print stacktrace before terminating program due to uncaught ↵Iain Buclaw1-0/+5
exception. By default, D run-time has a top level exception handler to catch anything that was uncaught by user code. However when the `rt_trapExceptions' flag is cleared, this handler would not be enabled, and this termination would occur, aborting the program, but without any information about the exception. libphobos/ChangeLog: * libdruntime/gcc/deh.d (_d_print_throwable): Declare. (_d_throw): Print stacktrace before terminating program due to uncaught exception.
2021-09-30libphobos: Remove unused variables in gcc.backtrace.Iain Buclaw2-33/+5
The core.runtime module always overrides the default parameter value for constructor calls. MaxAlignment is not required because a class can be created on the stack with the `scope' keyword. libphobos/ChangeLog: * libdruntime/core/runtime.d (runModuleUnitTests): Use scope to new LibBacktrace on the stack. * libdruntime/gcc/backtrace.d (FIRSTFRAME): Remove. (LibBacktrace.MaxAlignment): Remove. (LibBacktrace.this): Remove default initialization of firstFrame. (UnwindBacktrace.this): Likewise.
2021-09-30libphobos: Give _Unwind_Exception an alignment that best resembles ↵Iain Buclaw1-1/+21
__attribute__((aligned)) For interoperability with C++ EH, the alignment should match, otherwise D may not be able to intercept exceptions thrown from C++. libphobos/ChangeLog: * libdruntime/gcc/unwind/generic.d (__aligned__): Define. (_Unwind_Exception): Align struct to __aligned__.
2021-09-30libphobos: Define main function as extern(C) when compiling without D ↵Iain Buclaw2-2/+15
runtime (PR102476) The default supplied main function as read when compiling with `-fmain' has extern(D) linkage. However this does not work when mixing this option together with `-fno-druntime'. PR d/102476 gcc/testsuite/ChangeLog: * gdc.dg/pr102476.d: New test. libphobos/ChangeLog: * libdruntime/__main.di: Define main function as extern(C) when compiling without D runtime.
2021-09-30libgomp.fortran/alloc-*.f90: Add missing dg-prune-outputTobias Burnus3-0/+3
libgomp/ * testsuite/libgomp.fortran/alloc-7.f90: Add dg-prune-output for -fintrinsic-modules-path= warning of the C compiler. * testsuite/libgomp.fortran/alloc-9.f90: Likewise. * testsuite/libgomp.fortran/alloc-10.f90: Likewise.
2021-09-30openmp: Add omp_aligned_{,c}alloc and omp_{c,re}alloc for FortranTobias Burnus10-5/+770
gcc/ChangeLog: * omp-low.c (omp_runtime_api_call): Add omp_aligned_{,c}alloc and omp_{c,re}alloc, fix omp_alloc/omp_free. libgomp/ChangeLog: * libgomp.texi (OpenMP 5.1): Set implementation status to Y for omp_aligned_{,c}alloc and omp_{c,re}alloc routines. * omp_lib.f90.in (omp_aligned_alloc, omp_aligned_calloc, omp_calloc, omp_realloc): Add. * omp_lib.h.in (omp_aligned_alloc, omp_aligned_calloc, omp_calloc, omp_realloc): Add. * testsuite/libgomp.fortran/alloc-10.f90: New test. * testsuite/libgomp.fortran/alloc-6.f90: New test. * testsuite/libgomp.fortran/alloc-7.c: New test. * testsuite/libgomp.fortran/alloc-7.f90: New test. * testsuite/libgomp.fortran/alloc-8.f90: New test. * testsuite/libgomp.fortran/alloc-9.f90: New test.
2021-09-30testsuite: Skip a test-case when LTO is used [PR102509]Martin Liska2-0/+2
PR testsuite/102509 gcc/testsuite/ChangeLog: * gcc.c-torture/compile/attr-complex-method.c: Skip if LTO is used. * gcc.c-torture/compile/attr-complex-method-2.c: Likewise.
2021-09-30Do not hide asm_out_file in ASM_OUTPUT_ASCII.Martin Liska1-8/+7
gcc/ChangeLog: * defaults.h (ASM_OUTPUT_ASCII): Do not hide global variable asm_out_file and stream directly to MYFILE.
2021-09-30Refine alingment peeling fixRichard Biener1-4/+6
This refines the previous fix further by reverting to the original code since the API is a bit of a mess. It also fixes the vector type used to query the misalignment - that was what triggered the original bogus change. 2021-09-30 Richard Biener <rguenther@suse.de> * tree-vect-data-refs.c (vect_update_misalignment_for_peel): Restore and fix condition under which we apply npeel to the DRs misalignment value.
2021-09-30Fix thinko in previous alignment peeling changeRichard Biener1-1/+1
I was mistaken in that npeel is -1 for variable peeling - it is 0. 2021-09-30 Richard Biener <rguenther@suse.de> * tree-vect-data-refs.c (vect_update_misalignment_for_peel): Fix npeel check for variable amount of peeling.
2021-09-30libstdc++: Fix preprocessor check for C++17Jonathan Wakely1-1/+1
libstdc++-v3/ChangeLog: * include/bits/regex.h (basic_regex::multiline): Fix #if condition.
2021-09-30Plug possible snprintf overflow in lto-wrapper.Aldy Hernandez1-3/+7
My upcoming improvements to the DOM threader triggered a warning in this code. It looks like the format string is ".ltrans%u.ltrans", but we're only writing a max of ".ltrans" + whatever the MAX_INT is here. Tested on x86-64 Linux. gcc/ChangeLog: * lto-wrapper.c (run_gcc): Plug snprintf overflow.
2021-09-30openmp: Add omp_aligned_{,c}alloc and omp_{c,re}allocJakub Jelinek8-20/+1010
This patch adds new OpenMP 5.1 allocator entrypoints and in addition to that fixes an omp_alloc bug which is hard to test for - if the first allocator fails but has a larger alignment trait and has a fallback allocator, either the default behavior or a user fallback, then the extra alignment will be used even in the fallback allocation, rather than just starting with whatever alignment has been requested (in GOMP_alloc or the minimum one in omp_alloc). Jonathan's comment on IRC this morning made me realize that I should add alloc_align attributes to 2 of the prototypes and I still need to add testsuite coverage for omp_realloc, will do that in a follow-up. 2021-09-30 Jakub Jelinek <jakub@redhat.com> * omp.h.in (omp_aligned_alloc, omp_calloc, omp_aligned_calloc, omp_realloc): New prototypes. (omp_alloc): Move after omp_free prototype, add __malloc__ (omp_free) attribute. * allocator.c: Include string.h. (omp_aligned_alloc): No longer static, add ialias. Add new_alignment variable and use it instead of alignment so that when retrying the old alignment is used again. Don't retry if new alignment is the same as old alignment, unless allocator had pool size. (omp_alloc, GOMP_alloc, GOMP_free): Use ialias_call. (omp_aligned_calloc, omp_calloc, omp_realloc): New functions. * libgomp.map (OMP_5.0.2): Export omp_aligned_alloc, omp_calloc, omp_aligned_calloc and omp_realloc. * testsuite/libgomp.c-c++-common/alloc-4.c (main): Add omp_aligned_alloc, omp_calloc and omp_aligned_calloc tests. * testsuite/libgomp.c-c++-common/alloc-5.c: New test. * testsuite/libgomp.c-c++-common/alloc-6.c: New test. * testsuite/libgomp.c-c++-common/alloc-7.c: New test. * testsuite/libgomp.c-c++-common/alloc-8.c: New test.
2021-09-30Add gimple_ranger::debug.Aldy Hernandez2-0/+7
I'm trying to add one debug() for each dump() to the dumping aids. Tested on x86-64 Linux. gcc/ChangeLog: * gimple-range.cc (gimple_ranger::debug): New. * gimple-range.h (class gimple_ranger): Add debug.
2021-09-30Plug memory leak in hybrid_threader.Aldy Hernandez1-0/+1
Tested on x86-64 Linux. gcc/ChangeLog: PR middle-end/102519 * tree-vrp.c (hybrid_threader::~hybrid_threader): Free m_query.
2021-09-30Daily bump.GCC Administrator6-1/+196
2021-09-29debug/102507: ICE in btf_finalize when compiling with -gbtfIndu Bhagat1-4/+4
Fix the free up of btf_var_ids hash_map in btf_finalize (). gcc/ChangeLog: PR debug/102507 * btfout.c (GTY): Add GTY (()) albeit for cosmetic only purpose. (btf_finalize): Empty the hash_map btf_var_ids.
2021-09-29MAINTAINERS: Add myself to DCO sectionJonathan Wakely1-0/+1
ChangeLog: * MAINTAINERS: Add myself to DCO section.
2021-09-29[PR102501] Adjust jump threading testcases for ppc64* and others.Aldy Hernandez2-3/+3
I really don't know what to do here. This is a bit of whack-o-mole. The IL is sufficiently different for various architectures that any tweak can cause the number of jump threads to vary. For the pr7745-2.c testcase, we have less threading candidates because 2 of them now cross loop boundaries. Interestingly, this test matches "Jumps threaded", not threads registered, so the block copier can drop threads at copying time adding further confusion. For example, we can register N threads, but the old copier can cancel N-M threads while updating the CFG for a variety of different reasons (removed edges, threading through loop exits, etc). This makes the "Registering jump threads" not to match the total number of threads this test checks for with "Jumps threaded". The pr66752-3.c test OTOH, is just a matter of thread4 eliminating the "if". I had erroneously thought it would always be eliminated by thread3, but we really don't care where it gets cleaned up. All we know is that DCE can't depend on the early threaders doing this work, because it may cross loop boundaries. I've chosen thread4 arbitrarily, but we could just as easily pick the ".optimized" dump. Sorry, I'm really at my wits end here. I don't see any clean path forward, except rewrite these tests as gimple IL. They're close to useless as they sit. gcc/testsuite/ChangeLog: PR testsuite/102501 * gcc.dg/tree-ssa/pr66752-3.c: Adjust. * gcc.dg/tree-ssa/pr77445-2.c: Adjust.
2021-09-29Avoid CFG updates in VRP threader if nothing changed.Aldy Hernandez1-4/+5
There is no need to update the CFG or SSAs if nothing has changed in VRP threading. gcc/ChangeLog: * tree-vrp.c (thread_through_all_blocks): Return bool. (execute_vrp_threader): Return TODO_* flags. (pass_data_vrp_threader): Set todo_flags_finish to 0.
2021-09-29Use a separate TV_* timer for the VRP threader.Aldy Hernandez2-1/+2
There seems to be a memory consumption issue on 32 bit hosts after the hybrid threader patchset. I'm having a hard time reproducing, and in the process I've noticed that the threader is using the TV_TREE_VRP timer. Having a distinct one could help diagnose this and other issues going forward. gcc/ChangeLog: * timevar.def (TV_TREE_VRP_THREADER): New. * tree-vrp.c: Use TV_TREE_VRP_THREADER for VRP threader pass.
2021-09-29Fortran: fix error recovery for invalid constructorHarald Anlauf2-0/+15
gcc/fortran/ChangeLog: PR fortran/102520 * array.c (expand_constructor): Do not dereference NULL pointer. gcc/testsuite/ChangeLog: PR fortran/102520 * gfortran.dg/pr102520.f90: New test.
2021-09-29bpf: correct extra_headersDavid Faust1-1/+0
The BPF CO-RE support (commit 8bdabb37549f12ce727800a1c8aa182c0b1dd42a) mistakenly overwrote bpf-*-* extra_headers in config.gcc, causing bpf-helpers.h to not be installed. The redefinition with coreout.h is unneeded, so delete it. gcc/ChangeLog: * config.gcc (bpf-*-*): Do not overwrite extra_headers.
2021-09-29Fix more testsuite fallout from computed goto changesJeff Law2-2/+2
gcc/testsuite * gcc.c-torture/compile/920831-1.c: Fix computed goto types. * gcc.c-torture/compile/pr27863.c: Likewise.
2021-09-29aarch64: Fix type qualifiers for qtbl1 and qtbx1 Neon builtinsJonathan Wright3-21/+27
Fix type qualifiers for qtbl1 and qtbx1 Neon builtins and remove casts from the Neon intrinsic function bodies that use these builtins. gcc/ChangeLog: 2021-09-23 Jonathan Wright <jonathan.wright@arm.com> * config/aarch64/aarch64-builtins.c (TYPES_BINOP_PPU): Define new type qualifier enum. (TYPES_TERNOP_SSSU): Likewise. (TYPES_TERNOP_PPPU): Likewise. * config/aarch64/aarch64-simd-builtins.def: Define PPU, SSU, PPPU and SSSU builtin generator macros for qtbl1 and qtbx1 Neon builtins. * config/aarch64/arm_neon.h (vqtbl1_p8): Use type-qualified builtin and remove casts. (vqtbl1_s8): Likewise. (vqtbl1q_p8): Likewise. (vqtbl1q_s8): Likewise. (vqtbx1_s8): Likewise. (vqtbx1_p8): Likewise. (vqtbx1q_s8): Likewise. (vqtbx1q_p8): Likewise. (vtbl1_p8): Likewise. (vtbl2_p8): Likewise. (vtbx2_p8): Likewise.
2021-09-29libstdc++: Implement std::regex_constants::multiline (LWG 2503)Jonathan Wakely4-15/+157
This implements LWG 2503, which allows ^ and $ to match line terminator characters, rather than only matching the beginning and end of the entire input. The multiline option is only valid for ECMAScript, but for other grammars we ignore it rather than throwing an exception. This is related to PR libstdc++/102480, which incorrectly said that ECMAscript should match the beginning of a line when match_prev_avail is used. I think that's only supposed to happen when multiline is used. The new regex_constants::multiline and basic_regex::multiline constants are not defined for strict -std=c++11 and -std=c++14 modes, but regex_constants::__multiline is always defined, so that the implementation can use it internally. Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * include/bits/regex.h (basic_regex::multiline): Define constant for C++17. * include/bits/regex_constants.h (regex_constants::multiline): Define constant for C++17. (regex_constants::__multiline): Define duplicate constant for internal use in C++11 and C++14. * include/bits/regex_executor.h (_Executor::_M_match_multiline()): New member function. (_Executor::_M_is_line_terminator(_CharT)): New member function. (_Executor::_M_at_begin(), _Executor::_M_at_end()): Use new member functions to support multiline matches. * testsuite/28_regex/algorithms/regex_match/multiline.cc: New test.
2021-09-29libstdc++: Check for invalid syntax_option_type values in <regex>Jonathan Wakely4-10/+76
The standard says that it is invalid for more than one grammar element to be set in a value of type regex_constants::syntax_option_type. This adds a check in the regex compiler andthrows an exception if an invalid value is used. Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * include/bits/regex_compiler.h (_Compiler::_S_validate): New function. * include/bits/regex_compiler.tcc (_Compiler::_Compiler): Use _S_validate to check flags. * include/bits/regex_error.h (_S_grammar): New error code for internal use. * testsuite/28_regex/basic_regex/ctors/grammar.cc: New test.
2021-09-29libstdc++: std::basic_regex should treat '\0' as an ordinary char [PR84110]Jonathan Wakely3-0/+50
When the input sequence contains a _CharT(0) character, the strchr call in _Scanner<_CharT>::_M_scan_normal() will search for '\0' and so return a pointer to the terminating null at the end of the string. This makes the scanner think it's found a special character. Because it doesn't match any of the actual special characters, we fall off the end of the function (or assert in debug mode). We should check for a null character explicitly and either treat it as an ordinary character (for the ECMAScript grammar) or an error (for all others). I'm not 100% sure that's right, but it seems consistent with the POSIX RE rules where a '\0' means the end of the regex pattern or the end of the sequence being matched. Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: PR libstdc++/84110 * include/bits/regex_error.h (regex_constants::_S_null): New error code for internal use. * include/bits/regex_scanner.tcc (_Scanner::_M_scan_normal()): Check for null character. * testsuite/28_regex/basic_regex/84110.cc: New test.
2021-09-29libstdc++: Simplify std::basic_regex construction and assignmentJonathan Wakely4-82/+117
Introduce a new _M_compile function which does the common work needed by all constructors and assignment. Call that directly to avoid multiple levels of constructor delegation or calls to basic_regex::assign overloads. For assignment, there is no need to construct a std::basic_string if we already have a contiguous sequence of the correct character type, and no need to construct a temporary basic_regex when assigning from an existing basic_regex. Also define the copy and move assignment operators as defaulted, which does the right thing without constructing a temporary and swapping it. Copying or moving the shared_ptr member cannot fail, so they can be noexcept. The assign(const basic_regex&) and assign(basic_regex&&) member can then be defined in terms of copy or move assignment. The new _M_compile function takes pointer arguments, so the caller has to convert arbitrary iterator ranges into a contiguous sequence of characters. With that simplification, the __compile_nfa helpers are not needed and can be removed. This also fixes a bug where construction from a contiguous sequence with the wrong character type would fail to compile, rather than converting the elements to the regex character type. Signed-off-by: Jonathan Wakely <jwakely@redhat.com> libstdc++-v3/ChangeLog: * include/bits/regex.h (__detail::__is_contiguous_iter): Move here from <bits/regex_compiler.h>. (basic_regex::_M_compile): New function to compile an NFA from a regular expression string. (basic_regex::basic_regex): Use _M_compile instead of delegating to other constructors. (basic_regex::operator=(const basic_regex&)): Define as defaulted. (basic_regex::operator=(initializer_list<C>)): Use _M_compile. (basic_regex::assign(const basic_regex&)): Use copy assignment. (basic_regex::assign(basic_regex&&)): Use move assignment. (basic_regex::assign(const C*, flag_type)): Use _M_compile instead of constructing a temporary string. (basic_regex::assign(const C*, size_t, flag_type)): Likewise. (basic_regex::assign(const basic_string<C,T,A>&, flag_type)): Use _M_compile instead of constructing a temporary basic_regex. (basic_regex::assign(InputIter, InputIter, flag_type)): Avoid constructing a temporary string for contiguous iterators of the right value type. * include/bits/regex_compiler.h (__is_contiguous_iter): Move to <bits/regex.h>. (__enable_if_contiguous_iter, __disable_if_contiguous_iter) (__compile_nfa): Remove. * testsuite/28_regex/basic_regex/assign/exception_safety.cc: New test. * testsuite/28_regex/basic_regex/ctors/char/other.cc: New test.
2021-09-29testsuite/102517 - fix FAIL of gcc.dg/pr78408-1.c with OImode availabilityRichard Biener1-1/+1
This fixes the testcase which looks for variants of memcpy after memset folding which is disturbed when we expand the memcpy inline earlier which in fact performs the desired optimization but makes the dump file not match. For the ease of testing the following adjusts the smaller structure size to be no longer power-of-two which avoids the inline expansion. 2021-09-29 Richard Biener <rguenther@suse.de> PR testsuite/102517 * gcc.dg/pr78408-1.c: Make S not power-of-two size.
2021-09-29Fix peeling for alignment with negative stepRichard Biener3-6/+213
The following fixes a regression causing us to no longer peel negative step loops for alignment. With dr_misalignment now applying the bias for negative step we have to do the reverse when adjusting the misalignment for peeled DRs. 2021-09-29 Richard Biener <rguenther@suse.de> * tree-vect-data-refs.c (vect_dr_misalign_for_aligned_access): New helper. (vect_update_misalignment_for_peel): Use it to update misaligned to the value necessary for an aligned access. (vect_get_peeling_costs_all_drs): Likewise. (vect_enhance_data_refs_alignment): Likewise. * gcc.target/i386/vect-alignment-peeling-1.c: New testcase. * gcc.target/i386/vect-alignment-peeling-2.c: Likewise.
2021-09-29aarch64: Improve size heuristic for cpymem expansionKyrylo Tkachov2-11/+54
Similar to my previous patch for setmem this one does the same for the cpymem expansion. We count the number of ops emitted and compare it against the alternative of just calling the library function when optimising for size. For the code: void cpy_127 (char *out, char *in) { __builtin_memcpy (out, in, 127); } void cpy_128 (char *out, char *in) { __builtin_memcpy (out, in, 128); } we now emit a call to memcpy (with an extra MOV-immediate instruction for the size) instead of: cpy_127(char*, char*): ldp q0, q1, [x1] stp q0, q1, [x0] ldp q0, q1, [x1, 32] stp q0, q1, [x0, 32] ldp q0, q1, [x1, 64] stp q0, q1, [x0, 64] ldr q0, [x1, 96] str q0, [x0, 96] ldr q0, [x1, 111] str q0, [x0, 111] ret cpy_128(char*, char*): ldp q0, q1, [x1] stp q0, q1, [x0] ldp q0, q1, [x1, 32] stp q0, q1, [x0, 32] ldp q0, q1, [x1, 64] stp q0, q1, [x0, 64] ldp q0, q1, [x1, 96] stp q0, q1, [x0, 96] ret which is a clear code size win. Speed optimisation heuristics remain unchanged. 2021-09-29 Kyrylo Tkachov <kyrylo.tkachov@arm.com> * config/aarch64/aarch64.c (aarch64_expand_cpymem): Count number of emitted operations and adjust heuristic for code size. 2021-09-29 Kyrylo Tkachov <kyrylo.tkachov@arm.com> * gcc.target/aarch64/cpymem-size.c: New test.
2021-09-29aarch64: Improve size optimisation heuristic for setmem expansionKyrylo Tkachov3-18/+53
This patch adjusts the setmem expansion in the backend to track the number of ops it generates for the DUP + STR/STP inline sequences. This way we can compare the size/complexity of the sequence against alternatives, notably just returning "false" and thus just emitting a call to memset. The simple heuristic change here is that if we were going to emit more than 4 operations then we shouldn't bother and just call memset. The number 4 is chosen because in the worst case for memset we need to emit 4 instructions: 3 to move the arguments into the right registers and 1 for the call. The speed optimisation decisions are not affected, though I do want to extend these expansions in a later patch and I'd like to reuse this ops counting logic there. In any case this patch should make sense on its own. For the code: void __attribute__((__noinline__)) set127byte (int64_t *src, int c) { __builtin_memset (src, c, 127); } void __attribute__((__noinline__)) set128byte (int64_t *src, int c) { __builtin_memset (src, c, 128); } when optimising for size we now get just an immediate move + a call to memset (2 instructions) where before we'd have generated: set127byte(long*, int): dup v0.16b, w1 str q0, [x0, 96] stp q0, q0, [x0] stp q0, q0, [x0, 32] stp q0, q0, [x0, 64] str q0, [x0, 111] ret set128byte(long*, int): dup v0.16b, w1 stp q0, q0, [x0] stp q0, q0, [x0, 32] stp q0, q0, [x0, 64] stp q0, q0, [x0, 96] ret which is clearly undesirable for -Os. I've adjusted the recently-added gcc.target/aarch64/memset-strict-align-1.c testcase to use a bigger struct and switch to speed optimisation as with this patch we'll just call memset rather than expanding inline. That is the right decision for size optimisation (the resulting code is indeed shorter). With -O2 and the new struct size we still try the SIMD expansion and still trigger the path that the testcase is supposed to exercise. 2021-09-27 Kyrylo Tkachov <kyrylo.tkachov@arm.com> * config/aarch64/aarch64.c (aarch64_expand_setmem): Count number of emitted operations and adjust heuristic for code size. 2021-09-27 Kyrylo Tkachov <kyrylo.tkachov@arm.com> * gcc.target/aarch64/memset-corner-cases-2.c: New test. * gcc.target/aarch64/memset-strict-align-1.c: Adjust.
2021-09-29openmp: Disallow reduction with var private in containing parallel even on ↵Jakub Jelinek2-1/+12
scope [PR102504] The standard has a restriction: "A list item that appears in a reduction clause of a scope construct must be shared in the parallel region to which a corresponding scope region binds." similar to the restriction for worksharing constructs, but we were checking it only on worksharing constructs and not for scope and ICEd later on during omp expansion. 2021-09-29 Jakub Jelinek <jakub@redhat.com> PR middle-end/102504 * gimplify.c (gimplify_scan_omp_clauses): Use omp_check_private even in OMP_SCOPE clauses, not just on worksharing construct clauses. * c-c++-common/gomp/scope-4.c: New test.
2021-09-29Fix some testcases after my computed goto patchAndrew Pinski6-6/+6
For some reason I did not see these failures in my testing. Sorry about that. Anyways this fixes the testcases by adding a cast to __INTPTR_TYPE__ and then a cast to void*. Committed after testing them on x86_64-linux-gnu. gcc/testsuite/ChangeLog: * gcc.c-torture/compile/920826-1.c: Fix computed goto. * gcc.c-torture/compile/pr27863.c: Likewise. * gcc.c-torture/compile/pr70190.c: Likewise. * gcc.dg/torture/pr89135.c: Likewise. * gcc.dg/torture/pr90071.c: Likewise. * gcc.dg/vect/bb-slp-pr97709.c: Likewise.
2021-09-29Avoid memcpy inline expansion in gcc.dg/out-of-bounds-1.cRichard Biener1-1/+1
This avoids inline expansion to preserve the warning by making the memcpy size a non-power-of-two as suggested by Martin Sebor. 2021-09-29 Richard Biener <rguenther@suse.de> * gcc.dg/out-of-bounds-1.c: Make memcpied size not power-of-two.