aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2023-01-16Update copyright years.Jakub Jelinek3217-3339/+3315
2023-01-16Allow build_popcount_expr to use an IFNAndrew Carlotti8-28/+41
gcc/ChangeLog: * tree-ssa-loop-niter.cc (build_popcount_expr): Add IFN support. gcc/testsuite/ChangeLog: * g++.dg/tree-ssa/pr86544.C: Add .POPCOUNT to tree scan regex. * gcc.dg/tree-ssa/popcount.c: Likewise. * gcc.dg/tree-ssa/popcount2.c: Likewise. * gcc.dg/tree-ssa/popcount3.c: Likewise. * gcc.target/aarch64/popcount4.c: Likewise. * gcc.target/i386/pr95771.c: Likewise, and... * gcc.target/i386/pr95771-2.c: ...split int128 test from above, since this would emit just a single IFN if a TI optab is added.
2023-01-16Add c[lt]z idiom recognitionAndrew Carlotti10-0/+517
This recognises the patterns of the form: while (n & 1) { n >>= 1 } Unfortunately there are currently two issues relating to this patch. Firstly, simplify_using_initial_conditions does not recognise that (n != 0) and ((n & 1) == 0) implies that ((n >> 1) != 0). This preconditions arise following the loop copy-header pass, and the assumptions returned by number_of_iterations_exit_assumptions then prevent final value replacement from using the niter result. I'm not sure what is the best way to fix this - one approach could be to modify simplify_using_initial_conditions to handle this sort of case, but it seems that it basically wants the information that ranger could give anway, so would something like that be a better option? The second issue arises in the vectoriser, which is able to determine that the niter->assumptions are always true. When building with -march=armv8.4-a+sve -S -O3, we get this codegen: foo (unsigned int b) { int c = 0; if (b == 0) return PREC; while (!(b & (1 << (PREC - 1)))) { b <<= 1; c++; } return c; } foo: .LFB0: .cfi_startproc cmp w0, 0 cbz w0, .L6 blt .L7 lsl w1, w0, 1 clz w2, w1 cmp w2, 14 bls .L8 mov x0, 0 cntw x3 add w1, w2, 1 index z1.s, #0, #1 whilelo p0.s, wzr, w1 .L4: add x0, x0, x3 mov p1.b, p0.b mov z0.d, z1.d whilelo p0.s, w0, w1 incw z1.s b.any .L4 add z0.s, z0.s, #1 lastb w0, p1, z0.s ret .p2align 2,,3 .L8: mov w0, 0 b .L3 .p2align 2,,3 .L13: lsl w1, w1, 1 .L3: add w0, w0, 1 tbz w1, #31, .L13 ret .p2align 2,,3 .L6: mov w0, 32 ret .p2align 2,,3 .L7: mov w0, 0 ret .cfi_endproc In essence, the vectoriser uses the niter information to determine exactly how many iterations of the loop it needs to run. It then uses SVE whilelo instructions to run this number of iterations. The original loop counter is also vectorised, despite only being used in the final iteration, and then the final value of this counter is used as the return value (which is the same as the number of iterations it computed in the first place). This vectorisation is obviously bad, and I think it exposes a latent bug in the vectoriser, rather than being an issue caused by this specific patch. gcc/ChangeLog: * tree-ssa-loop-niter.cc (number_of_iterations_cltz): New. (number_of_iterations_bitcount): Add call to the above. (number_of_iterations_exit_assumptions): Add EQ_EXPR case for c[lt]z idiom recognition. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/cltz-max.c: New test. * gcc.dg/tree-ssa/clz-char.c: New test. * gcc.dg/tree-ssa/clz-int.c: New test. * gcc.dg/tree-ssa/clz-long-long.c: New test. * gcc.dg/tree-ssa/clz-long.c: New test. * gcc.dg/tree-ssa/ctz-char.c: New test. * gcc.dg/tree-ssa/ctz-int.c: New test. * gcc.dg/tree-ssa/ctz-long-long.c: New test. * gcc.dg/tree-ssa/ctz-long.c: New test.
2023-01-16docs: Add popcount, clz and ctz target attributesAndrew Carlotti1-0/+27
gcc/ChangeLog: * doc/sourcebuild.texi: Add missing target attributes.
2023-01-16Add cltz_complement idiom recognitionAndrew Carlotti12-3/+606
This recognises patterns of the form: while (n) { n >>= 1 } This patch results in improved (but still suboptimal) codegen: foo (unsigned int b) { int c = 0; while (b) { b >>= 1; c++; } return c; } foo: .LFB11: .cfi_startproc cbz w0, .L3 clz w1, w0 tst x0, 1 mov w0, 32 sub w0, w0, w1 csel w0, w0, wzr, ne ret The conditional is unnecessary. phiopt could recognise a redundant csel (using cond_removal_in_builtin_zero_pattern) when one of the inputs is a clz call, but it cannot recognise the redunancy when the input is (e.g.) (32 - clz). I could perhaps extend this function to recognise this pattern in a later patch, if this is a good place to recognise more patterns. gcc/ChangeLog: PR tree-optimization/94793 * tree-scalar-evolution.cc (expression_expensive_p): Add checks for c[lt]z optabs. * tree-ssa-loop-niter.cc (build_cltz_expr): New. (number_of_iterations_cltz_complement): New. (number_of_iterations_bitcount): Add call to the above. gcc/testsuite/ChangeLog: * lib/target-supports.exp (check_effective_target_clz) (check_effective_target_clzl, check_effective_target_clzll) (check_effective_target_ctz, check_effective_target_clzl) (check_effective_target_ctzll): New. * gcc.dg/tree-ssa/cltz-complement-max.c: New test. * gcc.dg/tree-ssa/clz-complement-char.c: New test. * gcc.dg/tree-ssa/clz-complement-int.c: New test. * gcc.dg/tree-ssa/clz-complement-long-long.c: New test. * gcc.dg/tree-ssa/clz-complement-long.c: New test. * gcc.dg/tree-ssa/ctz-complement-char.c: New test. * gcc.dg/tree-ssa/ctz-complement-int.c: New test. * gcc.dg/tree-ssa/ctz-complement-long-long.c: New test. * gcc.dg/tree-ssa/ctz-complement-long.c: New test.
2023-01-16doc: Fix grammar typo in description of malloc attributeJonathan Wakely1-1/+1
gcc/ChangeLog: * doc/extend.texi (Common Function Attributes): Fix grammar.
2023-01-16riscv: Fix up Copyright lines [PR108413]Jakub Jelinek2-2/+2
These 2 files had incorrectly formatted Copyright lines (no space between Copyright and (C)) which makes update-copyright.py upset. 2023-01-16 Jakub Jelinek <jakub@redhat.com> PR other/108413 * config/riscv/riscv-vsetvl.h: Add space in between Copyright and (C). * config/riscv/riscv-vsetvl.cc: Likewise.
2023-01-16x86: Avoid -Wuninitialized warnings on _mm*_undefined_* in C++ [PR105593]Jakub Jelinek6-0/+56
In https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609844.html I've posted a patch to allow ignoring -Winit-self using GCC diagnostic pragmas, such that one can mark self-initialization as intentional disabling of -Wuninitialized warnings. The following incremental patch uses that in the x86 intrinsic headers. 2023-01-16 Jakub Jelinek <jakub@redhat.com> PR c++/105593 gcc/ * config/i386/xmmintrin.h (_mm_undefined_ps): Temporarily disable -Winit-self using pragma GCC diagnostic ignored. * config/i386/emmintrin.h (_mm_undefined_pd, _mm_undefined_si128): Likewise. * config/i386/avxintrin.h (_mm256_undefined_pd, _mm256_undefined_ps, _mm256_undefined_si256): Likewise. * config/i386/avx512fintrin.h (_mm512_undefined_pd, _mm512_undefined_ps, _mm512_undefined_epi32): Likewise. * config/i386/avx512fp16intrin.h (_mm_undefined_ph, _mm256_undefined_ph, _mm512_undefined_ph): Likewise. gcc/testsuite/ * g++.target/i386/pr105593.C: New test.
2023-01-16c, c++: Allow ignoring -Winit-self through pragmas [PR105593]Jakub Jelinek5-2/+110
As mentioned in the PR, various x86 intrinsics need to return an uninitialized vector. Currently they use self initialization to avoid -Wuninitialized warnings, which works fine in C, but doesn't work in C++ where -Winit-self is enabled in -Wall. We don't have an attribute to mark a variable as knowingly uninitialized (the uninitialized attribute exists but means something else, only in the -ftrivial-auto-var-init context), and trying to suppress either -Wuninitialized or -Winit-self inside of the _mm_undefined_ps etc. intrinsic definitions doesn't work, one needs to currently disable through pragmas -Wuninitialized warning at the point where _mm_undefined_ps etc. result is actually used, but that goes against the intent of those intrinsics. The -Winit-self warning option actually doesn't do any warning, all we do is record a suppression for -Winit-self if !warn_init_self on the decl definition and later look that up in uninit pass. The following patch changes those !warn_init_self tests which are true only based on the command line option setting, not based on GCC diagnostic pragma overrides to !warning_enabled_at (DECL_SOURCE_LOCATION (decl), OPT_Winit_self) such that it takes them into account. 2023-01-16 Jakub Jelinek <jakub@redhat.com> PR c++/105593 gcc/c/ * c-parser.cc (c_parser_initializer): Check warning_enabled_at at the DECL_SOURCE_LOCATION (decl) for OPT_Winit_self instead of warn_init_self. gcc/cp/ * decl.cc (cp_finish_decl): Check warning_enabled_at at the DECL_SOURCE_LOCATION (decl) for OPT_Winit_self instead of warn_init_self. gcc/testsuite/ * c-c++-common/Winit-self3.c: New test. * c-c++-common/Winit-self4.c: New test. * c-c++-common/Winit-self5.c: New test.
2023-01-16rs6000: Teach rs6000_opaque_type_invalid_use_p about inline asm [PR108272]Kewen Lin5-10/+110
As PR108272 shows, there are some invalid uses of MMA opaque types in inline asm statements. This patch is to teach the function rs6000_opaque_type_invalid_use_p for inline asm, check and error any invalid use of MMA opaque types in input and output operands. PR target/108272 gcc/ChangeLog: * config/rs6000/rs6000.cc (rs6000_opaque_type_invalid_use_p): Add the support for invalid uses in inline asm, factor out the checking and erroring to lambda function check_and_error_invalid_use. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr108272-1.c: New test. * gcc.target/powerpc/pr108272-2.c: New test. * gcc.target/powerpc/pr108272-3.c: New test. * gcc.target/powerpc/pr108272-4.c: New test.
2023-01-16Daily bump.GCC Administrator3-1/+33
2023-01-15[PR107608] [range-ops] Avoid folding into INF when flag_trapping_math.Aldy Hernandez2-0/+22
As discussed in the PR, for trapping math, do not fold overflowing operations into +-INF as doing so could elide a trap. There is a minor adjustment to known_isinf() where it was mistakenly returning true for an [infinity U NAN], whereas it should only return true when the range is exclusively +INF or -INF. This is benign, as there were no users of known_isinf up to now. Tested on x86-64 Linux. I also ran the glibc testsuite (git sources) on x86-64 and this patch fixes: -FAIL: math/test-double-lgamma -FAIL: math/test-double-log1p -FAIL: math/test-float-lgamma -FAIL: math/test-float-log1p -FAIL: math/test-float128-catan -FAIL: math/test-float128-catanh -FAIL: math/test-float128-lgamma -FAIL: math/test-float128-log -FAIL: math/test-float128-log1p -FAIL: math/test-float128-y0 -FAIL: math/test-float128-y1 -FAIL: math/test-float32-lgamma -FAIL: math/test-float32-log1p -FAIL: math/test-float32x-lgamma -FAIL: math/test-float32x-log1p -FAIL: math/test-float64-lgamma -FAIL: math/test-float64-log1p -FAIL: math/test-float64x-lgamma -FAIL: math/test-ldouble-lgamma PR tree-optimization/107608 gcc/ChangeLog: * range-op-float.cc (range_operator_float::fold_range): Avoid folding into INF when flag_trapping_math. * value-range.h (frange::known_isinf): Return false for possible NANs.
2023-01-15Bugfix to allow testsuite/gm2/pim/pass/arraybool.mod to compile on ppc64leGaius Mulley5-51/+85
This bug is exposed on the ppc64le platform. The expression parser P3Build.bnf (and PHBuild.bnf) BuiltNot omitted to record the current token position on the quad stack. The patch changes all occurances of NEW to newBoolFrame to ensure that the tokenno recorded in the bool frame is set to a sensible value. BuildNot is fixed and improved to generate a virtual token recording the position of the subexpression. gcc/m2/ChangeLog: * gm2-compiler/M2LexBuf.mod (isSrcToken): Add block comment. Remove dead code. * gm2-compiler/M2Quads.def (BuildNot): Add notTokPos parameter. * gm2-compiler/M2Quads.mod (BuildNot): Add notTokPos parameter. Create and push virtual token. (PopBooltok): New procedure. (PushBooltok): New procedure. (PushBool): Re-implement using PushBooltok. (PopBool): Re-implement using PopBooltok. * gm2-compiler/P3Build.bnf (ConstFactor): Record token position of NOT. (Factor): Record token position of NOT. * gm2-compiler/PHBuild.bnf (ConstFactor): Record token position of NOT. (Relation): Push token position. (UnaryOrConstTerm): Push token position. (AddOperator): Push token position. (MulOperator): Push token position. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2023-01-15C-SKY: Support --with-float=softfp in configuration.Xianmiao Qu1-1/+1
Missed it before, it needs to be used when compiling non-multilib. gcc/ * config.gcc (csky-*-*): Support --with-float=softfp.
2023-01-15Daily bump.GCC Administrator9-1/+301
2023-01-14xtensa: Remove old broken tweak for leaf functionTakayuki 'January June' Suwa3-98/+30
In the before-IRA era, ORDER_REGS_FOR_LOCAL_ALLOC was called for each function in Xtensa, and there was register allocation table reordering for leaf functions to compensate for the poor performance of local-alloc. Today the adjustment hook is still called via its alternative ADJUST_REG_ALLOC_ORDER, but it is only called once at the start of the IRA, and leaf_function_p() erroneously returns true and also gives no argument count. That straightforwardly misleads register allocation that all functions are always leaves with no arguments, which leads to inefficiencies in allocation results. Fortunately, IRA is smart enough than local-alloc to not need such assistance. This patch does away with the antiquated by removing the wreckage that no longer works. gcc/ChangeLog: * config/xtensa/xtensa-protos.h (order_regs_for_local_alloc): Rename to xtensa_adjust_reg_alloc_order. * config/xtensa/xtensa.cc (xtensa_adjust_reg_alloc_order): Ditto. And also remove code to reorder register numbers for leaf functions, rename the tables, and adjust the allocation order for the call0 ABI to use register A0 more. (xtensa_leaf_regs): Remove. * config/xtensa/xtensa.h (REG_ALLOC_ORDER): Cosmetics. (order_regs_for_local_alloc): Rename as the above. (LEAF_REGISTERS, LEAF_REG_REMAP, leaf_function): Remove.
2023-01-14[aarch64] Fold ldr+dup to ld1rq for little endian targets.Prathamesh Kulkarni3-6/+31
gcc/ChangeLog: * config/aarch64/aarch64-sve.md (aarch64_vec_duplicate_vq<mode>_le): Change to define_insn_and_split to fold ldr+dup to ld1rq. * config/aarch64/predicates.md (aarch64_sve_dup_ld1rq_operand): New. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/acle/general/pr96463-2.c: Adjust.
2023-01-14c++: Avoid incorrect shortening of divisions [PR108365]Jakub Jelinek6-14/+62
The following testcase is miscompiled, because we shorten the division in a case where it should not be shortened. Divisions (and modulos) can be shortened if it is unsigned division/modulo, or if it is signed division/modulo where we can prove the dividend will not be the minimum signed value or divisor will not be -1, because e.g. on sizeof(long long)==sizeof(int)*2 && __INT_MAX__ == 0x7fffffff targets (-2147483647 - 1) / -1 is UB but (int) (-2147483648LL / -1LL) is not, it is -2147483648. The primary aim of both the C and C++ FE division/modulo shortening I assume was for the implicit integral promotions of {,signed,unsigned} {char,short} and because at this point we have no VRP information etc., the shortening is done if the integral promotion is from unsigned type for the divisor or if the dividend is an integer constant other than -1. This works fine for char/short -> int promotions when char/short have smaller precision than int - unsigned char -> int or unsigned short -> int will always be a positive int, so never the most negative. Now, the C FE checks whether orig_op0 is TYPE_UNSIGNED where op0 is either the same as orig_op0 or that promoted to int, I think that works fine, if it isn't promoted, either the division/modulo common type will have the same precision as op0 but then the division/modulo is unsigned and so without UB, or it will be done in wider precision (e.g. because op1 has wider precision), but then op0 can't be minimum signed value. Or it has been promoted to int, but in that case it was again from narrower type and so never minimum signed int. But the C++ FE was checking if op0 is a NOP_EXPR from TYPE_UNSIGNED. First of all, not sure if the operand of NOP_EXPR couldn't be non-integral type where TYPE_UNSIGNED wouldn't be meaningful, but more importantly, even if it is a cast from unsigned integral type, we only know it can't be minimum signed value if it is a widening cast, if it is same precision or narrowing cast, we know nothing. So, the following patch for the NOP_EXPR cases checks just in case that it is from integral type and more importantly checks it is a widening conversion, and then next to it also allows op0 to be just unsigned, promoted or not, as that is what the C FE will do for those cases too and I believe it must work - either the division/modulo common type will be that unsigned type, then we can shorten and don't need to worry about UB, or it will be some wider signed type but then it can't be most negative value of the wider type. And changes both the C and C++ FEs to do the same thing, using a helper function in c-family. 2023-01-14 Jakub Jelinek <jakub@redhat.com> PR c++/108365 * c-common.h (may_shorten_divmod): New static inline function. * c-typeck.cc (build_binary_op): Use may_shorten_divmod for integral division or modulo. * typeck.cc (cp_build_binary_op): Use may_shorten_divmod for integral division or modulo. * c-c++-common/pr108365.c: New test. * g++.dg/opt/pr108365.C: New test. * g++.dg/warn/pr108365.C: New test.
2023-01-13hash table: enforce testing is_empty before is_deletedAlexandre Oliva1-2/+14
Existing hash_table traits that use the same representation for empty and deleted slots reject marking slots as deleted, and to not pass is_deleted for slots that pass is_empty. Nevertheless, nearly everywhere, we only test for is_deleted after checking that !is_empty first. The one exception was the copy constructor, that would fail if traits recognized is_empty slots as is_deleted, but then refused to mark_deleted. This asymmetry is neither necessary nor desirable, and there is a theoretical risk that traits might not only fail to refuse to mark_deleted, but also return is_deleted for is_empty slots. This patch introduces checks that detect these potentially problematic situations, and reorders the tests in the copy constructor so as to use the conventional testing order and thus avoid them. for gcc/ChangeLog * hash-table.h (is_deleted): Precheck !is_empty. (mark_deleted): Postcheck !is_empty. (copy constructor): Test is_empty before is_deleted.
2023-01-13[PR42093] [arm] [thumb2] disable tree-dce for testAlexandre Oliva1-1/+1
CD-DCE introduces blocks to share common PHI nodes, which replaces a backwards branch that used to prevent the thumb2 jump table shortening that PR42093 tested for. In order to keep on testing that the backward branch prevents the jumptable shortening, disable tree-dce. for gcc/testsuite/ChangeLog PR target/42093 * gcc.target/arm/pr42093.c: Disable tree-dce.
2023-01-13[PR40457] [arm] expand SI-aligned movdi into pair of movsiAlexandre Oliva1-2/+10
When expanding a misaligned DImode move, emit aligned SImode moves if the parts are sufficiently aligned. This enables neighboring stores to be peephole-combined into stm, as expected by the PR40457 testcase, even after SLP vectorizes the originally aligned SImode stores into a misaligned DImode store. for gcc/ChangeLog PR target/40457 * config/arm/arm.md (movmisaligndi): Prefer aligned SImode moves.
2023-01-13analyzer: add heuristics for switch on enum type [PR105273]David Malcolm13-2/+810
Assume that switch on an enum doesn't follow an implicit default skipping all cases when all enum values are covered by cases. Fixes various false positives from -Wanalyzer-use-of-uninitialized-value such as this one seen in Doom: p_maputl.c: In function 'P_BoxOnLineSide': p_maputl.c:151:8: warning: use of uninitialized value 'p1' [CWE-457] [-Wanalyzer-use-of-uninitialized-value] 151 | if (p1 == p2) | ^ 'P_BoxOnLineSide': events 1-5 | | 115 | int p1; | | ^~ | | | | | (1) region created on stack here | | (2) capacity: 4 bytes |...... | 118 | switch (ld->slopetype) | | ~~~~~~ | | | | | (3) following 'default:' branch... |...... | 151 | if (p1 == p2) | | ~ | | | | | (4) ...to here | | (5) use of uninitialized value 'p1' here | where "ld->slopetype" is a "slopetype_t" enum, and for every value of that enum the switch has a case that initializes "p1". gcc/analyzer/ChangeLog: PR analyzer/105273 * region-model.cc (has_nondefault_case_for_value_p): New. (has_nondefault_cases_for_all_enum_values_p): New. (region_model::apply_constraints_for_gswitch): Skip implicitly-created "default" when switching on an enum and all enum values have non-default cases. (rejected_default_case::dump_to_pp): New. * region-model.h (region_model_context::possibly_tainted_p): New decl. (class rejected_default_case): New. * sm-taint.cc (region_model_context::possibly_tainted_p): New. * supergraph.cc (switch_cfg_superedge::dump_label_to_pp): Dump when implicitly_created_default_p. (switch_cfg_superedge::implicitly_created_default_p): New. * supergraph.h (switch_cfg_superedge::implicitly_created_default_p): New decl. gcc/testsuite/ChangeLog: PR analyzer/105273 * gcc.dg/analyzer/switch-enum-1.c: New test. * gcc.dg/analyzer/switch-enum-2.c: New test. * gcc.dg/analyzer/switch-enum-pr105273-git-vreportf-2.c: New test. * gcc.dg/analyzer/switch-enum-taint-1.c: New test. * gcc.dg/analyzer/switch-wrong-enum.c: New test. * gcc.dg/analyzer/torture/switch-enum-pr105273-doom-p_floor.c: New test. * gcc.dg/analyzer/torture/switch-enum-pr105273-doom-p_maputl.c: New test. * gcc.dg/analyzer/torture/switch-enum-pr105273-git-vreportf-1.c: New test. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2023-01-13Small fix for -fdump-ada-specEric Botcazou1-3/+47
This is needed to support the _Float32 and _Float64 types. gcc/c-family/ * c-ada-spec.cc (is_float32): New function. (is_float64): Likewise. (is_float128): Tweak. (dump_ada_node) <REAL_TYPE>: Call them to recognize more types.
2023-01-13Fix PR rtl-optimization/108274Eric Botcazou1-1/+4
Unlike other IPA passes, the ICF pass can be run at -O0 and some testcases rely on this in the testsuite. Now it effectively creates a tail call so the DF information needs be updated in this case after epilogue creation. gcc/ PR rtl-optimization/108274 * function.cc (thread_prologue_and_epilogue_insns): Also update the DF information for calls in a few more cases.
2023-01-13modula-2: Handle pass '-v' option to the compiler.Iain Sandoe3-1/+5
Somehow this setting had been missed, and we really need the verbose flag to enable useful debug output. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> gcc/m2/ChangeLog: * gm2-gcc/m2options.h (M2Options_SetVerbose): Export the function. * gm2-lang.cc: Handle OPT_v, passing it to the compiler. * lang-specs.h: Pass -v to cc1gm2.
2023-01-13Fix support for atomic loads and stores on hppa.John David Anglin6-92/+226
This change updates the atomic libcall support to fix the following issues: 1) A internal compiler error with -fno-sync-libcalls. 2) When sync libcalls are disabled, we don't generate libcalls for libatomic. 3) There is no sync libcall support for targets other than linux. As a result, non-atomic stores are silently emitted for types smaller or equal to the word size. There are now a few atomic libcalls in the libgcc code, so we need sync support on all targets. 2023-01-13 John David Anglin <danglin@gcc.gnu.org> gcc/ChangeLog: * config/pa/pa-linux.h (TARGET_SYNC_LIBCALL): Delete define. * config/pa/pa.cc (pa_init_libfuncs): Use MAX_SYNC_LIBFUNC_SIZE define. * config/pa/pa.h (TARGET_SYNC_LIBCALLS): Use flag_sync_libcalls. (MAX_SYNC_LIBFUNC_SIZE): Define. (TARGET_CPU_CPP_BUILTINS): Define __SOFTFP__ when soft float is enabled. * config/pa/pa.md (atomic_storeqi): Emit __atomic_exchange_1 libcall when sync libcalls are disabled. (atomic_storehi, atomic_storesi, atomic_storedi): Likewise. (atomic_loaddi): Emit __atomic_load_8 libcall when sync libcalls are disabled on 32-bit target. * config/pa/pa.opt (matomic-libcalls): New option. * doc/invoke.texi (HPPA Options): Update. libgcc/ChangeLog: * config.host (hppa*64*-*-linux*): Adjust tmake_file to use pa/t-pa64-linux. (hppa*64*-*-hpux11*): Adjust tmake_file to use pa/t-pa64-hpux instead of pa/t-hpux and pa/t-pa64. * config/pa/linux-atomic.c: Define u32 type. (ATOMIC_LOAD): Define new macro to implement atomic_load_1, atomic_load_2, atomic_load_4 and atomic_load_8. Update sync defines to use atomic_load calls for type. (SYNC_LOCK_LOAD_2): New macro to implement __sync_lock_load_8. * config/pa/sync-libfuncs.c: New file. * config/pa/t-netbsd (LIB2ADD_ST): Define. * config/pa/t-openbsd (LIB2ADD_ST): Define. * config/pa/t-pa64-hpux: New file. * config/pa/t-pa64-linux: New file.
2023-01-13sched-deps: do not schedule pseudos across calls [PR108117]Alexander Monakov2-1/+38
Scheduling across calls in the pre-RA scheduler is problematic: we do not take liveness info into account, and are thus prone to extending lifetime of a pseudo over the loop, requiring a callee-saved hardreg or causing a spill. If current function called a setjmp, lifting an assignment over a call may be incorrect if a longjmp would happen before the assignment. Thanks to Jose Marchesi for testing on AArch64. gcc/ChangeLog: PR rtl-optimization/108117 PR rtl-optimization/108132 * sched-deps.cc (deps_analyze_insn): Do not schedule across calls before reload. gcc/testsuite/ChangeLog: PR rtl-optimization/108117 PR rtl-optimization/108132 * gcc.dg/pr108117.c: New test.
2023-01-13c++: Avoid some false positive -Wfloat-conversion warnings with extended ↵Jakub Jelinek3-5/+20
precision [PR108285] On the following testcase trunk emits a false positive warning on ia32. convert_like_internal is there called with type of double and expr EXCESS_PRECISION_EXPR with float type with long double operand 2.L * (long double) x. Now, for the code generation we do the right thing, cp_convert to double from that 2.L * (long double) x, but we call even cp_convert_and_check with that and that emits the -Wfloat-conversion warning. Looking at what the C FE does in this case, it calls convert_and_check with the EXCESS_PRECISION_EXPR expression rather than its operand, and essentially uses the operand for code generation and EXCESS_PRECISION_EXPR itself for warnings. The following patch does that too for the C++ FE. 2023-01-13 Jakub Jelinek <jakub@redhat.com> PR c++/108285 * cvt.cc (cp_convert_and_check): For EXCESS_PRECISION_EXPR use its operand except that for warning purposes use the original EXCESS_PRECISION_EXPR. * call.cc (convert_like_internal): Only look through EXCESS_PRECISION_EXPR when calling cp_convert, not when calling cp_convert_and_check. * g++.dg/warn/pr108285.C: New test.
2023-01-13Recalibrate the timeouts for the larger code testsGaius Mulley2-0/+10
Some of the larger code tests timeout when -O3 is given. This patch increase the map and pimlib-base-run-pass tests. gcc/testsuite/ChangeLog: * gm2/examples/map/pass/examples-map-pass.exp: Call gm2_push_timeout 30 before foreach testcase. Call gm2_pop_timeout after the foreach statement. * gm2/pimlib/base/run/pass/pimlib-base-run-pass.exp: Call gm2_push_timeout 20 before foreach testcase. Call gm2_pop_timeout after the foreach statement. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2023-01-13testsuite: Add another testcase from PR107131Jakub Jelinek1-0/+18
This one is hand reduced to problematic code from optimized dump that used to be miscompiled during combine starting with r12-303 and fixed with r13-3530 aka PR107172 fix. 2023-01-13 Jakub Jelinek <jakub@redhat.com> PR target/107131 * gcc.c-torture/execute/pr107131.c: New test.
2023-01-13PR-108136 Add return statement to mc-boot-ch/RTco.cc pge-boot/GRTco.ccGaius Mulley2-0/+2
Clang found an exit path from function with non-void return type that has missing return statement [missingReturn]. gcc/m2/ChangeLog: * mc-boot-ch/GRTco.c (RTco_select): Add return 0. * pge-boot/GRTco.c (RTco_select): Add return 0. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2023-01-13arm: Add cde feature support for Cortex-M55 CPU.Srinath Parvathaneni4-9/+36
This patch adds cde feature (optional) support for Cortex-M55 CPU, please refer [1] for more details. To use this feature we need to specify +cdecpN (e.g. -mcpu=cortex-m55+cdecp<N>), where N is the coprocessor number 0 to 7. gcc/ChangeLog: 2023-01-13 Srinath Parvathaneni <srinath.parvathaneni@arm.com> * common/config/arm/arm-common.cc (arm_canon_arch_option_1): Ignore cde options for -mlibarch. * config/arm/arm-cpus.in (begin cpu cortex-m55): Add cde options. * doc/invoke.texi (CDE): Document options for Cortex-M55 CPU. gcc/testsuite/ChangeLog: 2023-01-13 Srinath Parvathaneni <srinath.parvathaneni@arm.com> * gcc.target/arm/multilib.exp: Add multilib tests for Cortex-M55 CPU.
2023-01-13Replace flag_strict_flex_arrays with DECL_NOT_FLEXARRAY in middle-end.Qing Zhao14-123/+68
We should not directly check flag_strict_flex_arrays in the middle end. Instead, check DECL_NOT_FLEXARRAY(array_field_decl) which is set by C/C++ FEs according to -fstrict-flex-arrays and the corresponding attribute attached to the array_field. As a result, We will lose the LEVEL information of -fstrict-flex-arrays in the middle end. -Wstrict-flex-arrays will not be able to issue such information. update the testing cases accordingly. gcc/ChangeLog: * attribs.cc (strict_flex_array_level_of): Move this function to ... * attribs.h (strict_flex_array_level_of): Remove the declaration. * gimple-array-bounds.cc (array_bounds_checker::check_array_ref): replace the referece to strict_flex_array_level_of with DECL_NOT_FLEXARRAY. * tree.cc (component_ref_size): Likewise. gcc/c/ChangeLog: * c-decl.cc (strict_flex_array_level_of): ... here. gcc/testsuite/ChangeLog: * gcc.dg/Warray-bounds-flex-arrays-1.c: Delete the level information from the message issued by -Wstrict-flex-arrays. * gcc.dg/Warray-bounds-flex-arrays-2.c: Likewise. * gcc.dg/Warray-bounds-flex-arrays-3.c: Likewise. * gcc.dg/Warray-bounds-flex-arrays-4.c: Likewise. * gcc.dg/Warray-bounds-flex-arrays-5.c: Likewise. * gcc.dg/Warray-bounds-flex-arrays-6.c: Likewise. * gcc.dg/Wstrict-flex-arrays-2.c: Likewise. * gcc.dg/Wstrict-flex-arrays-3.c: Likewise. * gcc.dg/Wstrict-flex-arrays.c: Likewise.
2023-01-13arm: Don't add crtfastmath.o for -sharedRichard Biener2-2/+2
Don't add crtfastmath.o for -shared to avoid altering the FP environment when loading a shared library. PR target/55522 * config/arm/linux-eabi.h (ENDFILE_SPEC): Don't add crtfastmath.o for -shared. * config/arm/unknown-elf.h (STARTFILE_SPEC): Likewise.
2023-01-13aarch64: Don't add crtfastmath.o for -sharedRichard Biener3-3/+3
Don't add crtfastmath.o for -shared to avoid altering the FP environment when loading a shared library. PR target/55522 * config/aarch64/aarch64-elf-raw.h (ENDFILE_SPEC): Don't add crtfastmath.o for -shared. * config/aarch64/aarch64-freebsd.h (GNU_USER_TARGET_MATHFILE_SPEC): Likewise. * config/aarch64/aarch64-linux.h (GNU_USER_TARGET_MATHFILE_SPEC): Likewise.
2023-01-13testsuite: Add testcase for PR that went latent in GCC 13 [PR107131]Jakub Jelinek1-0/+30
The following testcase is probably latent since r13-3217-gc4d15dddf6b9e. Adding testcase so that it doesn't silently reappear. 2023-01-13 Jakub Jelinek <jakub@redhat.com> PR target/107131 * gcc.dg/pr107131.c: New test.
2023-01-13aarch64: Fix DWARF frame register sizes for predicatesRichard Sandiford3-0/+50
Jakub pointed out that __builtin_init_dwarf_reg_size_table set the size of predicate registers to their current runtime size when compiled with +sve, but to 8 bytes otherwise. As explained in the comment, both behaviours are wrong. Predicates change size with VL and should never need to be restored during unwinding. In contrast, the call-saved FP&SIMD frame registers are 8 bytes (even though the hardware registers are at least 16 bytes) and the call-clobbered registers have zero size. A zero size seems correct for predicates too. gcc/ * config/aarch64/aarch64.cc (aarch64_dwarf_frame_reg_mode): New function. (TARGET_DWARF_FRAME_REG_MODE): Define. gcc/testsuite/ * gcc.target/aarch64/dwarf_reg_size_1.c: New test. * gcc.target/aarch64/dwarf_reg_size_2.c: Likewise.
2023-01-13aarch64: Don't update EH info when folding [PR107209]Richard Biener2-1/+17
The AArch64 folders tried to update EH info on the fly, bypassing the folder's attempts to remove dead EH edges later. This triggered an ICE when folding a potentially-trapping call to a constant. gcc/ PR target/107209 * config/aarch64/aarch64.cc (aarch64_gimple_fold_builtin): Don't update EH info on the fly. gcc/testsuite/ * gcc.target/aarch64/pr107209.c: New test. Co-Authored-By: Richard Biener <rguenther@suse.de>
2023-01-13tree-optimization/108387 - ICE with VN handling of x << C as x * (1<<C)Richard Biener2-1/+15
The following fixes unexpected simplification of x << C as x * (1<<C) to a constant. PR tree-optimization/108387 * tree-ssa-sccvn.cc (visit_nary_op): Check for SSA_NAME value before inserting expression into the tables. * gcc.dg/pr108387.c: New testcase.
2023-01-13Sync LTO type_for_mode with c-family/Richard Biener1-3/+17
The following adds _FloatN mode support to the LTO copy of c_common_type_for_mode and also implements the fix for PR94072. gcc/lto/ * lto-lang.cc (lto_type_for_mode): Sync with c_common_type_for_mode.
2023-01-13testsuite: extend timeout into all gm2 testsGaius Mulley5-1/+77
Add timeout capability to gm2-torture.exp. Also add a simple gm2_push_timeout/gm2_pop timeout facility and calibrate all tests to use the default of 10 seconds. 15 seconds (for the coroutine tests) and 60 seconds for whole program optimization. gcc/testsuite/ChangeLog: * gm2/coroutines/pim/run/pass/coroutines-pim-run-pass.exp (timeout-dg.exp): Load. Call gm2_push_timeout 15. Call gm2_pop_timeout at the end. * gm2/link/min/pass/link-min-pass.exp: Set path argument to "". * gm2/switches/whole-program/pass/run/switches-whole-program-pass-run.exp: Call gm2_push_timeout 60. Call gm2_pop_timeout at the end. * lib/gm2-torture.exp (gm2_previous_timeout): Set to 10 or individual_timeout. Configure dejagnu to timeout for 10 seconds. (gm2_push_timeout): New proc. (gm2_pop_timeout): New proc. * lib/gm2.exp (gm2_previous_timeout): Set to 10 or individual_timeout. Configure dejagnu to timeout for 10 seconds. (gm2_push_timeout): New proc. (gm2_pop_timeout): New proc. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2023-01-13Daily bump.GCC Administrator4-1/+130
2023-01-12Testsuite: use same timeout for gm2 as other front-endsGaius Mulley1-4/+2
Committing a patch authored by: Jason Merrill <jason@redhat.com> which enables timeouts in the gm2 regression script library gm2.exp. gcc/testsuite/ChangeLog: * lib/gm2.exp: Use timeout.exp. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2023-01-12Add -fno-exceptions to gcc/testsuite/lib/gm2.expGaius Mulley1-0/+1
The gm2 minimal libraries do not have exception handler capability. Therefore we want the front end to suppress generation of runtime exception code. gcc/testsuite/ChangeLog: * lib/gm2.exp (gm2_init_min): Append -fno-exceptions to args. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2023-01-12PR tree-optimization/92342: Optimize b & -(a==c) in match.pdRoger Sayle5-3/+65
This patch is an update/tweak of Andrew Pinski's two patches for PR tree-optimization/92342, that were originally posted by in November: https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585111.html https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585112.html Technically, the first of those was approved by Richard Biener, though never committed, and my first thought was to simply push it for Andrew, but the review of the second piece expressed concerns over comparisons in non-integral modes, where the result may not be zero-one valued. Indeed both transformations misbehave in the presence of vector mode comparisons (these transformations are already implemented for vec_cond elsewhere in match.pd), so my minor contribution is to limit these new transformations to scalars, by testing that both the operands and results are INTEGRAL_TYPE_P. 2023-01-12 Andrew Pinski <apinski@marvell.com> Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog: PR tree-optimization/92342 * match.pd ((m1 CMP m2) * d -> (m1 CMP m2) ? d : 0): Use tcc_comparison and :c for the multiply. (b & -(a CMP c) -> (a CMP c)?b:0): New pattern. gcc/testsuite/ChangeLog: PR tree-optimization/92342 * gcc.dg/tree-ssa/andnegcmp-1.c: New test. * gcc.dg/tree-ssa/andnegcmp-2.c: New test. * gcc.dg/tree-ssa/multcmp-1.c: New test. * gcc.dg/tree-ssa/multcmp-2.c: New test.
2023-01-12aarch64: Fix bit-field alignment in param passing [PR105549]Christophe Lyon11-426/+585
While working on enabling DFP for AArch64, I noticed new failures in gcc.dg/compat/struct-layout-1.exp (t028) which were not actually caused by DFP types handling. These tests are generated during 'make check' and enabling DFP made generation different (not sure if new non-DFP tests are generated, or if existing ones are generated differently, the tests in question are huge and difficult to compare). Anyway, I reduced the problem to what I attach at the end of the new gcc.target/aarch64/aapcs64/va_arg-17.c test and rewrote it in the same scheme as other va_arg* AArch64 tests. Richard Sandiford further reduced this to a non-vararg function, added as a second testcase. This is a tough case mixing bit-fields and alignment, where aarch64_function_arg_alignment did not follow what its descriptive comment says: we want to use the natural alignment of the bit-field type only if the user didn't reduce the alignment for the bit-field itself. The patch also adds a comment and assert that would help someone who has to look at this area again. The fix would be very small, except that this introduces a new ABI break, and we have to warn about that. Since this actually fixes a problem introduced in GCC 9.1, we keep the old computation to detect when we now behave differently. This patch adds two new tests (va_arg-17.c and pr105549.c). va_arg-17.c contains the reduced offending testcase from struct-layout-1.exp for reference. We update some tests introduced by the previous patch, where parameters with bit-fields and packed attribute now emit a different warning. 2022-11-28 Christophe Lyon <christophe.lyon@arm.com> Richard Sandiford <richard.sandiford@arm.com> gcc/ PR target/105549 * config/aarch64/aarch64.cc (aarch64_function_arg_alignment): Check DECL_PACKED for bitfield. (aarch64_layout_arg): Warn when parameter passing ABI changes. (aarch64_function_arg_boundary): Do not warn here. (aarch64_gimplify_va_arg_expr): Warn when parameter passing ABI changes. gcc/testsuite/ PR target/105549 * gcc.target/aarch64/bitfield-abi-warning-align16-O2.c: Update. * gcc.target/aarch64/bitfield-abi-warning-align16-O2-extra.c: Update. * gcc.target/aarch64/bitfield-abi-warning-align32-O2.c: Update. * gcc.target/aarch64/bitfield-abi-warning-align32-O2-extra.c: Update. * gcc.target/aarch64/aapcs64/va_arg-17.c: New test. * gcc.target/aarch64/pr105549.c: New test. * g++.target/aarch64/bitfield-abi-warning-align16-O2.C: Update. * g++.target/aarch64/bitfield-abi-warning-align16-O2-extra.C: Update. * g++.target/aarch64/bitfield-abi-warning-align32-O2.C: Update. * g++.target/aarch64/bitfield-abi-warning-align32-O2-extra.C: Update.
2023-01-12aarch64: fix warning emission for ABI break since GCC 9.1Christophe Lyon15-7/+1132
While looking at PR 105549, which is about fixing the ABI break introduced in GCC 9.1 in parameter alignment with bit-fields, we noticed that the GCC 9.1 warning is not emitted in all the cases where it should be. This patch fixes that and the next patch in the series fixes the GCC 9.1 break. We split this into two patches since patch #2 introduces a new ABI break starting with GCC 13.1. This way, patch #1 can be back-ported to release branches if needed to fix the GCC 9.1 warning issue. The main idea is to add a new global boolean that indicates whether we're expanding the start of a function, so that aarch64_layout_arg can emit warnings for callees as well as callers. This removes the need for aarch64_function_arg_boundary to warn (with its incomplete information). However, in the first patch there are still cases where we emit warnings were we should not; this is fixed in patch #2 where we can distinguish between GCC 9.1 and GCC.13.1 ABI breaks properly. The fix in aarch64_function_arg_boundary (replacing & with &&) looks like an oversight of a previous commit in this area which changed 'abi_break' from a boolean to an integer. We also take the opportunity to fix the comment above aarch64_function_arg_alignment since the value of the abi_break parameter was changed in a previous commit, no longer matching the description. 2022-11-28 Christophe Lyon <christophe.lyon@arm.com> Richard Sandiford <richard.sandiford@arm.com> gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_function_arg_alignment): Fix comment. (aarch64_layout_arg): Factorize warning conditions. (aarch64_function_arg_boundary): Fix typo. * function.cc (currently_expanding_function_start): New variable. (expand_function_start): Handle currently_expanding_function_start. * function.h (currently_expanding_function_start): Declare. gcc/testsuite/ChangeLog: * gcc.target/aarch64/bitfield-abi-warning-align16-O2.c: New test. * gcc.target/aarch64/bitfield-abi-warning-align16-O2-extra.c: New test. * gcc.target/aarch64/bitfield-abi-warning-align32-O2.c: New test. * gcc.target/aarch64/bitfield-abi-warning-align32-O2-extra.c: New test. * gcc.target/aarch64/bitfield-abi-warning-align8-O2.c: New test. * gcc.target/aarch64/bitfield-abi-warning.h: New test. * g++.target/aarch64/bitfield-abi-warning-align16-O2.C: New test. * g++.target/aarch64/bitfield-abi-warning-align16-O2-extra.C: New test. * g++.target/aarch64/bitfield-abi-warning-align32-O2.C: New test. * g++.target/aarch64/bitfield-abi-warning-align32-O2-extra.C: New test. * g++.target/aarch64/bitfield-abi-warning-align8-O2.C: New test. * g++.target/aarch64/bitfield-abi-warning.h: New test.
2023-01-12tree-optimization/99412 - reassoc and reduction chainsRichard Biener5-67/+35
With -ffast-math we end up associating reduction chains and break them - this is because of old code that tries to rectify reductions into a shape likened by the vectorizer. Nowadays the rank compute produces correct association for reduction chains and the vectorizer has robust support to fall back to a regular reductions (via reduction path) when it turns out to be not a proper reduction chain. So this patch removes the special code in reassoc which makes the TSVC s352 vectorized with -Ofast (it is already without -ffast-math). PR tree-optimization/99412 * tree-ssa-reassoc.cc (is_phi_for_stmt): Remove. (swap_ops_for_binary_stmt): Remove reduction handling. (rewrite_expr_tree_parallel): Adjust. (reassociate_bb): Likewise. * tree-parloops.cc (build_new_reduction): Handle MINUS_EXPR. * gcc.dg/vect/pr99412.c: New testcase. * gcc.dg/tree-ssa/reassoc-47.c: Adjust comment. * gcc.dg/tree-ssa/reassoc-48.c: Remove.
2023-01-12xtensa: Optimize ctzsi2 and ffssi2 a bitTakayuki 'January June' Suwa1-4/+4
This patch saves one byte when the Code Density Option is enabled, gcc/ChangeLog: * config/xtensa/xtensa.md (ctzsi2, ffssi2): Rearrange the emitting codes.
2023-01-12xtensa: Tune "*btrue" insn patternTakayuki 'January June' Suwa1-2/+9
This branch instruction has short encoding if EQ/NE comparison against immediate zero when the Code Density Option is enabled, but its "length" attribute was only for normal encoding. This patch fixes it. This patch also prevents undesireable replacement the comparison immediate zero of the instruction (short encoding, as mentioned above) with a register that has value of zero (normal encoding) by the postreload pass. gcc/ChangeLog: * config/xtensa/xtensa.md (*btrue): Correct value of the attribute "length" that depends on TARGET_DENSITY and operands, and add '?' character to the register constraint of the compared operand.