aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2024-10-28[target/117316] Fix initializer for riscv code alignment handlingJeff Law1-3/+27
The construct used for initializing the code alignments in a recent change is causing bootstrap problems on riscv64 as seen in the referenced bugzilla. This patch adjusts the initializer by pushing the NULL down into each uarch clause. Bootstrapped on riscv64, regression test in flight, but given bootstrap is broken it seemed advisable to move this forward now. I'm so much looking forward to the day when we have performant hardware for bootstrap testing... Sigh. Anyway, bootstrapped and installing on the trunk. PR target/117316 gcc/ * config/riscv/riscv.cc (riscv_tune_param): Drop initializer. (*_tune_info): Add initializers for code alignments.
2024-10-28tree-optimization/117307 - STMT_VINFO_SLP_VECT_ONLY mis-computationRichard Biener2-6/+30
STMT_VINFO_SLP_VECT_ONLY isn't properly computed as union of all group members and when the group is later split due to duplicates not all sub-groups inherit the flag. PR tree-optimization/117307 * tree-vect-data-refs.cc (vect_analyze_data_ref_accesses): Properly compute STMT_VINFO_SLP_VECT_ONLY. Set it on all parts of a split group. * gcc.dg/vect/pr117307.c: New testcase.
2024-10-28tree-core.h (omp_clause_code): Comments regarding range checks for ↵Tobias Burnus1-1/+23
OMP_CLAUSE_... gcc/ChangeLog: * tree-core.h (enum omp_clause_code): Add comments to cross ref to OMP_CLAUSE_DECL etc. and mark the ranges used in the range checks.
2024-10-28vec-lowering: Fix ABSU lowering [PR111285]Andrew Pinski2-1/+38
ABSU_EXPR lowering incorrectly used the resulting type for the new expression but in the case of ABSU the resulting type is an unsigned type and with ABSU is folded away. The fix is to use a signed type for the expression instead. Bootstrapped and tested on x86_64-linux-gnu. PR middle-end/111285 gcc/ChangeLog: * tree-vect-generic.cc (do_unop): Use a signed type for the operand if the operation was ABSU_EXPR. gcc/testsuite/ChangeLog: * g++.dg/torture/vect-absu-1.C: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-10-28phiopt: Move check for maybe_undef_p slightly earlierAndrew Pinski1-7/+7
This moves the check for maybe_undef_p in match_simplify_replacement slightly earlier before figuring out the true/false arg using arg0/arg1 instead. In most cases this is no difference in compile time; just in the case there is an undef in the args there would be a slight compile time improvement as there is no reason to figure out which arg corresponds to the true/false side of the conditional. Bootstrapped and tested on x86_64-linux-gnu. gcc/ChangeLog: * tree-ssa-phiopt.cc (match_simplify_replacement): Move check for maybe_undef_p earlier. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-10-28Remove code in vectorizer pattern recog relying on vec_cond{u,eq,}Richard Biener1-36/+1
With the intent to rely on vec_cond_mask and vec_cmp patterns comparisons do not need rewriting into COND_EXPRs that eventually combine to vec_cond{u,eq,}. * tree-vect-patterns.cc (check_bool_pattern): For comparisons we do nothing if we can expand them or we can't replace them with a ? -1 : 0 condition - but the latter would require expanding the comparison which we proved we can't. So do nothing, aka not think vec_cond{u,eq,} will save us.
2024-10-28RISC-V:Bugfix for vlmul_ext and vlmul_trunc with NULL return value[pr117286]xuli2-0/+20
This patch fixes following ICE: test.c: In function 'func': test.c:37:24: internal compiler error: Segmentation fault 37 | vfloat16mf2_t vc = __riscv_vlmul_trunc_v_f16m1_f16mf2(vb); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The root cause is that vlmul_trunc has a null return value. gimple_call <__riscv_vlmul_trunc_v_f16m1_f16mf2, NULL, vb_13> ^^^ Passed the rv64gcv_zvfh regression test. Singed-off-by: Li Xu <xuli1@eswincomputing.com> PR target/117286 gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc: Do not expand NULL return. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr117286.c: New test.
2024-10-28gcc.target/i386/pr53533-[13].c: Adjust assembly scanH.J. Lu2-2/+6
Before 1089d083117 Simplify (B * v + C) * D -> BD* v + CD when B,C,D are all INTEGER_CST. the loop was .L2: movl (%rdi,%rdx), %eax addl $12345, %eax imull $-1564285888, %eax, %eax leal -333519936(%rax), %eax movl %eax, (%rsi,%rdx) addq $4, %rdx cmpq $1024, %rdx jne .L2 There were 1 addl and 1 leal. 1 addq was to update the loop counter. The optimized loop is .L2: imull $-1564285888, (%rdi,%rax), %edx subl $1269844480, %edx movl %edx, (%rsi,%rax) addq $4, %rax cmpq $1024, %rax jne .L2 1 addl is changed to subl and leal is removed. Adjust assembly scan to check for 1 subl and 1 addl/addq as well as lea removal. * gcc.target/i386/pr53533-1.c: Adjust assembly scan. * gcc.target/i386/pr53533-3.c: Likewise. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
2024-10-28Daily bump.GCC Administrator5-1/+88
2024-10-27arm: Support -mfdpic for more targetsFangrui Song2-2/+5
Targets that are not arm*-*-uclinuxfdpiceabi can use -S -mfdpic, but -c -mfdpic does not pass --fdpic to gas. This is an unnecessary restriction. Just define the ASM_SPEC in bpabi.h. Additionally, use armelf[b]_linux_fdpiceabi emulations for -mfdpic in linux-eabi.h. This will allow a future musl fdpic port to use the desired BFD emulation. gcc/ChangeLog: * config/arm/bpabi.h (TARGET_FDPIC_ASM_SPEC): Transform -mfdpic. * config/arm/linux-eabi.h (TARGET_FDPIC_LINKER_EMULATION): Define. (SUBTARGET_EXTRA_LINK_SPEC): Use TARGET_FDPIC_LINKER_EMULATION if -mfdpic.
2024-10-27xtensa: Define TARGET_DIFFERENT_ADDR_DISPLACEMENT_P target hookTakayuki 'January June' Suwa2-6/+9
In commit bc5a9dab55d13f888a3cdd150c8cf5c2244f35e0 ("gcc: xtensa: reorder movsi_internal patterns for better code generation during LRA"), the instruction order in "movsi_internal" MD definition was changed to make LRA use load/store instructions with larger memory address displacements, but as a side effect, it now uses the larger displacements (ie., the larger instructions) even outside of reload operations. The underlying problem is that LRA assumes by default that there is only one maximal legitimate displacement for the same address structure, meaning that it has no choice but to use the first load/store instruction it finds. To fix this, define TARGET_DIFFERENT_ADDR_DISPLACEMENT_P hook to always return true. gcc/ChangeLog: * config/xtensa/xtensa.cc (TARGET_DIFFERENT_ADDR_DISPLACEMENT_P): Add new target hook to always return true. * config/xtensa/xtensa.md (movsi_internal): Revert the previous changes.
2024-10-27genmatch: Add selftests to genmatch for diag_vfprintfJakub Jelinek6-2/+152
The following patch adds selftests to genmatch to verify the new printing routine there. So that I can rely on HAVE_DECL_FMEMOPEN (host test), the tests are done solely in stage2+ where we link the host libcpp etc. to genmatch. The tests have been adjusted from pretty-print.cc (test_pp_format), and I've added to that function two new tests because I've noticed nothing was testing the %M$.*N$s etc. format specifiers. 2024-10-27 Jakub Jelinek <jakub@redhat.com> * configure.ac (gcc_AC_CHECK_DECLS): Add fmemopen. * configure: Regenerate. * config.in: Regenerate. * Makefile.in (build/genmatch.o): Add -DGENMATCH_SELFTESTS to BUILD_CPPFLAGS for stage2+ genmatch. * genmatch.cc (test_diag_vfprintf, genmatch_diag_selftests): New functions. (main): Call genmatch_diag_selftests. * pretty-print.cc (test_pp_format): Add two tests, one for %M$.*N$s and one for %M$.Ns.
2024-10-27c-family: -Wleading-whitespace= argument spellingJakub Jelinek6-10/+10
On Thu, Oct 24, 2024 at 03:33:25PM -0400, Eric Gallager wrote: > On Thu, Oct 24, 2024 at 4:17 AM Jakub Jelinek <jakub@redhat.com> wrote: > > I've tried to build stage3 with > > -Wleading-whitespace=blanks -Wtrailing-whitespace=blank -Wno-error=leading-whitespace=blanks -Wno-error=trailing-whitespace=blank > > So wait, it's "blanks" (plural) when it's leading, but "blank" > (singular) when it's trailing? That inconsistency bothers me... I've mentioned it already in https://gcc.gnu.org/pipermail/gcc-patches/2024-October/664664.html Citing that here: Not sure about the kinds for the option, given -Wleading-whitespace= uses plural and this option singular and -Wleading-whitespace= spaces means literally just ' ' characters, while space in -Wtrailing-whitespace= was ' ', '\t', '\v' and '\f'; so category; perhaps just use any and blanks? Other preferences? Here is a patch to do the blank->blanks and space->any changes. 2024-10-27 Jakub Jelinek <jakub@redhat.com> gcc/ * doc/invoke.texi (Wtrailing-whitespace=): Change blank argument to blanks and space argument to any. gcc/c-family/ * c.opt (warn_trailing_whitespace_kind): Change blank to blanks and space to any. gcc/testsuite/ * c-c++-common/cpp/Wtrailing-whitespace-2.c: Use -Wtrailing-whitespace=blanks rather than -Wtrailing-whitespace=blank. * c-c++-common/cpp/Wtrailing-whitespace-3.c: Use -Wtrailing-whitespace=any rather than -Wtrailing-whitespace=space. * c-c++-common/cpp/Wtrailing-whitespace-7.c: Use -Wtrailing-whitespace=blanks rather than -Wtrailing-whitespace=blank. * c-c++-common/cpp/Wtrailing-whitespace-8.c: Use -Wtrailing-whitespace=any rather than -Wtrailing-whitespace=space.
2024-10-27testsuite: Fix up gcc.dg/vec-perm-lower.c testJakub Jelinek1-1/+1
On Tue, Oct 15, 2024 at 12:45:35PM +0000, Tamar Christina wrote: > I'll write a gimple one and commit with this then. The new test FAILs on i686-linux, with the usual FAIL: gcc.dg/vec-perm-lower.c (test for excess errors) Excess errors: .../gcc/testsuite/gcc.dg/vec-perm-lower.c:9:1: warning: SSE vector return without SSE enabled changes the ABI [-Wpsabi] .../gcc/testsuite/gcc.dg/vec-perm-lower.c:8:1: warning: MMX vector argument without MMX enabled changes the ABI [-Wpsabi] The following patch fixes that. Tested on x86_64-linux with make check-gcc RUNTESTFLAGS='--target_board=unix/\{-m32,-m32/-mno-sse/-mno-mmx,-m64\} dg.exp=vec-perm-lower.c' which previously FAILed, now PASSes, ok for trunk? 2024-10-27 Jakub Jelinek <jakub@redhat.com> * gcc.dg/vec-perm-lower.c: Add -Wno-psabi to dg-options.
2024-10-27Fortran: Fix regressions with intent(out) class[PR115070, PR115348].Paul Thomas3-12/+80
2024-10-27 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/115070 PR fortran/115348 * trans-expr.cc (gfc_trans_class_init_assign): If all the components of the default initializer are null for a scalar, build an empty statement to prevent prior declarations from disappearing. gcc/testsuite/ PR fortran/115070 * gfortran.dg/pr115070.f90: New test. PR fortran/115348 * gfortran.dg/pr115348.f90: New test.
2024-10-27testsuite: Sanitize pacbti test cases for Cortex-MTorbjörn SVENSSON14-24/+24
Some of the test cases were scanning for "bti", but it would, incorrectly, match the ".arch_extenssion pacbti". gcc/testsuite/ChangeLog: * gcc.target/arm/bti-1.c: Check for asm instructions starting with a tab. * gcc.target/arm/bti-2.c: Likewise. * gcc.target/arm/pac-1.c: Likewise. * gcc.target/arm/pac-2.c: Likewise. * gcc.target/arm/pac-3.c: Likewise. * gcc.target/arm/pac-4.c: Likewise. * gcc.target/arm/pac-6.c: Likewise. * gcc.target/arm/pac-7.c: Likewise. * gcc.target/arm/pac-8.c: Likewise. * gcc.target/arm/pac-9.c: Likewise. * gcc.target/arm/pac-10.c: Likewise. * gcc.target/arm/pac-11.c: Likewise. * gcc.target/arm/pac-15.c: Likewise. * gcc.target/arm/pac-sibcall.c: Likewise. Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com> Co-authored-by: Yvan ROUX <yvan.roux@foss.st.com>
2024-10-27Daily bump.GCC Administrator5-1/+62
2024-10-26doc, fortran: Add a missing menu item.Iain Sandoe1-0/+1
The changes in r15-4697-g4727bfb37701 omit a menu entry which causes a bootstrap fail when Frotran is included for at least makeinfo 6.7. Fixed thus. gcc/fortran/ChangeLog: * intrinsic.texi: Add menu item for UINT. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
2024-10-26tree: Mark PAREN_EXPR and VEC_DUPLICATE_EXPR as non-trapping [PR117234]Andrew Pinski5-0/+75
While looking to fix a possible trapping issue in PHI-OPT's factor, I noticed that some tree codes could be marked as trapping even though they don't have a possibility to trap. In the case of PAREN_EXPR, it is basically a nop except when it comes to association across it so it can't trap. In the case of VEC_DUPLICATE_EXPR, it is similar to a CONSTRUCTOR, so it can't trap. This fixes those 2 issues and adds 4 testcases, 2 which are specific to aarch64 since the only way to get a VEC_DUPLICATE_EXPR is to use intrinsics currently. Build and tested for aarch64-linux-gnu. PR tree-optimization/117234 gcc/ChangeLog: * tree-eh.cc (operation_could_trap_helper_p): Treat PAREN_EXPR and VEC_DUPLICATE_EXPR like constructing expressions. gcc/testsuite/ChangeLog: * g++.dg/eh/noncall-fp-1.C: New test. * g++.target/aarch64/sve/noncall-eh-fp-1.C: New test. * gcc.dg/tree-ssa/trapping-1.c: New test. * gcc.target/aarch64/sve/trapping-1.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-10-26Add UNSIGNED for intrinsics.Thomas Koenig3-193/+409
gcc/fortran/ChangeLog: * gfortran.texi: Correct reference to make clear that UNSIGNED will not be part of F202Y. Other clarifications. Extend table of intrinsics, add links. * intrinsic.texi: Add descriptions for UNSIGNED arguments. * invoke.texi: Add anchor for -funsigned.
2024-10-26Fix old glitch in the GNAT Reference ManualEric Botcazou2-2/+2
gcc/ada PR ada/62122 * doc/gnat_rm/implementation_defined_attributes.rst (Unrestricted_Access): Remove null exclusion. * gnat_rm.texi: Regenerate.
2024-10-26Assert finished vectorizer pattern COND_EXPR transitionRichard Biener2-7/+3
The following places a few strathegic asserts so we do not end up with COND_EXPRs with a comparison as the first operand during vectorization. * tree-vect-slp.cc (vect_get_operand_map): Mark COMPARISON_CLASS_P COND_EXPR condition path unreachable. * tree-vect-stmts.cc (vect_is_simple_use): Likewise. (vectorizable_condition): Assert the COND_EXPR condition isn't COMPARISON_CLASS_P.
2024-10-26Finish vectorizer pattern proper COND_EXPR transitionRichard Biener1-2/+5
This fixes up vect_recog_ctz_ffs_pattern. * tree-vect-patterns.cc (vect_recog_ctz_ffs_pattern): Create a separate pattern stmt for the comparison in the generated COND_EXPR.
2024-10-26Finish vectorizer pattern proper COND_EXPR transitionRichard Biener1-2/+5
The following tries to finish building proper GIMPLE COND_EXPRs in vectorizer pattern recognition. * tree-vect-patterns.cc (vect_recog_divmod_pattern): Build separate comparion pattern for the condition of a COND_EXPR pattern.
2024-10-26testsuite: fixup tbaa test againSam James1-4/+4
Test was broken until r15-4684-g2d1d6be00257c5 which made it actually run and r15-4685-g091e45b4e97d1e which applied fixes other than the trivial rename. But more is needed: this gets the test working properly in terms of scanning the dump and handling the interaction w/ LTO with not producing an executable (did try ltrans scan but that didn't work either). Unfortunately, the test seems to fail for me on godbolt even going back to GCC 7.1 or thereabouts, hence XFAIL. However, if I revert r9-3870-g2a98b4bfc3d952, I do get an ICE in fld_incomplete_type_of -- because we do far more checking with LTO now on (in)complete types. And reverting it on releases/gcc-9 actually makes it give 0. In summary: fix the test fully so it really does run and we get a check for ICEing at least, and mark the dg-final scan as XFAIL so Honza can comment on that. gcc/testsuite/ChangeLog: PR testsuite/117299 * gcc.dg/lto/tbaa_0.c: Move to... * gcc.dg/tbaa.c: ...here.
2024-10-26Daily bump.GCC Administrator17-1/+2406
2024-10-25simplify-rtx: Handle `a != 0 ? -a : 0` [PR58195]Andrew Pinski3-0/+72
The gimple (and generic) levels have this optmization since r12-2041-g7d6979197274a662da7bdc5. It seems like a good idea to add a similar one to rtl just in case it is not caught at the gimple level. Note the loop case in csel-neg-1.c is not handled at the gimple level (even with phiopt turned back on), this is because of casts to avoid signed integer overflow; a patch to fix this at the gimple level will be submitted seperately. Changes since v1: * v2: Use `CONST0_RTX (mode)` instead of const0_rtx. Add csel-neg-2.c for float testcase which now passes. Build and tested for aarch64-linux-gnu. PR rtl-optimization/58195 gcc/ChangeLog: * simplify-rtx.cc (simplify_context::simplify_ternary_operation): Handle `a != 0 ? -a : 0` and `a == 0 ? 0 : -a`. gcc/testsuite/ChangeLog: * gcc.target/aarch64/csel-neg-1.c: New test. * gcc.target/aarch64/csel-neg-2.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-10-25testsuite: lto: fix pr47333 testSam James1-0/+3
This failure was hidden until we started to run the test by fixing the filename earlier: ignore -Wtemplate-body using a pragma like e.g. g++.dg/lto/20101010-1_0.C does because lto.exp doesn't support dg-additional-options. gcc/testsuite/ChangeLog: PR lto/47333 * g++.dg/lto/pr47333_0.C: Ignore -Wtemplate-body.
2024-10-25testsuite: lto: fix pr62026 testSam James1-1/+1
This failure was hidden until we started to run the test by fixing the filename earlier: pass -Wno-return-type. gcc/testsuite/ChangeLog: PR lto/62026 * g++.dg/lto/pr62026_0.C: Pass -Wno-return-type.
2024-10-25testsuite: lto: fix pr95677 testSam James1-3/+2
These failures were hidden until we started to run the test by fixing the filename earlier: use dg-lto directives. gcc/testsuite/ChangeLog: PR c++/95677 * g++.dg/lto/pr95677_0.C: Use dg-lto-*.
2024-10-25testsuite: lto: fix tbaa_0 testSam James1-3/+5
These failures were hidden until we started to run the test by fixing the filename earlier: use dg-lto directives, pass -std=gnu89 for implicit-int, and use -flto-partition=none like c-c++-common/hwasan/builtin-special-handling.c. gcc/testsuite/ChangeLog: * gcc.dg/lto/tbaa_0.c: Use dg-lto directives, pass -std=gnu89, and use -flto-partition=none.
2024-10-25testsuite: lto: rename tbaa-1 testSam James1-0/+0
This was being ignored previously. Rename it per README. gcc/testsuite/ChangeLog: * gcc.dg/lto/tbaa-1.c: Move to... * gcc.dg/lto/tbaa_0.c: ...here.
2024-10-25testsuite: lto: rename pr47333 testSam James1-0/+0
This was being ignored previously. Rename it per README. gcc/testsuite/ChangeLog: PR target/47333 * g++.dg/lto/pr47333.C: Move to... * g++.dg/lto/pr47333_0.C: ...here.
2024-10-25testsuite: lto: rename pr62026 testSam James1-0/+0
This was being ignored previously. Rename it per README. gcc/testsuite/ChangeLog: PR lto/62026 * g++.dg/lto/pr62026.C: Move to... * g++.dg/lto/pr62026_0.C: ...here.
2024-10-25testsuite: lto: rename pr95677 testSam James1-0/+0
This was being ignored previously. Rename it per README. gcc/testsuite/ChangeLog: PR c++/95677 * g++.dg/lto/pr95677.C: Move to... * g++.dg/lto/pr95677_0.C: ...here.
2024-10-25aarch64: Support multiple variants including up to 3Andrew Pinski5-5/+74
On some of the Qualcomm's SoC that includes oryon-1 core, the variant will be different on the cores due to big.little config. Though the difference between big and little is not significant enough to have seperate cost/scheduling models for them and the feature set is the same across all variants. Also on some SoCs, there are 3 variants of the core, big.middle.little so this increases the support there for up to 3 cores and 3 variants in the original parsing loop but it does not change the support for max of 2 different cores. After this patch and the patch that adds oryon-1, -mcpu=native works on the SoCs I am working with. Bootstrapped and tested on aarch64-linux-gnu with no regressions. gcc/ChangeLog: * config/aarch64/driver-aarch64.cc (host_detect_local_cpu): Support 3 cores and 3 variants. If there is one core but multiple variant, then treat the variant as being all. gcc/testsuite/ChangeLog: * gcc.target/aarch64/cpunative/info_25: New file. * gcc.target/aarch64/cpunative/info_26: New file. * gcc.target/aarch64/cpunative/native_cpu_25.c: New test. * gcc.target/aarch64/cpunative/native_cpu_26.c: New test. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2024-10-25AArch64: Add more accurate constraint [PR117292]Wilco Dijkstra4-3/+50
As shown in the PR, reload may only check the constraint in some cases and and not check the predicate is still valid for the resulting instruction. To fix the issue, add a new constraint which matches the predicate exactly. gcc/ChangeLog: PR target/117292 * config/aarch64/aarch64-simd.md (xor<mode>3<vczle><vczbe>): Use 'De' constraint. * config/aarch64/constraints.md (De): Add new constraint. gcc/testsuite/ChangeLog: PR target/117292 * gcc.target/aarch64/sve/single_5.c: Remove xfails. * gcc.target/aarch64/pr117292.c: New test.
2024-10-25Fortran: Fix ICE with structure constructor in data statement [PR79685]Paul Thomas4-8/+46
2024-10-25 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/79685 * decl.cc (match_data_constant): Find the symtree instead of the symbol so the use renamed symbols are found. Pass this and the derived type to gfc_match_structure_constructor. * match.h: Update prototype of gfc_match_structure_contructor. * primary.cc (gfc_match_structure_constructor): Remove call to gfc_get_ha_sym_tree and use caller supplied symtree instead. gcc/testsuite/ PR fortran/79685 * gfortran.dg/use_rename_13.f90: New test.
2024-10-25testsuite: add testcase for fixed PR115933Sam James1-0/+19
gcc/testsuite/ChangeLog: PR rtl-optimization/115933 * gcc.dg/pr115933.c: New test.
2024-10-25aarch64: Add mfloat vreinterpret intrinsicsAndrew Carlotti2-1/+57
gcc/ChangeLog: * config/aarch64/aarch64-builtins.cc (MODE_d_mf8): New. (MODE_q_mf8): New. (QUAL_mf8): New. (VREINTERPRET_BUILTINS1): Add mf8 entry. (VREINTERPRET_BUILTINS): Ditto. (VREINTERPRETQ_BUILTINS1): Ditto. (VREINTERPRETQ_BUILTINS): Ditto. (aarch64_lookup_simd_type_in_table): Match modal_float bit gcc/testsuite/ChangeLog: * gcc.target/aarch64/advsimd-intrinsics/mf8-reinterpret.c: New test.
2024-10-25aarch64: Add support for mfloat8x{8|16}_t typesAndrew Carlotti12-0/+23
gcc/ChangeLog: * config/aarch64/aarch64-builtins.cc (aarch64_init_simd_builtin_types): Initialise FP8 simd types. * config/aarch64/aarch64-builtins.h (enum aarch64_type_qualifiers): Add qualifier_modal_float bit. * config/aarch64/aarch64-simd-builtin-types.def: Add Mfloat8x{8|16}_t types. * config/aarch64/arm_neon.h: Add mfloat8x{8|16}_t typedefs. gcc/testsuite/ChangeLog: * gcc.target/aarch64/movv16qi_2.c: Test mfloat as well. * gcc.target/aarch64/movv16qi_3.c: Ditto. * gcc.target/aarch64/movv2x16qi_1.c: Ditto. * gcc.target/aarch64/movv3x16qi_1.c: Ditto. * gcc.target/aarch64/movv4x16qi_1.c: Ditto. * gcc.target/aarch64/movv8qi_2.c: Ditto. * gcc.target/aarch64/movv8qi_3.c: Ditto. * gcc.target/aarch64/mfloat-init-1.c: New test.
2024-10-25match.pd: Add std::pow folding optimizations.Jennifer Schmitz2-0/+70
This patch adds the following two simplifications in match.pd for POW_ALL and POWI: - pow (1.0/x, y) to pow (x, -y), avoiding the division - pow (0.0, x) to 0.0, avoiding the call to pow. The patterns are guarded by flag_unsafe_math_optimizations, !flag_trapping_math, and !HONOR_INFINITIES. The POW_ALL patterns are also gated under !flag_errno_math. The second pattern is also guarded by !HONOR_NANS and !HONOR_SIGNED_ZEROS. Tests were added to confirm the application of the transform for builtins pow, powf, powl, powi, powif, powil, and powf16. The patch was bootstrapped and regtested on aarch64-linux-gnu and x86_64-linux-gnu, no regression. OK for mainline? Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com> gcc/ * match.pd: Fold pow (1.0/x, y) -> pow (x, -y) and pow (0.0, x) -> 0.0. gcc/testsuite/ * gcc.dg/tree-ssa/pow_fold_1.c: New test.
2024-10-25Match: Simplify branch form 3 of unsigned SAT_ADD into branchlessPan Li5-4/+67
There are sorts of forms for the unsigned SAT_ADD. Some of them are complicated while others are cheap. This patch would like to simplify the complicated form into the cheap ones. For example as below: From the form 3 (branch): SAT_U_ADD = (X + Y) >= x ? (X + Y) : -1. To (branchless): SAT_U_ADD = (X + Y) | - ((X + Y) < X). #define T uint8_t T sat_add_u_1 (T x, T y) { return (T)(x + y) >= x ? (x + y) : -1; } Before this patch: 1 │ uint8_t sat_add_u_1 (uint8_t x, uint8_t y) 2 │ { 3 │ uint8_t D.2809; 4 │ 5 │ _1 = x + y; 6 │ if (x <= _1) goto <D.2810>; else goto <D.2811>; 7 │ <D.2810>: 8 │ D.2809 = x + y; 9 │ goto <D.2812>; 10 │ <D.2811>: 11 │ D.2809 = 255; 12 │ <D.2812>: 13 │ return D.2809; 14 │ } After this patch: 1 │ uint8_t sat_add_u_1 (uint8_t x, uint8_t y) 2 │ { 3 │ uint8_t D.2809; 4 │ 5 │ _1 = x + y; 6 │ _2 = x + y; 7 │ _3 = x > _2; 8 │ _4 = (unsigned char) _3; 9 │ _5 = -_4; 10 │ D.2809 = _1 | _5; 11 │ return D.2809; 12 │ } The simplify doesn't need to check if target support the SAT_ADD, it is somehow the optimization in gimple level. The below test suites are passed for this patch. * The rv64gcv fully regression test. * The x86 bootstrap test. * The x86 fully regression test. gcc/ChangeLog: * match.pd: Remove unsigned branch form 3 for SAT_ADD, and add simplify to branchless instead. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/sat_u_add-simplify-1-u16.c: New test. * gcc.dg/tree-ssa/sat_u_add-simplify-1-u32.c: New test. * gcc.dg/tree-ssa/sat_u_add-simplify-1-u64.c: New test. * gcc.dg/tree-ssa/sat_u_add-simplify-1-u8.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-10-25Assorted --disable-checking fixes [PR117249]Jakub Jelinek11-14/+26
We have currently 3 different definitions of gcc_assert macro, one used most of the time (unless --disable-checking) which evaluates the condition at runtime and also checks it at runtime, then one for --disable-checking GCC 4.5+ which looks like ((void)(UNLIKELY (!(EXPR)) ? __builtin_unreachable (), 0 : 0)) and a fallback one ((void)(0 && (EXPR))) Now, the last one actually doesn't evaluate any of the side-effects in the argument, just quiets up unused var/parameter warnings. I've tried to replace the middle definition with ({ [[assume (EXPR)]]; (void) 0; }) for compilers which support assume attribute and statement expressions (surprisingly quite a few spots use gcc_assert inside of comma expressions), but ran into PR117287, so for now such a change isn't being proposed. The following patch attempts to move important side-effects from gcc_assert arguments. Bootstrapped/regtested on x86_64-linux and i686-linux with normal --enable-checking=yes,rtl,extra, plus additionally I've attempted to do x86_64-linux bootstrap with --disable-checking and gcc_assert changed to the ((void)(0 && (EXPR))) version when --disable-checking. That version ran into spurious middle-end warnings ../../gcc/../include/libiberty.h:733:36: error: argument to ‘alloca’ is too large [-Werror=alloca-larger-than=] ../../gcc/tree-ssa-reassoc.cc:5659:20: note: in expansion of macro ‘XALLOCAVEC’ int op_num = ops.length (); int op_normal_num = op_num; gcc_assert (op_num > 0); int stmt_num = op_num - 1; gimple **stmts = XALLOCAVEC (gimple *, stmt_num); where we have gcc_assert exactly to work-around middle-end warnings. Guess I'd need to also disable -Werror for this experiment, which actually isn't a problem with unmodified system.h, because even for --disable-checking we use the __builtin_unreachable at least in stage2/stage3 and so the warnings aren't emitted, and even if it used [[assume ()]]; it would work too because in stage2/stage3 we could again rely on assume and statement expression support. 2024-10-25 Jakub Jelinek <jakub@redhat.com> PR middle-end/117249 * tree-ssa-structalias.cc (insert_vi_for_tree): Move put calls out of gcc_assert. * lto-cgraph.cc (lto_symtab_encoder_delete_node): Likewise. * gimple-ssa-strength-reduction.cc (get_alternative_base, add_cand_for_stmt): Likewise. * tree-eh.cc (add_stmt_to_eh_lp_fn): Likewise. * except.cc (duplicate_eh_regions_1): Likewise. * tree-ssa-reassoc.cc (insert_operand_rank): Likewise. * config/nvptx/nvptx.cc (nvptx_expand_call): Use == rather than = in gcc_assert. * opts-common.cc (jobserver_info::disconnect): Call close outside of gcc_assert and only check result in it. (jobserver_info::return_token): Call write outside of gcc_assert and only check result in it. * genautomata.cc (output_default_latencies): Move j++ side-effect outside of gcc_assert. * tree-ssa-loop-ivopts.cc (get_alias_ptr_type_for_ptr_address): Use == rather than = in gcc_assert. * cgraph.cc (symbol_table::create_edge): Move ++edges_max_uid side-effect outside of gcc_assert.
2024-10-25lto: Handle RAW_DATA_CST in compare_tree_sccs_1 [PR117201]Jakub Jelinek3-0/+108
I've missed I need to add RAW_DATA_CST support in compare_tree_sccs_1, because without that it considers all RAW_DATA_CSTs to be equivalent, regardless of their length or content. 2024-10-24 Jakub Jelinek <jakub@redhat.com> PR lto/117201 PR lto/117288 * lto-common.cc (compare_tree_sccs_1): Handle RAW_DATA_CST. * gcc.dg/lto/pr117201_0.c: New test. * gcc.dg/lto/pr117288_0.c: New test.
2024-10-25Default expand_vec_cond_expr_p code to ERROR_MARKRichard Biener5-18/+13
As we want to transition to only vcond_mask expanders the following makes it possible to easier distinguish queries that rely on vcond queries for expand_vec_cond_expr_p from those of vcond_mask by for the latter having the comparison code defaulted to ERROR_MARK. * optabs-tree.h (expand_vec_cond_expr_p): Default the comparison code to ERROR_MARK. * match.pd: Remove unneded expand_vec_cond_expr_p args. * tree-vect-generic.cc (expand_vector_condition): Likewise. * tree-vect-loop.cc (vect_reduction_update_partial_vector_usage): Likewise. * tree-vect-stmts.cc (vectorizable_simd_clone_call): Likewise. (scan_store_can_perm_p): Likewise. (vectorizable_condition): Likewise.
2024-10-25testsuite: Generalise tree-ssa/shifts-3.c regexpRichard Sandiford1-1/+1
My recent gcc.dg/tree-ssa/shifts-3.c test failed on arm-linux-gnu because it used widen_mult_expr to do a multiplication on chars. This patch generalises the regexp in the same way as for f3. gcc/testsuite/ * gcc.dg/tree-ssa/shifts-3.c: Accept widen_mult for f2 too.
2024-10-25Add regression testEric Botcazou1-0/+25
gcc/testsuite PR ada/116551 * gnat.dg/specs/vfa3.ads: New test.
2024-10-25Restrict :c to commutative ops as intendedRichard Biener2-26/+16
genmatch was supposed to restrict :c to verifiable commutative operations while leaving :C to the "I know what I'm doing" case. The following enforces this, cleaning up parsing and amending the commutative_op helper. There's one pattern that needs adjustment, the pattern optimizing fmax (x, NaN) or fmax (NaN, x) to x since fmax isn't commutative. * genmatch.cc (commutative_op): Add paramter to indicate whether all compares should be considered commutative. Handle hypot, add_overflow and mul_overflow. (parser::parse_expr): Simplify 'c' handling by using commutative_op and error out when the operation is not. * match.pd ((minmax:c @0 NaN@1) -> @0): Use :C, we know what we are doing.
2024-10-25tree-optimization/117277 - remove CLOBBERs before SLP code generationRichard Biener1-51/+59
We have to remove CLOBBERs before SLP is code generated since for store-lanes we are inserting our own CLOBBERs that we want to survive. So the following refactors vect_transform_loop to remove unwanted stmts first. This resolves the gcc.target/aarch64/sve/store_lane_spill_1.c FAIL. PR tree-optimization/117277 * tree-vect-loop.cc (vect_transform_loop): Remove CLOBBERs and prefetches before doing any code generation.