aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
7 hoursRemove SPR/GNR/DMR from avx512_{move,store}_by pieces tune.releases/gcc-14hongtao.liu5-8/+8
Align move_max with prefer_vector_width for SPR/GNR/DMR similar as below commit. commit 6ea25c041964bf63014fcf7bb68fb1f5a0a4e123 Author: liuhongt <hongtao.liu@intel.com> Date: Thu Aug 15 12:54:07 2024 +0800 Align ix86_{move_max,store_max} with vectorizer. When none of mprefer-vector-width, avx256_optimal/avx128_optimal, avx256_store_by_pieces/avx512_store_by_pieces is specified, GCC will set ix86_{move_max,store_max} as max available vector length except for AVX part. if (TARGET_AVX512F_P (opts->x_ix86_isa_flags) && TARGET_EVEX512_P (opts->x_ix86_isa_flags2)) opts->x_ix86_move_max = PVW_AVX512; else opts->x_ix86_move_max = PVW_AVX128; So for -mavx2, vectorizer will choose 256-bit for vectorization, but 128-bit is used for struct copy, there could be a potential STLF issue due to this "misalign". gcc/ChangeLog: * config/i386/x86-tune.def (X86_TUNE_AVX512_MOVE_BY_PIECES): Remove SPR/GNR/DMR. (X86_TUNE_AVX512_STORE_BY_PIECES): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pieces-memcpy-18.c: Use -mtune=znver5 instead of -mtune=sapphirerapids. * gcc.target/i386/pieces-memcpy-21.c: Ditto. * gcc.target/i386/pieces-memset-46.c: Ditto. * gcc.target/i386/pieces-memset-49.c: Ditto. (cherry picked from commit dd713d0f3fc88778a9b3d4f8f1895a3cd6c145ca)
9 hoursDaily bump.GCC Administrator2-1/+12
25 hourstestsuite: arm: Simplify fp16-aapcs testsTorbjörn SVENSSON5-218/+24
Reduce fp16-aapcs testcases to return value testing since parameter passing are already tested in aapcs/vfp*.c gcc/testsuite/ChangeLog: * gcc.target/arm/fp16-aapcs.c: New test. * gcc.target/arm/fp16-aapcs-1.c: Removed. * gcc.target/arm/fp16-aapcs-2.c: Likewise. * gcc.target/arm/fp16-aapcs-3.c: Likewise. * gcc.target/arm/fp16-aapcs-4.c: Likewise. Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com> (cherry picked from commit 1cf8cb45d872a5f09d65c63c891c091710c37432)
33 hoursDaily bump.GCC Administrator2-1/+9
39 hoursFix latent LRA bugJeff Law1-0/+1
Shreya's work to add the addptr pattern on the RISC-V port exposed a latent bug in LRA. We lazily allocate/reallocate the ira_reg_equiv structure and when we do (re)allocation we'll over-allocate and zero-fill so that we don't have to actually allocate and relocate the data so often. In the case exposed by Shreya's work we had N requested entries at the last rellocation step. We actually allocate N+M entries. During LRA we allocate enough new pseudos and thus have N+M+1 pseudos. In get_equiv we read ira_reg_equiv[regno] without bounds checking so we read past the allocated part of the array and get back junk which we use and depending on the precise contents we fault in various fun and interesting ways. We could either arrange to re-allocate ira_reg_equiv again on some path through LRA (possibly in get_equiv itself). We could also just insert the bounds check in get_equiv like is done elsewhere in LRA. Vlad indicated no strong preference in an email last week. So this just adds the bounds check in a manner similar to what's done elsewhere in LRA. Bootstrapped and regression tested on x86_64 as well as RISC-V with Shreya's work enabled and regtested across the various embedded targets. gcc/ * lra-constraints.cc (get_equiv): Bounds check before accessing data in ira_reg_equiv. (cherry picked from commit 0c6ad3f5dfbd45150eeef2474899ba7ef0d8e592)
2 daysDaily bump.GCC Administrator4-1/+86
3 daysaarch64: PR target/121749: Use dg-assemble in testcaseKyrylo Tkachov1-1/+1
Committing as obvious. Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com> gcc/testsuite/ PR target/121749 * gcc.target/aarch64/simd/pr121749.c: Use dg-assemble directive. (cherry picked from commit 2b8256d0ce18ed4d00868c78f5128d32884ccfa1)
3 daysaarch64: PR target/121749: Use correct predicate for narrowing shift amountsKyrylo Tkachov3-12/+24
With g:d20b2ad845876eec0ee80a3933ad49f9f6c4ee30 the narrowing shift instructions are now represented with standard RTL and more merging optimisations occur. This exposed a wrong predicate for the shift amount operand. The shift amount is the number of bits of the narrow destination, not the input sources. Correct this by using the vn_mode attribute when specifying the predicate, which exists for this purpose. I've spotted a few more narrowing shift patterns that need the restriction, so they are updated as well. Bootstrapped and tested on aarch64-none-linux-gnu. Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com> gcc/ PR target/121749 * config/aarch64/aarch64-simd.md (aarch64_<shrn_op>shrn_n<mode>): Use aarch64_simd_shift_imm_offset_<vn_mode> instead of aarch64_simd_shift_imm_offset_<ve_mode> predicate. (aarch64_<shrn_op>shrn_n<mode> VQN define_expand): Likewise. (*aarch64_<shrn_op>rshrn_n<mode>_insn): Likewise. (aarch64_<shrn_op>rshrn_n<mode>): Likewise. (aarch64_<shrn_op>rshrn_n<mode> VQN define_expand): Likewise. (aarch64_sqshrun_n<mode>_insn): Likewise. (aarch64_sqshrun_n<mode>): Likewise. (aarch64_sqshrun_n<mode> VQN define_expand): Likewise. (aarch64_sqrshrun_n<mode>_insn): Likewise. (aarch64_sqrshrun_n<mode>): Likewise. (aarch64_sqrshrun_n<mode>): Likewise. * config/aarch64/iterators.md (vn_mode): Handle DI, SI, HI modes. gcc/testsuite/ PR target/121749 * gcc.target/aarch64/simd/pr121749.c: New test. (cherry picked from commit cb508e54140687a50790059fac548d87515df6be)
3 daysAVR: Support AVR32EB14/20/28/32.Georg-Johann Lay2-1/+5
Add support for some recent AVR devices. gcc/ * config/avr/avr-mcus.def: Add avr32eb14, avr32eb20, avr32eb28, avr32eb32. * doc/avr-mmcu.texi: Rebuild. (cherry picked from commit 45f605a74fd7e96294477db064cc58033c3fba49)
3 daysc++: Fix mangling of _Float16 template args [PR121801]Matthias Kretz2-1/+27
Signed-off-by: Matthias Kretz <m.kretz@gsi.de> gcc/testsuite/ChangeLog: PR c++/121801 * g++.dg/abi/pr121801.C: New test. gcc/cp/ChangeLog: PR c++/121801 * mangle.cc (write_real_cst): Handle 16-bit real and assert that reals have 16 bits or a multiple of 32 bits. (cherry picked from commit 19d1c7c28f4fd0557dd868a7a4041b00ceada890)
3 dayss390: Emulate vec_cmp{eq,gt,gtu} for 128-bit integersStefan Schulze Frielinghaus4-12/+171
Mode iterator V_HW enables V1TI for target VXE which means vec_cmpv1tiv1ti becomes available which leads to an ICE since there is no corresponding insn. Fixed by emulating comparisons and enabling mode V1TI unconditionally for V_HW. For the sake of symmetry, I also added TI mode to V_HW since TF mode is already included. As a consequence the consumers of V_HW vec_{splat,slb,sld,sldw,sldb,srdb,srab,srb,test_mask_int,test_mask} also become available for 128-bit integers. This fixes gcc.c-torture/execute/pr105613.c and gcc.dg/pr106063.c. gcc/ChangeLog: * config/s390/vector.md (V_HW): Enable V1TI unconditionally and add TI. (vec_cmpu<VIT_HW:mode><VIT_HW:mode>): Add 128-bit integer variants. (*vec_cmpeq<mode><mode>_nocc_emu): Emulate operation. (*vec_cmpgt<mode><mode>_nocc_emu): Emulate operation. (*vec_cmpgtu<mode><mode>_nocc_emu): Emulate operation. gcc/testsuite/ChangeLog: * gcc.target/s390/vector/vec-cmp-emu-1.c: New test. * gcc.target/s390/vector/vec-cmp-emu-2.c: New test. * gcc.target/s390/vector/vec-cmp-emu-3.c: New test. (cherry picked from commit 1b575bb24a7a3d2b00197dd5deb4c26b313f442b)
3 daysDaily bump.GCC Administrator1-1/+1
4 daysDaily bump.GCC Administrator1-1/+1
5 daysDaily bump.GCC Administrator1-1/+1
6 daysDaily bump.GCC Administrator1-1/+1
7 daysDaily bump.GCC Administrator2-1/+10
8 daysAVR: Disable tree-switch-conversion per default.Georg-Johann Lay1-0/+7
There are at least two cases where tree-switch-conversion leads to unpleasant resource allocation: PR49857 The lookup table lives in RAM. This is the case for all devices that locate .rodata in RAM, which is for almost all AVR devices. PR81540 Code is bloated for 64-bit inputs. As far as PR49857 is concerned, a target hook that may add an address-space qualifier to the lookup table is the obvious solution, though a respective patch has always been rejected by global maintainers for non-technical reasons. gcc/ PR target/81540 PR target/49857 * common/config/avr/avr-common.cc: Disable -ftree-switch-conversion. (cherry picked from commit 912159d2b5429c3126756b56723dd4f32dd56bdd)
8 daysDaily bump.GCC Administrator1-1/+1
9 daysDaily bump.GCC Administrator1-1/+1
10 daysDaily bump.GCC Administrator1-1/+1
11 daysDaily bump.GCC Administrator1-1/+1
12 daysDaily bump.GCC Administrator1-1/+1
13 daysDaily bump.GCC Administrator1-1/+1
2025-09-04Daily bump.GCC Administrator2-1/+8
2025-09-03middle-end: Fix typo in gimple.hBenjamin Wu1-1/+1
gcc/ChangeLog: * gimple.h (GTMA_DOES_GO_IRREVOCABLE): Fix typo. (cherry picked from commit 356250630abd876ae592bc3d2b4cc171bc834b79)
2025-09-03Daily bump.GCC Administrator1-1/+1
2025-09-02Daily bump.GCC Administrator1-1/+1
2025-09-01Daily bump.GCC Administrator1-1/+1
2025-08-31Daily bump.GCC Administrator2-1/+8
2025-08-29Revert "Fix _Decimal128 arithmetic error under FE_UPWARD."liuhongt1-18/+0
This reverts commit e645728e9de64d019661c8f92bb487e06d95644a.
2025-08-30Daily bump.GCC Administrator2-1/+8
2025-08-28Fix _Decimal128 arithmetic error under FE_UPWARD.liuhongt1-0/+18
libgcc/config/libbid/ChangeLog: PR target/120691 * bid128_div.c: Fix _Decimal128 arithmetic error under FE_UPWARD. * bid128_rem.c: Ditto. * bid128_sqrt.c: Ditto. * bid64_div.c (bid64_div): Ditto. * bid64_sqrt.c (bid64_sqrt): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr120691.c: New test. (cherry picked from commit 50064b2898edfb83bc37f2597a35cbd3c1c853e3)
2025-08-29Daily bump.GCC Administrator1-1/+1
2025-08-28Daily bump.GCC Administrator1-1/+1
2025-08-27Daily bump.GCC Administrator1-1/+1
2025-08-26Daily bump.GCC Administrator1-1/+1
2025-08-25Daily bump.GCC Administrator1-1/+1
2025-08-24Daily bump.GCC Administrator1-1/+1
2025-08-23Daily bump.GCC Administrator3-1/+51
2025-08-22Fortran: fix bogus runtime error with optional procedure argument [PR121145]Harald Anlauf2-1/+48
PR fortran/121145 gcc/fortran/ChangeLog: * trans-expr.cc (gfc_conv_procedure_call): Do not create pointer check for proc-pointer actual passed to optional dummy. gcc/testsuite/ChangeLog: * gfortran.dg/pointer_check_15.f90: New test. (cherry picked from commit 8f9450505f8244d262f8b4ff274f113f99cdc7e2)
2025-08-22Fortran: follow-up fix to checking of renamed-on-use interface name [PR120784]Harald Anlauf2-1/+38
Commit r16-1633 introduced a regression for imported interfaces that were not renamed-on-use, since the related logic did not take into account that the absence of renaming could be represented by an empty string. PR fortran/120784 gcc/fortran/ChangeLog: * interface.cc (gfc_match_end_interface): Detect empty local_name. gcc/testsuite/ChangeLog: * gfortran.dg/interface_63.f90: Extend testcase. (cherry picked from commit ddff83b3dde4a8308d0e156f85693e7176b85524)
2025-08-22Fortran: fix checking of renamed-on-use interface name [PR120784]Harald Anlauf2-3/+72
PR fortran/120784 gcc/fortran/ChangeLog: * interface.cc (gfc_match_end_interface): If a use-associated symbol is renamed, use the local_name for checking. gcc/testsuite/ChangeLog: * gfortran.dg/interface_63.f90: New test. (cherry picked from commit 6dd1659cf10a7ad51576f902ef3bc007db30c990)
2025-08-22Daily bump.GCC Administrator1-1/+1
2025-08-21Daily bump.GCC Administrator3-1/+43
2025-08-20tree-sra: Avoid total SRA if there are incompat. aggregate accesses (PR119085)Martin Jambor2-0/+43
We currently use the types encountered in the function body and not in type declaration to perform total scalarization. Bug PR 119085 uncovered that we miss a check that when the same data is accessed with aggregate types that those are actually compatible. Without it, we can base total scalarization on a type that does not "cover" all live data in a different part of the function. This patch adds the check. gcc/ChangeLog: 2025-07-21 Martin Jambor <mjambor@suse.cz> PR tree-optimization/119085 * tree-sra.cc (sort_and_splice_var_accesses): Prevent total scalarization if two incompatible aggregates access the same place. gcc/testsuite/ChangeLog: 2025-07-21 Martin Jambor <mjambor@suse.cz> PR tree-optimization/119085 * gcc.dg/tree-ssa/pr119085.c: New test. (cherry picked from commit 171fcc80ede596442712e559c4fc787aa4636694)
2025-08-20tree-sra: Fix grp_covered flag computation when totally scalarizing (PR117423)Martin Jambor2-2/+56
Testcase of PR 117423 shows a flaw in the fancy way we do "total scalarization" in SRA now. We use the types encountered in the function body and not in type declaration (allowing us to totally scalarize when only one union field is ever used, since we effectively "skip" the union then) and can accommodate pre-existing accesses that happen to fall into padding. In this case, we skipped the union (bypassing the totally_scalarizable_type_p check) and the access falling into the "padding" is an aggregate and so not a candidate for SRA but actually containing data. Arguably total scalarization should just bail out when it encounters this situation (but I decided not to depend on this mainly because we'd need to detect all cases when we eventually cannot scalarize, such as when a scalar access has children accesses) but the actual bug is that the detection if all data in an aggregate is indeed covered by replacements just assumes that is always the case if total scalarization triggers which however may not be the case in cases like this - and perhaps more. This patch fixes the bug by just assuming that all padding is taken care of when total scalarization triggered, not that every access was actually scalarized. gcc/ChangeLog: 2025-07-17 Martin Jambor <mjambor@suse.cz> PR tree-optimization/117423 * tree-sra.cc (analyze_access_subtree): Fix computation of grp_covered flag. gcc/testsuite/ChangeLog: 2025-07-17 Martin Jambor <mjambor@suse.cz> PR tree-optimization/117423 * gcc.dg/tree-ssa/pr117423.c: New test. (cherry picked from commit 7375909e9d9e7de23acb4b1e0a965d8faf1943c4)
2025-08-20AVR: target/121608 - Don't add --relax when linking with -r.Georg-Johann Lay1-1/+1
The linker rejects --relax in relocatable links (-r), hence only add --relax when -r is not specified. gcc/ PR target/121608 * config/avr/specs.h (LINK_RELAX_SPEC): Wrap in %{!r...}. (cherry picked from commit 0f15ff7b511493e9197e6153b794081c1557ba02)
2025-08-20Daily bump.GCC Administrator1-1/+1
2025-08-19Daily bump.GCC Administrator3-1/+34
2025-08-18aarch64: Fix mode mismatch when building a predicate [PR121118]Richard Sandiford2-1/+17
This PR is about a case where we used aarch64_expand_sve_const_pred_trn to combine two predicates, one of which was constructing using aarch64_sve_move_pred_via_while. The former requires the inputs to have mode VNx16BI, but the latter returned VNx8BI for a .H WHILELO. The proper fix, used on trunk, is to make the pattern emitted by aarch64_sve_move_pred_via_while produce an VNx16BI for all element sizes, since every bit of the result is significant. However, that required some target-independent changes that are too invasive to backport. This patch goes for the simpler (but less robust) approach of using the original pattern and casting it to VNx16BI after the fact. Since the WHILELO pattern is an unspec, the chances of something optimising it in a way that changes the undefined bits of the output should be very low, especially on a release branch. It is still a less satisfactory fix though. gcc/ PR target/121118 * config/aarch64/aarch64.cc (aarch64_sve_move_pred_via_while): Return a VNx16BI predicate. gcc/testsuite/ PR target/121118 * gcc.target/aarch64/sve/acle/general/pr121118_1.c: New test. (cherry picked from commit 58a9717df098defb7f595fbc56122107e952a46b)