aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2023-07-22Daily bump.GCC Administrator1-1/+1
2023-07-21Daily bump.GCC Administrator3-1/+21
2023-07-20Fortran: intrinsics and deferred-length character arguments [PR95947,PR110658]Harald Anlauf2-1/+94
gcc/fortran/ChangeLog: PR fortran/95947 PR fortran/110658 * trans-expr.c (gfc_conv_procedure_call): For intrinsic procedures whose result characteristics depends on the first argument and which can be of type character, the character length will not be deferred. gcc/testsuite/ChangeLog: PR fortran/95947 PR fortran/110658 * gfortran.dg/deferred_character_37.f90: New test. (cherry picked from commit 95ddd2659849a904509067ec3a2770135149a722)
2023-07-20Daily bump.GCC Administrator2-1/+8
2023-07-19testsuite: Require vectors of doubles for pr97428.cMaciej W. Rozycki1-0/+1
The pr97428.c test assumes support for vectors of doubles, but some targets only support vectors of floats, causing this test to fail with such targets. Limit this test to targets that support vectors of doubles then. gcc/testsuite/ * gcc.dg/vect/pr97428.c: Limit to `vect_double' targets. (cherry picked from commit 5d9fc2aced3a2128527afd4a627424542f238471)
2023-07-19Daily bump.GCC Administrator1-1/+1
2023-07-18Daily bump.GCC Administrator1-1/+1
2023-07-17Daily bump.GCC Administrator1-1/+1
2023-07-16Daily bump.GCC Administrator1-1/+1
2023-07-15Daily bump.GCC Administrator4-1/+26
2023-07-14Fortran: formal symbol attributes for intrinsic procedures [PR110288]Harald Anlauf2-0/+20
gcc/fortran/ChangeLog: PR fortran/110288 * symbol.c (gfc_copy_formal_args_intr): When deriving the formal argument attributes from the actual ones for intrinsic procedure calls, take special care of CHARACTER arguments that we do not wrongly treat them formally as deferred-length. gcc/testsuite/ChangeLog: PR fortran/110288 * gfortran.dg/findloc_10.f90: New test. (cherry picked from commit 3b2c523ae31b68fc3b8363b458a55eec53a44365)
2023-07-14SH: Fix PR101469 peephole bugOleg Endo1-0/+39
gcc/ChangeLog: PR target/101469 * config/sh/sh.md (peephole2): Handle case where eliminated reg is also used by the address of the following memory operand.
2023-07-14Daily bump.GCC Administrator1-1/+1
2023-07-13Daily bump.GCC Administrator1-1/+1
2023-07-12Daily bump.GCC Administrator1-1/+1
2023-07-11Daily bump.GCC Administrator1-1/+1
2023-07-10Daily bump.GCC Administrator1-1/+1
2023-07-09Daily bump.GCC Administrator3-1/+18
2023-07-08Fortran: simplification of FINDLOC for constant complex arguments [PR110585]Harald Anlauf2-0/+24
gcc/fortran/ChangeLog: PR fortran/110585 * arith.c (gfc_compare_expr): Handle equality comparison of constant complex gfc_expr arguments. gcc/testsuite/ChangeLog: PR fortran/110585 * gfortran.dg/findloc_9.f90: New test. (cherry picked from commit 7ac1581d066a6f3a0d4acf1042a74634258b4966)
2023-07-08Daily bump.GCC Administrator3-1/+22
2023-07-07d: Fix PR 108842: Cannot use enum array with -fno-druntimeIain Buclaw4-15/+45
Restrict the generating of CONST_DECLs for D manifest constants to just scalars without pointers. It shouldn't happen that a reference to a manifest constant has not been expanded within a function body during codegen, but it has been found to occur in older versions of the D front-end (PR98277), so if the decl of a non-scalar constant is requested, just return its initializer as an expression. PR d/108842 gcc/d/ChangeLog: * decl.cc (DeclVisitor::visit (VarDeclaration *)): Only emit scalar manifest constants. (get_symbol_decl): Don't generate CONST_DECL for non-scalar manifest constants. * imports.cc (ImportVisitor::visit (VarDeclaration *)): New method. gcc/testsuite/ChangeLog: * gdc.dg/pr98277.d: Add more tests. * gdc.dg/pr108842.d: New test. (cherry picked from commit f934c5753849f7c48c6a3abfcd73b8f6008e8371)
2023-07-07Daily bump.GCC Administrator1-1/+1
2023-07-06Daily bump.GCC Administrator3-1/+35
2023-07-05Fix power10 fusion bug with prefixed loads, PR target/105325Michael Meissner6-42/+84
This changes fixes PR target/105325. PR target/105325 is a bug where an invalid lwa instruction is generated due to power10 fusion of a load instruction to a GPR and an compare immediate instruction with the immediate being -1, 0, or 1. In some cases, when the load instruction is done, the GCC compiler would generate a load instruction with an offset that was too large to fit into the normal load instruction. In particular, loads from the stack might originally have a small offset, so that the load is not a prefixed load. However, after the stack is set up, and register allocation has been done, the offset now is large enough that we would have to use a prefixed load instruction. The support for prefixed loads did not consider that patterns with a fused load and compare might have a prefixed address. Without this support, the proper prefixed load won't be generated. In the original code, when the split2 pass is run after reload has finished the ds_form_mem_operand predicate that was used for lwa and ld no longer returns true. When the pattern was created, ds_form_mem_operand recognized the insn as being valid since the offset was small. But after register allocation, ds_form_mem_operand did not return true. Because it didn't return true, the insn could not be split. Since the insn was not split and the prefix support did not indicate a prefixed instruction was used, the wrong load is generated. The solution involves: 1) Don't use ds_form_mem_operand for ld and lwa, always use non_update_memory_operand. 2) Delete ds_form_mem_operand since it is no longer used. 3) Use the "YZ" constraints for ld/lwa instead of "m". 4) If we don't need to sign extend the lwa, convert it to lwz, and use cmpwi instead of cmpdi. Adjust the insn name to reflect the code generate. 5) Insure that the insn using lwa will be recognized as having a prefixed operand (and hence the insn length will be 16 bytes instead of 8 bytes). 5a) Set the prefixed and maybe_prefix attributes to know that fused_load_cmpi are also load insns; 5b) In the case where we are just setting CC and not using the memory afterward, set the clobber to use a DI register, and put an explicit sign_extend operation in the split; 5c) Set the sign_extend attribute to "yes" for lwa. 5d) 5a-5c are the things that prefixed_load_p in rs6000.cc checks to ensure that lwa is treated as a ds-form instruction and not as a d-form instruction (i.e. lwz). 6) Add a new test case for this case. 7) Adjust the insn counts in fusion-p10-ldcmpi.c. Because we are no longer using ds_form_mem_operand, the ld and lwa instructions will fuse x-form (reg+reg) addresses in addition ds-form (reg+offset or reg). 2023-06-23 Michael Meissner <meissner@linux.ibm.com> gcc/ PR target/105325 * config/rs6000/genfusion.pl (gen_ld_cmpi_p10_one): Fix problems that allowed prefixed lwa to be generated. * config/rs6000/fusion.md: Regenerate. * config/rs6000/predicates.md (ds_form_mem_operand): Delete. * config/rs6000/rs6000.md (prefixed attribute): Add support for load plus compare immediate fused insns. (maybe_prefixed): Likewise. gcc/testsuite/ PR target/105325 * g++.target/powerpc/pr105325.C: New test. * gcc.target/powerpc/fusion-p10-ldcmpi.c: Update insn counts. (cherry picked from commit 370de1488a9a49956c47e5ec8c8f1489b4314a34) Co-Authored-By: Aaron Sawdey <acsawdey@linux.ibm.com>
2023-07-05rs6000: genfusion: Rewrite load/compare codeSegher Boessenkool1-82/+103
This makes the code more readable, more digestible, more maintainable, more extensible. That kind of thing. It does that by pulling things apart a bit, but also making what stays together more cohesive lumps. The original function was a bunch of loops and early-outs, and then quite a bit of stuff done per iteration, with the iterations essentially independent of each other. This patch moves the stuff done for one iteration to a new _one function. The second big thing is the stuff printed to the .md file is done in "here documents" now, which is a lot more readable than having to quote and escape and double-escape pieces of text. Whitespace inside the here-document is significant (will be printed as-is), which is a bit awkward sometimes, or might take some getting used to, but it is also one of the benefits of using them. Local variables are declared at first use (or close to first use). There also shouldn't be many at all, often you can write easier to read and manage code by omitting to name something that is hard to name in the first place. Finally some things are done in more typical, more modern, and tighter Perl style, for example REs in "if"s or "qw" for lists of constants. 2023-06-06 Segher Boessenkool <segher@kernel.crashing.org> * config/rs6000/genfusion.pl (gen_ld_cmpi_p10_one): New, rewritten and split out from... (gen_ld_cmpi_p10): ... this. (cherry picked from commit 19e5bf1d5fac00da0b8cd4144d5651b2979d8308)
2023-07-05Daily bump.GCC Administrator1-1/+1
2023-07-04Daily bump.GCC Administrator1-1/+1
2023-07-03Daily bump.GCC Administrator3-1/+20
2023-07-02d: Fix core.volatile.volatileLoad discarded if result is unusedIain Buclaw3-0/+26
The first pass of code generation in the D front-end splits up all compound expressions and discards expressions that have no side effects. This included calls to the `volatileLoad' intrinsic if its result was not used, causing such calls to be eliminated from the program. We already set TREE_THIS_VOLATILE on the expression, however the tree documentation says if this bit is set in an expression, so is TREE_SIDE_EFFECTS. So set TREE_SIDE_EFFECTS on the expression too. This prevents any early discarding from occuring. PR d/110516 gcc/d/ChangeLog: * intrinsics.cc (expand_volatile_load): Set TREE_SIDE_EFFECTS on the expanded expression. (expand_volatile_store): Likewise. gcc/testsuite/ChangeLog: * gdc.dg/torture/pr110516a.d: New test. * gdc.dg/torture/pr110516b.d: New test. (cherry picked from commit 80ae426a195a0d035640a6301da833564deade52)
2023-07-02Daily bump.GCC Administrator3-1/+12
2023-07-01d: Fix ICE in setValue, at d/dmd/dinterpret.c:7013Iain Buclaw2-1/+63
Backports ICE fix from upstream. When casting null to integer or real, instead of painting the type on the NullExp, we emplace an IntegerExp/RealExp with the value zero. Same as when casting from NullExp to bool. Reviewed-on: https://github.com/dlang/dmd/pull/13172 PR d/110511 gcc/d/ChangeLog: * dmd/dinterpret.c (Interpreter::visit (CastExp *)): Handle casting null to int or float. gcc/testsuite/ChangeLog: * gdc.test/compilable/test21794.d: New test.
2023-07-01Daily bump.GCC Administrator1-1/+1
2023-06-30Daily bump.GCC Administrator3-1/+25
2023-06-28go: Update usage of TARGET_AIX to TARGET_AIX_OSPaul E. Murphy2-7/+7
TARGET_AIX is defined to a non-zero value on linux and maybe other powerpc64le targets. This leads to unexpected behavior such as dropping the .go_export section when linking a shared library on linux/powerpc64le. Instead, use TARGET_AIX_OS to toggle AIX specific behavior. Fixes golang/go#60798. 2023-06-22 Paul E. Murphy <murphyp@linux.ibm.com> gcc/go/ * go-backend.c [TARGET_AIX]: Rename and update usage to TARGET_AIX_OS. * go-lang.c: Likewise. (cherry picked from commit b76cd1ec361712e1ac9ca5e0246da24ea2b78916)
2023-06-29Refine maskstore patterns with UNSPEC_MASKMOV.liuhongt1-10/+55
Similar like r14-2070-gc79476da46728e If mem_addr points to a memory region with less than whole vector size bytes of accessible memory and k is a mask that would prevent reading the inaccessible bytes from mem_addr, add UNSPEC_MASKMOV to prevent it to be transformed to any other whole memory access instructions. gcc/ChangeLog: PR rtl-optimization/110237 * config/i386/sse.md (<avx512>_store<mode>_mask): Refine with UNSPEC_MASKMOV. (maskstore<mode><avx512fmaskmodelower): Ditto. (*<avx512>_store<mode>_mask): New define_insn, it's renamed from original <avx512>_store<mode>_mask.
2023-06-29Refine maskloadmn pattern with UNSPEC_MASKLOAD.liuhongt1-2/+6
If mem_addr points to a memory region with less than whole vector size bytes of accessible memory and k is a mask that would prevent reading the inaccessible bytes from mem_addr, add UNSPEC_MASKLOAD to prevent it to be transformed to vpblendd. gcc/ChangeLog: PR target/110309 * config/i386/sse.md (maskload<mode><avx512fmaskmodelower>): Refine pattern with UNSPEC_MASKLOAD. (maskload<mode><avx512fmaskmodelower>): Ditto.
2023-06-29Daily bump.GCC Administrator3-1/+25
2023-06-28Support parallel testing in libgomp: fallback Perl 'flock' [PR66005]Thomas Schwinge1-0/+3
Follow-up to commit 6c3b30ef9e0578509bdaf59c13da4a212fe6c2ba "Support parallel testing in libgomp, part II [PR66005]" ("..., and enable if 'flock' is available for serializing execution testing"), where we saw: > On my Dell Precision 7530 laptop: > > $ uname -srvi > Linux 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023 x86_64 > $ grep '^model name' < /proc/cpuinfo | uniq -c > 12 model name : Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz > $ nvidia-smi -L > GPU 0: Quadro P1000 (UUID: GPU-e043973b-b52a-d02b-c066-a8fdbf64e8ea) > > ... [...]: case (c) standard configuration, no offloading > configured, [...] > $ \time make check-target-libgomp > > Case (c), baseline; [...]: > > 1180.98user 110.80system 19:36.40elapsed 109%CPU (0avgtext+0avgdata 505148maxresident)k > 1133.22user 111.08system 19:35.75elapsed 105%CPU (0avgtext+0avgdata 505212maxresident)k > > Case (c), parallelized [using 'flock']: > > [...] > -j12 GCC_TEST_PARALLEL_SLOTS=12 > 2591.04user 192.64system 4:44.98elapsed 976%CPU (0avgtext+0avgdata 505216maxresident)k > 2581.23user 195.21system 4:47.51elapsed 965%CPU (0avgtext+0avgdata 505212maxresident)k Quite the same when instead of 'flock' using this fallback Perl 'flock': 2565.23user 194.35system 4:46.77elapsed 962%CPU (0avgtext+0avgdata 505216maxresident)k 2549.38user 200.20system 4:46.08elapsed 961%CPU (0avgtext+0avgdata 505216maxresident)k PR testsuite/66005 gcc/ * doc/install.texi: Document (optional) Perl usage for parallel testing of libgomp. libgomp/ * testsuite/lib/libgomp.exp: 'flock' through stdout. * testsuite/flock: New. * configure.ac (FLOCK): Point to that if no 'flock' available, but 'perl' is. * configure: Regenerate. (cherry picked from commit 04abe1944d30eb18a2060cfcd9695d085f7b4752)
2023-06-28Make option mvzeroupper independent of optimization level.liuhongt3-7/+19
pass_insert_vzeroupper is under condition TARGET_AVX && TARGET_VZEROUPPER && flag_expensive_optimizations && !optimize_size But the document of mvzeroupper doesn't mention the insertion required -O2 and above, it may confuse users when they explicitly use -Os -mvzeroupper. ------------ mvzeroupper Target Mask(VZEROUPPER) Save Generate vzeroupper instruction before a transfer of control flow out of the function. ------------ The patch moves flag_expensive_optimizations && !optimize_size to ix86_option_override_internal. It makes -mvzeroupper independent of optimization level, but still keeps the behavior of architecture tuning(emit_vzeroupper) unchanged. gcc/ChangeLog: * config/i386/i386-features.c (pass_insert_vzeroupper:gate): Move flag_expensive_optimizations && !optimize_size to .. * config/i386/i386-options.c (ix86_option_override_internal): .. this, it makes -mvzeroupper independent of optimization level, but still keeps the behavior of architecture tuning(emit_vzeroupper) unchanged. (rest_of_handle_insert_vzeroupper): Remove flag_expensive_optimizations && !optimize_size. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-vzeroupper-29.c: New testcase.
2023-06-28Daily bump.GCC Administrator1-1/+1
2023-06-27Daily bump.GCC Administrator3-1/+19
2023-06-26d: Suboptimal codegen for __builtin_expect(cond, false)Iain Buclaw2-12/+41
Since PR96435, both boolean objects and expressions have been evaluated in the following way. (*(ubyte*)&obj_or_expr) & 1 It has been noted that sometimes this can cause the back-end to optimize in non-obvious ways - in particular with __builtin_expect. This @safe feature is now restricted to just when reading the value of a bool field that comes from a union. PR d/110359 gcc/d/ChangeLog: * d-convert.cc (convert_for_rvalue): Only apply the @safe boolean conversion to boolean fields of a union. (convert_for_condition): Call convert_for_rvalue in the default case. gcc/testsuite/ChangeLog: * gdc.dg/pr110359.d: New test. (cherry picked from commit ab98db1e8c1b997414539f41b7fb814019497d8d)
2023-06-26Daily bump.GCC Administrator1-1/+1
2023-06-25Daily bump.GCC Administrator1-1/+1
2023-06-24Daily bump.GCC Administrator1-1/+1
2023-06-23Daily bump.GCC Administrator1-1/+1
2023-06-22Daily bump.GCC Administrator1-1/+1
2023-06-21Daily bump.GCC Administrator3-1/+31
2023-06-20rs6000: Guard __builtin_{un,}pack_vector_int128 with vsx [PR109932]Kewen Lin3-2/+44
As PR109932 shows, builtins __builtin_{un,}pack_vector_int128 should be guarded under vsx rather than power7, as their corresponding bif patterns have the conditions TARGET_VSX and VECTOR_MEM_ALTIVEC_OR_VSX_P (V1TImode). This patch is to ensure __builtin_{un,}pack_vector_int128 only available under vsx. PR target/109932 gcc/ChangeLog: * config/rs6000/rs6000-builtin.def (BU_VSX_MISC_2): New macro. ({un,}pack_vector_int128): Use BU_VSX_MISC_2 instead of BU_P7_MISC_2. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr109932-1.c: New test. * gcc.target/powerpc/pr109932-2.c: New test.
2023-06-20rs6000: Don't use TFmode for 128 bits fp constant in toc [PR110011]Kewen Lin2-1/+43
As PR110011 shows, when encoding 128 bits fp constant into toc, we adopts REAL_VALUE_TO_TARGET_LONG_DOUBLE which is to find the first float mode with LONG_DOUBLE_TYPE_SIZE bits of precision, it would be TFmode here. But the 128 bits fp constant can be with mode IFmode or KFmode, which doesn't necessarily have the same underlying float format as the one of TFmode, like this PR exposes, with option -mabi=ibmlongdouble TFmode has ibm_extended_format while KFmode has ieee_quad_format, mixing up the formats (the encoding/decoding ways) would cause unexpected results. This patch is to make it use constant's own mode instead of TFmode for real_to_target call. PR target/110011 gcc/ChangeLog: * config/rs6000/rs6000.c (output_toc): Use the mode of the 128-bit floating constant itself for real_to_target call. gcc/testsuite/ChangeLog: * gcc.target/powerpc/pr110011.c: New test. (cherry picked from commit 388809f2afde874180da0669c669e241037eeba0)