aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2025-04-24cobol: Repair some exception processing logic.Robert Dubner3-589/+295
This patch changes the exception processing logic for the calculation of reference modifications and table subscripts to be more in accordance with ISO specifications. It also adjusts the processing of RETURN-CODE when calling routines that have no CALL ... RETURNING phrase. gcc/cobol * genapi.cc: (initialize_variable_internal): Change TRACE1 formatting. (create_and_call): Repair RETURN-CODE processing. (mh_source_is_group): Repair run-time IF type comparison. (psa_FldLiteralA): Change TRACE1 formatting. (parser_symbol_add): Eliminate unnecessary code. * genutil.cc: Eliminate SET_EXCEPTION_CODE macro. (get_data_offset_dest): Repair set_exception_code logic. (get_data_offset_source): Likewise. (get_binary_value): Likewise. (refer_refmod_length): Likewise. (refer_fill_depends): Likewise. (refer_offset_dest): Likewise. (refer_size_dest): Likewise. (refer_offset_source): Likewise. gcc/testsuite * cobol.dg/group1/declarative_1.cob: Adjust for repaired exception logic.
2025-04-24Fix i386 vectorizer cost of COND_EXPR and MIN_MAX with one of parameters 0 or -1Jan Hubicka2-8/+48
gcc/ChangeLog: PR target/119919 * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost): Account correctly cond_expr and min/max when one of operands is 0 or -1. gcc/testsuite/ChangeLog: * gcc.target/i386/pr119919.c: New test.
2025-04-24Fix ICE building deepsjeng with -fprofile-useJan Hubicka1-4/+4
The problem here is division by zero, since adjusted 0 > precise 0. Fixed by using right test. gcc/ChangeLog: PR ipa/119924 * ipa-cp.cc (update_counts_for_self_gen_clones): Use nonzero_p. (update_profiling_info): Likewise. (update_specialized_profile): Likewise.
2025-04-24c++: attribute duplication [PR116954]Jason Merrill1-0/+3
As a followup to the previous patch for 116954, there's no reason to do anything in remove_contract_attributes if contracts aren't enabled. PR c++/116954 gcc/cp/ChangeLog: * contracts.cc (remove_contract_attributes): Return early if not enabled.
2025-04-24aarch64: Fix CFA offsets in non-initial stack probes [PR119610]Richard Sandiford3-26/+78
PR119610 is about incorrect CFI output for a stack probe when that probe is not the initial allocation. The main aarch64 stack probe function, aarch64_allocate_and_probe_stack_space, implicitly assumed that the incoming stack pointer pointed to the top of the frame, and thus held the CFA. aarch64_save_callee_saves and aarch64_restore_callee_saves use a parameter called bytes_below_sp to track how far the stack pointer is above the base of the static frame. This patch does the same thing for aarch64_allocate_and_probe_stack_space. Also, I noticed that the SVE path was attaching the first CFA note to the wrong instruction: it was attaching the note to the calculation of the stack size, rather than to the r11<-sp copy. gcc/ PR target/119610 * config/aarch64/aarch64.cc (aarch64_allocate_and_probe_stack_space): Add a bytes_below_sp parameter and use it to calculate the CFA offsets. Attach the first SVE CFA note to the move into the associated temporary register. (aarch64_allocate_and_probe_stack_space): Update calls accordingly. Start out with bytes_per_sp set to the frame size and decrement it after each allocation. gcc/testsuite/ PR target/119610 * g++.dg/torture/pr119610.C: New test. * g++.target/aarch64/sve/pr119610-sve.C: Likewise.
2025-04-24c: Allow $@` in GNU23/GNU2Y raw string delimiters [PR110343]Jakub Jelinek1-0/+25
Aaron mentioned in the PR that late in C23 N3124 was adopted and $@` are now part of basic character set. The paper has been implemented in GCC from what I can see, but we should allow for GNU23/2Y $@` in raw string delimiters as well, like they are allowed for C++26, because the delimiters can contain anything from basic character set but space, ()\, tab, form-feed, newline and backspace. 2025-04-24 Jakub Jelinek <jakub@redhat.com> PR c++/110343 * lex.cc (lex_raw_string): For C allow $@` in raw string delimiters if CPP_OPTION (pfile, low_ucns) i.e. for C23 and later. * gcc.dg/raw-string-1.c: New test.
2025-04-24opts.cc: Use opts rather than opts_set for validating -fipa-reorder-for-localityKyrylo Tkachov1-4/+5
This ensures -fno-ipa-reorder-for-locality doesn't complain with an explicit -flto-partition=. Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com> * opts.cc (validate_ipa_reorder_locality_lto_partition): Check opts instead of opts_set for x_flag_ipa_reorder_for_locality. (finish_options): Update call site.
2025-04-24opts.cc Simplify handling of explicit -flto-partition= and ↵Kyrylo Tkachov4-14/+6
-fipa-reorder-for-locality The handling of an explicit -flto-partition= and -fipa-reorder-for-locality should be simpler. No need to have a new default option. We can use opts_set to check if -flto-partition is explicitly set and use that information in the error handling. Remove -flto-partition=default and update accordingly. Bootstrapped and tested on aarch64-none-linux-gnu. Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com> gcc/ * common.opt (LTO_PARTITION_DEFAULT): Delete. (flto-partition=): Change default back to balanced. * flag-types.h (lto_partition_model): Remove LTO_PARTITION_DEFAULT. * opts.cc (validate_ipa_reorder_locality_lto_partition): Check opts_set->x_flag_lto_partition instead of LTO_PARTITION_DEFAULT. (finish_options): Remove handling of LTO_PARTITION_DEFAULT. gcc/testsuite/ * gcc.dg/completion-2.c: Remove check for default.
2025-04-24PR modula2/119915: Sprintf1 repeats the entire format string if it starts ↵Gaius Mulley2-2/+65
with a directive This bugfix is for FormatStrings to ensure that in the case of %x, %u the procedure function PerformFormatString uses Copy rather than Slice to avoid the case on an upper bound of zero in Slice. Oddly the %d case had the correct code. gcc/m2/ChangeLog: PR modula2/119915 * gm2-libs/FormatStrings.mod (PerformFormatString): Handle the %u and %x format specifiers in a similar way to the %d specifier. Avoid using Slice and use Copy instead. gcc/testsuite/ChangeLog: PR modula2/119915 * gm2/pimlib/run/pass/format2.mod: New test. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2025-04-24dwarf2out: Decrease dw_loc_descr_node and dw_attr_struct struct sizes [PR119711]Jakub Jelinek2-25/+50
As noted by Richi on a large testcase, there are unnecessary paddings in some heavily used dwarf2out.{h,cc} structures on 64-bit hosts. struct dw_val_node { enum dw_val_class val_class; /* 0 4 */ /* XXX 4 bytes hole, try to pack */ struct addr_table_entry * val_entry; /* 8 8 */ union dw_val_struct_union v; /* 16 16 */ /* size: 32, cachelines: 1, members: 3 */ /* sum members: 28, holes: 1, sum holes: 4 */ /* last cacheline: 32 bytes */ }; struct dw_loc_descr_node { dw_loc_descr_ref dw_loc_next; /* 0 8 */ enum dwarf_location_atom dw_loc_opc:8; /* 8: 0 4 */ unsigned int dtprel:1; /* 8: 8 4 */ unsigned int frame_offset_rel:1; /* 8: 9 4 */ /* XXX 22 bits hole, try to pack */ int dw_loc_addr; /* 12 4 */ struct dw_val_node dw_loc_oprnd1; /* 16 32 */ struct dw_val_node dw_loc_oprnd2; /* 48 32 */ /* size: 80, cachelines: 2, members: 7 */ /* sum members: 76 */ /* sum bitfield members: 10 bits, bit holes: 1, sum bit holes: 22 bits */ /* last cacheline: 16 bytes */ }; struct dw_attr_struct { enum dwarf_attribute dw_attr; /* 0 4 */ /* XXX 4 bytes hole, try to pack */ struct dw_val_node dw_attr_val; /* 8 32 */ /* size: 40, cachelines: 1, members: 2 */ /* sum members: 36, holes: 1, sum holes: 4 */ /* last cacheline: 40 bytes */ }; The following patch is an (not very clean admittedly) attempt to decrease size of dw_loc_descr_node from 80 bytes to 72 and (more importantly) dw_attr_struct from 40 bytes to 32 by moving the dw_attr member from dw_attr_struct into dw_attr_val's padding and similarly move dw_loc_opc/dtprel/frame_offset_rel members into dw_loc_oprnd1 padding and dw_loc_addr into dw_loc_oprnd2 padding. All we need to ensure is that nothing tries to copy whole dw_val_node structs unless it is copied as part of whole dw_loc_descr_node or dw_attr_struct copy. To verify that wasn't the case, I've temporarily added a deleted copy ctor to dw_val_node and then looked at all the errors/warnings caused by that, and those were just from memcpy/memmove or structure assignments of whole dw_loc_descr_node/dw_attr_struct. 2025-04-24 Jakub Jelinek <jakub@redhat.com> PR debug/119711 * dwarf2out.h (struct dw_val_node): Add u member. (struct dw_loc_descr_node): Remove dw_loc_opc, dtprel, frame_offset_rel and dw_loc_addr members. (dw_loc_opc, dw_loc_dtprel, dw_loc_frame_offset_rel, dw_loc_addr): Define. (struct dw_attr_struct): Remove dw_attr member. (dw_attr): Define. * dwarf2out.cc (loc_descr_equal_p_1): Use dw_loc_dtprel instead of dtprel. (output_loc_operands, new_addr_loc_descr, loc_checksum, loc_checksum_ordered): Likewise. (resolve_args_picking_1): Use dw_loc_frame_offset_rel instead of frame_offset_rel. (loc_list_from_tree_1): Likewise. (resolve_addr_in_expr): Use dw_loc_dtprel instead of dtprel. (copy_deref_exprloc): Copy val_class, val_entry and v members instead of whole dw_loc_oprnd1 and dw_loc_oprnd2. (optimize_string_length): Copy val_class, val_entry and v members instead of whole dw_attr_val. (hash_loc_operands): Use dw_loc_dtprel instead of dtprel. (compare_loc_operands, compare_locs): Likewise.
2025-04-23target: [PR103750] Also handle avx512 kmask & immediate 15 or 3 when VF is 4/2.liuhongt16-39/+187
Since the upper bits are already cleared by the comparison instructions. gcc/ChangeLog: PR target/103750 * config/i386/sse.md (*<avx512>_cmp<mode>3_and15): New define_insn. (*<avx512>_ucmp<mode>3_and15): Ditto. (*<avx512>_cmp<mode>3_and3): Ditto. (*avx512vl_ucmpv2di3_and3): Ditto. (*<avx512>_cmp<V48H_AVX512VL:mode>3_zero_extend<SWI248x:mode>): Change operands[3] predicate to <cmp_imm_predicate>. (*<avx512>_cmp<V48H_AVX512VL:mode>3_zero_extend<SWI248x:mode>_2): Ditto. (*<avx512>_cmp<mode>3): Add GET_MODE_NUNITS (<MODE>mode) >= 8 to the condition. (*<avx512>_ucmp<mode>3): Ditto. (V48_AVX512VL_4): New mode iterator. (VI48_AVX512VL_4): Ditto. (V8_AVX512VL_2): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512vl-pr103750-1.c: New test. * gcc.target/i386/avx512f-pr96891-3.c: Adjust testcase. * gcc.target/i386/avx512f-vpcmpgtuq-1.c: Ditto. * gcc.target/i386/avx512vl-vpcmpeqq-1.c: Ditto. * gcc.target/i386/avx512vl-vpcmpequq-1.c: Ditto. * gcc.target/i386/avx512vl-vpcmpgeq-1.c: Ditto. * gcc.target/i386/avx512vl-vpcmpgeuq-1.c: Ditto. * gcc.target/i386/avx512vl-vpcmpgtq-1.c: Ditto. * gcc.target/i386/avx512vl-vpcmpgtuq-1.c: Ditto. * gcc.target/i386/avx512vl-vpcmpleq-1.c: Ditto. * gcc.target/i386/avx512vl-vpcmpleuq-1.c: Ditto. * gcc.target/i386/avx512vl-vpcmpltq-1.c: Ditto. * gcc.target/i386/avx512vl-vpcmpltuq-1.c: Ditto. * gcc.target/i386/avx512vl-vpcmpneqq-1.c: Ditto. * gcc.target/i386/avx512vl-vpcmpnequq-1.c: Ditto.
2025-04-24PR modula2/119914 No error message generated when passing a Ztype to an ↵Gaius Mulley6-20/+217
unbounded array This patch detects constants ZType, RType, CType being passed to unbounded arrays and generates an error message highlighting the formal and actual parameters in error. gcc/m2/ChangeLog: PR modula2/119914 * gm2-compiler/M2Check.mod (checkConstMeta): Add check for Ztype, Rtype and Ctype and unbounded arrays. (IsZRCType): New procedure function. (isZRC): Add comment. * gm2-compiler/M2Quads.mod: * gm2-compiler/M2Range.mod (gdbinit): New procedure. (BreakWhenRangeCreated): Ditto. (CheckBreak): Ditto. (InitRange): Call CheckBreak. (Init): Add gdbhook and initialize interactive watch point. * gm2-compiler/SymbolTable.def (GetNthParamAnyClosest): New procedure function. * gm2-compiler/SymbolTable.mod (BreakSym): Remove constant. (BreakSym): Add Variable. (stop): Remove. (gdbhook): New procedure. (BreakWhenSymCreated): Ditto. (CheckBreak): Ditto. (NewSym): Call CheckBreak. (Init): Add gdbhook and initialize interactive watch point. (MakeProcedure): Replace guarded call to stop with CheckBreak. (GetNthParamChoice): New procedure function. (GetNthParamOrdered): Ditto. (GetNthParamAnyClosest): Ditto. (GetOuterModuleScope): Ditto. gcc/testsuite/ChangeLog: PR modula2/119914 * gm2/pim/fail/constintarraybyte.mod: New test. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2025-04-23Regenerate gcc.potJoseph Myers1-6357/+6609
* gcc.pot: Regenerate.
2025-04-23testsuite: Require fstack_protector for no-stack-protector-attr-3.CDimitar Dimitrov1-0/+1
The test fails on pru-unknown-elf with: cc1plus: warning: '-fstack-protector' not supported for this target Even though the compiled functions have the feature disabled using an attribute, the command line option is still not supported by some targets. Tested x86_64-pc-linux-gnu and ensured that g++.sum is the same with and without this patch. gcc/testsuite/ChangeLog: * g++.dg/no-stack-protector-attr-3.C: Require effective target fstack_protector. Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
2025-04-23Enable ip-cp cloning over non-hot edgesJan Hubicka7-150/+192
Currently enabling profile feedback regresses x264 and exchange. In both cases the root of the issue is that ipa-cp cost model thinks cloning is not relevant when feedback is available while it clones without feedback. Consider: __attribute__ ((used)) int a[1000]; __attribute__ ((noinline)) void test2(int sz) { for (int i = 0; i < sz; i++) a[i]++; asm volatile (""::"m"(a)); } __attribute__ ((noinline)) void test1 (int sz) { for (int i = 0; i < 1000; i++) test2(sz); } int main() { test1(1000); return 0; } Here we want to clone call both test1 and test2 and specialize for 1000, but ipa-cp will not do that, since it will skip call main->test1 as not hot since it is called just once both with or without profile feedback. In this simple testcase even without profile feedback we will track that main is called once. I think the testcase shows that hotness of call is not that relevant when deciding whether we want to propagate constants across it. ipa-cp with IPA profile can compute overall estimate of time saved (which is existing time benefit computing time saved per invociation of the function multiplied by number of executions) and see if result is big enough. An easy check is to simply call maybe_hot_p on the resulting count. So this patch makes ipa-cp to consider all calls sites except those known to be unlikely executed (i.e. run 0 times in train run or known to lead to someting bad) as interesting, which makes ipa-cp to propagate across them, find cloning candidates and feed them into good_clonning_oppurtunity. For this I added cs_interesting_for_ipcp_p which also attempts to do right thing with partial training. Now good_clonning_oppurtunity will currently return false, since it will figure out that the call edge is not very frequent. It already kind of knows that frequency of call instruction istself is not too important, but instead of computing overall time saved, it tries to compare it with param_ipa_cp_profile_count_base percentage of counts of call edges. I think this is not very relevant since estimated time saved per call can be large. So I dropped this logic and replaced it with simple use of overall saved time. Since ipa-cp is not dealing well with the cases where it hits the allowed unit growth limit, we probably want to be more careful, so I keep existing metric with this change. So now we get: Evaluating opportunities for test1/3. - considering value 1000 for param #0 sz (caller_count: 1) good_cloning_opportunity_p (time: 1, size: 8, count_sum: 1 (precise), overall time saved: 1 (adjusted)) -> evaluation: 0.12, threshold: 500 not cloning: time saved is not hot good_cloning_opportunity_p (time: 129001, size: 20, count_sum: 1 (precise), overall time saved: 129001 (adjusted)) -> evaluation: 6450.05, threshold: 500 First call to good_cloning_oppurtunity considers the case where only test1 is clonned. In this case time saved is 1 (for passing the value around) and since it is called just once (count_sum) overall time saved is 1 which is not considered hot and we also get very low evaulation score. In the second call we consider cloning chain test1->test2. In this case time saved is large (12901) since test2 is invoked many times and it is used to controll the loop. We still know that the count is 1 but overall time is 129001 which is already considered relevant and we clone. I also try to do something sensible in case we have calls both with and without IPA profile (which can happen for comdats where profile got missing or with LTO if some units were not trained). Instead of checking whether sum of calls with known profile is nonzero, I keep track if there are other calls and if so, also try the local heuristics that is used without profile feedback. The patch improves SPECint with -Ofast -fprofile-use by approx 1% by speeding up x264 from 99.3s to 91.3s (9%) and exchange from 99.7s to 95.5s (3.3%). We still get better x264 runtime without profile (86.4s for x264 and 93.8 for exchange). The main problem I see is that ipa-cp has the global limit for growth of 10% but does not consider the oppurtunities in priority order. Consequently if the limit is hit, randomly some clone oppurtunities are dropped in favour of others. I dumped unit size changes with -flto -Ofast build of SPEC2017. Without patch I get: orig new growth 588677 605385 102.838229 4378 6037 137.894016 484650 494851 102.104818 4111 4111 100.000000 99953 103519 103.567677 106181 114889 108.201091 21389 21597 100.972462 24925 26746 107.305918 15308 23974 156.610922 27354 27906 102.017986 494 494 100.000000 4631 4631 100.000000 863216 872729 101.102042 126604 126604 100.000000 605138 627156 103.638509 4112 4112 100.000000 222006 231293 104.183220 2952 3384 114.634146 37584 39807 105.914751 4111 4111 100.000000 13226 13226 100.000000 4111 4111 100.000000 326215 337396 103.427494 25240 25433 100.764659 64644 65972 102.054328 127223 132300 103.990631 494 494 100.000000 Small units can grow up to 16000 instructions and other units are large. So there is only one 156% growth hititng limits which is exchange that has recursive clonning that goes specially. With profile feedback ipacp basically shuts itself off: 333815 333891 100.022767 2559 2974 116.217272 217576 217581 100.002298 2749 2749 100.000000 64652 64716 100.098992 68416 69707 101.886986 13171 13171 100.000000 11849 11849 100.000000 10519 16180 153.816903 15843 15843 100.000000 231 231 100.000000 3624 3624 100.000000 573385 573386 100.000174 97623 97623 100.000000 295673 295676 100.001015 2750 2750 100.000000 130723 130726 100.002295 2334 2334 100.000000 19313 19313 100.000000 2749 2749 100.000000 517331 517331 100.000000 6707 6707 100.000000 2749 2749 100.000000 193638 193638 100.000000 16425 16425 100.000000 47154 47154 100.000000 96422 96422 100.000000 231 231 100.000000 So we essentially clone only exchange and and mcf (116%) With patch and no FDO I get: 588677 605385 102.838229 4378 6037 137.894016 484519 494698 102.100846 4111 4111 100.000000 99953 103519 103.567677 106181 114889 108.201091 21389 22632 105.811398 24854 26620 107.105496 15308 23974 156.610922 27354 28039 102.504204 494 494 100.000000 4631 4631 100.000000 4631 4631 100.000000 126604 126630 100.020536 4112 4112 100.000000 222006 231293 104.183220 2952 3384 114.634146 37584 39807 105.914751 2760715 2835539 102.710312 4111 4111 100.000000 13226 13226 100.000000 4111 4111 100.000000 326215 337396 103.427494 25240 25433 100.764659 64644 65972 102.054328 127223 132300 103.990631 494 494 100.000000 which seems essentially same as without patch. However with FDO I get: 333815 350363 104.957237 2559 3345 130.715123 217469 220765 101.515618 485599 488772 100.653420 2749 2749 100.000000 64652 74265 114.868836 68416 87484 127.870674 13171 20656 156.829398 11792 11990 101.679104 10519 17028 161.878506 15843 16119 101.742094 231 231 100.000000 573336 573336 100.000000 97623 97623 100.000000 295497 296208 100.240612 2750 2750 100.000000 130723 133341 102.002708 2334 2334 100.000000 19313 19368 100.284782 2749 2749 100.000000 6707 6755 100.715670 2749 2749 100.000000 193638 194712 100.554643 16425 17377 105.796043 47154 47154 100.000000 96422 96422 100.000000 231 231 100.000000 So here we get 114% and 127 growth in x264 (two differen tbinaries) 56% growht in Deepsjeng, 61% growth in Exchange which all are above 10% cutoff. Bootstrapped/regtested x86_64-linux. gcc/ChangeLog: * ipa-cp.cc (base_count): Remove. (struct caller_statistics): Rename n_hot_calls to n_interesting_calls; add called_without_ipa_profile. (init_caller_stats): Update. (cs_interesting_for_ipcp_p): New function. (gather_caller_stats): collect n_interesting_calls and called_without_profile. (ipcp_cloning_candidate_p): Use n_interesting-calls rather then hot. (good_cloning_opportunity_p): Rewrite heuristics when IPA profile is present (estimate_local_effects): Update. (value_topo_info::propagate_effects): Update. (compare_edge_profile_counts): Remove. (ipcp_propagate_stage): Do not collect base_count. (get_info_about_necessary_edges): Record whether function is called without profile. (decide_about_value): Update. (ipa_cp_cc_finalize): Do not initialie base_count. * profile-count.cc (profile_count::operator*): New. (profile_count::operator*=): New. * profile-count.h (profile_count::operator*): Declare (profile_count::operator*=): Declare. * params.opt: Remove ipa-cp-profile-count-base. * doc/invoke.texi: Likewise.
2025-04-23Cost truth_value exprs in i386 vectorizer costs.Jan Hubicka1-0/+18
this patch implements costing of truth_value exprs. I.e. a = b < c; Those seems to be now the most common operations that goes to the addss path except for in->fp and fp->int conversions. For integer we use setcc, for FP there is CMccSS and variants which sets the destination register a s a mast (i.e. -1 on true and 0 on false). Technically these needs res&1 to get into 1 on true, 0 on false, but looking on examples where this is used, it is common that the resulting code is optimized avoiding need for this (except for cases wehre result is directly saved to memory). For this reason I am accounting only one sse_op (CMccSS) itself. gcc/ChangeLog: * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost): Cost truth_value exprs.
2025-04-23Update gcc sv.poJoseph Myers1-230/+162
* sv.po: Update.
2025-04-23testsuite: aarch64: arm: Enable vld1x?.c and vst1x?.c on arm [PR71233]Christophe Lyon6-46/+90
r14-7202-gc8ec3e1327cb1e added vld1xN and vst1xN intrinsics and some tests on arm, but didn't enable some existing tests. Since these tests are shared with aarch64, this patch removes the 'dg-skip-if "unimplemented" { arm*-*-* }' directives and relies on the advsimd-intrinsics.exp driver to define the appropriate flags and dg-do-what action. (A previous patch removed 'dg-do run', and this patch removes 'dg-options "-O3"' which would override the options computed by the test driver) float16 intrinsics require the neon-fp16 FPU, which is possibly enabled by advsimd-intrinsics.exp, so we include them unconditionally on aarch64 or if fp16 is enabled on arm. poly64 intrinsics would require crypto-neon-fp-armv8: the patch enables the corresponding tests on aarch64 only, since for arm they are already covered by other tests in gcc.target/arm/simd/. For some reason, poly64 tests where missing from x2 and x3 tests, so the patch adds them as needed. Tested on aarch64-linux-gnu (no change), arm-linux-gnueabihf (the additional tests are executed) and various flavors of arm-none-eabi (the additional tests are compiled-only on M-profile, executed on A-profile). gcc/testsuite/ PR target/71233 * gcc.target/aarch64/advsimd-intrinsics/vld1x2.c: Enable on arm. * gcc.target/aarch64/advsimd-intrinsics/vld1x3.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vld1x4.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vst1x2.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vst1x3.c: Likewise. * gcc.target/aarch64/advsimd-intrinsics/vst1x4.c: Likewise.
2025-04-23testsuite: Skip g++.dg/eh/pr119507.C on Solaris/SPARC with asRainer Orth1-0/+2
The new g++.dg/eh/pr119507.C test FAILs on Solaris/SPARC with the native as: FAIL: g++.dg/eh/pr119507.C -std=gnu++17 scan-assembler-times .section[\\t ][^\\n]*.gcc_except_table._Z6comdatv 1 FAIL: g++.dg/eh/pr119507.C -std=gnu++17 scan-assembler-times .section[\\t ][^\\n]*.gcc_except_table._Z7comdat1v 1 FAIL: g++.dg/eh/pr119507.C -std=gnu++26 scan-assembler-times .section[\\t ][^\\n]*.gcc_except_table._Z6comdatv 1 FAIL: g++.dg/eh/pr119507.C -std=gnu++26 scan-assembler-times .section[\\t ][^\\n]*.gcc_except_table._Z7comdat1v 1 FAIL: g++.dg/eh/pr119507.C -std=gnu++98 scan-assembler-times .section[\\t ][^\\n]*.gcc_except_table._Z6comdatv 1 FAIL: g++.dg/eh/pr119507.C -std=gnu++98 scan-assembler-times .section[\\t ][^\\n]*.gcc_except_table._Z7comdat1v 1 This happens because the syntax for COMDAT sections is vastly different from the one used by gas. Rather than trying to handle this, this patch just skips the test. Tested on sparc-sun-solaris2.11 with both as and gas, i386-pc-solaris2.11, and x86_64-pc-linux-gnu. 2025-04-23 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> gcc/testsuite: * g++.dg/eh/pr119507.C: Skip on sparc*-*-solaris2* && !gas.
2025-04-23Fortran: Use correct location in check of coarray functions [PR119200]Andre Vehreschild1-6/+12
Use gfc_current_intrinsic_where during check(), because gfc_current_locus is not set to correct location or at all. PR fortran/119200 gcc/fortran/ChangeLog: * check.cc (gfc_check_lcobound): Use locus from intrinsic_where. (gfc_check_image_index): Same. (gfc_check_num_images): Same. (gfc_check_team_number): Same. (gfc_check_this_image): Same. (gfc_check_ucobound): Same.
2025-04-23testsuite: AMDGCN test for vect-early-break_38.c as well to consistent ↵Tamar Christina1-0/+1
architecture [PR119286] I had missed this one during the AMDGCN test failures. Like vect-early-break_18.c this test is also scalaring the loads and thus leading to unexpected vectorization for this testcase. gcc/testsuite/ChangeLog: PR target/119286 * gcc.dg/vect/vect-early-break_38.c: Force -march=gfx908 for amdgcn.
2025-04-22Accept allones or 0 operand for vcond_mask op1.liuhongt6-7/+55
Since ix86_expand_sse_movcc will simplify them into a simple vmov, vpand or vpandn. gcc/ChangeLog: * config/i386/predicates.md (vector_or_0_or_1s_operand): New predicate. (nonimm_or_0_or_1s_operand): Ditto. * config/i386/sse.md (vcond_mask_<mode><sseintvecmodelower>): Extend the predicate of operands1 to accept 0 or allones operands. (vcond_mask_<mode><sseintvecmodelower>): Ditto. (vcond_mask_v1tiv1ti): Ditto. (vcond_mask_<mode><sseintvecmodelower>): Ditto. * config/i386/i386.md (mov<mode>cc): Ditto for operands[2] and operands[3]. * config/i386/i386-expand.cc (ix86_expand_sse_fp_minmax): Force immediate_operand to register. gcc/testsuite/ChangeLog: * gcc.target/i386/blendv-to-maxmin.c: New test. * gcc.target/i386/blendv-to-pand.c: New test.
2025-04-23Daily bump.GCC Administrator5-1/+425
2025-04-22Fix vectorizer costs of COND_EXPR, MIN_EXPR, MAX_EXPR, ABS_EXPR, ABSU_EXPRJan Hubicka2-11/+92
this patch adds special cases for vectorizer costs in COND_EXPR, MIN_EXPR, MAX_EXPR, ABS_EXPR and ABSU_EXPR. We previously costed ABS_EXPR and ABSU_EXPR but it was only correct for FP variant (wehre it corresponds to andss clearing sign bit). Integer abs/absu is open coded as conditinal move for SSE2 and SSE3 instroduced an instruction. MIN_EXPR/MAX_EXPR compiles to minss/maxss for FP and accroding to Agner Fog tables they costs same as sse_op on all targets. Integer translated to single instruction since SSE3. COND_EXPR translated to open-coded conditional move for SSE2, SSE4.1 simplified the sequence and AVX512 introduced masked registers. gcc/ChangeLog: * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost): Add special cases for COND_EXPR; make MIN_EXPR, MAX_EXPR, ABS_EXPR and ABSU_EXPR more realistic. gcc/testsuite/ChangeLog: * gcc.target/i386/pr89618-2.c: XFAIL.
2025-04-22rs6000: Ignore OPTION_MASK_SAVE_TOC_INDIRECT differences in inlining ↵Jakub Jelinek2-4/+23
decisions [PR119327] The following testcase FAILs because the always_inline function can't be inlined. The rs6000 backend has similarly to other targets a hook which rejects inlining which would bring in new ISAs which aren't there in the caller. And this hook rejects this because of OPTION_MASK_SAVE_TOC_INDIRECT differences. This flag is set if explicitly requested or by default depending on whether the current function looks hot (or at least not cold): if ((rs6000_isa_flags_explicit & OPTION_MASK_SAVE_TOC_INDIRECT) == 0 && flag_shrink_wrap_separate && optimize_function_for_speed_p (cfun)) rs6000_isa_flags |= OPTION_MASK_SAVE_TOC_INDIRECT; The target nodes that are being compared here are actually the default target node (which was created when cfun was NULL) vs. one that was created for the always_inline function when it wasn't NULL, so one doesn't have it, the other does. In any case, this flag feels like a tuning decision rather than hard ISA requirement and I see no problems why we couldn't inline even explicit -msave-toc-indirect function into -mno-save-toc-indirect or vice versa. We already ignore OPTION_MASK_P{8,10}_FUSION which are also more like tuning flags. 2025-04-22 Jakub Jelinek <jakub@redhat.com> PR target/119327 * config/rs6000/rs6000.cc (rs6000_can_inline_p): Ignore also OPTION_MASK_SAVE_TOC_INDIRECT differences. * g++.dg/opt/pr119327.C: New test.
2025-04-22aarch64: Define __ARM_FEATURE_FAMINMAXRichard Sandiford2-0/+16
We implemented FAMINMAX ACLE support but failed to define the associated feature macro. gcc/ * config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Define __ARM_FEATURE_FAMINMAX. gcc/testsuite/ * gcc.target/aarch64/pragma_cpp_predefs_4.c: Test __ARM_FEATURE_FAMINMAX.
2025-04-22Induction vectorizer: prevent ICE for scalable typesSpencer Abson1-9/+30
We currently check that the target suppports PLUS_EXPR and MINUS_EXPR with step_vectype (a fix for pr103523). However, vectorizable_induction can emit a vectorized MULT_EXPR when calculating the step of each IV for SLP, and both MULT_EXPR/FLOAT_EXPR when calculating VEC_INIT for float inductions. gcc/ChangeLog: * tree-vect-loop.cc (vectorizable_induction): Add target support checks for vectorized MULT_EXPR and FLOAT_EXPR where necessary for scalable types. Prefer target_supports_op_p over directly_supports_p for these tree codes. (vectorizable_nonlinear_induction): Fix a doc comment while I'm here.
2025-04-22AArch64: Emit half-precision FCMP/FCMPESpencer Abson3-13/+77
Enable a target with FEAT_FP16 to emit the half-precision variants of FCMP/FCMPE. gcc/ChangeLog: * config/aarch64/aarch64.md: Update cbranch, cstore, fcmp and fcmpe to use the GPF_F16 iterator for floating-point modes. gcc/testsuite/ChangeLog: * gcc.target/aarch64/_Float16_cmp_1.c: New test. * gcc.target/aarch64/_Float16_cmp_2.c: New (negative) test.
2025-04-22AArch64: Define the spaceship optab [PR117013]Spencer Abson6-0/+390
This expansion ensures that exactly one comparison is emitted for spacesip-like sequences on floating-point operands, including when the result of such sequences are compared against members of std::<some_ordering>::<some_value>. For both integer and floating-point types, we optimize for the case in which the result of a spaceship-like operation is written to a GPR. The PR highlights this issue for floating-point operands, but we also make an improvement for integers, preferring: cmp w0, w1 cset w1, gt csinv w0, w1, wzr, ge over: cmp w0, w1 mov w0, 1 csinv w0, w0, wzr, ge csel w0, w0, wzr, ne to compute: auto test(int a, int b) { return a <=> b;} gcc/ChangeLog: PR target/117013 * config/aarch64/aarch64-protos.h (aarch64_expand_fp_spaceship): Declare optab expander function for floating-point types. * config/aarch64/aarch64.cc (aarch64_expand_fp_spaceship): Define optab expansion for floating-point types (new function). * config/aarch64/aarch64.md (spaceship<mode>4): Add define_expands for spaceship<mode>4 on integer and floating-point types. gcc/testsuite/ChangeLog: PR target/117013 * g++.target/aarch64/spaceship_1.C: New test. * g++.target/aarch64/spaceship_2.C: New test. * g++.target/aarch64/spaceship_3.C: New test.
2025-04-22aarch64: Update FP8 dependencies for -mcpu=olympusKyrylo Tkachov1-1/+1
We had not noticed that after g:299a8e2dc667e795991bc439d2cad5ea5bd379e2 the FP8FMA and FP8DOT4 features aren't implied by FP8FMA. The intent is for -mcpu=olympus to support all of them. Fix the definition to include the relevant sub-features explicitly. Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com> gcc/ * config/aarch64/aarch64-cores.def (olympus): Add fp8fma, fp8dot4 explicitly.
2025-04-22Fortran: Various fixes on F2018 teams.Andre Vehreschild9-10/+137
gcc/fortran/ChangeLog: * match.cc (match_exit_cycle): Allow to exit team block. (gfc_match_end_team): Create end_team node also without parameter list. * trans-intrinsic.cc (conv_stat_and_team): Team and team_number only need to be a single pointer. * trans-stmt.cc (trans_associate_var): Create a mapping coarray token for coarray associations or it is not addressed correctly. * trans.h (enum gfc_coarray_regtype): Add mapping mode to coarray register. libgfortran/ChangeLog: * caf/libcaf.h: Add mapping mode to coarray's register. * caf/single.c (_gfortran_caf_register): Create a token sharing another token's memory. (check_team): Check team parameters to coindexed expressions are valid. gcc/testsuite/ChangeLog: * gfortran.dg/coarray/coindexed_3.f08: Add minimal test for get_team(). * gfortran.dg/team_change_2.f90: Add test for change team with label and exiting out of it. * gfortran.dg/team_end_2.f90: Check parsing to labeled team blocks is correct now. * gfortran.dg/team_end_3.f90: Check that end_team call is generated for labeled end_teams, too. * gfortran.dg/coarray/coindexed_5.f90: New test.
2025-04-22Fortran: Add teams support in image_index and num_images for F2018Andre Vehreschild18-172/+168
This more or less completes the set of functions that are affected by teams. gcc/fortran/ChangeLog: * check.cc (gfc_check_image_index): Check for team or team_number correctnes. (gfc_check_num_images): Same. * gfortran.texi: Update documentation on num_images' API function. * intrinsic.cc (add_functions): Update signature of image_index and num_images. Both can take either a team handle or number. * intrinsic.h (gfc_check_num_images): Update signature to take either team or team_number. (gfc_check_image_index): Can take coarray, subscripts and team or team number now. (gfc_simplify_image_index): Same. (gfc_simplify_num_images): Same. (gfc_resolve_image_index): Same. * intrinsic.texi: Update documentation of num_images() Fortran function. * iresolve.cc (gfc_resolve_image_index): Update signature. * simplify.cc (gfc_simplify_num_images): Update signature and remove undocumented failed argument. (gfc_simplify_image_index): Add team or team number argument. * trans-intrinsic.cc (conv_stat_and_team): Because being optional teams need to be a pointer to the opaque pointer. (conv_caf_sendget): Correct call; was two arguments short. (trans_image_index): Support team or team_number. (trans_num_images): Same. (conv_intrinsic_cobound): Adapt to changed signature of num_images in call. * trans-stmt.cc (gfc_trans_sync): Same. libgfortran/ChangeLog: * caf/libcaf.h (_gfortran_caf_num_images): Correct prototype. * caf/single.c (_gfortran_caf_num_images): Default implementation. gcc/testsuite/ChangeLog: * gfortran.dg/coarray_49.f90: Adapt to changed error message. * gfortran.dg/coarray_collectives_12.f90: Adapt to changed function signature of num_images. * gfortran.dg/coarray_collectives_16.f90: Same. * gfortran.dg/coarray_lib_this_image_1.f90: Same. * gfortran.dg/coarray_lib_this_image_2.f90: Same. * gfortran.dg/coarray_this_image_1.f90: Adapt tests for num_images. * gfortran.dg/coarray_this_image_2.f90: Same. * gfortran.dg/coarray_this_image_3.f90: Same. * gfortran.dg/num_images_1.f90: Check that deprecated syntax is no longer supported.
2025-04-22Fortran: Add team-support to this_image [PR87326]Andre Vehreschild16-121/+283
This_image() no longer has a distance formal argument, but a team one. The source of the distance argument could not be identified, i.e. whether it came from a TS or standard draft. To implement only the standard it is removed. Besides being defined, it was not used anyway. PR fortran/87326 gcc/fortran/ChangeLog: * check.cc (gfc_check_this_image): Check the three different parameter lists possible for this_image and sort them correctly. * gfortran.texi: Update documentation on this_image's API. * intrinsic.cc (add_functions): Update this_image's signature. (check_specific): Add specific check for this_image. * intrinsic.h (gfc_check_this_image): Change to flexible argument list. * intrinsic.texi: Update documentation on this_image(). * iresolve.cc (gfc_resolve_this_image): Resolve the different arguments. * simplify.cc (gfc_simplify_this_image): Simplify the simplify routine. * trans-decl.cc (gfc_build_builtin_function_decls): Update signature of this_image. * trans-expr.cc (gfc_caf_get_image_index): Use correct signature of this_image. * trans-intrinsic.cc (trans_this_image): Adapt to correct signature. libgfortran/ChangeLog: * caf/libcaf.h (_gfortran_caf_this_image): Correct prototype. * caf/single.c (struct caf_single_team): Add new_index of image. (_gfortran_caf_this_image): Return the image index in the given team. (_gfortran_caf_form_team): Set new_index in team structure. gcc/testsuite/ChangeLog: * gfortran.dg/coarray_10.f90: Update error messages. * gfortran.dg/coarray_lib_this_image_1.f90: Same. * gfortran.dg/coarray_lib_this_image_2.f90: Same. * gfortran.dg/coarray_this_image_1.f90: Add more tests and remove incorrect ones. * gfortran.dg/coarray_this_image_2.f90: Test more features. * gfortran.dg/coarray_this_image_3.f90: New test.
2025-04-22Fortran: Update get_team, team_number and image_status to F2018 [PR88154, ↵Andre Vehreschild17-77/+329
PR88960, PR97210, PR103001] Add functions get_team() and team_number() to comply with F2018 standard. Update image_status() to comply with F2018 standard. PR fortran/88154 PR fortran/88960 PR fortran/97210 PR fortran/103001 gcc/fortran/ChangeLog: * check.cc (team_type_check): Check a type for being team_type from the iso_fortran_env module. (gfc_check_image_status): Use team_type check. (gfc_check_get_team): Check for level argument. (gfc_check_team_number): Use team_type check. * expr.cc (gfc_check_assign): Add treatment for returning team_type in caf-single mode. * gfortran.texi: Add/Update documentation for get_team and team_number API functions. * intrinsic.cc (add_functions): Update get_team signature. * intrinsic.h (gfc_resolve_get_team): Add prototype. * intrinsic.texi: Add/Update documentation for get_team and team_number Fortran functions. * iresolve.cc (gfc_resolve_get_team): Resolve return type to be of type team_type. * iso-fortran-env.def: Update STAT_LOCK constants. They have nothing to do with files. Add level constants for get_team. * libgfortran.h: Add level and unlock_stat constants. * simplify.cc (gfc_simplify_get_team): Simply to correct return type team_type. * trans-decl.cc (gfc_build_builtin_function_decls): Update get_team and image_status API prototypes to correct signatures. * trans-intrinsic.cc (conv_intrinsic_image_status): Translate second parameter correctly. (conv_intrinsic_team_number): Translate optional single team argument correctly. (gfc_conv_intrinsic_function): Add translation of get_team. libgfortran/ChangeLog: * caf/libcaf.h: Add constants for get_team's level argument and update stat values for failed images. (_gfortran_caf_team_number): Add prototype. (_gfortran_caf_get_team): Same. * caf/single.c (_gfortran_caf_team_number): Get the given team's team number. (_gfortran_caf_get_team): Get the current team or the team given by level when the argument is present. gcc/testsuite/ChangeLog: * gfortran.dg/coarray/image_status_1.f08: Correct check for team_type. * gfortran.dg/pr102458.f90: Adapt to multiple errors. * gfortran.dg/coarray/get_team_1.f90: New test. * gfortran.dg/team_get_1.f90: New test. * gfortran.dg/team_number_1.f90: Correct Fortran syntax.
2025-04-22Fortran: Improve F2018 TEAM handling [PR87326, PR87556, PR88254, PR103896]Andre Vehreschild22-190/+942
Improve the implementation of F2018 TEAM handling routines. Add runtime-functions to caf_single to allow testing. PR fortran/87326 PR fortran/87556 PR fortran/88254 PR fortran/103796 gcc/fortran/ChangeLog: * coarray.cc (split_expr_at_caf_ref): Treat polymorphic types correctly. Ensure resolve of expression after coindex. (create_allocated_callback): Fix parameter of allocated function for coarrays. (coindexed_expr_callback): Improve detection of coarrays in allocated function. * decl.cc (gfc_match_end): Add team block matching. * dump-parse-tree.cc (show_code_node): Dump change team block as such. * frontend-passes.cc (gfc_code_walker): Recognice team block. * gfortran.texi: Add documentation for team api functions. * intrinsic.texi: Add documentation about team_type in iso_fortran_env module. * iso-fortran-env.def (team_type): Use helper to get pointer kind. * match.cc (gfc_match_associate): Factor out matching of association list, because it is used in change team as well. (check_coarray_assoc): Ensure, that the association is to a coarray. (match_association_list): Match a list of association either in associate or in change team. (gfc_match_form_team): Match form team correctly include new_index. (gfc_match_change_team): Match change team with association list. (gfc_match_end_team): Match end team including stat and errmsg. (gfc_match_return): Prevent return from team block. * parse.cc (decode_statement): Sort team block. (next_statement): Same. (check_statement_label): Same. (accept_statement): Same. (verify_st_order): Same. (parse_associate): Renamed to move_associates_to_block... (move_associates_to_block): ... to enable reuse for change team. (parse_change_team): Parse it as block. (parse_executable): Same. * parse.h (enum gfc_compile_state): Add team block as compiler state. * resolve.cc (resolve_scalar_argument): New function to resolve an argument to a statement as a scalar. (resolve_form_team): Resolve its members. (resolve_change_team): Same. (resolve_branch): Prevent branch from jumping out of team block. (check_team): Removed. * trans-decl.cc (gfc_build_builtin_function_decls): Add stat and errmsg to team API functions and update their arguments. * trans-expr.cc (gfc_trans_subcomponent_assign): Also null the token when moving memory or an allocated() will not detect a free. * trans-intrinsic.cc (gfc_conv_intrinsic_caf_is_present_remote): Adapt to signature change no longer a pointer-pointer. * trans-stmt.cc (gfc_trans_form_team): Translate a form team including new_index. (gfc_trans_change_team): Translate a change team as a block. libgfortran/ChangeLog: * caf/libcaf.h: Remove commented block. (_gfortran_caf_form_team): Allow for all relevant arguments. (_gfortran_caf_change_team): Same. (_gfortran_caf_end_team): Same. (_gfortran_caf_sync_team): Same. * caf/single.c (struct caf_single_team): Team handling structures. (_gfortran_caf_init): Initialize initial team. (free_team_list): Free all teams and the memory they hold. (_gfortran_caf_finalize): Free initial and sibling teams. (_gfortran_caf_register): Add memory registered to current team. (_gfortran_caf_deregister): Unregister memory from current team. (_gfortran_caf_is_present_on_remote): Check token's memptr for llocation. May have been deallocated by an end team. (_gfortran_caf_form_team): Push a new team stub to the list. (_gfortran_caf_change_team): Push a formed team on top of the ctive teams stack. (_gfortran_caf_end_team): End the active team, free all memory allocated during its livespan. (_gfortran_caf_sync_team): Take stat and errmsg into account. gcc/testsuite/ChangeLog: * gfortran.dg/team_change_2.f90: New test. * gfortran.dg/team_change_3.f90: New test. * gfortran.dg/team_end_2.f90: New test. * gfortran.dg/team_end_3.f90: New test. * gfortran.dg/team_form_2.f90: New test. * gfortran.dg/team_form_3.f90: New test. * gfortran.dg/team_sync_2.f90: New test.
2025-04-22Fortran: Unify handling of STAT= and ERRMSG= optional arguments [PR87939]Andre Vehreschild17-124/+528
In preparing F2018 Teams handling improvements, unify handling of STAT= and ERRMSG= optional arguments. Handling of stat and errmsg in most teams statements is corrected in the next patch. Implement stat and errmsg for move_alloc () to comply with F2018. PR fortran/87939 gcc/fortran/ChangeLog: * check.cc (gfc_check_move_alloc): Add stat and errmsg to move_alloc. * dump-parse-tree.cc (show_sync_stat): New helper function. (show_code_node): Use show_sync_stat to print stat and errmsg. * gfortran.h (struct sync_stat): New struct to unify stat and errmsg handling. * intrinsic.cc (add_subroutines): Correct signature of move_alloc. * intrinsic.h (gfc_check_move_alloc): Correct signature of check_move_alloc. * match.cc (match_named_arg): Match an optional argument to a statement. (match_stat_errmsg): Match a stat= or errmsg= named argument. (gfc_match_critical): Use match_stat_errmsg to match the named arguments. (gfc_match_sync_team): Same. * resolve.cc (resolve_team_argument): Resolve an expr to have type TEAM_TYPE from iso_fortran_env. (resolve_scalar_variable_as_arg): Resolve an argument as a scalar type. (resolve_sync_stat): Resolve stat and errmsg expressions. (resolve_sync_team): Resolve a sync team statement using sync_stat helper. (resolve_end_team): Same. (resolve_critical): Same. * trans-decl.cc (gfc_build_builtin_function_decls): Correct sync_team signature. * trans-intrinsic.cc (conv_intrinsic_move_alloc): Store stat an errmsg optional arguments in helper struct and use helper to translate. * trans-stmt.cc (trans_exit): Implement DRY pattern for generating an _exit(). (gfc_trans_sync_stat): Translate stat and errmsg contents. (gfc_trans_end_team): Use helper to translate stat and errmsg. (gfc_trans_sync_team): Same. (gfc_trans_critical): Same. * trans-stmt.h (gfc_trans_sync_stat): New function. * trans.cc (gfc_deallocate_with_status): Parameterize check at runtime to allow unallocated (co-)array when freeing a structure. (gfc_deallocate_scalar_with_status): Same and also add errmsg. * trans.h (gfc_deallocate_with_status): Signature changes. (gfc_deallocate_scalar_with_status): Same. libgfortran/ChangeLog: * caf/single.c (_gfortran_caf_lock): Correct stat value, if lock is already locked by current image. (_gfortran_caf_unlock): Correct stat value, if lock is not locked. gcc/testsuite/ChangeLog: * gfortran.dg/coarray_critical_2.f90: New test. * gfortran.dg/coarray_critical_3.f90: New test. * gfortran.dg/team_sync_1.f90: New test. * gfortran.dg/move_alloc_11.f90: New test.
2025-04-22[PATCH] [RISC-V]Support -mcpu for Xuantie cpuYixuan Chen8-2/+325
Support -mcpu=xt-c908, xt-c908v, xt-c910, xt-c910v2, xt-c920, xt-c920v2 for Xuantie series cpu. ref:https://www.xrvm.cn/community/download?id=4224248662731067392 without fmv_cost, vector_unaligned_access, use_divmod_expansion, overlap_op_by_pieces, fill the tune info with generic ooo for further modification. gcc/ChangeLog: * config/riscv/riscv-cores.def (RISCV_TUNE): Add xt-c908, xt-c908v, xt-c910, xt-c910v2, xt-c920, xt-c920v2. (RISCV_CORE): Add xt-c908, xt-c908v, xt-c910, xt-c910v2, xt-c920, xt-c920v2. * doc/invoke.texi: Add xt-c908, xt-c908v, xt-c910, xt-c910v2, xt-c920, xt-c920v2. gcc/testsuite/ChangeLog: * gcc.target/riscv/mcpu-xt-c908.c: test -mcpu=xt-c908. * gcc.target/riscv/mcpu-xt-c910.c: test -mcpu=xt-c910. * gcc.target/riscv/mcpu-xt-c920v2.c: test -mcpu=xt-c920v2. * gcc.target/riscv/mcpu-xt-c908v.c: test -mcpu=xt-c908v. * gcc.target/riscv/mcpu-xt-c910v2.c: test -mcpu=xt-c910v2. * gcc.target/riscv/mcpu-xt-c920.c: test -mcpu=xt-c920.
2025-04-22testsuite: Add support for GCOV_UNDER_TESTChristophe Lyon4-16/+44
After commit r15-8947-g8ed2d5d219e999, which added new tests using gcov, the CI noticed failures because it was calling 'gcov' instead of $target-gcov. This is because the CI scripts override GXX_UNDER_TEST, but still run the testsuite in-tree, and gcc-transform-out-of-tree only depends on TESTING_IN_BUILD_TREE but the definition of GCOV uses GCC_UNDER_TEST, GXX_UNDER_TEST or GDC_UNDER_TEST (assuming their default values when TESTING_IN_BUILD_TREE). To handle such a case, this patch adds support for a new variable, GCOV_UNDER_TEST, which overrides the current behavior if defined, and has an effect similar to overriding GCC_UNDER_TEST etc... Unfortunately, the change needs to be duplicated in several places, which use either GCC_UNDER_TEST, GXX_UNDER_TEST or GDC_UNDER_TEST. Tested g++.dg/gcov/gcov.exp and now g++.dg/gcov/gcov-22.C passes (calling <installdir>/bin/$target-gcov as instructed by the CI scripts). No change observed on gcc.misc-tests/gcov.exp, and I could not test gdc.dg/gcov.exp and gnat.dg/gcov/gcov.exp. gcc/testsuite/ChangeLog: * g++.dg/gcov/gcov.exp: Handle GCOV_UNDER_TEST. * gcc.misc-tests/gcov.exp: Likewise. * gdc.dg/gcov.exp: Likewise. * gnat.dg/gcov/gcov.exp: Likewise.
2025-04-22Document locality partitioning params in invoke.texiKyrylo Tkachov1-0/+13
Filip Kastl pointed out that contrib/check-params-in-docs.py complains about params not documented in invoke.texi, so this patch adds the short explanation from params.opt for these to the invoke.texi section. Thanks for the reminder. Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com> gcc/ * doc/invoke.texi (lto-partition-locality-frequency-cutoff, lto-partition-locality-size-cutoff, lto-max-locality-partition): Document.
2025-04-22testsuite: Use sigsetjmp in gcc.misc-tests/gcov-31.cRainer Orth1-1/+1
The gcc.misc-tests/gcov-31.c test FAILs on Solaris and Darwin: FAIL: gcc.misc-tests/gcov-31.c (test for excess errors) Excess errors: /vol/gcc/src/hg/master/local/gcc/testsuite/gcc.misc-tests/gcov-31.c:23:5: error: implicit declaration of function '__sigsetjmp'; did you mean 'sigsetjmp'? [-Wimplicit-function-declaration] __sigsetjmp is a Linux/glibc implementation detail. Other tests just use sigsetjmp directly, so this patch follows suit. Tested on i386-pc-solaris2.11, sparc-sun-solaris2.11, x86_64-pc-linux-gnu, and x86_64-apple-darwin24.4.0. 2025-04-22 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> gcc/testsuite: * gcc.misc-tests/gcov-31.c (run_pending_traps): Use sigsetjmp instead of __sigsetjmp.
2025-04-22c++/modules: Remove unnecessary lazy_load_pendingsNathaniel Shead1-2/+0
This call is not necessary, as we don't access the bodies of any classes that we instantiate here. gcc/cp/ChangeLog: * name-lookup.cc (lookup_imported_hidden_friend): Remove unnecessary lazy_load_pendings. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
2025-04-22c++/modules: Find non-exported reachable decls when instantiating friend ↵Nathaniel Shead4-20/+67
classes [PR119863] In r15-9029-geb26b667518c95, we started checking for conflicting declarations with any reachable decl attached to the same originating module. This exposed the issue in the PR, where we would always create a new type even if a matching type existed in the original module. This patch reworks lookup_imported_hidden_friend to handle this case better, by first checking for any reachable decl in the attached module before looking in the mergeable decl slots. PR c++/119863 gcc/cp/ChangeLog: * name-lookup.cc (get_mergeable_namespace_binding): Remove no-longer-used function. (lookup_imported_hidden_friend): Also look for hidden imported decls in an attached decl's module. gcc/testsuite/ChangeLog: * g++.dg/modules/tpl-friend-18_a.C: New test. * g++.dg/modules/tpl-friend-18_b.C: New test. * g++.dg/modules/tpl-friend-18_c.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
2025-04-21Skip g++.dg/eh/pr119507.C on arm eabiAndrew Pinski1-0/+2
arm eabi emits the exception table using the handlerdata directive and does not use a comdat section for comdat functions. So this testcase should be skipped for arm eabi. Pushed as obvious after a quick test. gcc/testsuite/ChangeLog: * g++.dg/eh/pr119507.C: Skip for arm eabi. Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
2025-04-21[testsuite] [ppc] require ifunc for target_clones testAlexandre Oliva1-0/+1
gcc.target/powerpc/power11-3.c uses target_clones, that depends on ifunc. Require ifunc support. for gcc/testsuite/ChangeLog * gcc.target/powerpc/power11-3.c: Require ifunc support.
2025-04-21[riscv] vec_dup immediate constants in pred_broadcast expand [PR118182]Alexandre Oliva1-2/+20
pr118182-2.c fails on gcc-14 because it lacks the late_combine passes, particularly the one that runs after register allocation. Even in the trunk, the predicate broadcast for the add reduction is expanded and register-allocated as _zvfh, taking up an unneeded scalar register to hold the constant to be vec_duplicated. It is the late combine pass after register allocation that substitutes this unneeded scalar register into the vec_duplicate, resolving to the _zero or _imm insns. It's easy enough and more efficient to expand pred_broadcast to the insns that take the already-duplicated vector constant, when the operands satisfy the predicates of the _zero or _imm insns. for gcc/ChangeLog PR target/118182 * config/riscv/vector.md (@pred_broadcast<mode>): Expand to _zero and _imm variants without vec_duplicate.
2025-04-22Daily bump.GCC Administrator4-1/+76
2025-04-21c++: reorder constexpr checksJason Merrill1-16/+21
My proposed change to stop setting TREE_STATIC on constexpr heap pseudo-variables led to a diagnostic regression because we would get the generic "not constant" diagnostic before the "allocated storage" diagnostic. So let's move the generic verify_constant down a bit. gcc/cp/ChangeLog: * constexpr.cc (cxx_eval_outermost_constant_expr): Move verify_constant later.
2025-04-21c++: new size folding [PR118775]Jason Merrill3-8/+10
r15-7893 added a workaround for a case where we weren't registering (long)&a as invalid in a constant-expression, because build_new_1 had folded away the CONVERT_EXPR that we rely on to diagnose that problem. In general we want to defer most folding until cp_fold_function, so let's fold less here. We mainly want to expose constant size so we can treat it differently, and we already did any constexpr evaluation when initializing cst_outer_nelts, so fold_to_constant seems like the right choice. PR c++/118775 gcc/cp/ChangeLog: * constexpr.cc (cxx_eval_call_expression): Add assert. (fold_to_constant): Handle processing_template_decl. * init.cc (build_new_1): Use fold_to_constant. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/constexpr-new24.C: Adjust diagnostic.
2025-04-21c++: static constexpr strictness [PR99456]Jason Merrill1-3/+13
r11-7740 limited constexpr rejection of conversion from pointer to integer to manifestly constant-evaluated contexts; it should instead check whether we're in strict mode. The comment for that commit noted that making this change regressed other tests, which turned out to be because maybe_constant_init_1 was not being properly strict for variables declared constexpr/constinit. PR c++/99456 gcc/cp/ChangeLog: * constexpr.cc (cxx_eval_constant_expression): Check strict instead of manifestly_const_eval. (maybe_constant_init_1): Be strict for static constexpr vars.
2025-04-21Fix cost of vectorized double->float conversionJan Hubicka1-29/+22
In previous patch I miscomputed costs of cvtpd2pf instruction which mistakely gets accounted as 2 (VEC_PACK_TRUNC_EXPR). Vectorizer can produce both, but when producing VEC_PACK_TRUNC_EXPR it use promote_demote patch. This patch thus simplifies handling of NOP_EXPR since in that case we should always be producing only one instruction. PR target/119879 * config/i386/i386.cc (fp_conversion_stmt_cost): Inline to ... (ix86_vector_costs::add_stmt_cost): ... here; fix handling of NOP_EXPR.