Age | Commit message (Collapse) | Author | Files | Lines |
|
Removes `fold_convert (aggr_ptr_type, iv_step)` when using create_iv.
This was previously constructing statements like:
```
unsigned int _49;
vector([4,4]) int * _50;
sizetype _51;
vector([4,4]) int * vectp_x.6;
...
_50 = (vector([4,4]) int *) _49;
_51 = (sizetype) _50;
...
vectp_x.6_48 = vectp_x.6_47 + _51;
```
And instead creates:
```
unsigned int _49;
sizetype _50;
vector([4,4]) int * vectp_x.6;
...
_50 = (sizetype) _49;
...
vectp_x.6_48 = vectp_x.6_47 + _50;
```
As create_iv already has the logic to handle a pointer mode base and an integer
mode var this seems a more natural expression of this.
gcc/ChangeLog:
* tree-vect-data-refs.cc (vect_create_data_ref_ptr): Remove unnecessary
casts to aggr_ptr_type.
|
|
This removes the non-SLP path from vectorizable_comparison.
* tree-vect-stmts.cc (vectorizable_comparison_1): Remove
non-SLP path.
(vectorizable_comparison): Likewise.
|
|
The following removes the non-SLP paths from vectorizable_condition.
* tree-vect-stmts.cc (vectorizable_condition): Remove
non-SLP paths.
|
|
This removes the non-SLP paths from vectorizable_scan_store.
* tree-vect-stmts.cc (vectorizable_scan_store): Remove
non-SLP path and unused parameters.
(vectorizable_store): Adjust.
|
|
This removes the non-SLP paths from vectorizable_shift.
* tree-vect-stmts.cc (vectorizable_shift): Remove non-SLP paths.
|
|
This removes the non-SLP paths from vectorizable_assignment
* tree-vect-stmts.cc (vectorizable_assignment): Remove
non-SLP paths.
|
|
This removes the non-SLP paths from vectorizable_call, propagates
out ncopies == 1 and removes empty loops resulting from that.
* tree-vect-stmts.cc (vectorizable_call): Remove non-SLP path.
|
|
* tree-vect-stmts.cc (vectorizable_bswap): Remove non-SLP path.
|
|
This removes the non-SLP paths from vectorizable_recurr.
* tree-vect-loop.cc (vectorizable_recurr): Remove non-SLP path.
|
|
Extend the binary op/UNSPEC_SEL combiner patterns from SVE_FULL_F/
SVE_FULL_F_B16B16 to SVE_F/SVE_F_B16B16, where the strictness value
is SVE_RELAXED_GP.
gcc/ChangeLog:
* config/aarch64/aarch64-sve.md (*cond_<optab><mode>_2_relaxed):
Extend from SVE_FULL_F_B16B16 to SVE_F_B16B16.
(*cond_<optab><mode>_3_relaxed): Likewise.
(*cond_<optab><mode>_any_relaxed): Likwise.
(*cond_<optab><mode>_any_const_relaxed): Extend from SVE_FULL_F
to SVE_F.
(*cond_add<mode>_2_const_relaxed): Likewise.
(*cond_add<mode>_any_const_relaxed): Likewise.
(*cond_sub<mode>_3_const_relaxed): Likewise.
(*cond_sub<mode>_const_relaxed): Likewise.
gcc/testsuite/ChangeLog:
* g++.target/aarch64/sve/unpacked_cond_binary_bf16_1.C: New test.
* gcc.target/aarch64/sve/unpacked_cond_builtin_fmax_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_builtin_fmin_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_fadd_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_fdiv_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_fmaxnm_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_fminnm_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_fmul_1.c: Likewise..
* gcc.target/aarch64/sve/unpacked_cond_fsubr_1.c: Likewise.
|
|
This patch extends the unpredicated FP division expander to support
partial FP modes. It extends the existing patterns used to implement
UNSPEC_COND_FDIV and it's approximation as needed.
gcc/ChangeLog:
* config/aarch64/aarch64-sve.md: (@aarch64_sve_<optab><mode>):
Extend from SVE_FULL_F to SVE_F, use aarch64_predicate_operand.
(@aarch64_frecpe<mode>): Extend from SVE_FULL_F to SVE_F.
(@aarch64_frecps<mode>): Likewise.
(div<mode>3): Likewise, use aarch64_sve_fp_pred.
* config/aarch64/iterators.md: Add warnings above SVE_FP_UNARY
and SVE_FP_BINARY.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/sve/unpacked_fdiv_1.c: New test.
* gcc.target/aarch64/sve/unpacked_fdiv_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fdiv_3.c: Likewise.
|
|
This patch extends the expanders for unpredicated smax, smin, add, sub,
mul, min, and max, so that they support partial SVE FP modes.
The relevant insn and splitting patterns are also updated.
gcc/ChangeLog:
* config/aarch64/aarch64-sve.md (<optab><mode>3): Extend from
SVE_FULL_F to SVE_F, use aarch64_sve_fp_pred.
(*post_ra_<sve_fp_op><mode>3): Extend from SVE_FULL_F to SVE_F.
(@aarch64_pred_<optab><mode>): Extend from SVE_FULL_F to SVE_F,
use aarch64_predicate_operand (ADD/SUB/MUL/MAX/MIN).
(split for using unpredicated insns): Move SVE_RELAXED_GP into
the pattern, rather than testing for it in the condition.
* config/aarch64/aarch64-sve2.md (@aarch64_pred_<optab><mode>):
Extend from VNx8BF_ONLY to SVE_BF.
gcc/testsuite/ChangeLog:
* g++.target/aarch64/sve/unpacked_binary_bf16_1.C: New test.
* g++.target/aarch64/sve/unpacked_binary_bf16_2.C: Likewise.
* gcc.target/aarch64/sve/unpacked_builtin_fmax_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_builtin_fmax_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_builtin_fmin_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_builtin_fmin_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fadd_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fadd_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fmaxnm_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fmaxnm_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fminnm_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fminnm_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fmul_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fmul_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fsubr_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_fsubr_2.c: Likewise.
|
|
In some cases involving assigning an aggregate to a formal parameter of
an unconstrained discriminated subtype that has a Dynamic_Predicate, and where
the discriminated type also has a component of an unconstrained discriminated
subtype, the front end generates a malformed tree which causes a compilation
failure when the backend fails a consistency check.
gcc/ada/ChangeLog:
* exp_aggr.adb (Convert_To_Assignments): Add calls to Ensure_Defined
before generating assignments to components that could be
associated with a not-yet-defined itype.
|
|
RM 6.5 defines static and dynamic checks to ensure that a function result
with one or more access discriminants will not outlive the entity
designated by a non-null access discriminant value (see paragraphs
5.9 and 21). Implement these checks. Also fix a bug in passing along
an implicit parameter needed to perform the dynamic checks when a function
that takes such a parameter returns a call to another such function.
gcc/ada/ChangeLog:
* accessibility.adb (Function_Call_Or_Allocator_Level): Handle the
case where a function that has an Extra_Accessibility_Of_Result
parameter returns as its result a call to another such function.
In that case, the extra parameter should be passed along.
(Check_Return_Construct_Accessibility): Replace a warning about an
inevitable failure of a dynamic check with a legality-rule-violation
error message; adjust the text of the message accordingly.
* exp_ch6.ads (Apply_Access_Discrims_Accessibility_Check): New
procedure, following example of the existing
Apply_CW_Accessibility procedure.
* exp_ch6.adb (Apply_Access_Discrims_Accessibility_Check): body
for new procedure.
(Expand_Simple_Function_Return): Add call to new
Apply_Access_Discrims_Accessibility_Check procedure.
* exp_ch3.adb (Make_Allocator_For_Return): Add call to new
Apply_Access_Discrims_Accessibility_Check procedure.
|
|
gcc/ada/ChangeLog:
* doc/gnat_rm/standard_and_implementation_defined_restrictions.rst:
clarify parameter description.
* gnat_rm.texi: Regenerate.
|
|
The test vsx-builtin-7.c failed on powerpc64le-linux due to Identical
Code Folding (ICF) merging the functions insert_di_0_v2 and insert_di_0.
This behavior was introduced by commit r15-7961-gdc47161c1f32c3, which
enhanced alias analysis in ao_compare::compare_ao_refs, enabling the
compiler to identify and optimize structurally identical functions. As a
result, the compiler replaced insert_di_0_v2 with a tail call to
insert_di_0, altering the expected test behavior.
This patch adds -fno-ipa-icf to the test's dg-options to disable ICF,
avoiding function merging and ensuring the test executes correctly.
2025-06-24 Jeevitha Palanisamy <jeevitha@linux.ibm.com>
gcc/testsuite/
PR testsuite/119382
* gcc.target/powerpc/vsx-builtin-7.c: Add '-fno-ipa-icf' to dg-options.
|
|
Add asm check to make sure vx combine of vaaddu.vx will not pollute
the vxrm.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-fixed-vxrm-1-u16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-fixed-vxrm-1-u32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-fixed-vxrm-1-u64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-fixed-vxrm-1-u8.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-fixed-vxrm.h: New test.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
The vaaddu.vx combine almost comes from avg_floor, it will
requires the vxrm to be RDN. But not all vaaddu.vx should
depends on the RDN. The vaaddu.vx combine should leverage
the VXRM value as is instead of pollute them all to RDN.
This patch would like to fix this and set it as is.
gcc/ChangeLog:
* config/riscv/autovec-opt.md (*uavg_floor_vx_<mode>): Rename
from...
(*<sat_op_v_vdup>_vx_<mode>): Rename to...
(*<sat_op_vdup_v>_vx_<mode>): Rename to...
* config/riscv/riscv-protos.h (enum insn_flags): Add vxrm
RNE, ROD type.
(enum insn_type): Add RNE_P, ROD_P type.
(expand_vx_binary_vxrm_vec_vec_dup): Add new func decl.
(expand_vx_binary_vxrm_vec_dup_vec): Ditto.
* config/riscv/riscv-v.cc (get_insn_type_by_vxrm_val): Add
helper to get insn type by vxrm value.
(expand_vx_binary_vxrm_vec_vec_dup): Add new func impl
to expand vec + vec_dup pattern.
(expand_vx_binary_vxrm_vec_dup_vec): Ditto but for
vec_dup + vec pattern.
* config/riscv/vector-iterators.md: Add helper iterator
for sat vx combine.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
When emitting a primary module interface, we must re-stream any TU-local
entities that we saw in a partition. This patch adds the missing
members from core_vals.
As a drive-by fix, in some cases we might have a typedef referring to a
TU-local entity; we need to handle that case as well.
PR c++/120412
gcc/cp/ChangeLog:
* module.cc (trees_out::core_vals): Write TU_LOCAL_ENTITY bits.
(trees_in::core_vals): Read it.
(trees_in::tree_node): Handle TU_LOCAL_ENTITY typedefs.
gcc/testsuite/ChangeLog:
* g++.dg/modules/internal-14_a.C: New test.
* g++.dg/modules/internal-14_b.C: New test.
* g++.dg/modules/internal-14_c.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
|
|
Extend the unary op/UNSPEC_SEL combiner patterns from SVE_FULL_F to SVE_F,
where the strictness value is SVE_RELAXED_GP.
gcc/ChangeLog:
* config/aarch64/aarch64-sve.md (*cond_<optab><mode>_2_relaxed):
Extend from SVE_FULL_F to SVE_F.
(*cond_<optab><mode>_any_relaxed): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/sve/unpacked_cond_fabs_1.c: New test.
* gcc.target/aarch64/sve/unpacked_cond_fneg_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_frinta_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_frinta_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_frinti_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_frintm_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_frintp_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_frintx_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_frintz_1.c: Likewise.
|
|
This patch extends the expander for unpredicated round, nearbyint, floor,
ceil, rint, and trunc, so that it can handle partial SVE FP modes.
We move fabs and fneg to a separate expander, since they are not trapping
instructions.
gcc/ChangeLog:
* config/aarch64/aarch64-sve.md (<optab><mode>2): Replace use of
aarch64_ptrue_reg with aarch64_sve_fp_pred.
(@aarch64_pred_<optab><mode>): Extend from SVE_FULL_F to SVE_F,
and use aarch64_predicate_operand.
* config/aarch64/iterators.md: Split FABS/FNEG out of
SVE_COND_FP_UNARY (into new SVE_COND_FP_UNARY_BITWISE).
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/sve/unpacked_fabs_1.c: New test.
* gcc.target/aarch64/sve/unpacked_fneg_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_frinta_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_frinta_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_frinti_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_frinti_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_frintm_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_frintm_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_frintp_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_frintp_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_frintx_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_frintx_2.c: Likewise.
* gcc.target/aarch64/sve/unpacked_frintz_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_frintz_2.c: Likewise.
|
|
Add UNSPEC_SEL combiner patterns for unpacked FP conversions, where the
strictness value is SVE_RELAXED_GP.
gcc/ChangeLog:
* config/aarch64/aarch64-sve.md
(*cond_<optab>_nontrunc<SVE_PARTIAL_F:mode><SVE_HSDI:mode>_relaxed):
New FCVT/SEL combiner pattern.
(*cond_<optab>_trunc<VNx2DF_ONLY:mode><VNx2SI_ONLY:mode>_relaxed):
New FCVTZ{S,U}/SEL combiner pattern.
(*cond_<optab>_nonextend<SVE_HSDI:mode><SVE_PARTIAL_F:mode>_relaxed):
New {S,U}CVTF/SEL combiner pattern.
(*cond_<optab>_trunc<SVE_SDF:mode><SVE_PARTIAL_HSF:mode>):
New FCVT/SEL combiner pattern.
(*cond_<optab>_nontrunc<SVE_PARTIAL_HSF:mode><SVE_SDF:mode>_relaxed):
New FCVTZ{S,U}/SEL combiner pattern.
* config/aarch64/iterators.md: New mode iterator for VNx2SI.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/sve/unpacked_cond_cvtf_1.c: New test.
* gcc.target/aarch64/sve/unpacked_cond_fcvt_1.c: Likewise.
* gcc.target/aarch64/sve/unpacked_cond_fcvtz_1.c: Likewise.
|
|
PR fortran/121203
gcc/fortran/ChangeLog:
* trans-expr.cc (gfc_conv_procedure_call): Obtain the character
length of an assumed character length procedure from the typespec
of the actual argument even if there is no explicit interface.
gcc/testsuite/ChangeLog:
* gfortran.dg/function_charlen_4.f90: New test.
|
|
During the last weeks it became clear that our current broadcast
handling needs an overhaul in order to improve maintainability.
PR121073 showed that my intermediate fix wasn't enough and caused
regressions.
This patch now goes a first step towards untangling broadcast
(vmv.v.x), "set first" (vmv.s.x), and zero-strided load (vlse).
Also can_be_broadcast_p is rewritten and strided_broadcast_p is
introduced to make the distinction clear directly in the predicates.
Due to the pervasiveness of the patterns I needed to touch a lot
of places and tried to clear up some things while at it. The patch
therefore also introduces new helpers expand_broadcast for vmv.v.x
that dispatches to regular as well as strided broadcast and
expand_set_first that does the same thing for vmv.s.x.
The non-strided fallbacks are now implemented as splitters of the
strided variants. This makes it easier to see where and when things
happen.
The test cases I touched appeared wrong to me so this patch sets a new
baseline for some of the scalar_move tests.
There is still work to be done but IMHO that can be deferred: It would
be clearer if the three broadcast-like variants differed not just in
name but also in RTL pattern so matching is not as confusing. Right now
vmv.v.x and vmv.s.x only differ in the mask and are interchangeable by
just changing it from "all ones" to a "single one".
As last time, I regtested on rv64 and rv32 with strided_broadcast turned
on and off. Note there are regressions cond_fma_fnma-[78].c. Those are
due to the patch exposing more fwprop/late-combine opportunities. For
fma/fnma we don't yet have proper costing for vv/vx in place but I'll
expect that to be addressed soon and figured we can live with those for
the time being.
PR target/121073
gcc/ChangeLog:
* config/riscv/autovec-opt.md: Use new helpers.
* config/riscv/autovec.md: Ditto.
* config/riscv/predicates.md (strided_broadcast_mask_operand):
New predicate.
(strided_broadcast_operand): Ditto.
(any_broadcast_operand): Ditto.
* config/riscv/riscv-protos.h (expand_broadcast): Declare.
(expand_set_first): Ditto.
(expand_set_first_tu): Ditto.
(strided_broadcast_p): Ditto.
* config/riscv/riscv-string.cc (expand_vec_setmem): Use new
helpers.
* config/riscv/riscv-v.cc (expand_broadcast): New functionk.
(expand_set_first): Ditto.
(expand_set_first_tu): Ditto.
(expand_const_vec_duplicate): Use new helpers.
(expand_const_vector_duplicate_repeating): Ditto.
(expand_const_vector_duplicate_default): Ditto.
(sew64_scalar_helper): Ditto.
(expand_vector_init_merge_repeating_sequence): Ditto.
(expand_reduction): Ditto.
(strided_broadcast_p): New function.
(whole_reg_to_reg_move_p): Use new helpers.
* config/riscv/riscv-vector-builtins-bases.cc: Use either
broadcast or strided broadcast.
* config/riscv/riscv-vector-builtins.cc (function_expander::use_ternop_insn):
Ditto.
(function_expander::use_widen_ternop_insn): Ditto.
(function_expander::use_scalar_broadcast_insn): Ditto.
* config/riscv/riscv-vector-builtins.h: Declare scalar
broadcast.
* config/riscv/vector.md (*pred_broadcast<mode>): Split into
regular and strided broadcast.
(*pred_broadcast<mode>_zvfh): Split.
(pred_broadcast<mode>_zvfh): Ditto.
(*pred_broadcast<mode>_zvfhmin): Ditto.
(@pred_strided_broadcast<mode>): Ditto.
(*pred_strided_broadcast<mode>): Ditto.
(*pred_strided_broadcast<mode>_zvfhmin): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vls-vlmax/repeat-6.c: Adjust test
expectation.
* gcc.target/riscv/rvv/base/scalar_move-5.c: Ditto.
* gcc.target/riscv/rvv/base/scalar_move-6.c: Ditto.
* gcc.target/riscv/rvv/base/scalar_move-7.c: Ditto.
* gcc.target/riscv/rvv/base/scalar_move-8.c: Ditto.
* gcc.target/riscv/rvv/base/scalar_move-9.c: Ditto.
* gcc.target/riscv/rvv/pr121073.c: New test.
|
|
This patch fixes the vf_vfmacc-run-1-f16.c test failures on rv32
by adding zvfh requirements as well as options to the test and
the target harness.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmacc-run-1-f16.c:
Add zvfh requirements and options.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmadd-run-1-f16.c:
Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmsac-run-1-f16.c:
Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfmsub-run-1-f16.c:
Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmacc-run-1-f16.c:
Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmadd-run-1-f16.c:
Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmsac-run-1-f16.c:
Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfnmsub-run-1-f16.c:
Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfwmacc-run-1-f16.c:
Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vf_vfwmsac-run-1-f16.c:
Ditto.
* lib/target-supports.exp: Add zvfh options.
|
|
Regrename can fail in some case and `insn_rr[INSN_UID (insn)].op_info`
will be null. The FMA steering code was not expecting the failure to happen.
This started to happen after early RA was added but it has been a latent bug
before that.
Build and tested for aarch64-linux-gnu.
PR target/120119
gcc/ChangeLog:
* config/aarch64/cortex-a57-fma-steering.cc (func_fma_steering::analyze):
Skip if renaming fails.
gcc/testsuite/ChangeLog:
* g++.dg/torture/pr120119-1.C: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
COBOL has a group of PERFORM statements that require careful adjustments to
the location_t elements of the GENERIC nodes so that the COBOL-aware version
of GDB behaves properly. These changes are in service of that goal.
gcc/cobol/ChangeLog:
* genapi.cc (leave_procedure): Adjust location_t for PERFORM.
(parser_perform_times): Likewise.
(internal_perform_through_times): Likewise.
(perform_outofline_before_until): Likewise.
(perform_outofline_after_until): Likewise.
(perform_outofline_testafter_varying): Likewise.
(perform_outofline_before_varying): Likewise.
|
|
Here we're incorrectly rejecting the modules testcase (reduced from a
std module example):
$ cat 121179_a.C
export module foo;
enum class E { x };
bool operator==(E, int);
export
template<class T>
void f() {
E::x != 0;
}
$ cat 121179_b.C
import foo;
template void f<int>();
$ g++ -fmodules 121179_*.C
In module foo, imported at 121179_b.C:1:
121179_a.C: In instantiation of ‘void f@foo() [with T = int]’:
121179_b.C:3:9: required from here
121179_a.C:9:8: error: no match for ‘operator!=’ (operand types are ‘E@foo’ and ‘int’)
This is ultimately because our non-dependent rewritten operator expression
handling throws away the result of unqualified lookup at template parse time,
and so we have to repeat the lookup at instantiation time which fails because
the operator== isn't exported.
This is a known deficiency, but it's easy enough to narrowly fix this
for simple != to == rewrites by making build_min_non_dep_op_overload
look through logical negation.
PR c++/121179
gcc/cp/ChangeLog:
* call.cc (build_new_op): Don't clear *overload for a simple
!= to == rewrite.
* tree.cc (build_min_non_dep_op_overload): Handle TRUTH_NOT_EXPR
appearing in a rewritten operator expression.
gcc/testsuite/ChangeLog:
* g++.dg/lookup/operator-8.C: Strengthen test and remove one
XFAIL.
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
Our implementation of the INVOKE spec ([func.require]) was incorrectly
treating reference_wrapper<T>::get() as returning T instead of T&, which
notably makes a difference when invoking a ref-qualified memfn pointer.
PR c++/121055
gcc/cp/ChangeLog:
* method.cc (build_invoke): Correct reference_wrapper handling.
gcc/testsuite/ChangeLog:
* g++.dg/ext/is_invocable5.C: New test.
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
gcc/testsuite/ChangeLog:
* lib/gcc-defs.exp (aarch64-arg-dg-options): Split add_tune into
add_tune and add_override, so that specifying -moverride does not
change the baseline tuning from the testuite's default (generic).
|
|
We currently do only very restricted store sinking into paths
that have no loads or stores and end in a virtual PHI. The
following extends this to sink towards a single virtual
definition in addition to the case of a PHI, handling skipping
of unrelated virtual uses. We later have to prune cases
that would require virtual PHI insertion and the patch below
basically restricts this to sinking to noreturn paths for now.
PR tree-optimization/121220
* tree-ssa-sink.cc (statement_sink_location): For stores
handle sinking to paths ending in a store. Skip loads
that do not use the store.
* gcc.dg/tree-ssa/ssa-sink-23.c: New testcase.
|
|
This patch adds a return statement to M2Exception which removes a
build warning.
gcc/m2/ChangeLog:
* gm2-libs/M2EXCEPTION.mod (M2Exception): Add return
exException in case Raise completes.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
|
|
We currently use the types encountered in the function body and not in
type declaration to perform total scalarization. Bug PR 119085
uncovered that we miss a check that when the same data is accessed
with aggregate types that those are actually compatible. Without it,
we can base total scalarization on a type that does not "cover" all
live data in a different part of the function. This patch adds the
check.
gcc/ChangeLog:
2025-07-21 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/119085
* tree-sra.cc (sort_and_splice_var_accesses): Prevent total
scalarization if two incompatible aggregates access the same place.
gcc/testsuite/ChangeLog:
2025-07-21 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/119085
* gcc.dg/tree-ssa/pr119085.c: New test.
|
|
This is a followup patch for PR modula2/121164 to
fix the location for the error message attributed to cc1gm2.
gcc/m2/ChangeLog:
PR modula2/121164
* gm2-compiler/P1SymBuild.mod: Remove PutProcTypeParam.
Remove PutProcTypeParam.
(CheckFileName): Remove.
(P1EndBuildDefinitionModule): Correct spelling.
(P1EndBuildImplementationModule): Ditto.
(P1EndBuildProgramModule): Ditto.
(EndBuildInnerModule): Ditto.
* gm2-compiler/P2SymBuild.mod (P2EndBuildDefModule): Correct
spelling.
(P2EndBuildImplementationModule): Ditto.
(P2EndBuildProgramModule): Ditto.
(EndBuildInnerModule): Ditto.
(CheckFormalParameterSection): Ditto.
* gm2-compiler/P3SymBuild.mod (P3EndBuildDefModule): Ditto.
* gm2-compiler/PCSymBuild.mod (PCEndBuildDefModule): Ditto.
(fixupProcedureType): Pass tok to PutProcTypeVarParam.
Pass tok to PutProcTypeParam.
* gm2-compiler/SymbolTable.def (PutProcTypeParam): Add tok
parameter.
(PutProcTypeVarParam): Ditto.
* gm2-compiler/SymbolTable.mod (SymParam): At change type to
CARDINAL.
New field FullTok.
New field Scope.
(SymVarParam): At change type to CARDINAL.
New field FullTok.
New field Scope.
(GetVarDeclTok): Check ShadowVar for NulSym and return At.
(PutParam): Initialize FullTok.
Initialize At.
Initialize Scope.
(PutVarParam): Initialize FullTok.
Assign At.
Initialize Scope.
(AddProcedureProcTypeParam): Add tok parameter.
(GetScope): Add ParamSym and VarParamSym clause.
(PutProcTypeVarParam): Add tok parameter.
Initialize At.
Initialize FullTok.
(GetDeclaredDefinition): Clause ParamSym return At.
Clause VarParamSym return At.
(GetDeclaredModule): Ditto.
(PutDeclaredDefinition): Remove clause ParamSym.
Remove clause VarParamSym.
(PutDeclaredModule): Remove clause ParamSym.
Remove clause VarParamSym.
libgm2/ChangeLog:
PR modula2/121164
* libm2iso/Makefile.am (libm2iso_la_M2FLAGS): Add -Wall.
* libm2iso/Makefile.in: Regenerate.
* libm2log/Makefile.am (libm2log_la_M2FLAGS): Add -Wall.
* libm2log/Makefile.in: Regenerate.
* libm2min/Makefile.am (libm2min_la_M2FLAGS): Add -Wall.
* libm2min/Makefile.in: Regenerate.
* libm2pim/Makefile.am (libm2pim_la_M2FLAGS): Add -Wall.
* libm2pim/Makefile.in: Regenerate.
gcc/testsuite/ChangeLog:
PR modula2/121164
* gm2/switches/pedantic-params/fail/arrayofchar.def: New test.
* gm2/switches/pedantic-params/fail/arrayofchar.mod: New test.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
|
|
Prompted by the discussions around a recent clang bug, I realized that
gcc still defaults to -mcpu=v9 on Solaris/SPARC.
This is an oversight since the Oracle Studio 12.6 cc, released in 2017,
already defaults to -xarch=sparcvis2, the equivalent of
-mcpu=ultrasparc3. Besides, both the 32 and 64-bit libc.so.1 require
UltraSPARC III extensions anyway:
SPARC32PLUS Version 1, V8+ Required, UltraSPARC3 Extensions Required [VIS]
SPARCV9 Version 1, UltraSPARC3 Extensions Required [VIS]
So this patch follows suite.
Bootstrapped on sparc-sun-solaris2.11 and sparcv9-sun-solaris2.11 with
as/ld, gas/ld, and gas/gld configurations.
There are currently two regressions exposed by this patch (PRs 121191
and 121192), which are only present in gcc 16 resp. 15/16.
There's one small caveat: while Solaris now marks all objects with
EF_SPARC_32PLUS EF_SPARC_SUN_US1 EF_SPARC_SUN_US3, gas only sets the
EF_SPARC_SUN_US[13] flags in the ELF header if UltraSPARC I/III insns
are actually used. This is in accordance with the SPARC Compliance
Definition 2.4.1, 4P-1. In the end, it doesn't matter anyway since
libc.so.1 already has both flags, so the resulting executables and
shared objects will too, anyway.
2025-07-20 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>
gcc:
* config.gcc <sparc*-*-solaris2*> (with_cpu): Default to ultrasparc3.
|
|
With a patch still in development we get NULL STMT_VINFO_VECTYPE.
One side-effect is that during scalar stmt testing we no longer
pass a vectype. The following adjusts aarch64_vector_costs::add_stmt_cost
to check for a non-NULL vectype before accessing it, like all the
code surrounding it. The other fix possibility would have been
to re-orderr the check with the vect_mem_access_type one, but that
one is not going to exist during scalar code costing either in the
future.
* config/aarch64/aarch64.cc (aarch64_vector_costs::add_stmt_cost):
Check vectype is non-NULL before accessing it.
|
|
constant_byte_string fails to consider the string type might be VLA
when initialized by an empty string CTOR.
PR middle-end/121216
* expr.cc (constant_byte_string): Check the string type
size fits an uhwi before converting to uhwi.
* gcc.dg/pr121216.c: New testcase.
|
|
Since r16-372-g064cac730f88dc fn1 is now inlined into main
which meant the scan dump was failing since it was looking
for it only once. Marking fn1 as noinline gets us back to
the old behavior and no longer dependent on the inliner.
Pushed as obvious after a quick test.
PR testsuite/120101
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/pr81627.c (fn1): Mark as noinline.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
The test ends up writing a byte beyond bounds of the buffer, which gets
trapped on some targets when the test is run with
-fstack-protector-strong.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/pr116125.c (mem_overlap): Expand A to 10 members.
Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
|
|
|
|
The note and example in [class.union] p6 think that placement new can be
used to change the active member of a union, but we didn't support that for
array members in constant-evaluation even after implementing P1330 and
P2747.
First I tried to address this by introducing a CLOBBER_BEGIN_OBJECT for the
entire array, but that broke the resolution of LWG3436, which invokes 'new
T[1]' for an array T, and trying to clobber a multidimensional array when
the actual object is single-dimensional breaks. So I've raised that issue
with the committee. Until that is resolved, this patch takes a simpler
approach: allow initialization of an element of an array to make the array
the active member of a union.
PR c++/121068
gcc/cp/ChangeLog:
* constexpr.cc (cxx_eval_store_expression): Allow ARRAY_REFs
when activating an array member of a union.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/constexpr-union6.C: Expect x5 to work.
* g++.dg/cpp26/constexpr-new4.C: New test.
|
|
A patch I was testing noticed that the allocation is too small for the
placement new here, but that isn't the point of the testcase.
gcc/testsuite/ChangeLog:
* g++.dg/warn/Wmismatched-new-delete-5.C: Fix allocation.
|
|
thing in function [PR109267]
When we have an empty function, things can go wrong with
cfi_startproc/cfi_endproc and a few other things like exceptions. So if
the only thing the function does is a call to __builtin_unreachable,
let's replace that with a __builtin_trap instead if the target has a trap
instruction. For targets without a trap instruction defined, replace it
with an infinite loop; this allows not to need for the abort call to happen
but still get the correct behavior of not having two functions at the same
location.
The QOI idea for basic block reorder is recorded as PR 120004.
Changes since v1:
* v2: Move to final gimple cfg cleanup instead of expand and use
BUILT_IN_UNREACHABLE_TRAP.
* v3: For targets without a trap defined, create an infinite loop.
Bootstrapped and tested on x86_64-linux-gnu.
PR middle-end/109267
gcc/ChangeLog:
* tree-cfgcleanup.cc (execute_cleanup_cfg_post_optimizing): If the first
non debug statement in the first (and only) basic block is a call
to __builtin_unreachable change it to a call to __builtin_trap or an
infinite loop.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp (check_effective_target_trap): New proc.
* g++.dg/missing-return.C: Update testcase for the !trap case.
* gcc.dg/pr109267-1.c: New test.
* gcc.dg/pr109267-2.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
This patch fixes the following defects in the function:
- The cost of move instructions larger than the natural word width,
specifically "movd[if]_internal", cannot be estimated correctly
- Floating-point or symbolic constant assignment insns cannot be
identified as L32R instructions
gcc/ChangeLog:
* config/xtensa/xtensa.cc (xtensa_is_insn_L32R_p):
Rewrite to capture insns that could be L32R machine instructions
wherever possible.
(xtensa_rtx_costs): Fix to consider that moves larger than a
natural word can take multiple L32R machine instructions.
(constantpool_address_p): Cosmetics.
* config/xtensa/xtensa.md (movdi_internal, movdf_internal):
Add missing insn attributes.
|
|
The relaxed MOVI instructions in the Xtensa ISA are assignment ones that
contain large integer, floating-point or symbolic constants that would not
normally be allowed as immediate values by instructions in assembly code,
and will instead be translated by the assembler later rather the compiler,
into the L32R instructions referencing to literal pool entries containing
that values (see '-mauto-litpools' Xtensa-specific option).
This means that even though such instructions look like nothing more than
constant value assignments in their RTL representation, these may perform
better by treating them as loads from memory (i.e. the actual behavior)
and also trying to avoid using the value immediately after the load,
especially from an instruction scheduling perspective.
gcc/ChangeLog:
* config/xtensa/xtensa.md
(movsi_internal, movhi_internal, movsf_internal):
Change the value of the "type" attribute from "move" to "load"
when the source operand constraint is "Y".
|
|
The function `vect_check_gather_scatter` requires the `base` of the load
to be loop-invariant and the `off`set to be not loop-invariant. When faced
with a scenario where `base` is not loop-invariant, instead of giving up
immediately we can try swapping the `base` and `off`, if `off` is
actually loop-invariant.
Previously, it would only swap if `off` was the constant zero (and so
trivially loop-invariant). This is too conservative: we can still
perform the swap if `off` is a more complex but still loop-invariant
expression, such as a variable defined outside of the loop.
This allows loops like the function below to be vectorised, if the
target has masked loads and sufficiently large vector registers (eg
`-march=armv8-a+sve -msve-vector-bits=128`):
```c
typedef struct Array {
int elems[3];
} Array;
int loop(Array **pp, int len, int idx) {
int nRet = 0;
for (int i = 0; i < len; i++) {
Array *p = pp[i];
if (p) {
nRet += p->elems[idx];
}
}
return nRet;
}
```
gcc/ChangeLog:
* tree-vect-data-refs.cc (vect_check_gather_scatter): Swap
`base` and `off` in more scenarios. Also assert at the end of
the function that `base` and `off` are loop-invariant and not
loop-invariant respectively.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/sve/mask_load_2.c: Update tests.
|
|
Commit the test file `mask_load_2.c` before the vectorisation analysis
is changed, so that the changes in codegen are more obvious in the next
commit.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/sve/mask_load_2.c: New test.
|
|
So while debugging Austin's work to support the spacemit x60 in the BPI we
found that even though his pipeline description had mappings for all the vector
instructions, they were still getting matched by the generic-vector-ooo DFA.
The core problem is that DFA never restricted itself to a tune option (oops).
That's easily fixed, at which time everything using generic blows up because we
don't have a generic in-order vector DFA. Everything using generic was
indirectly also using generic-vector-ooo for the vector instructions.
It may be better long term to define a generic-vector DFA, but to preserve
behavior, I'm letting generic-vector-ooo match when the generic DFA is active.
Tested in my tester, waiting on pre-commit CI before moving forward.
gcc/
* config/riscv/generic-vector-ooo.md: Restrict insn reservations to
generic_ooo and generic tuning models.
|
|
When we have a vector shift with a scalar the shift operand can be
external - in that case we should not use the shift operand def
as hint where to place the vector shift instruction. The ICE
in the PR is because stmt dominance queries only work inside of
the vector region. But we should also never place stmts outside
of it.
PR tree-optimization/121202
* tree-vect-slp.cc (vect_schedule_slp_node): Do not take
an out-of-region stmt as "last".
* gcc.dg/pr121202.c: New testcase.
|