Age | Commit message (Collapse) | Author | Files | Lines |
|
live in ext-dce
Another patch to refine liveness computations. This should be NFC and is
designed to help debugging.
In simplest terms the patch avoids setting bit groups outside the size of a
pseudo as live. Consider a HImode pseudo, bits 16..63 for such a pseudo don't
really have meaning, yet we often set bit groups related to bits 16.63 on in
the liveness bitmaps.
This makes debugging harder than it needs to be by simply having larger bitmaps
to verify when walking through the code in a debugger.
This has been bootstrapped and regression tested on x86_64. It's also been
tested on the crosses in my tester without regressions.
Pushing to the trunk,
PR rtl-optimization/115877
gcc/
* ext-dce.cc (group_limit): New function.
(mark_reg_live): Likewise.
(ext_dce_process_sets): Use new functions.
(ext_dce_process_uses): Likewise.
(ext_dce_init): Likewise.
|
|
We're hashing operand 2 to the temporary hash.
* fold-const.cc (operand_compare::hash_operand): Fix hash
of WIDEN_*_EXPR.
|
|
The following constifies parts of inchash.
* inchash.h (inchash::end): Make const.
(inchash::merge): Take const reference hash argument.
(inchash::add_commutative): Likewise.
|
|
Coarray parameters of procedures/functions need to be dereffed, because
they are references to the descriptor but the routine expected the
descriptor directly.
PR fortran/88624
gcc/fortran/ChangeLog:
* trans-expr.cc (gfc_conv_procedure_call): Treat
pointers/references (e.g. from parameters) correctly by derefing
them.
gcc/testsuite/ChangeLog:
* gfortran.dg/coarray/dummy_1.f90: Add calling function trough
function.
* gfortran.dg/pr88624.f90: New test.
|
|
[PR115531].
This implements the new target hook indicating that for AArch64 when possible
we prefer masked operations for any type vs doing LOAD + SELECT or
SELECT + STORE.
Thanks,
Tamar
gcc/ChangeLog:
PR tree-optimization/115531
* config/aarch64/aarch64.cc
(aarch64_conditional_operation_is_expensive): New.
(TARGET_VECTORIZE_CONDITIONAL_OPERATION_IS_EXPENSIVE): New.
gcc/testsuite/ChangeLog:
PR tree-optimization/115531
* gcc.dg/vect/vect-conditional_store_1.c: New test.
* gcc.dg/vect/vect-conditional_store_2.c: New test.
* gcc.dg/vect/vect-conditional_store_3.c: New test.
* gcc.dg/vect/vect-conditional_store_4.c: New test.
|
|
This adds a conditional store optimization for the vectorizer as a pattern.
The vectorizer already supports modifying memory accesses because of the pattern
based gather/scatter recognition.
Doing it in the vectorizer allows us to still keep the ability to vectorize such
loops for architectures that don't have MASK_STORE support, whereas doing this
in ifcvt makes us commit to MASK_STORE.
Concretely for this loop:
void foo1 (char *restrict a, int *restrict b, int *restrict c, int n, int stride)
{
if (stride <= 1)
return;
for (int i = 0; i < n; i++)
{
int res = c[i];
int t = b[i+stride];
if (a[i] != 0)
res = t;
c[i] = res;
}
}
today we generate:
.L3:
ld1b z29.s, p7/z, [x0, x5]
ld1w z31.s, p7/z, [x2, x5, lsl 2]
ld1w z30.s, p7/z, [x1, x5, lsl 2]
cmpne p15.b, p6/z, z29.b, #0
sel z30.s, p15, z30.s, z31.s
st1w z30.s, p7, [x2, x5, lsl 2]
add x5, x5, x4
whilelo p7.s, w5, w3
b.any .L3
which in gimple is:
vect_res_18.9_68 = .MASK_LOAD (vectp_c.7_65, 32B, loop_mask_67);
vect_t_20.12_74 = .MASK_LOAD (vectp.10_72, 32B, loop_mask_67);
vect__9.15_77 = .MASK_LOAD (vectp_a.13_75, 8B, loop_mask_67);
mask__34.16_79 = vect__9.15_77 != { 0, ... };
vect_res_11.17_80 = VEC_COND_EXPR <mask__34.16_79, vect_t_20.12_74, vect_res_18.9_68>;
.MASK_STORE (vectp_c.18_81, 32B, loop_mask_67, vect_res_11.17_80);
A MASK_STORE is already conditional, so there's no need to perform the load of
the old values and the VEC_COND_EXPR. This patch makes it so we generate:
vect_res_18.9_68 = .MASK_LOAD (vectp_c.7_65, 32B, loop_mask_67);
vect__9.15_77 = .MASK_LOAD (vectp_a.13_75, 8B, loop_mask_67);
mask__34.16_79 = vect__9.15_77 != { 0, ... };
.MASK_STORE (vectp_c.18_81, 32B, mask__34.16_79, vect_res_18.9_68);
which generates:
.L3:
ld1b z30.s, p7/z, [x0, x5]
ld1w z31.s, p7/z, [x1, x5, lsl 2]
cmpne p7.b, p7/z, z30.b, #0
st1w z31.s, p7, [x2, x5, lsl 2]
add x5, x5, x4
whilelo p7.s, w5, w3
b.any .L3
gcc/ChangeLog:
PR tree-optimization/115531
* tree-vect-patterns.cc (vect_cond_store_pattern_same_ref): New.
(vect_recog_cond_store_pattern): New.
(vect_vect_recog_func_ptrs): Use it.
* target.def (conditional_operation_is_expensive): New.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in: Document it.
* targhooks.cc (default_conditional_operation_is_expensive): New.
* targhooks.h (default_conditional_operation_is_expensive): New.
|
|
'dg-run' is not a valid dejagnu directive, 'dg-do run' is needed here
for the test to be executed.
PR target/108699
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr108699.c: Fix 'dg-run' typo.
Signed-off-by: Sam James <sam@gentoo.org>
|
|
Rearrange the test help header files, as well as align the name
conventions.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/binop/vec_sat_binary.h: Move to...
* gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vvv_run.h: ...here.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_scalar.h: Move to...
* gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vvx_run.h: ...here.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vx.h: Move to...
* gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vx_run.h: ...here.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-1.c: Adjust
the include file names.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-10.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-11.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-12.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-13.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-14.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-15.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-16.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-17.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-18.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-19.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-20.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-21.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-22.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-23.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-24.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-25.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-26.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-27.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-28.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-29.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-30.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-31.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-32.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-9.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-10.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-11.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-12.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-13.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-14.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-15.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-16.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-17.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-18.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-19.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-20.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-21.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-22.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-23.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-24.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-25.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-26.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-27.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-28.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-29.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-30.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-31.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-32.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-5.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-6.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-7.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-8.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-run-9.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-1.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-2.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-3.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-4.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-5.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-6.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-7.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-8.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-1.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-2.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-3.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-4.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-5.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-6.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-7.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-run-8.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-1.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-10.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-11.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-12.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-13.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-14.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-15.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-16.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-17.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-18.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-19.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-2.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-20.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-21.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-22.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-23.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-24.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-25.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-26.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-27.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-28.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-29.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-3.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-30.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-31.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-32.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-33.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-34.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-35.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-36.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-37.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-38.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-39.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-4.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-40.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-5.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-6.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-7.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-8.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-9.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-1.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-10.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-11.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-12.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-13.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-14.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-15.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-16.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-17.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-18.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-19.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-2.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-20.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-21.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-22.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-23.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-24.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-25.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-26.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-27.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-28.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-29.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-3.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-30.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-31.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-32.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-33.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-34.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-35.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-36.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-37.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-38.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-39.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-4.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-40.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-5.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-6.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-7.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-8.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub-run-9.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-1.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-2.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-3.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-run-1.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-run-2.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-run-3.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_zip-run.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_zip.c: Ditto
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-1.c: Ditto
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-2.c: Ditto
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-3.c: Ditto
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-4.c: Ditto
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-5.c: Ditto
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-6.c: Ditto
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-1.c: Ditto
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-2.c: Ditto
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-3.c: Ditto
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-4.c: Ditto
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-5.c: Ditto
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-run-6.c: Ditto
* gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h: Move to...
* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: ...here.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
|
|
2024-07-21 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/59104
* gfortran.h : Add decl_order to gfc_symbol.
* symbol.cc : Add static next_decl_order..
(gfc_set_sym_referenced): Set symbol decl_order.
* trans-decl.cc : Include dependency.h.
(decl_order): Replace symbol declared_at.lb->location with
decl_order.
gcc/testsuite/
PR fortran/59104
* gfortran.dg/dependent_decls_3.f90: New test.
|
|
initialization
While debugging pr115877, I noticed we were failing to remove the destination
register from LIVENOW bitmap when it was set to a constant value. ie (set
(dest) (const_int)). This was a trivial oversight in
safe_for_live_propagation.
I don't have an example of this affecting code generation, but it certainly
could. More importantly, by making LIVENOW more accurate it's easier to debug
when LIVENOW differs from expectations.
As with the prior patch this has been tested as part of a larger patchset with
the crosses as well as individually on x86_64.
Pushing to the trunk,
PR rtl-optimization/115877
gcc/
* ext-dce.cc (safe_for_live_propagation): Handle RTX_CONST_OBJ.
|
|
So I'm not yet sure how I'm going to break everything down, but this is easy
enough to break out as 1/N of ext-dce fixes/improvements.
When handling uses in an insn, we first determine what bits are set in the
destination which is represented in DST_MASK. Then we use that to refine what
bits are live in the source operands.
In the source operand handling section we *modify* DST_MASK if the source
operand is a SUBREG (ugh!). So if the first operand is a SUBREG, then we can
incorrectly compute which bit groups are live in the second operand, especially
if it is a SUBREG as well.
This was seen when testing a larger set of patches on the rl78 port
(builtin-arith-overflow-p-7 & pr71631 execution failures), so no new test for
this bugfix.
Run through my tester (in conjunction with other ext-dce changes) on the
various cross targets. Run individually through a bootstrap and regression
test cycle on x86_64 as well.
Pushing to the trunk.
PR rtl-optimization/115877
gcc/
* ext-dce.cc (ext_dce_process_uses): Restore the value of DST_MASK
for reach operand.
|
|
Originally added in r0-44646-g204250d2fcd084 and r0-44627-gfd350d241fecf6 whic
moved -fno-common from all builds to just checking builds.
Since r10-4867-g6271dd984d7f92, GCC defaults to -fno-common. There's no need
to pass it specially for checking builds.
We could keep it for older bootstrap compilers with checking but I don't see
much value in that, it was already just a bonus before.
gcc/ChangeLog:
* Makefile.in (NOCOMMON_FLAG): Delete.
(GCC_WARN_CFLAGS): Drop NOCOMMON_FLAG.
(GCC_WARN_CXXFLAGS): Drop NOCOMMON_FLAG.
* configure.ac: Ditto.
* configure: Regenerate.
gcc/d/ChangeLog:
* Make-lang.in (WARN_DFLAGS): Drop NOCOMMON_FLAG.
|
|
|
|
|
|
|
|
I've also confirmed on the CSiBE set that the secondary combine pass is
actually beneficial on SH. It does result in some code size reductions.
gcc/CHangeLog:
* config/sh/sh.md (mov_neg_si_t): Allow insn and split after
register allocation.
(*treg_noop_move): New insn.
|
|
|
|
Require a bitint target large enough.
gcc/testsuite/
* gcc.dg/pr116003.c: Require bitint575 target.
|
|
This reverts commit 56f824cc206ff00d466aaeb11211d8005c4668bc.
|
|
This reverts commit 37c4703ce84722b9c24db3e8e6d57ab6d3a7b5eb.
|
|
This reverts commit 7db47f7b915c5f5d645fa536547e26b92290afe3.
|
|
This reverts commit 59dd1d7ab21ad9a7ebf641ec9aeea609c003ad2f.
|
|
Translate DW_TAG_subprogram DIEs into CodeView LF_FUNC_ID types and
S_GPROC32_ID / S_LPROC32_ID symbols. ld will then transform these into
S_GPROC32 / S_LPROC32 symbols, which map addresses to unmangled function
names.
gcc/
* dwarf2codeview.cc (enum cv_sym_type): Add new values.
(struct codeview_symbol): Add function to union.
(struct codeview_custom_type): Add lf_func_id to union.
(write_function): New function.
(write_codeview_symbols): Call write_function.
(write_lf_func_id): New function.
(write_custom_types): Call write_lf_func_id.
(add_function): New function.
(codeview_debug_early_finish): Call add_function.
|
|
Testcase should only be for bitint targets
gcc/testsuite/
* gcc.dg/pr116003.c : Add target bitint.
|
|
gcc:
* doc/invoke.texi (Spec Files): Remove documentation of obsolete
spec strings "predefines" and "signed_char".
|
|
The inner loop in build_option_suggestions uses OPTION to take the
address of OPTB and use it across iterations, which is undefined
behaviour since OPTB is defined within the loop. Pull it outside the
loop to make this defined.
gcc/ChangeLog:
* opt-suggestions.cc
(option_proposer::build_option_suggestions): Pull OPTB
definition out of the innermost loop.
|
|
gcc/ChangeLog:
PR c/83324
* doc/extend.texi: Document [[musttail]]
|
|
Some adopted from the existing C musttail plugin tests.
Also extends the ability to query the sibcall capabilities of the
target.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp:
(check_effective_target_struct_tail_call): New function.
* c-c++-common/musttail1.c: New test.
* c-c++-common/musttail12.c: New test.
* c-c++-common/musttail13.c: New test.
* c-c++-common/musttail2.c: New test.
* c-c++-common/musttail3.c: New test.
* c-c++-common/musttail4.c: New test.
* c-c++-common/musttail5.c: New test.
* c-c++-common/musttail7.c: New test.
* c-c++-common/musttail8.c: New test.
* g++.dg/musttail10.C: New test.
* g++.dg/musttail11.C: New test.
* g++.dg/musttail6.C: New test.
* g++.dg/musttail9.C: New test.
|
|
Implement a C23 clang compatible musttail attribute similar to the earlier
C++ implementation in the C parser.
gcc/c/ChangeLog:
PR c/83324
* c-parser.cc (struct attr_state): Define with musttail_p.
(c_parser_statement_after_labels): Handle [[musttail]].
(c_parser_std_attribute): Dito.
(c_parser_handle_musttail): Dito.
(c_parser_compound_statement_nostart): Dito.
(c_parser_all_labels): Dito.
(c_parser_statement): Dito.
* c-tree.h (c_finish_return): Add musttail_p flag.
* c-typeck.cc (c_finish_return): Handle musttail_p flag.
|
|
This patch implements a clang compatible [[musttail]] attribute for
returns.
musttail is useful as an alternative to computed goto for interpreters.
With computed goto the interpreter function usually ends up very big
which causes problems with register allocation and other per function
optimizations not scaling. With musttail the interpreter can be instead
written as a sequence of smaller functions that call each other. To
avoid unbounded stack growth this requires forcing a sibling call, which
this attribute does. It guarantees an error if the call cannot be tail
called which allows the programmer to fix it instead of risking a stack
overflow. Unlike computed goto it is also type-safe.
It turns out that David Malcolm had already implemented middle/backend
support for a musttail attribute back in 2016, but it wasn't exposed
to any frontend other than a special plugin.
This patch adds a [[gnu::musttail]] attribute for C++ that can be added
to return statements. The return statement must be a direct call
(it does not follow dependencies), which is similar to what clang
implements. It then uses the existing must tail infrastructure.
For compatibility it also detects clang::musttail
Passes bootstrap and full test
gcc/c-family/ChangeLog:
* c-attribs.cc (set_musttail_on_return): New function.
* c-common.h (set_musttail_on_return): Declare new function.
gcc/cp/ChangeLog:
PR c/83324
* cp-tree.h (AGGR_INIT_EXPR_MUST_TAIL): Add.
* parser.cc (cp_parser_statement): Handle musttail.
(cp_parser_jump_statement): Dito.
* pt.cc (tsubst_expr): Copy CALL_EXPR_MUST_TAIL_CALL.
* semantics.cc (simplify_aggr_init_expr): Handle musttail.
|
|
The actual handling is directly in the parser since the
generic mechanism doesn't support statement attributes,
but this gives basic error checking/detection on the attribute.
gcc/c-family/ChangeLog:
PR c/83324
* c-attribs.cc (handle_musttail_attribute): Add.
* c-common.h (handle_musttail_attribute): Add.
|
|
gcc/ChangeLog:
* config/loongarch/loongarch-protos.h
(loongarch_split_128bit_move): Delete.
(loongarch_split_128bit_move_p): Delete.
(loongarch_split_256bit_move): Delete.
(loongarch_split_256bit_move_p): Delete.
(loongarch_split_vector_move): Add a function declaration.
* config/loongarch/loongarch.cc
(loongarch_vector_costs::finish_cost): Adjust the code
formatting.
(loongarch_split_vector_move_p): Merge
loongarch_split_128bit_move_p and loongarch_split_256bit_move_p.
(loongarch_split_move_p): Merge code.
(loongarch_split_move): Likewise.
(loongarch_split_128bit_move_p): Delete.
(loongarch_split_256bit_move_p): Delete.
(loongarch_split_128bit_move): Delete.
(loongarch_split_vector_move): Merge loongarch_split_128bit_move
and loongarch_split_256bit_move.
(loongarch_split_256bit_move): Delete.
(loongarch_global_init): Remove the extra semicolon at the
end of the function.
* config/loongarch/loongarch.md (*movdf_softfloat): Added a new
condition TARGET_64BIT.
|
|
|
|
Check for an SSA_NAME not in the CFG before trying to create an
equivalence record in the defintion block.
PR tree-optimization/116003
gcc/
* value-relation.cc (equiv_oracle::register_initial_def): Check
if SSA_NAME is in the IL before registering.
gcc/testsuite/
* gcc.dg/pr116003.c: New.
|
|
Since Subversion r201359 (Git commit a167b052dfe9a8509bb23c374ffaeee953df0917)
"Introduce gen-pass-instances.awk and pass-instances.def", the usage comment at
the top of 'gcc/passes.def' no longer is accurate (even if that latter file
does continue to use the 'NEXT_PASS' form without 'NUM') -- and, worse, the
'NEXT_PASS' etc. in that usage comment are processed by the
'gcc/gen-pass-instances.awk' script:
--- source-gcc/gcc/passes.def 2024-06-24 18:55:15.132561641 +0200
+++ build-gcc/gcc/pass-instances.def 2024-06-24 18:55:27.768562714 +0200
[...]
@@ -20,546 +22,578 @@
/*
Macros that should be defined when using this file:
INSERT_PASSES_AFTER (PASS)
PUSH_INSERT_PASSES_WITHIN (PASS)
POP_INSERT_PASSES ()
- NEXT_PASS (PASS)
+ NEXT_PASS (PASS, 1)
TERMINATE_PASS_LIST (PASS)
*/
[...]
(That is, this is 'NEXT_PASS' for the first instance of pass 'PASS'.)
That's benign so far, but with another thing that I'll be extending, I'd
then run into an error while the script handles this comment block. ;-\
gcc/
* passes.def: Rewrite usage comment at the top.
|
|
Previously we built vector boolean constants using 1 for true
elements and 0 for false elements. This matches the predicates
produced by SVE's PTRUE instruction, but leads to a miscompilation
on AVX, where all bits of a boolean element should be set.
One option for RTL would be to make this target-configurable.
But that isn't really possible at the tree level, where vectors
should work in a more target-independent way. (There is currently
no way to create a "generic" packed boolean vector, but never say
never :)) And, if we were going to pick a generic behaviour,
it would make sense to use 0/-1 rather than 0/1, for consistency
with integer vectors.
Both behaviours should work with SVE on read, since SVE ignores
the upper bits in each predicate element. And the choice shouldn't
make much difference for RTL, since all SVE predicate modes are
expressed as vectors of BI, rather than of multi-bit booleans.
I suspect there might be some fallout from this change on SVE.
But I think we should at least give it a go, and see whether any
fallout provides a strong counterargument against the approach.
gcc/
PR middle-end/115406
* fold-const.cc (native_encode_vector_part): For vector booleans,
check whether an element is nonzero and, if so, set all of the
correspending bits in the target image.
* simplify-rtx.cc (native_encode_rtx): Likewise.
gcc/testsuite/
PR middle-end/115406
* gcc.dg/torture/pr115406.c: New test.
|
|
These tests used to generate:
bl swap
ldr r2, [sp, #4]
mov r0, r2 @ __fp16
but g:9d20529d94b23275885f380d155fe8671ab5353a means that we can
load directly into r0:
bl swap
ldrh r0, [sp, #4] @ __fp16
This patch updates the tests to "defend" this change.
While there, the scans include:
mov\tr1, r[03]}
But if the spill of r2 occurs first, there's no real reason why
r2 couldn't be used as the temporary, instead r3.
The patch tries to update the scans while preserving the spirit
of the originals.
gcc/testsuite/
* gcc.target/arm/fp16-aapcs-2.c: Expect the return value to be
loaded directly from the stack. Test that the swap generates
two moves out of r0/r1 and two moves in.
* gcc.target/arm/fp16-aapcs-4.c: Likewise.
|
|
The code path for rejecting an object-less call to a non-static member
function should also consider xobj member functions (so that we correctly
reject the below calls with a "cannot call member function without object"
diagnostic).
PR c++/115783
gcc/cp/ChangeLog:
* call.cc (build_new_method_call): Generalize METHOD_TYPE
check to DECL_OBJECT_MEMBER_FUNCTION_P.
gcc/testsuite/ChangeLog:
* g++.dg/cpp23/explicit-obj-diagnostics11.C: New test.
Reviewed-by: Jason Merrill <jason@redhat.com>
|
|
gcc/
* config/avr/builtins.def (MASK1): New DEF_BUILTIN.
* config/avr/avr.cc (avr_rtx_costs_1): Handle rtx costs for
expressions like __builtin_avr_mask1.
(avr_init_builtins) <uintQI_ftype_uintQI_uintQI>: New tree type.
(avr_expand_builtin) [AVR_BUILTIN_MASK1]: Diagnose unexpected forms.
(avr_fold_builtin) [AVR_BUILTIN_MASK1]: Handle case.
* config/avr/avr.md (gen_mask1): New expand helper.
(mask1_0x01_split, mask1_0x80_split, mask1_0xfe_split): New
insn-and-split.
(*mask1_0x01, *mask1_0x80, *mask1_0xfe): New insns.
* doc/extend.texi (AVR Built-in Functions) <__builtin_avr_mask1>:
Document new built-in function.
gcc/testsuite/
* gcc.target/avr/torture/builtin-mask1.c: New test.
|
|
gcc/fortran/ChangeLog:
PR fortran/103115
* trans-array.cc (gfc_trans_array_constructor_value): If the first
element of an array constructor is deferred-length character and
therefore does not have an element size known at compile time, do
not try to collect subsequent constant elements into a constructor
for optimization.
gcc/testsuite/ChangeLog:
PR fortran/103115
* gfortran.dg/string_array_constructor_4.f90: New test.
|
|
[PR114759,PR115988]
2024-07-18 Peter Bergner <bergner@linux.ibm.com>
gcc/testsuite/
PR target/114759
PR target/115988
* gcc.target/powerpc/pr114759-3.c: Catch unsupported ABI errors.
|
|
Seems to be fixed by r15-521-g6ad7ca1bb90573.
PR c++/109464
gcc/testsuite/ChangeLog:
* g++.dg/template/explicit-instantiation8.C: New test.
|
|
Both xchg and cmpxchg instructions, in the pseudo-C dialect, do not
expect their memory address operand to be surrounded by parentheses.
For example, it should be output as "w0 =cmpxchg32_32(r8+8,w0,w2)"
instead of "w0 =cmpxchg32_32((r8+8),w0,w2)".
This patch implements an operand modifier 'M' which marks the
instruction templates that do not expect the parentheses, and adds it do
xchg and cmpxchg templates.
gcc/ChangeLog:
* config/bpf/atomic.md (atomic_compare_and_swap,
atomic_exchange): Add operand modifier %M to the first
operand.
* config/bpf/bpf.cc (no_parentheses_mem_operand): Create
variable.
(bpf_print_operand): Set no_parentheses_mem_operand variable if
%M operand is used.
(bpf_print_operand_address): Conditionally output parentheses.
gcc/testsuite/ChangeLog:
* gcc.target/bpf/pseudoc-atomic-memaddr-op.c: Add test.
|
|
When working on the #embed optimization support, I went recently through
all of reshape_init_r* and today I read in detail all the P3106R1 changes
and I believe we implement it that way for years.
To double check that, I've added tests with the current [dcl.init.aggr]
examples but tested in all the languages from C++98 to C++26, of course
guarded as needed for constructs which require newer versions of C++.
The examples come in two tests, one is a runtime test for the non-erroneous
examples, the other is a compile time test for the diagnostics.
The former one includes mostly intact examples with runtime checking (both
to test what is written in the section exactly and to test at least
something with C++98) and then when useful also adds constexpr tests with
static_asserts for C++11 and later.
Tested on x86_64-linux and i686-linux with
GXX_TESTSUITE_STDS=98,11,14,17,20,23,26 make check-g++ RUNTESTFLAGS='dg.exp=aggr-init*.C'
Also tested on GCC 11 branch with
GXX_TESTSUITE_STDS=98,11,14,17,20,2b make check-g++ RUNTESTFLAGS='dg.exp=aggr-init*.C'
where just the " is a GCC extension" part of one error is left out,
otherwise it passes the same, ditto with clang 14 (of course with different
diagnostics, but verified it emits diagnostics on the right lines), so I
believe we can claim implementation of this DR paper, either in all versions
or at least in GCC 11+.
2024-07-19 Jakub Jelinek <jakub@redhat.com>
PR c++/114460
* g++.dg/cpp26/aggr-init1.C: New test.
* g++.dg/cpp26/aggr-init2.C: New test.
|
|
This patch addresses a difference between the hash function and the equality
function for canonical types of template parameters (ctp_hasher). The equality
function uses comptypes (typeck.cc) (with COMPARE_STRUCTURAL) and checks
constraint equality for two auto nodes (typeck.cc:1586), while the hash
function ignores it (pt.cc:4528). This leads to hash collisions that can be
avoided by using `hash_placeholder_constraint` (constraint.cc:1150).
Note that due to the proper handling of hash collisions (hash-table.h:1059),
there is no test case that can distinguish the current implementation from the
proposed one.
* constraint.cc (hash_placeholder_constraint): Rename to
iterative_hash_placeholder_constraint.
(iterative_hash_placeholder_constraint): Rename from
hash_placeholder_constraint and add the initial val argument.
* cp-tree.h (hash_placeholder_constraint): Rename to
iterative_hash_placeholder_constraint.
(iterative_hash_placeholder_constraint): Renamed from
hash_placeholder_constraint and add the initial val argument.
* pt.cc (struct ctp_hasher): Updated to use
iterative_hash_placeholder_constraint in the case of a valid placeholder
constraint.
(auto_hash::hash): Reflect the renaming of hash_placeholder_constraint to
iterative_hash_placeholder_constraint.
|
|
The SAT_TRUNC form 2 has below pattern matching.
From:
_18 = MIN_EXPR <left_8, 4294967295>;
iftmp.0_11 = (unsigned int) _18;
To:
_18 = MIN_EXPR <left_8, 4294967295>;
iftmp.0_11 = .SAT_TRUNC (left_8);
But if there is another use of _18 like below, the transform to the
.SAT_TRUNC may have no earnings. For example:
From:
_18 = MIN_EXPR <left_8, 4294967295>; // op_0 def
iftmp.0_11 = (unsigned int) _18; // op_0
stream.avail_out = iftmp.0_11;
left_37 = left_8 - _18; // op_0 use
To:
_18 = MIN_EXPR <left_8, 4294967295>; // op_0 def
iftmp.0_11 = .SAT_TRUNC (left_8);
stream.avail_out = iftmp.0_11;
left_37 = left_8 - _18; // op_0 use
Pattern recog to .SAT_TRUNC cannot eliminate MIN_EXPR as above. Then the
backend (for example x86/riscv) will have additional 2-3 more insns
after pattern recog besides the MIN_EXPR. Thus, keep the normal truncation
as is should be the better choose.
The below testsuites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The x86 bootstrap tests.
3. The x86 fully regression tests.
PR target/115863
gcc/ChangeLog:
* match.pd: Add single_use check for .SAT_TRUNC form 2.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr115863-1.c: New test.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
|
|
is_trivial was introduced in
<https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2230.html>
which split POD into is_trivial and is_standard_layout.
Later came CWG 1363. Since
struct A {
A() = default;
A(int = 42) {}
};
cannot be default-initialized, it should not be trivial, so the definition
of what is a trivial class changed.
Similarly, CWG 1496 concluded that
struct B {
B() = delete;
}:
should not be trivial either.
P0848 adjusted the definition further to say "eligible". That means
that
template<typename T>
struct C {
C() requires false = default;
};
should not be trivial, either, since C::C() is not eligible.
Bug 85723 reports that we implement none of the CWGs.
I chose to fix this by using type_has_non_deleted_trivial_default_ctor
which uses locate_ctor which uses build_new_method_call, which would
be used by default-initialization as well. With that, all __is_trivial
problems I could find in the Bugzilla are fixed, except for PR96288,
which may need changes to trivially-copyable, so I'm not messing with
that now.
I hope this has no ABI implications. There's effort undergoing to
remove "trivial class" from the core language as it's not really
meaningful. So the impact of this change should be pretty low except
to fix a few libstdc++ problems.
PR c++/108769
PR c++/58074
PR c++/115522
PR c++/85723
gcc/cp/ChangeLog:
* class.cc (type_has_non_deleted_trivial_default_ctor): Fix formatting.
* tree.cc (trivial_type_p): Instead of TYPE_HAS_TRIVIAL_DFLT, use
type_has_non_deleted_trivial_default_ctor.
gcc/testsuite/ChangeLog:
* g++.dg/warn/Wclass-memaccess.C: Add dg-warning.
* g++.dg/ext/is_trivial1.C: New test.
* g++.dg/ext/is_trivial2.C: New test.
* g++.dg/ext/is_trivial3.C: New test.
* g++.dg/ext/is_trivial4.C: New test.
* g++.dg/ext/is_trivial5.C: New test.
* g++.dg/ext/is_trivial6.C: New test.
|
|
There are various non-IBM CPUs with altivec, so we cannot use that
flag to determine which .machine cpu to use, so ignore it.
Emit an additional ".machine altivec" if Altivec is enabled so
that the assembler doesn't require an explicit -maltivec option
to assemble any Altivec instructions for those targets where
the ".machine cpu" is insufficient to enable Altivec. For example,
-mcpu=G5 emits a ".machine power4".
2024-07-18 René Rebe <rene@exactcode.de>
Peter Bergner <bergner@linux.ibm.com>
gcc/
PR target/97367
* config/rs6000/rs6000.cc (rs6000_machine_from_flags): Do not consider
OPTION_MASK_ALTIVEC.
(emit_asm_machine): For Altivec compiles, emit a ".machine altivec".
gcc/testsuite/
PR target/97367
* gcc.target/powerpc/pr97367.c: New test.
Signed-off-by: René Rebe <rene@exactcode.de>
|