Age | Commit message (Collapse) | Author | Files | Lines |
|
Fix an issue in which "vectors" of duplicate entries placed in scalar
registers caused the following 63 registers to be marked live, for the
purpose of prologue generation, which resulted in stack corruption.
gcc/ChangeLog:
* config/gcn/gcn.cc (gcn_class_max_nregs): Handle vectors in SGPRs.
(move_callee_saved_registers): Detect the bug condition early.
|
|
Just using move insn for no-op conversions triggers special move handling in
IRA which declares that subreg of vectors aren't valid and routes everything
through memory. These patterns make the vec_select explicit and all is well.
gcc/ChangeLog:
* config/gcn/gcn-protos.h (gcn_stepped_zero_int_parallel_p): New.
* config/gcn/gcn-valu.md (V_1REG_ALT): New.
(V_2REG_ALT): New.
(vec_extract<V_1REG:mode><V_1REG_ALT:mode>_nop): New.
(vec_extract<V_2REG:mode><V_2REG_ALT:mode>_nop): New.
(vec_extract<V_ALL:mode><V_ALL_ALT:mode>): Use new patterns.
* config/gcn/gcn.cc (gcn_stepped_zero_int_parallel_p): New.
* config/gcn/predicates.md (ascending_zero_int_parallel): New.
|
|
The following testcase ICEs on aarch64-linux, because
expand_vector_condition attempts to piecewise lower SVE
d_3 = a_1(D) < b_2(D);
_5 = VEC_COND_EXPR <d_3, c_4(D), d_3>;
which isn't possible - nunits_for_known_piecewise_op ICEs but
the rest of the code assumes constant number of elements too.
expand_vector_condition attempts to find if a (rhs1) is a SSA_NAME
for comparison and calls expand_vec_cond_expr_p (type, TREE_TYPE (a1), code)
where a1 is one of the operands of the comparison and code is the comparison
code. That one indeed isn't supported here, but what aarch64 SVE supports
are the individual statements, comparison (expand_vec_cmp_expr_p) and
expand_vec_cond_expr_p (type, TREE_TYPE (a), SSA_NAME), the latter because
that function starts with
if (VECTOR_BOOLEAN_TYPE_P (cmp_op_type)
&& get_vcond_mask_icode (TYPE_MODE (value_type),
TYPE_MODE (cmp_op_type)) != CODE_FOR_nothing)
return true;
In an earlier version of the patch (in the PR), we did this
if (VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (a))
&& expand_vec_cond_expr_p (type, TREE_TYPE (a), ERROR_MARK))
return true;
before the code == SSA_NAME handling plus some further tweaks later.
While that fixed the ICE, it broke quite a few tests on x86 and some on
aarch64 too. The problem is that expand_vector_comparison doesn't lower
comparisons which aren't supported and only feed VEC_COND_EXPR first operand
and expand_vector_condition succeeds for those, so with the above mentioned
change we'd verify the VEC_COND_EXPR is implementable using optab alone,
but nothing would verify the tcc_comparison which relied on
expand_vector_condition to verify.
So, the following patch instead queries whether optabs can handle the
comparison and VEC_COND_EXPR together (if a (rhs1) is a comparison;
otherwise as before it checks only the VEC_COND_EXPR) and if that fails,
also checks whether the two operations could be supported individually
and only if even that fails does the piecewise lowering.
2023-03-23 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/109176
* tree-vect-generic.cc (expand_vector_condition): If a has
vector boolean type and is a comparison, also check if both
the comparison and VEC_COND_EXPR could be successfully expanded
individually.
* gcc.target/aarch64/sve/pr109176.c: New test.
|
|
Fix the bug of the rvv bool mode size by the adjustment.
Besides the mode precision (aka bit size [1, 2, 4, 8, 16, 32, 64])
of the vbool*_t, the mode size (aka byte size) will be adjusted to
[1, 1, 1, 1, 2, 4, 8] according to the rvv spec 1.0 isa. The
adjustment will provide correct information for the underlying
redundant instruction elimiation.
Given the below sample code:
{
vbool1_t v1 = *(vbool1_t*)in;
vbool64_t v2 = *(vbool64_t*)in;
*(vbool1_t*)(out + 100) = v1;
*(vbool64_t*)(out + 200) = v2;
}
Before the size adjustment:
csrr t0,vlenb
slli t1,t0,1
csrr a3,vlenb
sub sp,sp,t1
slli a4,a3,1
add a4,a4,sp
addi a2,a1,100
vsetvli a5,zero,e8,m8,ta,ma
sub a3,a4,a3
vlm.v v24,0(a0)
vsm.v v24,0(a2)
vsm.v v24,0(a3)
addi a1,a1,200
csrr t0,vlenb
vsetvli a4,zero,e8,mf8,ta,ma
slli t1,t0,1
vlm.v v24,0(a3)
vsm.v v24,0(a1)
add sp,sp,t1
jr ra
After the size adjustment:
addi a3,a1,100
vsetvli a4,zero,e8,m8,ta,ma
addi a1,a1,200
vlm.v v24,0(a0)
vsm.v v24,0(a3)
vsetvli a5,zero,e8,mf8,ta,ma
vlm.v v24,0(a0)
vsm.v v24,0(a1)
ret
Additionally, the size adjust cannot cover all possible combinations
of the vbool*_t code pattern like above. We will take a look into it
in another patches.
PR 108185
PR 108654
gcc/ChangeLog:
PR target/108654
PR target/108185
* config/riscv/riscv-modes.def (ADJUST_BYTESIZE): Adjust size
for vector mask modes.
* config/riscv/riscv.cc (riscv_v_adjust_bytesize): New.
* config/riscv/riscv.h (riscv_v_adjust_bytesize): New.
gcc/testsuite/ChangeLog:
PR target/108654
PR target/108185
* gcc.target/riscv/rvv/base/pr108185-1.c: Update.
* gcc.target/riscv/rvv/base/pr108185-2.c: Ditto.
* gcc.target/riscv/rvv/base/pr108185-3.c: Ditto.
Signed-off-by: Pan Li <pan2.li@intel.com>
Co-authored-by: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
|
|
The arch 'rv32imac' will not be created when excuting
'./multilib-generator rv32imc-ilp32--a'
The output is:
MULTILIB_OPTIONS = march=rv32imc mabi=ilp32
MULTILIB_DIRNAMES = rv32imc ilp32
MULTILIB_REQUIRED = march=rv32imc/mabi=ilp32
MULTILIB_REUSE =
Analysis : The alts:['rv32imc', 'rv32imac'] will change
to ['rv32imac', 'rv32imc'] through function:unique(alts) processing,
This is the wrong alts should not be changed.
This patch fix it.
gcc/ChangLog:
* config/riscv/multilib-generator: Adjusting the loop of 'alt' in 'alts'.
Signed-off-by: Songhe Zhu <zhusonghe@eswincomputing.com>
|
|
In this testcase, the tree walk to look for bare parameter packs was
confused by finding a type with no TREE_BINFO. But it should be fine that
it's unset; we already checked for unexpanded packs at parse time.
I also tried doing the partial instantiation of the local class, which is
probably the long-term direction we want to go, but for stage 4 let's go
with this safer change.
PR c++/109241
gcc/cp/ChangeLog:
* pt.cc (find_parameter_packs_r): Handle null TREE_BINFO.
gcc/testsuite/ChangeLog:
* g++.dg/cpp1y/lambda-generic-local-class2.C: New test.
|
|
In order to decrease the memory traffic, we don't use whole register
load/store for the LMUL less than 1 and mask mode, so those case will
require one extra general purpose register for setting up VL register,
but it's not allowed during LRA process, so we defined few special move patterns
used for LRA, which will defer the expansion after LRA.
gcc/ChangeLog:
PR target/109244
* config/riscv/riscv-protos.h (emit_vlmax_vsetvl): Define as global.
(emit_vlmax_op): Ditto.
* config/riscv/riscv-v.cc (get_sew): New function.
(emit_vlmax_vsetvl): Adapt function.
(emit_pred_op): Ditto.
(emit_vlmax_op): Ditto.
(emit_nonvlmax_op): Ditto.
(legitimize_move): Fix LRA ICE.
(gen_no_side_effects_vsetvl_rtx): Adapt function.
* config/riscv/vector.md (@mov<V_FRACT:mode><P:mode>_lra): New pattern.
(@mov<VB:mode><P:mode>_lra): Ditto.
(*mov<V_FRACT:mode><P:mode>_lra): Ditto.
(*mov<VB:mode><P:mode>_lra): Ditto.
gcc/testsuite/ChangeLog:
PR target/109244
* g++.target/riscv/rvv/base/pr109244.C: New test.
* gcc.target/riscv/rvv/base/binop_vv_constraint-4.c: Adapt testcase.
* gcc.target/riscv/rvv/base/binop_vv_constraint-6.c: Ditto.
* gcc.target/riscv/rvv/base/binop_vx_constraint-127.c: Ditto.
* gcc.target/riscv/rvv/base/spill-1.c: Ditto.
* gcc.target/riscv/rvv/base/spill-2.c: Ditto.
* gcc.target/riscv/rvv/base/spill-3.c: Ditto.
* gcc.target/riscv/rvv/base/spill-5.c: Ditto.
* gcc.target/riscv/rvv/base/spill-7.c: Ditto.
* g++.target/riscv/rvv/base/bug-18.C: New test.
* gcc.target/riscv/rvv/base/merge_constraint-3.c: New test.
* gcc.target/riscv/rvv/base/merge_constraint-4.c: New test.
|
|
__riscv_vlenb is defined in RVV intrinsic spec 0.11 and used in some project
like google/highway.
gcc/ChangeLog:
PR target/109228
* config/riscv/riscv-vector-builtins-bases.cc (class vlenb): Add
__riscv_vlenb support.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def (vlenb): Ditto.
* config/riscv/riscv-vector-builtins-shapes.cc (struct vlenb_def): Ditto.
(SHAPE): Ditto.
* config/riscv/riscv-vector-builtins-shapes.h: Ditto.
* config/riscv/riscv-vector-builtins.cc: Ditto.
gcc/testsuite/ChangeLog:
PR target/109228
* gcc.target/riscv/rvv/base/vlenb-1.c: New test.
|
|
gcc/ChangeLog:
* config/riscv/riscv-vsetvl.cc (reg_available_p): Fix bugs.
(pass_vsetvl::compute_local_backward_infos): Fix bugs.
(pass_vsetvl::need_vsetvl): Fix bugs.
(pass_vsetvl::backward_demand_fusion): Fix bugs.
(pass_vsetvl::demand_fusion): Fix bugs.
(eliminate_insn): Fix bugs.
(insert_vsetvl): Ditto.
(pass_vsetvl::emit_local_forward_vsetvls): Ditto.
* config/riscv/riscv-vsetvl.h (enum vsetvl_type): Ditto.
* config/riscv/vector.md: Ditto.
gcc/testsuite/ChangeLog:
* g++.target/riscv/rvv/base/bug-10.C: New test.
* g++.target/riscv/rvv/base/bug-11.C: New test.
* g++.target/riscv/rvv/base/bug-12.C: New test.
* g++.target/riscv/rvv/base/bug-13.C: New test.
* g++.target/riscv/rvv/base/bug-14.C: New test.
* g++.target/riscv/rvv/base/bug-15.C: New test.
* g++.target/riscv/rvv/base/bug-16.C: New test.
* g++.target/riscv/rvv/base/bug-17.C: New test.
* g++.target/riscv/rvv/base/bug-2.C: New test.
* g++.target/riscv/rvv/base/bug-3.C: New test.
* g++.target/riscv/rvv/base/bug-4.C: New test.
* g++.target/riscv/rvv/base/bug-5.C: New test.
* g++.target/riscv/rvv/base/bug-6.C: New test.
* g++.target/riscv/rvv/base/bug-7.C: New test.
* g++.target/riscv/rvv/base/bug-8.C: New test.
* g++.target/riscv/rvv/base/bug-9.C: New test.
Signed-off-by: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
Co-authored-by: kito-cheng <kito.cheng@sifive.com>
|
|
We've wrong RTL pattern cause unexpected optimizaion result.
Give a example is vnmsub.vx pattern, the operation of vnmsub.vx
list below:
vnmsub.vx vd, rs1, vs2, vm # vd[i] = -(x[rs1] * vd[i]) + vs2[i]
But our RTL pattern write as (x[rs1] * vd[i]) - vs2[i], and the GCC try to
simplify when x[rs1] is constant 1, and then become a vd[i] - vs[i]
instruction.
We also revise all ternary instructions to make sure the RTL has right
semantic:
And it's the mapping list between instruction and RTL pattern:
interger:
vnmsac.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vs2[i]) + vd[i] (minus op3 (mult op1 op2))
vnmsac.vx vd, rs1, vs2, vm # vd[i] = -(x[rs1] * vs2[i]) + vd[i] (minus op3 (mult op1 op2))
floating-point:
vfmacc.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vs2[i]) + vd[i] (plus (mult (op1 op2)) op3)
vfmacc.vf vd, rs1, vs2, vm # vd[i] = +(f[rs1] * vs2[i]) + vd[i] (plus (mult (op1 op2)) op3)
vfnmacc.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vs2[i]) - vd[i] (minus (neg (mult (op1 op2))) op3))
vfnmacc.vf vd, rs1, vs2, vm # vd[i] = -(f[rs1] * vs2[i]) - vd[i] (minus (neg (mult (op1 op2)) op3))
vfmsac.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vs2[i]) - vd[i] (minus (mult (op1 op2)) op3)
vfmsac.vf vd, rs1, vs2, vm # vd[i] = +(f[rs1] * vs2[i]) - vd[i] (minus (mult (op1 op2)) op3)
vfnmsac.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vs2[i]) + vd[i] (plus (neg:(mult (op1 op2))) op3)
vfnmsac.vf vd, rs1, vs2, vm # vd[i] = -(f[rs1] * vs2[i]) + vd[i] (plus (neg:(mult (op1 op2))) op3)
gcc/ChangeLog:
* config/riscv/riscv-vector-builtins-bases.cc: Fix ternary bug.
* config/riscv/vector-iterators.md (nmsac): Ditto.
(nmsub): Ditto.
(msac): Ditto.
(msub): Ditto.
(nmadd): Ditto.
(nmacc): Ditto.
* config/riscv/vector.md (@pred_mul_<optab><mode>): Ditto.
(@pred_mul_plus<mode>): Ditto.
(*pred_madd<mode>): Ditto.
(*pred_macc<mode>): Ditto.
(*pred_mul_plus<mode>): Ditto.
(@pred_mul_plus<mode>_scalar): Ditto.
(*pred_madd<mode>_scalar): Ditto.
(*pred_macc<mode>_scalar): Ditto.
(*pred_mul_plus<mode>_scalar): Ditto.
(*pred_madd<mode>_extended_scalar): Ditto.
(*pred_macc<mode>_extended_scalar): Ditto.
(*pred_mul_plus<mode>_extended_scalar): Ditto.
(@pred_minus_mul<mode>): Ditto.
(*pred_<madd_nmsub><mode>): Ditto.
(*pred_nmsub<mode>): Ditto.
(*pred_<macc_nmsac><mode>): Ditto.
(*pred_nmsac<mode>): Ditto.
(*pred_mul_<optab><mode>): Ditto.
(*pred_minus_mul<mode>): Ditto.
(@pred_mul_<optab><mode>_scalar): Ditto.
(@pred_minus_mul<mode>_scalar): Ditto.
(*pred_<madd_nmsub><mode>_scalar): Ditto.
(*pred_nmsub<mode>_scalar): Ditto.
(*pred_<macc_nmsac><mode>_scalar): Ditto.
(*pred_nmsac<mode>_scalar): Ditto.
(*pred_mul_<optab><mode>_scalar): Ditto.
(*pred_minus_mul<mode>_scalar): Ditto.
(*pred_<madd_nmsub><mode>_extended_scalar): Ditto.
(*pred_nmsub<mode>_extended_scalar): Ditto.
(*pred_<macc_nmsac><mode>_extended_scalar): Ditto.
(*pred_nmsac<mode>_extended_scalar): Ditto.
(*pred_mul_<optab><mode>_extended_scalar): Ditto.
(*pred_minus_mul<mode>_extended_scalar): Ditto.
(*pred_<madd_msub><mode>): Ditto.
(*pred_<macc_msac><mode>): Ditto.
(*pred_<madd_msub><mode>_scalar): Ditto.
(*pred_<macc_msac><mode>_scalar): Ditto.
(@pred_neg_mul_<optab><mode>): Ditto.
(@pred_mul_neg_<optab><mode>): Ditto.
(*pred_<nmadd_msub><mode>): Ditto.
(*pred_<nmsub_nmadd><mode>): Ditto.
(*pred_<nmacc_msac><mode>): Ditto.
(*pred_<nmsac_nmacc><mode>): Ditto.
(*pred_neg_mul_<optab><mode>): Ditto.
(*pred_mul_neg_<optab><mode>): Ditto.
(@pred_neg_mul_<optab><mode>_scalar): Ditto.
(@pred_mul_neg_<optab><mode>_scalar): Ditto.
(*pred_<nmadd_msub><mode>_scalar): Ditto.
(*pred_<nmsub_nmadd><mode>_scalar): Ditto.
(*pred_<nmacc_msac><mode>_scalar): Ditto.
(*pred_<nmsac_nmacc><mode>_scalar): Ditto.
(*pred_neg_mul_<optab><mode>_scalar): Ditto.
(*pred_mul_neg_<optab><mode>_scalar): Ditto.
(@pred_widen_neg_mul_<optab><mode>): Ditto.
(@pred_widen_mul_neg_<optab><mode>): Ditto.
(@pred_widen_neg_mul_<optab><mode>_scalar): Ditto.
(@pred_widen_mul_neg_<optab><mode>_scalar): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/bug-3.c: New test.
* gcc.target/riscv/rvv/base/bug-4.c: New test.
* gcc.target/riscv/rvv/base/bug-5.c: New test.
Signed-off-by: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
Co-authored-by: kito-cheng <kito.cheng@sifive.com>
|
|
Add target check funciton to ensure vector extension can be used.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp (check_effective_target_riscv_vector):
New.
|
|
The target hook is only used by i386, and the current definition is
same as default gen_reg_rtx.
gcc/ChangeLog:
* builtins.cc (builtin_memset_read_str): Replace
targetm.gen_memset_scratch_rtx with gen_reg_rtx.
(builtin_memset_gen_str): Ditto.
* config/i386/i386-expand.cc
(ix86_convert_const_wide_int_to_broadcast): Replace
ix86_gen_scratch_sse_rtx with gen_reg_rtx.
(ix86_expand_vector_move): Ditto.
* config/i386/i386-protos.h (ix86_gen_scratch_sse_rtx):
Removed.
* config/i386/i386.cc (ix86_gen_scratch_sse_rtx): Removed.
(TARGET_GEN_MEMSET_SCRATCH_RTX): Removed.
* doc/tm.texi: Remove TARGET_GEN_MEMSET_SCRATCH_RTX.
* doc/tm.texi.in: Ditto.
* target.def: Ditto.
|
|
|
|
c-c++-common/diagnostic-format-sarif-file-4.c is a test case for
quoting non-ASCII source code in a SARIF diagnostic log.
The SARIF standard mandates that .sarif files are UTF-8 encoded.
PR testsuite/105959 notes that the test case fails when the system
encoding is not UTF-8, such as when the "make" invocation is prefixed
with LC_ALL=C, whereas it works with in a UTF-8-locale.
The root cause is that dg-scan opens the file for reading using the
"system" encoding; I believe it is falling back to treating all files as
effectively ISO 8859-1 in a non-UTF-8 locale.
This patch fixes things by adding a mechanism to dg-scan to allow
callers to (optionally) specify an encoding to use when reading the
file, and updating scan-sarif-file (and the -not variant) to always
use UTF-8 when calling dg-scan, fixing the test case with LC_ALL=C.
gcc/testsuite/ChangeLog:
PR testsuite/105959
* gcc.dg-selftests/dg-final.exp
(dg_final_directive_check_num_args): Update expected maximum
number of args for the various directives using dg-scan.
* lib/scanasm.exp (append_encoding_arg): New procedure.
(dg-scan): Add optional 3rd argument: the encoding to use when
reading from the file.
* lib/scansarif.exp (scan-sarif-file): Treat the file as UTF-8
encoded when reading it.
(scan-sarif-file-not): Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
Fixes golang/go#59169
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/478176
|
|
fold_convert doesn't work with a dependent argument, and problematically
differed from the corresponding fold+build_nop further down in the
function. So change it to match.
PR c++/108390
gcc/cp/ChangeLog:
* pt.cc (unify): Use fold of build_nop instead of fold_convert.
gcc/testsuite/ChangeLog:
* g++.dg/template/partial-order3.C: New test.
|
|
gcc/fortran/ChangeLog:
PR fortran/104572
* resolve.cc (gfc_resolve_finalizers): Argument of a FINAL subroutine
cannot be an alternate return.
gcc/testsuite/ChangeLog:
PR fortran/104572
* gfortran.dg/pr104572.f90: New test.
Co-authored-by: Steven G. Kargl <kargl@gcc.gnu.org>
|
|
LRA was trying to do live range splitting again and again as there were
no enough regs for asm. This patch solves the problem.
PR target/109137
gcc/ChangeLog:
* lra.cc (lra): Do not repeat inheritance and live range splitting
when asm error is found.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr109137.c: New.
|
|
The driver should default to include the current working directory in the
module search path. This patch adds . to the search path provided
-fm2-pathname has not been specified. The patch also reorders the pim
libraries so that the m2cor directory is searched before m2pim.
Coroutine support is visible by default for both -fpim and -fiso
(from their respective SYSTEM modules).
gcc/m2/ChangeLog:
PR modula2/109248
* Make-lang.in (m2/pge-boot/%.o): Add CFLAGS and CXXFLAGS for C
and C++ compiles.
* gm2spec.cc (add_m2_I_path): Indentation.
(lang_specific_driver): New variable seen_pathname.
Detect -fm2-pathname. If not seen then push_back_Ipath (".").
Change non iso library path to "m2cor,m2log,m2pim,m2iso".
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
|
|
Since r7-2549 we were throwing away the explicit C:: when we found that ~C
has an attribute that we treat as making its type dependent.
PR c++/108795
gcc/cp/ChangeLog:
* semantics.cc (finish_id_expression_1): Check scope before
returning id_expression.
gcc/testsuite/ChangeLog:
* g++.dg/ext/attr-tsafe1.C: New test.
|
|
As the PR shows, we currently emit duplicate diagnostics for calls to
functions marked with __attribute__((unavailable)). This patch fixes
that.
gcc/cp/ChangeLog:
PR c++/109177
* call.cc (build_over_call): Use make_temp_override to suppress
both unavailable and deprecated warnings when calling
build_addr_func.
gcc/testsuite/ChangeLog:
PR c++/109177
* g++.dg/ext/pr109177.C: New test.
|
|
[PR109239]
The patch has this effect on my integration tests of -fanalyzer:
Comparison:
GOOD: 129 (17.70% -> 17.92%)
BAD: 600 -> 591 (-9)
which is purely due to improvements to -Wanalyzer-deref-before-check
on the Linux kernel:
-Wanalyzer-deref-before-check:
GOOD: 1 (4.55% -> 7.69%)
BAD: 21 -> 12 (-9)
Known false positives: 16 -> 10 (-6)
linux-5.10.162: 7 -> 1 (-6)
Suspected false positives: 3 -> 0 (-3)
linux-5.10.162: 3 -> 0 (-3)
gcc/analyzer/ChangeLog:
PR analyzer/109239
* program-point.cc: Include "analyzer/inlining-iterator.h".
(program_point::effectively_intraprocedural_p): New function.
* program-point.h (program_point::effectively_intraprocedural_p):
New decl.
* sm-malloc.cc (deref_before_check::emit): Use it when rejecting
interprocedural cases, so that we reject interprocedural cases
that have become intraprocedural due to inlining.
gcc/testsuite/ChangeLog:
PR analyzer/109239
* gcc.dg/analyzer/deref-before-check-pr109239-linux-bus.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
gcc/ChangeLog:
* config/gcn/gcn-protos.h (gcn_expand_dpp_swap_pairs_insn)
(gcn_expand_dpp_distribute_even_insn)
(gcn_expand_dpp_distribute_odd_insn): Declare.
* config/gcn/gcn-valu.md (@dpp_swap_pairs<mode>)
(@dpp_distribute_even<mode>, @dpp_distribute_odd<mode>)
(cmul<conj_op><mode>3, cml<addsub_as><mode>4, vec_addsub<mode>3)
(cadd<rot><mode>3, vec_fmaddsub<mode>4, vec_fmsubadd<mode>4)
(fms<mode>4<exec>, fms<mode>4_negop2<exec>, fms<mode>4)
(fms<mode>4_negop2): New patterns.
* config/gcn/gcn.cc (gcn_expand_dpp_swap_pairs_insn)
(gcn_expand_dpp_distribute_even_insn)
(gcn_expand_dpp_distribute_odd_insn): New functions.
* config/gcn/gcn.md: Add entries to unspec enum.
gcc/testsuite/ChangeLog:
* gcc.target/gcn/complex.c: New test.
|
|
This patch implements a nan_state class, that allows us to query or
pass around the NANness of an frange. We can store +NAN, -NAN, +-NAN,
or not-a-NAN with it.
I tried to touch as little as possible, leaving other cleanups to the
next release. For example, we should replace the m_*_nan fields in
frange with nan_state, and provide relevant accessors to nan_state
(isnan, etc).
PR tree-optimization/109008
gcc/ChangeLog:
* value-range.cc (frange::set): Add nan_state argument.
* value-range.h (class nan_state): New.
(frange::get_nan_state): New.
|
|
gcc/ChangeLog:
* configure: Regenerate.
|
|
Remove M2LINK.def. Pass the user forced module initialization string as
a parameter to M2RTS.ConstructModules. This patch allows
-fm2-whole-program to link successfully using dynamic libraries.
gcc/m2/ChangeLog:
PR modula2/107630
* Make-lang.in (m2/stage2/cc1gm2$(exeext)): Remove
m2/gm2-libs-boot/M2LINK.o.
(m2/stage1/cc1gm2$(exeext)): Ditto.
(GM2-LIBS-BOOT-DEFS): Remove M2LINK.def.
(GM2-LIBS-DEFS): Ditto.
(m2/mc-boot/$(SRC_PREFIX)%.o): Replace CXX_FLAGS with CXXFLAGS.
(m2/mc-boot-ch/$(SRC_PREFIX)%.o): Ditto.
(m2/mc-boot/main.o): Ditto.
(mcflex.o): Add $(CFLAGS).
(m2/gm2-libs-boot/M2LINK.o): Remove rule.
* gm2-compiler/M2GCCDeclare.def (DeclareM2linkGlobals): Remove.
* gm2-compiler/M2GCCDeclare.mod: (M2LinkEntry): Remove.
(M2LinkIndex): Remove.
(DoVariableDeclaration): Remove initial and call to
AddEntryM2Link.
(AddEntryM2Link): Remove.
(GetEntryM2Link): Remove.
(DeclareM2linkGlobals): Remove.
(DetectM2LinkInitial): Remove.
(InitM2LinkModule): Remove.
* gm2-compiler/M2GenGCC.mod (CodeFinallyEnd): Remove call to
DeclareM2linkGlobals.
* gm2-compiler/M2Quads.mod (BuildM2InitFunction): Add extra
parameter containing runtime module override to ConstructModules.
* gm2-compiler/M2Scaffold.mod: Update comment describing
ConstructModules.
* gm2-gcc/m2decl.cc (m2decl_DeclareM2linkForcedModuleInitOrder):
Remove.
* gm2-libs-iso/M2RTS.def (ConstructModules): Add overrideliborder
parameter.
* gm2-libs-iso/M2RTS.mod: Add overrideliborder parameter.
* gm2-libs/M2Dependent.def (ConstructModules): Add overrideliborder
parameter.
* gm2-libs/M2Dependent.mod (ConstructModules): Add overrideliborder
parameter.
* gm2-libs/M2RTS.def (ConstructModules): Add overrideliborder parameter.
* gm2-libs/M2RTS.mod (ConstructModules): Add overrideliborder
parameter.
* gm2-libs/M2LINK.def: Removed.
libgm2/ChangeLog:
* libm2pim/Makefile.am (M2DEFS): Remove M2LINK.def.
* libm2pim/Makefile.in: Rebuild.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
|
|
|
|
I've observed an LTO wrong-code bug with a large testcase in GCC 12,
that results from TYPE_TYPELESS_STORAGE not being set consistently on
type variants.
Specifically, in the LTO stage of compilation, there is an aggregate
type passed to get_alias_set, whose TYPE_MAIN_VARIANT does not have
TYPE_TYPELESS_STORAGE set. However, the TYPE_CANONICAL of that main
variant *does* have have TYPE_TYPELESS_STORAGE set; note that the use
of TYPE_CANONICAL in get_alias_set comes after the check of
TYPE_TYPELESS_STORAGE. The effect is that when (one-argument)
record_component_aliases is called, the recursive call to
get_alias_set gives alias set 0, and the aggregate type ends up not
being considered to alias its members, with wrong-code consequences.
I haven't managed to produce a self-contained executable testcase to
demonstrate this, but it clearly seems appropriate for
TYPE_TYPELESS_STORAGE to be consistent on type variants, so this patch
makes it so, which appears to be sufficient to resolve the bug. I've
attached a reduced test to
<https://gcc.gnu.org/pipermail/gcc-patches/2023-March/614278.html>
that does at least demonstrate main-variant versions of a type (SB in
this test) being written out to LTO IR both with and without
TYPE_TYPELESS_STORAGE, although not the subsequent consequences of a
type without TYPE_TYPELESS_STORAGE with a TYPE_CANONICAL (as
constructed after LTO type merging) with TYPE_TYPELESS_STORAGE and
following wrong-code.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
* stor-layout.cc (finalize_type_size): Copy TYPE_TYPELESS_STORAGE
to variants.
|
|
gcc/fortran/ChangeLog:
PR fortran/99036
* decl.cc (gfc_match_modproc): Reject MODULE PROCEDURE if not in a
generic module interface.
gcc/testsuite/ChangeLog:
PR fortran/99036
* gfortran.dg/pr99036.f90: New test.
|
|
When parsing a default member init we just build a CONVERT_EXPR for
converting to a virtual base, and then expand that into the more complex
form when we actually use the DMI in a constructor. But that wasn't working
for the template case where we are considering the conversion at the point
that the constructor needs the DMI instantiation, so it seemed like we were
in a constructor already. And then when the other constructor tries to
reuse the instantiation, it sees uses of the first constructor's parameters,
and dies. So ensure that we get the CONVERT_EXPR in this case, too.
PR c++/106890
gcc/cp/ChangeLog:
* init.cc (maybe_instantiate_nsdmi_init): Don't leave
current_function_decl set to a constructor.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/nsdmi-template25.C: New test.
|
|
We currently allow VARYING lhs GORI calculations to continue if there is
a relation present in the hope it will eventually better refine a result.
This adds a check that the relation is relevant to the outgoing range
calculation first. If it is not relevant, stop calculating.
PR tree-optimization/109192
* gimple-range-gori.cc (gori_compute::compute_operand_range):
Terminate gori calculations if a relation is not relevant.
* value-relation.h (value_relation::set_relation): Allow
equality between op1 and op2 if they are the same.
|
|
The following avoids looking at STMT_SLP_TYPE apart from the only
place needing it - transform and analysis of non-SLP loop stmts.
In particular it doesn't have a reliable meaning on SLP representatives
which are also passed as stmt_vinfo to vectorizable_* routines. The
proper way to check in those is to look for the slp_node argument
instead.
PR tree-optimization/109219
* tree-vect-loop.cc (vectorizable_reduction): Check
slp_node, not STMT_SLP_TYPE.
* tree-vect-stmts.cc (vectorizable_condition): Likewise.
* tree-vect-slp.cc (vect_slp_analyze_node_operations_1):
Remove assertion on STMT_SLP_TYPE.
* gcc.dg/torture/pr109219.c: New testcase.
|
|
On Tue, Mar 21, 2023 at 12:35:19PM +0000, Andrew Stubbs wrote:
> > /* Ensure the the in-branch simd clones are used on targets that support them.
> > Some targets use another call for the epilogue loops. */
> > -/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 2 "vect" { target { ! aarch64*-*-* } } } } */
> > -/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 3 "vect" { target aarch64*-*-* } } } */
> > +/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 2 "vect" } } */
>
> I suppose those comments are now obsolete.
Oops, fixed thusly.
2023-03-21 Jakub Jelinek <jakub@redhat.com>
PR testsuite/108898
* gcc.dg/vect/vect-simd-clone-16.c: Remove parts of comment mentioning
epilogue loops.
* gcc.dg/vect/vect-simd-clone-17.c: Likewise.
* gcc.dg/vect/vect-simd-clone-18.c: Likewise.
|
|
As mentioned in the PR, vect-simd-clone-1[678]{,f}.c tests FAIL on
x86_64-linux with -m64/-march=cascadelake or -m32/-march=cascadelake,
there are 3 matches for the calls rather than expected two.
As suggested by Richi, this patch changes those tests to use
--param vect-epilogues-nomask=0 such that it is more predictable on how
many calls will show up. In the non-[a-f] suffixed tests, the
scan-tree-dump-times patterns were expecting 2 for non-aarch64 and 3 for
aarch64, which is a puzzle for me, because vect_simd_clones effective
target is apparently never true on aarch64 (just on x86 in some cases and
on amdgcn; perhaps something to change for GCC14, but I guess too late
for stage4). That said, I have looked at aarch64 dumps and see only 2
calls with --param vect-epilogues-nomask=0 and 3 with --param
vect-epilogues-nomask=1 or without it, so I have tweaked those to always
expect the same thing. Another thing is some tests uselessly had
-fdump-tree-optimized in dg-options even when they don't scan anything
there.
Tested on x86_64-linux with
make -j32 -k check-gcc RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-simd-clone-*.c \
--target_board='unix{-m64/-march=x86-64,-m64/-march=cascadelake,-m32/-march=i686,-m32/-march=cascadelake}'"
and aarch64-linux (where all tests are UNSUPPORTED before/after).
2023-03-21 Jakub Jelinek <jakub@redhat.com>
PR testsuite/108898
* gcc.dg/vect/vect-simd-clone-16.c: Add --param vect-epilogues-nomask=0
to dg-additional-options. Always expect just 2 foo.simdclone calls.
* gcc.dg/vect/vect-simd-clone-16f.c: Add
--param vect-epilogues-nomask=0 to dg-additional-options.
* gcc.dg/vect/vect-simd-clone-17.c: Likewise. Always expect just 2
foo.simdclone calls.
* gcc.dg/vect/vect-simd-clone-17d.c: Remove -fdump-tree-optimized from
dg-additional-options.
* gcc.dg/vect/vect-simd-clone-17e.c: Likewise.
* gcc.dg/vect/vect-simd-clone-17f.c: Likewise. Add
--param vect-epilogues-nomask=0 to dg-additional-options.
* gcc.dg/vect/vect-simd-clone-18.c: Add --param vect-epilogues-nomask=0
to dg-additional-options. Always expect just 2 foo.simdclone calls.
* gcc.dg/vect/vect-simd-clone-18f.c: Add
--param vect-epilogues-nomask=0 to dg-additional-options.
|
|
[PR109215]
Our documentation sadly talks about elt_type arr[0]; as zero-length arrays,
not arrays with zero elements. Unfortunately, those aren't the only arrays
which can have zero size, the same size can be also result of zero-length
element, like in GNU C struct whatever {} or in GNU C/C++ if the element
type is [0] array or combination thereof (dunno if Ada doesn't allow
something similar too). One can't do much with them, taking address of
their elements, (no-op) copying of the elements in and out. But they
behave differently from arr[0] arrays e.g. in that using non-zero indexes
in them (as long as they are within bounds as for normal arrays) is valid.
I think this naming inaccuracy resulted in Martin designing
special_array_member in an inconsistent way, mixing size zero array members
with array members of one or two or more elements and then using the
size zero interchangeably with zero elements.
The following patch changes that (but doesn't do any
documentation/diagnostics renaming, as this is really a corner case),
such that int_0/trail_0 for consistency is just about [0] arrays
plus [] for the latter, not one or more zero sized elements case.
The testcase has one xfailed case for where perhaps in later GCC versions
we could add extra code to handle it, for some reason we don't diagnose
out of bounds accesses for the zero sized elements cases. It will be
harder because e.g. FRE will canonicalize &var.fld[0] and &var.fld[10]
to just one of them because they are provably the same address.
But the important thing is to fix this regression (where we warn on
completely valid code in the Linux kernel). Anyway, for further work
on this we don't really need any extra help from special_array_member,
all code can just check integer_zerop (TYPE_SIZE_UNIT (TREE_TYPE (type))),
it doesn't depend on the position of the members etc.
2023-03-21 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/109215
* tree.h (enum special_array_member): Adjust comments for int_0
and trail_0.
* tree.cc (component_ref_sam_type): Clear zero_elts if memtype
has zero sized element type and the array has variable number of
elements or constant one or more elements.
(component_ref_size): Adjust comments, formatting fix.
* gcc.dg/Wzero-length-array-bounds-3.c: New test.
|
|
This flag allows us to restore old (pre-6.8) behavior of the
@{summary,}content commands, so that texi2any continues to emit
summarycontents first.
maintainer-scripts/ChangeLog:
* update_web_docs_git: Set CONTENTS_OUTPUT_LOCATION=inline in
order to put @shortcontents above contents.
gcc/ChangeLog:
* configure.ac: Add check for the Texinfo 6.8
CONTENTS_OUTPUT_LOCATION customization variable and set it if
supported.
* configure: Regenerate.
* Makefile.in (MAKEINFO_TOC_INLINE_FLAG): New variable. Set by
configure.ac to -c CONTENTS_OUTPUT_LOCATION=inline if
CONTENTS_OUTPUT_LOCATION support is detected, empty otherwise.
($(build_htmldir)/%/index.html): Pass MAKEINFO_TOC_INLINE_FLAG.
|
|
This commit fixes up an instance of the index entry mis-ordering that
occurred between the formulation and application of commit
r13-6310-gf33d7a88d069d1.
gcc/ChangeLog:
* doc/extend.texi: Associate use_hazard_barrier_return index
entry with its attribute.
* doc/invoke.texi: Associate -fcanon-prefix-map index entry with
its attribute
|
|
The @gol macro appears to have existed as a workaround for a bug in old
versions of makeinfo and/or texinfo.tex, where they would, in some types
of output, fail to emit line breaks in @gccoptlists. After updating
texinfo.tex, I noticed that this behavior appears to no longer be
exhibited, instead, both acted correctly and inserted newlines. The
(groff) manual output also appears unaffected.
gcc/ChangeLog:
* doc/implement-c.texi: Remove usage of @gol.
* doc/invoke.texi: Ditto.
* doc/sourcebuild.texi: Ditto.
* doc/include/gcc-common.texi: Remove @gol. In new Makeinfo and
texinfo.tex versions, the bug it was working around appears to
be gone.
gcc/fortran/ChangeLog:
* invoke.texi: Remove usages of @gol.
* intrinsic.texi: Ditto.
|
|
gcc/ChangeLog:
* doc/include/texinfo.tex: Update to 2023-01-17.19.
|
|
The @defbuiltin{,x} macros are convenience macros for the often-repeated
task of defining a built-in function in extend.texi. Usage of this
macro should lead to a higher degree of consistency across pieces of
text written by different people, and provide a better reading
experience, as they prevent easy-to-make errors, like forgetting index
entries for these functions.
gcc/ChangeLog:
* doc/include/gcc-common.texi: Add @defbuiltin{,x} and
@enddefbuiltin for defining built-in functions.
* doc/extend.texi: Apply @defbuiltin{,x} to many, but not all,
places where it should be used.
|
|
This commit addresses a few minor errors that were spotted while testing
the GCC manual with a few people, and while working on wider changes.
gcc/ChangeLog:
* doc/extend.texi (Formatted Output Function Checking): New
subsection for grouping together printf et al.
(Exception handling) Fix missing @ sign before copyright
header, which lead to the copyright line leaking into
'(gcc)Exception handling'.
* doc/gcc.texi: Set document language to en_US.
(@copying): Wrap front cover texts in quotations, move in manual
description text.
|
|
The GCC manual has multiple indices. By creating an appendix which
lists them, we help makeinfo present a more accessible way for the
reader to see all the indices.
gcc/ChangeLog:
* doc/gcc.texi: Add the Indices appendix, to make texinfo
generate nice indices overview page.
|
|
The following adds a missing range-op for __builtin_expect which
helps -Wuse-after-free to detect the case a realloc original
pointer is used when the result was NULL. The implementation
should handle all argument one pass-through builtins we handle
in the fnspec machinery, but that's defered to GCC 14.
The gcc.dg/tree-ssa/ssa-lim-21.c testcase needs adjustment because
for (int j = 0; j < m; j++)
if (__builtin_expect (m, 0))
for (int i = 0; i < m; i++)
is now correctly optimized to a unconditional jump by EVRP - m
cannot be zero when the outer loop is entered. I've adjusted
the outer loop to iterate 'n' times which makes us apply store-motion
to 'count' and 'q->data1' but only out of the inner loop and
as expected not apply store motion to 'q->data' at all.
The gcc.dg/predict-20.c testcase relies on broken behavior of
profile estimation when trying to handle __builtin_expect values
flowing into PHI nodes. I have opened PR109210 and removed
the expected matching from the testcase.
PR tree-optimization/109170
* gimple-range-op.cc (cfn_pass_through_arg1): New.
(gimple_range_op_handler::maybe_builtin_call): Handle
__builtin_expect via cfn_pass_through_arg1.
* gcc.dg/Wuse-after-free-pr109170.c: New testcase.
* gcc.dg/tree-ssa/ssa-lim-21.c: Adjust.
* gcc.dg/predict-20.c: Likewise.
|
|
2023-03-21 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/109206
* trans-array.cc (gfc_trans_array_constructor_value): Correct
incorrect setting of typespec.
|
|
2023-03-21 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/109209
* resolve.cc (generate_component_assignments): Restore the
exclusion of allocatable components from the loop.
gcc/testsuite/
PR fortran/109209
* gfortran.dg/pr109209.f90: New test.
|
|
The bootstrap tool mc is built using $(CXX) and it is missing
$(CXXFLAGS).
gcc/m2/ChangeLog:
* Make-lang.in (m2/mc-boot/$(SRC_PREFIX)%.o): Add $(CXXFLAGS).
(m2/mc-boot-ch/$(SRC_PREFIX)%.o): Add $(CXXFLAGS).
(m2/mc-boot-ch/$(SRC_PREFIX)%.o): Add $(CXXFLAGS).
(m2/mc-boot/main.o): Add $(CXXFLAGS).
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
|
|
|
|
* sv.po: Update.
|
|
gcc/fortran/ChangeLog:
PR fortran/109216
* invoke.texi: Correct documentation of how underscores are appended
to external names.
|
|
When I implemented explicit(bool) in r9-3735, I added this code to
add_template_candidate_real:
+ /* Now the explicit specifier might have been deduced; check if this
+ declaration is explicit. If it is and we're ignoring non-converting
+ constructors, don't add this function to the set of candidates. */
+ if ((flags & LOOKUP_ONLYCONVERTING) && DECL_NONCONVERTING_P (fn))
+ return NULL;
but as this test demonstrates, that's incorrect when we're initializing
from a {}: for list-initialization we consider explicit constructors and
complain if one is chosen.
PR c++/109159
gcc/cp/ChangeLog:
* call.cc (add_template_candidate_real): Add explicit decls to the
set of candidates when the initializer is a braced-init-list.
libstdc++-v3/ChangeLog:
* testsuite/20_util/pair/cons/explicit_construct.cc: Adjust dg-error.
* testsuite/20_util/tuple/cons/explicit_construct.cc: Likewise.
* testsuite/23_containers/span/explicit.cc: Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/explicit16.C: New test.
|