Age | Commit message (Collapse) | Author | Files | Lines |
|
Check to see if a comparison to a constant can be determined to always
be not-equal based on the bitmask.
PR tree-optimization/111766
gcc/
* range-op.cc (operator_equal::fold_range): Check constants
against the bitmask.
(operator_not_equal::fold_range): Ditto.
* value-range.h (irange_bitmask::member_p): New.
gcc/testsuite/
* gcc.dg/pr111766.c: New.
|
|
During the intersection operation, it can be helpful to remove any
low-end ranges when the bitmask has trailing zeros. This prevents
obviously incorrect ranges from appearing without requiring a bitmask
check.
* value-range.cc (irange_bitmask::adjust_range): New.
(irange::intersect_bitmask): Call adjust_range.
* value-range.h (irange_bitmask::adjust_range): New prototype.
|
|
The patch generalizes address register class handling to allow multiple
register classes. For APX EGPR targets, some instructions do not support
GPR32 registers, so it is necessary to limit address register set to
avoid them. The same situation happens for instructions with high registers,
where REX registers can not be used in the address, so the existing
infrastructure can be adapted to also handle this case.
The patch is mostly a mechanical rename of "gpr32" attribute to "addr" and
introduces no functional changes, although it fixes a couple of inconsistent
attribute values in passing.
A follow-up patch will use the above infrastructure to limit address register
class to legacy registers for instructions with high registers.
gcc/ChangeLog:
* config/i386/i386.cc (ix86_memory_address_use_extended_reg_class_p):
Rename to ...
(ix86_memory_address_reg_class): ... this. Generalize address
register class handling to allow multiple address register classes.
Return maximal class for unrecognized instructions. Improve comments.
(ix86_insn_base_reg_class): Rewrite to handle
multiple address register classes.
(ix86_regno_ok_for_insn_base_p): Ditto.
(ix86_insn_index_reg_class): Ditto.
* config/i386/i386.md: Rename "gpr32" attribute to "addr"
and substitute its values with "0" -> "gpr16", "1" -> "*".
(addr): New attribute to limit allowed address register set.
(gpr32): Remove.
* config/i386/mmx.md: Rename "gpr32" attribute to "addr"
and substitute its values with "0" -> "gpr16", "1" -> "*".
* config/i386/sse.md: Ditto.
|
|
The following exercise otherwise not exercised paths in the
vectorizer peeling code, resulting in CPU 2017 build ICEs
when patching but no fallout in the testsuite.
* gfortran.dg/20231103-1.f90: New testcase.
* gfortran.dg/20231103-2.f90: Likewise.
|
|
During analyzing PR111950 I found the loop live operation code-gen
odd, in particular only replacing a single PHI but then adjusting
possibly remaining PHIs afterwards where there shouldn't really
be any out-of-loop uses of the scalar in-loop def left.
* tree-vect-loop.cc (vectorizable_live_operation): Simplify
LC PHI replacement.
|
|
'libgcc/config/gcn/gthr-gcn.h:__gthread_getspecific'
For 'libgcc/config/gcn/gthr-gcn.h' used in libstdc++ context (WIP), we have:
[...]/build-gcc-offload-amdgcn-amdhsa/amdgcn-amdhsa/libstdc++-v3/include/amdgcn-amdhsa/bits/gthr-default.h: In function ‘void* __gthread_getspecific(__gthread_key_t)’:
[...]/build-gcc-offload-amdgcn-amdhsa/amdgcn-amdhsa/libstdc++-v3/include/amdgcn-amdhsa/bits/gthr-default.h:90:10: error: ‘NULL’ was not declared in this scope
90 | return NULL;
| ^~~~
Resolve this with 's%NULL%0', as is used in
'libgcc/gthr-single.h:__gthread_getspecific', for example.
Follow-up to commit 76d463310787c8c7fd0c55cf88031b240311ab68
"Create GCN-specific gthreads".
libgcc/
* config/gcn/gthr-gcn.h (__gthread_getspecific): 's%NULL%0'.
|
|
... to restore compatability with validate_failures.py .
The testsuite script validate_failures.py expects
"Running <sub-testsuite> ..." to extract <sub-testsuite> values,
and gotools.sum provided "Running <sub-testsuite>".
Note that libgo.sum, which also uses Makefile logic to generate
DejaGnu-like output, already has "..." suffix.
gotools/ChangeLog:
* Makefile.am: Update "Running <sub-testsuite> ..." output
* Makefile.in: Regenerate.
Signed-off-by: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
|
|
This patch improves the code generated for x << 1 (and for x + x) when
X is 64-bit DImode, using the same two instruction code sequence used
for DImode addition.
For the test case:
long long foo(long long x) { return x << 1; }
GCC -O2 currently generates the following code:
foo: lsr r2,r0,31
asl_s r1,r1,1
asl_s r0,r0,1
j_s.d [blink]
or_s r1,r1,r2
and on CPU without a barrel shifter, i.e. -mcpu=em
foo: add.f 0,r0,r0
asl_s r1,r1
rlc r2,0
asl_s r0,r0
j_s.d [blink]
or_s r1,r1,r2
with this patch (both with and without a barrel shifter):
foo: add.f r0,r0,r0
j_s.d [blink]
adc r1,r1,r1
A similar optimization is also applicable to H8300H, that could also use
a two instruction sequence (plus rts) but currently GCC generates 16
instructions (plus an rts) for foo above.
2023-11-03 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/arc/arc.md (addsi3): Fix GNU-style code formatting.
(adddi3): Change define_expand to generate a *adddi3.
(*adddi3): New define_insn_and_split to lower DImode additions
during the split1 pass (after combine and before reload).
(ashldi3): New define_expand to (only) generate *ashldi3_cnt1
for DImode left shifts by a single bit.
(*ashldi3_cnt1): New define_insn_and_split to lower DImode
left shifts by one bit to an *adddi3.
gcc/testsuite/ChangeLog
* gcc.target/arc/adddi3-1.c: New test case.
* gcc.target/arc/ashldi3-1.c: Likewise.
|
|
This patch removes a can_create_pseudo_p condition from
*cmov_uxtw_insn_insv, bringing it in line with *cmov<mode>_insn_insv.
The constraints correctly describe the requirements.
gcc/
* config/aarch64/aarch64.md (*cmov_uxtw_insn_insv): Remove
can_create_pseudo_p condition.
|
|
This patch fixes following FAILs for RVV:
FAIL: gcc.dg/vect/vect-gather-1.c -flto -ffat-lto-objects scan-tree-dump vect "Loop contains only SLP stmts"
FAIL: gcc.dg/vect/vect-gather-1.c scan-tree-dump vect "Loop contains only SLP stmts"
Bootstrap on X86 and regtest passed.
Ok for trunk ?
PR tree-optimization/111721
gcc/ChangeLog:
* tree-vect-slp.cc (vect_get_and_check_slp_defs): Support SLP for dummy mask -1.
* tree-vect-stmts.cc (vectorizable_load): Ditto.
|
|
The following removes a bogus assert constraining the uses that
could appear when a built from scalar defs SLP node constrains
code generation in a way so earlier uses of the vector CTOR
components fail to get vectorized. We can't really constrain the
operation such use appears in.
PR tree-optimization/112366
* tree-vect-loop.cc (vectorizable_live_operation): Remove
assert.
|
|
Running 'make check' with: 'RUNTESTFLAGS=--target_board=unix/-fno-exceptions',
'error: exception handling disabled' is triggered for C++ 'throw' etc. usage,
and per 'gcc/testsuite/lib/gcc-dg.exp:gcc-dg-prune':
# If exceptions are disabled, mark tests expecting exceptions to be enabled
# as unsupported.
if { ![check_effective_target_exceptions_enabled] } {
if [regexp "(^|\n)\[^\n\]*: error: exception handling disabled" $text] {
return "::unsupported::exception handling disabled"
}
..., which generally means:
-PASS: [...] (test for excess errors)
+UNSUPPORTED: [...]: exception handling disabled
However, this doesn't work for 'g++.dg/tree-prof/' test cases. For example:
[-PASS:-]{+UNSUPPORTED:+} g++.dg/tree-prof/indir-call-prof-2.C [-compilation, -fprofile-generate -D_PROFILE_GENERATE-]{+compilation: exception handling disabled+}
[-PASS:-]{+UNRESOLVED:+} g++.dg/tree-prof/indir-call-prof-2.C execution, -fprofile-generate -D_PROFILE_GENERATE
[-PASS:-]{+UNRESOLVED:+} g++.dg/tree-prof/indir-call-prof-2.C compilation, -fprofile-use -D_PROFILE_USE
[-PASS:-]{+UNRESOLVED:+} g++.dg/tree-prof/indir-call-prof-2.C execution, -fprofile-use -D_PROFILE_USE
Dependent tests turn UNRESOLVED if the first "compilation" runs into the
expected 'UNSUPPORTED: [...] compile: exception handling disabled'.
Specify 'dg-require-effective-target exceptions_enabled' for those test cases.
gcc/testsuite/
* g++.dg/tree-prof/indir-call-prof-2.C: Specify
'dg-require-effective-target exceptions_enabled'.
* g++.dg/tree-prof/partition1.C: Likewise.
* g++.dg/tree-prof/partition2.C: Likewise.
* g++.dg/tree-prof/partition3.C: Likewise.
* g++.dg/tree-prof/pr51719.C: Likewise.
* g++.dg/tree-prof/pr57451.C: Likewise.
* g++.dg/tree-prof/pr59255.C: Likewise.
|
|
Running 'make check' with: 'RUNTESTFLAGS=--target_board=unix/-fno-exceptions',
'error: exception handling disabled' is triggered for C++ 'throw' etc. usage,
and per 'gcc/testsuite/lib/gcc-dg.exp:gcc-dg-prune':
# If exceptions are disabled, mark tests expecting exceptions to be enabled
# as unsupported.
if { ![check_effective_target_exceptions_enabled] } {
if [regexp "(^|\n)\[^\n\]*: error: exception handling disabled" $text] {
return "::unsupported::exception handling disabled"
}
..., which generally means:
-PASS: [...] (test for excess errors)
+UNSUPPORTED: [...]: exception handling disabled
However, this doesn't work for "split files" test cases. For example:
[-PASS:-]{+UNSUPPORTED:+} g++.dg/lto/20081109-1 cp_lto_20081109-1_0.o [-assemble, -fPIC -flto -flto-partition=1to1-]{+assemble: exception handling disabled+}
[-PASS:-]{+UNRESOLVED:+} g++.dg/lto/20081109-1 cp_lto_20081109-1_0.o-cp_lto_20081109-1_0.o [-link,-]{+link+} -fPIC -flto -flto-partition=1to1
{+UNRESOLVED: g++.dg/lto/20081109-1 cp_lto_20081109-1_0.o-cp_lto_20081109-1_0.o execute -fPIC -flto -flto-partition=1to1+}
The "compile"/"assemble" tests (either continue to work, or) result in the
expected 'UNSUPPORTED: [...] compile: exception handling disabled', but
dependent "link" and "execute" tests then turn UNRESOLVED.
Specify 'dg-require-effective-target exceptions_enabled' for those test cases.
gcc/testsuite/
* g++.dg/lto/20081109-1_0.C: Specify
'dg-require-effective-target exceptions_enabled'.
* g++.dg/lto/20081109_0.C: Likewise.
* g++.dg/lto/20091026-1_0.C: Likewise.
* g++.dg/lto/pr87906_0.C: Likewise.
* g++.dg/lto/pr88046_0.C: Likewise.
|
|
Running 'make check' with: 'RUNTESTFLAGS=--target_board=unix/-fno-exceptions',
'error: exception handling disabled' is triggered for C++ 'throw' etc. usage,
and per 'gcc/testsuite/lib/gcc-dg.exp:gcc-dg-prune':
# If exceptions are disabled, mark tests expecting exceptions to be enabled
# as unsupported.
if { ![check_effective_target_exceptions_enabled] } {
if [regexp "(^|\n)\[^\n\]*: error: exception handling disabled" $text] {
return "::unsupported::exception handling disabled"
}
..., which generally means:
-PASS: [...] (test for excess errors)
+UNSUPPORTED: [...]: exception handling disabled
However, this doesn't work for "split files" test cases. For example:
PASS: g++.dg/compat/eh/ctor1 cp_compat_main_tst.o compile
[-PASS:-]{+UNSUPPORTED:+} g++.dg/compat/eh/ctor1 cp_compat_x_tst.o [-compile-]{+compile: exception handling disabled+}
[-PASS:-]{+UNSUPPORTED:+} g++.dg/compat/eh/ctor1 cp_compat_y_tst.o [-compile-]{+compile: exception handling disabled+}
[-PASS:-]{+UNRESOLVED:+} g++.dg/compat/eh/ctor1 cp_compat_x_tst.o-cp_compat_y_tst.o link
[-PASS:-]{+UNRESOLVED:+} g++.dg/compat/eh/ctor1 cp_compat_x_tst.o-cp_compat_y_tst.o execute
The "compile"/"assemble" tests (either continue to work, or) result in the
expected 'UNSUPPORTED: [...] compile: exception handling disabled', but
dependent "link" and "execute" tests then turn UNRESOLVED.
Specify 'dg-require-effective-target exceptions_enabled' for those test cases.
gcc/testsuite/
* g++.dg/compat/eh/ctor1_main.C: Specify
'dg-require-effective-target exceptions_enabled'.
* g++.dg/compat/eh/ctor2_main.C: Likewise.
* g++.dg/compat/eh/dtor1_main.C: Likewise.
* g++.dg/compat/eh/filter1_main.C: Likewise.
* g++.dg/compat/eh/filter2_main.C: Likewise.
* g++.dg/compat/eh/new1_main.C: Likewise.
* g++.dg/compat/eh/nrv1_main.C: Likewise.
* g++.dg/compat/eh/spec3_main.C: Likewise.
* g++.dg/compat/eh/template1_main.C: Likewise.
* g++.dg/compat/eh/unexpected1_main.C: Likewise.
* g++.dg/compat/init/array5_main.C: Likewise.
|
|
Running 'make check' with: 'RUNTESTFLAGS=--target_board=unix/-fno-exceptions',
'error: exception handling disabled' is triggered for C++ 'throw' etc. usage,
and per 'gcc/testsuite/lib/gcc-dg.exp:gcc-dg-prune':
# If exceptions are disabled, mark tests expecting exceptions to be enabled
# as unsupported.
if { ![check_effective_target_exceptions_enabled] } {
if [regexp "(^|\n)\[^\n\]*: error: exception handling disabled" $text] {
return "::unsupported::exception handling disabled"
}
..., which generally means:
-PASS: [...] (test for excess errors)
+UNSUPPORTED: [...]: exception handling disabled
However, if there are additional 'dg-error' etc. directives, these may regress
PASS -> FAIL (or similar) -- if their associated diagnostics are precluded by
'error: exception handling disabled'. For example:
PASS: g++.dg/cpp2a/explicit1.C (test for errors, line 43)
PASS: g++.dg/cpp2a/explicit1.C (test for errors, line 47)
[-PASS:-]{+FAIL:+} g++.dg/cpp2a/explicit1.C (test for errors, line 50)
[-PASS:-]{+FAIL:+} g++.dg/cpp2a/explicit1.C (test for errors, line 51)
PASS: g++.dg/cpp2a/explicit1.C (test for errors, line 52)
PASS: g++.dg/cpp2a/explicit1.C (test for errors, line 53)
PASS: g++.dg/cpp2a/explicit1.C (test for errors, line 59)
[-PASS:-]{+UNSUPPORTED:+} g++.dg/cpp2a/explicit1.C [-(test for excess errors)-]{+: exception handling disabled+}
Specify 'dg-require-effective-target exceptions_enabled' for those test cases.
gcc/testsuite/
* g++.dg/cpp0x/catch1.C: Specify
'dg-require-effective-target exceptions_enabled'.
* g++.dg/cpp0x/constexpr-throw.C: Likewise.
* g++.dg/cpp1y/constexpr-89785-2.C: Likewise.
* g++.dg/cpp1y/constexpr-throw.C: Likewise.
* g++.dg/cpp1y/pr79393-3.C: Likewise.
* g++.dg/cpp2a/consteval-memfn1.C: Likewise.
* g++.dg/cpp2a/consteval11.C: Likewise.
* g++.dg/cpp2a/consteval34.C: Likewise.
* g++.dg/cpp2a/consteval9.C: Likewise.
* g++.dg/cpp2a/explicit1.C: Likewise.
* g++.dg/cpp2a/explicit2.C: Likewise.
* g++.dg/cpp2a/explicit5.C: Likewise.
* g++.dg/eh/builtin10.C: Likewise.
* g++.dg/eh/builtin11.C: Likewise.
* g++.dg/eh/builtin6.C: Likewise.
* g++.dg/eh/builtin7.C: Likewise.
* g++.dg/eh/builtin9.C: Likewise.
* g++.dg/eh/dtor4.C: Likewise.
* g++.dg/eh/pr42859.C: Likewise.
* g++.dg/ext/stmtexpr25.C: Likewise.
* g++.dg/ext/vla4.C: Likewise.
* g++.dg/init/placement4.C: Likewise.
* g++.dg/other/error32.C: Likewise.
* g++.dg/parse/crash55.C: Likewise.
* g++.dg/parse/pr31952-2.C: Likewise.
* g++.dg/parse/pr31952-3.C: Likewise.
* g++.dg/tm/noexcept-7.C: Likewise.
* g++.dg/torture/pr43257.C: Likewise.
* g++.dg/torture/pr56694.C: Likewise.
* g++.dg/torture/pr81659.C: Likewise.
* g++.dg/warn/Wcatch-value-1.C: Likewise.
* g++.dg/warn/Wcatch-value-2.C: Likewise.
* g++.dg/warn/Wcatch-value-3.C: Likewise.
* g++.dg/warn/Wcatch-value-3b.C: Likewise.
* g++.dg/warn/Wexceptions1.C: Likewise.
* g++.dg/warn/Wexceptions3.C: Likewise.
* g++.dg/warn/Winfinite-recursion-3.C: Likewise.
* g++.dg/warn/Wreturn-6.C: Likewise.
* g++.dg/warn/Wstringop-truncation-2.C: Likewise.
* g++.dg/warn/Wterminate1.C: Likewise.
* g++.old-deja/g++.eh/catch1.C: Likewise.
* g++.old-deja/g++.eh/catch10.C: Likewise.
* g++.old-deja/g++.eh/cond1.C: Likewise.
* g++.old-deja/g++.eh/ctor1.C: Likewise.
* g++.old-deja/g++.eh/throw2.C: Likewise.
* g++.old-deja/g++.other/cond5.C: Likewise.
|
|
The following avoids hoisting expressions that may invoke undefined
behavior and are not computed on all paths. This is realized by
noting that we have to avoid materializing expressions as part
of hoisting that are not part of the set of expressions we have
found eligible for hoisting. Instead of picking the expression
corresponding to the hoistable values from the first successor
we now keep a union of the expressions so that hoisting can pick
the expression that has its dependences fully hoistable.
PR tree-optimization/112310
* tree-ssa-pre.cc (do_hoist_insertion): Keep the union
of expressions, validate dependences are contained within
the hoistable set before hoisting.
* gcc.dg/torture/pr112310.c: New testcase.
|
|
2023-11-03 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/98498
* interface.cc (upoly_ok): Defined operators using unlimited
polymorphic formal arguments must not override the intrinsic
operator use.
gcc/testsuite/
PR fortran/98498
* gfortran.dg/interface_50.f90: New test.
|
|
Update in v2:
* Add mode size equal check to disable different mode size when expand,
because the underlying codegen is not implemented yet.
Original log:
The previous rounding API start with i/l/ll only works on the same
mode types. For example as below, and we arrange the iterator similar
to fcvt.
* SF => SI
* DF => DI
After we refined this limination from middle-end, these API can also
vectorized with different type sizes, aka:
* HF => SI, HF => DI
* SF => DI, SF => SI
* DF => SI, DF => DI
Then the iterator cannot take care of this simply and this patch
would like to re-arrange the iterator in two items.
* V_VLS_F_CONVERT_SI: handle (HF, SF, DF) => SI
* V_VLS_F_CONVERT_DI: handle (HF, SF, DF) => DI
As well as related mode_attr to reconcile the new iterator.
gcc/ChangeLog:
* config/riscv/autovec.md (lrint<mode><v_i_l_ll_convert>2): Remove.
(lround<mode><v_i_l_ll_convert>2): Ditto.
(lceil<mode><v_i_l_ll_convert>2): Ditto.
(lfloor<mode><v_i_l_ll_convert>2): Ditto.
(lrint<mode><v_f2si_convert>2): New pattern for cvt from
FP to SI.
(lround<mode><v_f2si_convert>2): Ditto.
(lceil<mode><v_f2si_convert>2): Ditto.
(lfloor<mode><v_f2si_convert>2): Ditto.
(lrint<mode><v_f2di_convert>2): New pattern for cvt from
FP to DI.
(lround<mode><v_f2di_convert>2): Ditto.
(lceil<mode><v_f2di_convert>2): Ditto.
(lfloor<mode><v_f2di_convert>2): Ditto.
* config/riscv/vector-iterators.md: Renew iterators for both
the SI and DI.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
With compile option --param=riscv-autovec-preference=fixed-vlmax, we have
redundant AVL/VL toggling:
vsetvli a5,a3,e8,mf4,ta,ma -> should be changed into e32m1
vle32.v v1,0(a1)
vle32.v v2,0(a0)
vsetivli zero,4,e32,m1,ta,ma -> redundant
slli a2,a5,2
vadd.vv v1,v1,v2
sub a3,a3,a5
vsetvli zero,a5,e32,m1,ta,ma -> redundant
vse32.v v1,0(a4)
add a0,a0,a2
add a1,a1,a2
add a4,a4,a2
bne a3,zero,.L3
The root cause is because we simplify AVL into immediate AVL too early
in FIXED-VLMAX situation. The later avlprop PASS failed to propagate AVL
generated by (SELECT_VL/vsetvl VL, AVL) into the normal RVV instruction.
So we need to remove immedate AVL simplification in 'expand' stage.
After this patch:
vsetvli a5,a3,e32,m1,ta,ma
slli a2,a5,2
vle32.v v1,0(a1)
vle32.v v2,0(a0)
sub a3,a3,a5
vadd.vv v1,v1,v2
vse32.v v1,0(a4)
add a0,a0,a2
add a1,a1,a2
add a4,a4,a2
bne a3,zero,.L3
After the removed simplification, the following situation should be fixed:
typedef int8_t vnx2qi __attribute__ ((vector_size (2)));
__attribute__ ((noipa)) void
f_vnx2qi (int8_t a, int8_t b, int8_t *out)
{
vnx2qi v = {a, b};
*(vnx2qi *) out = v;
}
We should use vsetvili zero, 2 instead of vsetvl a5,zero.
Such simplification is done in avlprop PASS which is also included in this patch
to fix regression of these situation.
PR target/112326
gcc/ChangeLog:
* config/riscv/riscv-avlprop.cc (get_insn_vtype_mode): New function.
(simplify_replace_vlmax_avl): Ditto.
(pass_avlprop::execute): Add immediate AVL simplification.
* config/riscv/riscv-protos.h (imm_avl_p): Rename.
* config/riscv/riscv-v.cc (const_vlmax_p): Ditto.
(imm_avl_p): Ditto.
(emit_vlmax_insn): Adapt for new interface name.
* config/riscv/vector.md (mode_idx): New attribute.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr112326.c: New test.
|
|
This reverts commit 81a81abec5c28f2c08f986f1f17ac66e555cbd4b.
|
|
|
|
Now that all insns are guaranteed to have a type, ensure every insn
is associated with a cpu unit/insn reservation.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_sched_variable_issue): add disabled assert
Signed-off-by: Edwin Lu <ewlu@rivosinc.com>
|
|
2023-11-02 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/112316
* parse.cc (parse_associate): Remove condition that caused this
regression.
gcc/testsuite/
PR fortran/112316
* gfortran.dg/pr112316.f90: New test.
|
|
I noticed we were using a hash_table directly here instead of the simpler
hash_set interface. Also, let's check for the variable itself and repeats
earlier, since they should happen more often than any of the other cases.
gcc/cp/ChangeLog:
* semantics.cc (nrv_data): Change visited to hash_set.
(finalize_nrv_r): Reorganize.
|
|
In r12-6333 for PR33799, I fixed the example in [except.ctor]/2. In that
testcase, the exception is caught and the function returns again,
successfully.
In this testcase, however, the exception is rethrown, and hits two separate
cleanups: one in the try block and the other in the function body. So we
destroy twice an object that was only constructed once.
Fortunately, the fix for the normal case is easy: we just need to clear the
"return value constructed by return" flag when we do it the first time.
This gets more complicated with the named return value optimization, since
we don't want to destroy the return value while the NRV variable is still in
scope.
PR c++/112301
PR c++/102191
PR c++/33799
gcc/cp/ChangeLog:
* except.cc (maybe_splice_retval_cleanup): Clear
current_retval_sentinel when destroying retval.
* semantics.cc (nrv_data): Add in_nrv_cleanup.
(finalize_nrv): Set it.
(finalize_nrv_r): Fix handling of throwing cleanups.
gcc/testsuite/ChangeLog:
* g++.dg/eh/return1.C: Add more cases.
|
|
libstdc++-v3/ChangeLog:
PR libstdc++/112314
* include/std/string_view (string_view::remove_suffix): Add
debug assertion.
* testsuite/21_strings/basic_string_view/modifiers/remove_prefix/debug.cc:
New test.
* testsuite/21_strings/basic_string_view/modifiers/remove_suffix/debug.cc:
New test.
|
|
Fix ICE because of forgotten checks for pointers to void
and incomplete arrays.
Committed as obvious.
PR c/112347
gcc/c:
* c-typeck.cc (convert_for_assignment): Add missing check.
gcc/testsuite:
* gcc.dg/Walloc-size-3.c: New test.
|
|
D front-end changes:
- Suggested preview switches now give gdc flags (PR109681).
- `new S[10]' is now lowered to `_d_newarrayT!S(10)'.
D runtime changes:
- Runtime compiler library functions `_d_newarrayU', `_d_newarrayT',
`_d_newarrayiT' have been converted to templates.
Phobos changes:
- Add new `std.traits.Unshared' template.
gcc/d/ChangeLog:
* dmd/MERGE: Merge upstream dmd 643b1261bb.
* d-attribs.cc (build_attributes): Update for new front-end interface.
* d-lang.cc (d_post_options): Likewise.
* decl.cc (layout_class_initializer): Likewise.
libphobos/ChangeLog:
* libdruntime/MERGE: Merge upstream druntime 643b1261bb.
* libdruntime/Makefile.am (DRUNTIME_DSOURCES_FREEBSD): Add
core/sys/freebsd/ifaddrs.d, core/sys/freebsd/net/if_dl.d,
core/sys/freebsd/sys/socket.d, core/sys/freebsd/sys/types.d.
(DRUNTIME_DSOURCES_LINUX): Add core/sys/linux/linux/if_arp.d,
core/sys/linux/linux/if_packet.d.
* libdruntime/Makefile.in: Regenerate.
* src/MERGE: Merge upstream phobos 1c98326e7.
|
|
The checks for snprintf give a -Wformat warning due to a missing
argument.
libstdc++-v3/ChangeLog:
* acinclude.m4 (GLIBCXX_ENABLE_C99): Fix snprintf checks.
* configure: Regenerate.
|
|
Spurred by Roger's recent work on ARC, this patch improves the code we
generation for single bit sign extractions.
The basic idea is to get the bit we want into C, the use a subx;ext.w;ext.l
sequence to sign extend it in a GPR.
For bits 0..15 we can use a bld instruction to get the bit we want into C. For
bits 16..31, we can move the high word into the low word, then use bld.
There's a couple special cases where we can shift the bit we want from the high
word into C which is an instruction smaller.
Not surprisingly most cases seen in newlib and the test suite are extractions
from the low byte, HImode sign bit and top two bits of SImode.
Regression tested on the H8 with no regressions. Installing on the trunk.
gcc/
* config/h8300/combiner.md: Add new patterns for single bit
sign extractions.
|
|
No functional change intended.
gcc/analyzer/ChangeLog:
PR analyzer/112317
* access-diagram.cc (class x_aligned_x_ruler_widget): Eliminate
unused field "m_col_widths".
(access_diagram_impl::add_valid_vs_invalid_ruler): Update for
above change.
* region-model.cc
(check_one_function_attr_null_terminated_string_arg): Remove
unused variables "cd_unchecked", "strlen_sval", and
"limited_sval".
* region-model.h (region_model_context_decorator::warn): Add
missing "override".
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
The previous rounding API start with i/l/ll only works on the same
mode types. For example as below, and we arrange the iterator similar
to fcvt.
* SF => SI
* DF => DI
After we refined this limination from middle-end, these API can also
vectorized with different type sizes, aka:
* HF => SI, HF => DI
* SF => DI, SF => SI
* DF => SI, DF => DI
Then the iterator cannot take care of this simply and this patch
would like to re-arrange the iterator in two items.
* V_VLS_F_CONVERT_SI: handle (HF, SF, DF) => SI
* V_VLS_F_CONVERT_DI: handle (HF, SF, DF) => DI
As well as related mode_attr to reconcile the new iterator.
gcc/ChangeLog:
* config/riscv/autovec.md (lrint<mode><v_i_l_ll_convert>2): Remove.
(lround<mode><v_i_l_ll_convert>2): Ditto.
(lceil<mode><v_i_l_ll_convert>2): Ditto.
(lfloor<mode><v_i_l_ll_convert>2): Ditto.
(lrint<mode><v_f2si_convert>2): New pattern for cvt from
FP to SI.
(lround<mode><v_f2si_convert>2): Ditto.
(lceil<mode><v_f2si_convert>2): Ditto.
(lfloor<mode><v_f2si_convert>2): Ditto.
(lrint<mode><v_f2di_convert>2): New pattern for cvt from
FP to DI.
(lround<mode><v_f2di_convert>2): Ditto.
(lceil<mode><v_f2di_convert>2): Ditto.
(lfloor<mode><v_f2di_convert>2): Ditto.
* config/riscv/vector-iterators.md: Renew iterators for both
the SI and DI.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
Say 'memory lifetime' rather than 'memory life' as lifetime is the more
standard term nowadays (indeed we have e.g. -fno-lifetime-dse).
It's also easier to grep for if someone is looking for the documentation on
where we do that.
gcc/ChangeLog:
* doc/passes.texi (Dead code elimination): Explicitly say 'lifetime'
as this has become the standard term for what we're doing here.
Signed-off-by: Sam James <sam@gentoo.org>
|
|
A run FAIL suddenly shows up today to me:
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c execution test
that I didn't have before.
After investigation, I realize that there is a bug in AVL propagtion PASS.
gcc/ChangeLog:
* config/riscv/riscv-avlprop.cc
(pass_avlprop::get_vlmax_ta_preferred_avl): Don't allow
non-real insn AVL propation.
|
|
As described in PR111401 we currently emit a COND and a PLUS expression
for conditional reductions. This makes it difficult to combine both
into a masked reduction statement later.
This patch improves that by directly emitting a COND_ADD/COND_OP during
ifcvt and adjusting some vectorizer code to handle it.
It also makes neutral_op_for_reduction return -0 if HONOR_SIGNED_ZEROS
is true.
gcc/ChangeLog:
PR middle-end/111401
* internal-fn.cc (internal_fn_else_index): New function.
* internal-fn.h (internal_fn_else_index): Define.
* tree-if-conv.cc (convert_scalar_cond_reduction): Emit COND_OP
if supported.
(predicate_scalar_phi): Add whitespace.
* tree-vect-loop.cc (fold_left_reduction_fn): Add IFN_COND_OP.
(neutral_op_for_reduction): Return -0 for PLUS.
(check_reduction_path): Don't count else operand in COND_OP.
(vect_is_simple_reduction): Ditto.
(vect_create_epilog_for_reduction): Fix whitespace.
(vectorize_fold_left_reduction): Add COND_OP handling.
(vectorizable_reduction): Don't count else operand in COND_OP.
(vect_transform_reduction): Add COND_OP handling.
* tree-vectorizer.h (neutral_op_for_reduction): Add default
parameter.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c: New test.
* gcc.target/riscv/rvv/autovec/cond/pr111401.c: New test.
* gcc.target/riscv/rvv/autovec/reduc/reduc_call-2.c: Adjust.
* gcc.target/riscv/rvv/autovec/reduc/reduc_call-4.c: Ditto.
|
|
The following addresses wrong debug IL created by SCCP rewriting stmts
to defined overflow. I addressed another inefficiency there but
needed to adjust the API of rewrite_to_defined_overflow for this
which is now taking a stmt iterator for in-place operation and a
stmt for sequence producing because gsi_for_stmt doesn't work for
stmts not in the IL.
PR tree-optimization/112320
* gimple-fold.h (rewrite_to_defined_overflow): New overload
for in-place operation.
* gimple-fold.cc (rewrite_to_defined_overflow): Add stmt
iterator argument to worker, define separate API for
in-place and not in-place operation.
* tree-if-conv.cc (predicate_statements): Simplify.
* tree-scalar-evolution.cc (final_value_replacement_loop):
Likewise.
* tree-ssa-ifcombine.cc (pass_tree_ifcombine::execute): Adjust.
* tree-ssa-reassoc.cc (update_range_test): Likewise.
* gcc.dg/pr112320.c: New testcase.
|
|
Move stack protector patterns above mov $0,%reg -> xor %reg,%reg
so the later won't interfere with stack protector peephole2s.
gcc/ChangeLog:
* config/i386/i386.md: Move stack protector patterns
above mov $0,%reg -> xor %reg,%reg peephole2 pattern.
|
|
This fixes:
PASS: gcc.dg/vect/vect-gather-2.c (test for excess errors)
-FAIL: gcc.dg/vect/vect-gather-2.c scan-tree-dump vect "different gather base"
-FAIL: gcc.dg/vect/vect-gather-2.c scan-tree-dump vect "different gather scale"
PASS: gcc.dg/vect/vect-gather-2.c scan-tree-dump-not vect "Loop contains only SLP stmts"
..., and enables other test cases.
gcc/testsuite/
* lib/target-supports.exp
(check_effective_target_vect_gather_load_ifn): True for GCN
target.
|
|
gcc/ChangeLog:
* config/i386/mmx.md (cmlav4hf4): New expander.
(cmla_conjv4hf4): Ditto.
(cmulv4hf3): Ditto.
(cmul_conjv4hf3): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/part-vect-complexhf.c: New test.
|
|
The following patch implements C++26 unevaluated-string.
As it seems to me just extra pedanticity, it is implemented only for
-std=c++26 or -std=gnu++26 and later and only if -pedantic/-pedantic-errors.
Nothing is done for inline asm, while the spec changes those, it changes it
to a balanced token sequence with implementation defined rules on what is
and isn't allowed (so pedantically accepting asm ("" : "+m" (x));
was accepts-invalid before C++26, but we didn't diagnose anything).
For the other spots mentioned in the paper, static_assert message,
linkage specification, deprecated/nodiscard attributes it enforces the
requirements (no prefixes, udlit suffixes, no octal/hexadecimal escapes
(conditional escape sequences were rejected with pedantic already before).
For the deprecated operator "" identifier case I've kept things as is,
because everything seems to have been diagnosed already (a lot being implied
from the string having to be empty).
2023-11-02 Jakub Jelinek <jakub@redhat.com>
PR c++/110342
gcc/cp/
* parser.cc: Implement C++26 P2361R6 - Unevaluated strings.
(uneval_string_attr): New enumerator.
(cp_parser_string_literal_common): Add UNEVAL argument. If true,
pass CPP_UNEVAL_STRING rather than CPP_STRING to
cpp_interpret_string_notranslate.
(cp_parser_string_literal, cp_parser_userdef_string_literal): Adjust
callers of cp_parser_string_literal_common.
(cp_parser_unevaluated_string_literal): New function.
(cp_parser_parenthesized_expression_list): Handle uneval_string_attr.
(cp_parser_linkage_specification): Use
cp_parser_unevaluated_string_literal for C++26.
(cp_parser_static_assert): Likewise.
(cp_parser_std_attribute): Use uneval_string_attr for standard
deprecated and nodiscard attributes.
gcc/testsuite/
* g++.dg/cpp26/unevalstr1.C: New test.
* g++.dg/cpp26/unevalstr2.C: New test.
* g++.dg/cpp0x/udlit-error1.C (lol): Expect an error for C++26
about user-defined literal in deprecated attribute.
libcpp/
* include/cpplib.h (TTYPE_TABLE): Add CPP_UNEVAL_STRING literal
entry. Use C++11 instead of C++-0x in comments.
* charset.cc (convert_escape): Add UNEVAL argument, if true,
pedantically diagnose numeric escape sequences.
(cpp_interpret_string_1): Formatting fix. Adjust convert_escape
caller.
(cpp_interpret_string): Formatting string.
(cpp_interpret_string_notranslate): Pass type through to
cpp_interpret_string if it is CPP_UNEVAL_STRING.
|
|
Notice that there are some reundant 'vimov' codes in attribute.
Committed as it is obvious.
gcc/ChangeLog:
* config/riscv/vector.md: Fix redundant codes in attributes.
|
|
https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/288
gcc/ChangeLog:
* config/riscv/riscv-vector-builtins-bases.cc: Expand non-tuple intrinsics.
* config/riscv/riscv-vector-builtins-functions.def (vcreate): Define non-tuple intrinsics.
* config/riscv/riscv-vector-builtins-shapes.cc (struct vcreate_def): Ditto.
* config/riscv/riscv-vector-builtins.cc: Add arg types.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/tuple_create.c: Rename to vcreate.c.
* gcc.target/riscv/rvv/base/vcreate.c: New test.
|
|
Update in v4:
* Append the check to vectorizable_internal_function.
Update in v3:
* Add func to predicate type size is legal or not for vectorizer call.
Update in v2:
* Fix one ICE of type assertion.
* Adjust some test cases for aarch64 sve and riscv vector.
Original log:
The vectoriable_call has one restriction of the size of data type.
Aka DF to DI is allowed but SF to DI isn't. You may see below message
when try to vectorize function call like lrintf.
void
test_lrintf (long *out, float *in, unsigned count)
{
for (unsigned i = 0; i < count; i++)
out[i] = __builtin_lrintf (in[i]);
}
lrintf.c:5:26: missed: couldn't vectorize loop
lrintf.c:5:26: missed: not vectorized: unsupported data-type
Then the standard name pattern like lrintmn2 cannot work for different
data type size like SF => DI. This patch would like to refine this data
type size check and unblock the standard name like lrintmn2 on conditions.
The type size of vectype_out need to be exactly the same as the type
size of vectype_in when the vectype_out size isn't participating in
the optab selection. While there is no such restriction when the
vectype_out is somehow a part of the optab query.
The below test are passed for this patch.
* The risc-v regression tests.
* Ensure the lrintf standard name in risc-v.
The below test are ongoing.
* The x86 bootstrap and regression test.
* The aarch64 regression test.
gcc/ChangeLog:
* tree-vect-stmts.cc (vectorizable_internal_function): Add type
size check for vectype_out doesn't participating for optab query.
(vectorizable_call): Remove the type size check.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
reduction instruction[PR112327]
Consider this following intrinsic code:
void rvv_dot_prod(int16_t *pSrcA, int16_t *pSrcB, uint32_t n, int64_t *result)
{
size_t vl;
vint16m4_t vSrcA, vSrcB;
vint64m1_t vSum = __riscv_vmv_s_x_i64m1(0, 1);
while (n > 0) {
vl = __riscv_vsetvl_e16m4(n);
vSrcA = __riscv_vle16_v_i16m4(pSrcA, vl);
vSrcB = __riscv_vle16_v_i16m4(pSrcB, vl);
vSum = __riscv_vwredsum_vs_i32m8_i64m1(__riscv_vwmul_vv_i32m8(vSrcA, vSrcB, vl), vSum, vl);
pSrcA += vl;
pSrcB += vl;
n -= vl;
}
*result = __riscv_vmv_x_s_i64m1_i64(vSum);
}
https://godbolt.org/z/vWd35W7G6
Before this patch:
...
Loop:
...
vmv1r.v v2,v1
...
vwredsum.vs v1,v8,v2
...
After this patch:
...
Loop:
...
vwredsum.vs v1,v8,v1
...
PR target/112327
gcc/ChangeLog:
* config/riscv/vector.md: Add '0'.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/pr112327-1.c: New test.
* gcc.target/riscv/rvv/base/pr112327-2.c: New test.
|
|
|
|
Committing under the "obvious" rule.
ChangeLog:
* .github/CONTRIBUTING.md: Wrap lines more tightly.
|
|
Currently there is an unofficial mirror of GCC on GitHub that people
sometimes submit pull requests to:
https://github.com/gcc-mirror/gcc
However, this is not the proper way to contribute to GCC, so that means
that someone (usually Jonathan Wakely) has to go through the PRs and
manually tell people that they're sending their PRs to the wrong place.
One thing that would help mitigate this problem would be files in a
special .github directory that GitHub would automatically open when
contributors attempt to open a PR, that would then tell them the proper
way to contribute instead. This patch attempts to add two such files.
They are written in Markdown, which I'm realizing might require some
special handling in this repository, since the ".md" extension is also
used for GCC's "Machine Description" files here, but I'm not quite sure
how to go about handling that. Also note that I adapted these files from
equivalent files in the git repository for Git itself:
https://github.com/git/git/blob/master/.github/CONTRIBUTING.md
https://github.com/git/git/blob/master/.github/PULL_REQUEST_TEMPLATE.md
What do people think?
ChangeLog:
* .github/CONTRIBUTING.md: New file.
* .github/PULL_REQUEST_TEMPLATE.md: New file.
|
|
This patch is a follow-up to my previous PR target/110551 patch, this
time to address the additional move after mulx, seen on TARGET_BMI2
architectures (such as -march=haswell). The complication here is
that the flexible multiple-set mulx instruction is introduced into
RTL after reload, by split2, and therefore can't benefit from register
preferencing. This results in RTL like the following:
(insn 32 31 17 2 (parallel [
(set (reg:DI 4 si [orig:101 r ] [101])
(mult:DI (reg:DI 1 dx [109])
(reg:DI 5 di [109])))
(set (reg:DI 5 di [ r+8 ])
(umul_highpart:DI (reg:DI 1 dx [109])
(reg:DI 5 di [109])))
]) "pr110551-2.c":8:17 -1
(nil))
(insn 17 32 9 2 (set (reg:DI 0 ax [107])
(reg:DI 5 di [ r+8 ])) "pr110551-2.c":9:40 90 {*movdi_internal}
(expr_list:REG_DEAD (reg:DI 5 di [ r+8 ])
(nil)))
Here insn 32, the mulx instruction, places its results in si and di,
and then immediately after decides to move di to ax, with di now dead.
This can be trivially cleaned up by a peephole2. I've added an
additional constraint that the two SET_DESTs can't be the same
register to avoid confusing the middle-end, but this has well-defined
behaviour on x86_64/BMI2, encoding a umul_highpart.
For the new test case, compiled on x86_64 with -O2 -march=haswell:
Before:
mulx64: movabsq $-7046029254386353131, %rdx
mulx %rdi, %rsi, %rdi
movq %rdi, %rax
xorq %rsi, %rax
ret
After:
mulx64: movabsq $-7046029254386353131, %rdx
mulx %rdi, %rsi, %rax
xorq %rsi, %rax
ret
2023-11-01 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR target/110551
* config/i386/i386.md (*bmi2_umul<mode><dwi>3_1): Tidy condition
as operands[2] with predicate register_operand must be !MEM_P.
(peephole2): Optimize a mulx followed by a register-to-register
move, to place result in the correct destination if possible.
gcc/testsuite/ChangeLog
PR target/110551
* gcc.target/i386/pr110551-2.c: New test case.
|
|
Other subword atomic patterns use riscv_subword_address to calculate
the aligned address, shift amount, mask and !mask. atomic_test_and_set
was implemented before the common function was added. After this patch
all subword atomic patterns use riscv_subword_address.
gcc/ChangeLog:
* config/riscv/sync.md: Use riscv_subword_address function to
calculate the address and shift in atomic_test_and_set.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
|
|
This patch transitions the ztso testcases to use the testsuite infrastructure,
enabling the tests on both rv64 and rv32 targets.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/amo-table-ztso-amo-add-1.c: Add Ztso extension to
dg-options for dg-do compile.
* gcc.target/riscv/amo-table-ztso-amo-add-2.c: Ditto.
* gcc.target/riscv/amo-table-ztso-amo-add-3.c: Ditto.
* gcc.target/riscv/amo-table-ztso-amo-add-4.c: Ditto.
* gcc.target/riscv/amo-table-ztso-amo-add-5.c: Ditto.
* gcc.target/riscv/amo-table-ztso-compare-exchange-1.c: Ditto.
* gcc.target/riscv/amo-table-ztso-compare-exchange-2.c: Ditto.
* gcc.target/riscv/amo-table-ztso-compare-exchange-3.c: Ditto.
* gcc.target/riscv/amo-table-ztso-compare-exchange-4.c: Ditto.
* gcc.target/riscv/amo-table-ztso-compare-exchange-5.c: Ditto.
* gcc.target/riscv/amo-table-ztso-compare-exchange-6.c: Ditto.
* gcc.target/riscv/amo-table-ztso-compare-exchange-7.c: Ditto.
* gcc.target/riscv/amo-table-ztso-fence-1.c: Ditto.
* gcc.target/riscv/amo-table-ztso-fence-2.c: Ditto.
* gcc.target/riscv/amo-table-ztso-fence-3.c: Ditto.
* gcc.target/riscv/amo-table-ztso-fence-4.c: Ditto.
* gcc.target/riscv/amo-table-ztso-fence-5.c: Ditto.
* gcc.target/riscv/amo-table-ztso-load-1.c: Ditto.
* gcc.target/riscv/amo-table-ztso-load-2.c: Ditto.
* gcc.target/riscv/amo-table-ztso-load-3.c: Ditto.
* gcc.target/riscv/amo-table-ztso-store-1.c: Ditto.
* gcc.target/riscv/amo-table-ztso-store-2.c: Ditto.
* gcc.target/riscv/amo-table-ztso-store-3.c: Ditto.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-1.c: Ditto.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-2.c: Ditto.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-3.c: Ditto.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-4.c: Ditto.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-5.c: Ditto.
* lib/target-supports.exp: Add testing infrastructure to require the
Ztso extension or add it to an existing -march.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
|