Age | Commit message (Collapse) | Author | Files | Lines |
|
There is a new buildbot check that all autotool files are generated
with the correct versions (automake 1.15.1 and autoconf 2.69).
https://builder.sourceware.org/buildbot/#/builders/gcc-autoregen
Correct one file that was generated with the wrong version.
libiberty/
* aclocal.m4: Rebuild.
|
|
Passing in a base extension in non-canonical order (i, e, g) causes GCC
to ICE:
xgcc: error: '-march=rv64ge': ISA string is not in canonical order. 'e'
xgcc: internal compiler error: in add, at common/config/riscv/riscv-common.cc:671
...
This is fixed by skipping to the next extension when a non-canonical
order is detected.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc
(riscv_subset_list::parse_std_ext): Emit an error and skip to
the next extension when a non-canonical ordering is detected.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/arch-27.c: New test.
* gcc.target/riscv/arch-28.c: New test.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
|
|
r14-985-gca2007a9bb3074 used the collapsed macro definition
CAN_HAVE_LOCATION_P in gcc-rich-location.cc and r14-977-g8861c80733da5c
in c++'s build_cplus_array_type ().
However, although otherwise correct, the usage of CAN_HAVE_LOCATION_P
in these two spots is misleading, so this patch reverts aforementioned
two hunks.
gcc/cp/ChangeLog:
* tree.cc (build_cplus_array_type): Revert using the macro
CAN_HAVE_LOCATION_P.
gcc/ChangeLog:
* gcc-rich-location.cc (maybe_range_label_for_tree_type_mismatch::get_text):
Revert using the macro CAN_HAVE_LOCATION_P.
|
|
Fixes: f0e28d8c1371 ("RISC-V: Fix failed hoist in LICM of vmv.v.x instruction")
Since above commit, we have following failure:
FAIL: gcc.c-torture/execute/memset-3.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test
FAIL: gcc.c-torture/execute/memset-3.c -O3 -g execution test
The issue was not the commit but rather it unravelled an issue in the
vsetvli pass.
Here's Juzhe's analysis:
We have 2 types of global vsetvls insertion.
One is earliest fusion of each end of the block.
The other is LCM suggested edge vsetvls.
So before this patch, insertion as follows:
| (insn 2817 2820 2818 361 (set (reg:SI 67 vtype)
| (unspec:SI [
| (const_int 8 [0x8])
| (const_int 7 [0x7])
| (const_int 1 [0x1]) repeated x2
| ] UNSPEC_VSETVL)) 1708 {vsetvl_vtype_change_only}
| (nil))
| (insn 2818 2817 999 361 (set (reg:SI 67 vtype)
| (unspec:SI [
| (const_int 32 [0x20])
| (const_int 1 [0x1]) repeated x3
| ] UNSPEC_VSETVL)) 1708 {vsetvl_vtype_change_only}
| (nil))
After this patch:
| (insn 2817 2820 2819 361 (set (reg:SI 67 vtype)
| (unspec:SI [
| (const_int 32 [0x20])
| (const_int 1 [0x1]) repeated x3
| ] UNSPEC_VSETVL)) 1708 {vsetvl_vtype_change_only}
| (nil))
| (insn 2819 2817 999 361 (set (reg:SI 67 vtype)
| (unspec:SI [
| (const_int 8 [0x8])
| (const_int 7 [0x7])
| (const_int 1 [0x1]) repeated x2
| ] UNSPEC_VSETVL)) 1708 {vsetvl_vtype_change_only}
| (nil))
The original insertion order is incorrect.
We should first insert earliest fusion since it is the vsetvls information
already there which was seen by later LCM. We just delay the insertion.
So it should be come before the LCM suggested insertion.
PR target/112447
gcc/ChangeLog:
* config/riscv/riscv-vsetvl.cc (pre_vsetvl::emit_vsetvl): Insert
local vsetvl info before LCM suggested one.
Tested-by: Patrick O'Neill <patrick@rivosinc.com> # pre-commit-CI #679
Co-developed-by: Vineet Gupta <vineetg@rivosinc.com>
|
|
RV64 compare and branch instructions only support 64-bit operands.
At Expand time, the backend conservatively zero/sign extends
its operands even if not needed, such as incoming function args
which ABI/ISA guarantee to be sign-extended already (this is true for
SI, HI, QI operands)
And subsequently REE fails to eliminate them as
"missing defintion(s)" or "multiple definition(s)
since function args don't have explicit definition.
So during expand riscv_extend_comparands (), if an operand is a
subreg-promoted SI with inner DI, which is representative of a function
arg, just peel away the subreg to expose the DI, eliding the sign
extension. As Jeff noted this routine is also used in if-conversion so
potentially can also help there.
Note there's currently patches floating around to improve REE and also a
new pass to eliminate unneccesary extensions, but it is still beneficial
to not generate those extra extensions in first place. It is obviously
less work for post-reload passes such as REE, but even for earlier
passes, such as combine, having to deal with one less thing and ensuing
fewer combinations is a win too.
Way too many existing tests used to observe this issue.
e.g. gcc.c-torture/compile/20190827-1.c -O2 -march=rv64gc
It elimiates the SEXT.W
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_sign_extend_if_not_subreg_prom): New.
* (riscv_extend_comparands): Call New function on operands.
Tested-by: Patrick O'Neill <patrick@rivosinc.com> # pre-commit-CI #676
Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
|
|
The NON_DEPENDENT_EXPR removal exposed that is_direct_enum_init can be
called in a template context on a CONSTRUCTOR that isn't type-dependent
but whose element is.
PR c++/112515
gcc/cp/ChangeLog:
* decl.cc (is_direct_enum_init): Check type-dependence of the
single element.
gcc/testsuite/ChangeLog:
* g++.dg/template/non-dependent30.C: New test.
|
|
Here we're ICEing from strip_typedefs for the partially instantiated
requires-expression when walking its REQUIRES_EXPR_EXTRA_ARGS which
in this case is a TREE_LIST with non-empty TREE_PURPOSE (to hold the
captured local specialization 't' as per build_extra_args) which
strip_typedefs doesn't expect.
We can probably skip walking REQUIRES_EXPR_EXTRA_ARGS at all since it
shouldn't contain any typedefs in the first place, but it seems safer
and more generally useful to just teach strip_typedefs to handle non-empty
TREE_PURPOSE the obvious way. (The code asserts TREE_PURPOSE was empty
even since since its inception i.e. r189298.)
PR c++/101043
gcc/cp/ChangeLog:
* tree.cc (strip_typedefs_expr) <case TREE_LIST>: Handle
non-empty TREE_PURPOSE.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-requires37.C: New test.
|
|
Here when building up the non-dependent .* expression, we crash from
fold_convert on 'b.a' due to this (templated) COMPONENT_REF having an
IDENTIFIER_NODE instead of FIELD_DECL operand that middle-end routines
expect. Like in r14-4899-gd80a26cca02587, this patch fixes this by
replacing the problematic piecemeal folding with a single call to
cp_fully_fold. Also, don't bother building the POINTER_PLUS_EXPR in a
template context. This means the returned non-dependent tree might not
have TREE_SIDE_EFFECTS set when it used to, so we need to compensate
by making build_min_non_dep propagate TREE_SIDE_EFFECTS from the original
arguments like buildN and build_min do.
PR c++/112427
gcc/cp/ChangeLog:
* tree.cc (build_min_non_dep): Propagate TREE_SIDE_EFFECTS from
the original arguments.
(build_min_non_dep_call_vec): Likewise.
* typeck2.cc (build_m_component_ref): Use cp_convert, build2 and
cp_fully_fold instead of fold_build_pointer_plus and fold_convert.
Don't build the POINTER_PLUS_EXPR in a template context.
gcc/testsuite/ChangeLog:
* g++.dg/template/non-dependent29.C: New test.
|
|
potential_constant_expression was incorrectly treating most local
variables from a constexpr function as constant because it wasn't
considering the 'now' parameter. This patch fixes this by relaxing
its var_in_maybe_constexpr_fn checks accordingly, which turns out to
partially fix two recently reported regressions:
PR111703 is a regression caused by r11-550-gf65a3299a521a4 for restricting
constexpr evaluation during warning-dependent folding. The mechanism is
intended to restrict only constant evaluation of the instantiated
non-dependent expression, but it also ends up restricting constant
evaluation occurring during instantiation of the expression, in particular
when instantiating the converted argument 'x' (a VIEW_CONVERT_EXPR) into
a copy constructor call. This seems like a flaw in the mechanism, though
I don't know if we want to fix the mechanism or get rid of it completely
since the original testcases which motivated the mechanism are fixed more
simply by r13-1225-gb00b95198e6720. In any case, this patch partially
fixes this by making us correctly treat 'x' as non-constant which prevents
the problematic warning-dependent folding from occurring at all.
PR112269 is caused by r14-4796-g3e3d73ed5e85e7 for merging tsubst_copy
into tsubst_copy_and_build. tsubst_copy used to exit early when 'args'
was empty, behavior which that commit deliberately didn't preserve.
This early exit masked the fact that COMPLEX_EXPR wasn't handled by
tsubst at all, and is a tree code that apparently we could see during
warning-dependent folding on some targets. A complete fix is to add
handling for this tree code in tsubst_expr, but this patch should fix
the reported testsuite failures since the COMPLEX_EXPRs that crop up
in <complex> are considered non-constant expressions after this patch.
PR c++/111703
PR c++/112269
gcc/cp/ChangeLog:
* constexpr.cc (potential_constant_expression_1) <case VAR_DECL>:
Only consider var_in_maybe_constexpr_fn if 'now' is false.
<case INDIRECT_REF>: Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-fn8.C: New test.
|
|
gcc/ChangeLog:
* config/i386/i386.md (*addqi_ext<mode>_1_slp):
Add "&& " before "reload_completed" in split condition.
(*subqi_ext<mode>_1_slp): Ditto.
(*<any_logic:code>qi_ext<mode>_1_slp): Ditto.
|
|
[PR112540]
PR target/112540
gcc/ChangeLog:
* config/i386/i386.md (*addqi_ext<mode>_1_slp):
Correct operand numbers in split pattern. Replace !Q constraint
of operand 1 with !qm. Add insn constrain.
(*subqi_ext<mode>_1_slp): Ditto.
(*<any_logic:code>qi_ext<mode>_1_slp): Ditto.
|
|
Minor fix-up for commit c09471fbc7588db2480f036aa56a2403d3c03ae5
"nvptx: Add suppport for __builtin_nvptx_brev instrinsic".
gcc/
* doc/extend.texi (Nvidia PTX Built-in Functions): Fix
copy'n'paste-o in '__builtin_nvptx_brev' description.
|
|
This minor tweak to the nvptx backend switches the representation of
of the brev instruction from an UNSPEC to instead use the new BITREVERSE
rtx. This allows various RTL optimizations including evaluation (constant
folding) of integer constant arguments at compile-time.
gcc/
* config/nvptx/nvptx.md (UNSPEC_BITREV): Delete.
(bitrev<mode>2): Represent using bitreverse.
gcc/testsuite/
* gcc.target/nvptx/brev-2-O2.c: Adjust.
* gcc.target/nvptx/brevll-2-O2.c: Likewise.
Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
|
|
In order to observe effects of a later patch, extend the 'brev' test cases
added in commit c09471fbc7588db2480f036aa56a2403d3c03ae5
"nvptx: Add suppport for __builtin_nvptx_brev instrinsic".
gcc/testsuite/
* gcc.target/nvptx/brev-1.c: Extend.
* gcc.target/nvptx/brev-2.c: Rename to...
* gcc.target/nvptx/brev-2-O2.c: ... this, and extend. Copy to...
* gcc.target/nvptx/brev-2-O0.c: ... this, and adapt for '-O0'.
* gcc.target/nvptx/brevll-1.c: Extend.
* gcc.target/nvptx/brevll-2.c: Rename to...
* gcc.target/nvptx/brevll-2-O2.c: ... this, and extend. Copy to...
* gcc.target/nvptx/brevll-2-O0.c: ... this, and adapt for '-O0'.
|
|
Add the new CDNA register file. We don't support any of the specialized
instructions that use these registers, but they're useful to relieve
register pressure without spilling to stack.
Co-authored-by: Andrew Jenner <andrew@codesourcery.com>
gcc/ChangeLog:
* config/gcn/constraints.md: Add "a" AVGPR constraint.
* config/gcn/gcn-valu.md (*mov<mode>): Add AVGPR alternatives.
(*mov<mode>_4reg): Likewise.
(@mov<mode>_sgprbase): Likewise.
(gather<mode>_insn_1offset<exec>): Likewise.
(gather<mode>_insn_1offset_ds<exec>): Likewise.
(gather<mode>_insn_2offsets<exec>): Likewise.
(scatter<mode>_expr<exec_scatter>): Likewise.
(scatter<mode>_insn_1offset_ds<exec_scatter>): Likewise.
(scatter<mode>_insn_2offsets<exec_scatter>): Likewise.
* config/gcn/gcn.cc (MAX_NORMAL_AVGPR_COUNT): Define.
(gcn_class_max_nregs): Handle AVGPR_REGS and ALL_VGPR_REGS.
(gcn_hard_regno_mode_ok): Likewise.
(gcn_regno_reg_class): Likewise.
(gcn_spill_class): Allow spilling to AVGPRs on TARGET_CDNA1_PLUS.
(gcn_sgpr_move_p): Handle AVGPRs.
(gcn_secondary_reload): Reload AVGPRs via VGPRs.
(gcn_conditional_register_usage): Handle AVGPRs.
(gcn_vgpr_equivalent_register_operand): New function.
(gcn_valid_move_p): Check for validity of AVGPR moves.
(gcn_compute_frame_offsets): Handle AVGPRs.
(gcn_memory_move_cost): Likewise.
(gcn_register_move_cost): Likewise.
(gcn_vmem_insn_p): Handle TYPE_VOP3P_MAI.
(gcn_md_reorg): Handle AVGPRs.
(gcn_hsa_declare_function_name): Likewise.
(print_reg): Likewise.
(gcn_dwarf_register_number): Likewise.
* config/gcn/gcn.h (FIRST_AVGPR_REG): Define.
(AVGPR_REGNO): Define.
(LAST_AVGPR_REG): Define.
(SOFT_ARG_REG): Update.
(FRAME_POINTER_REGNUM): Update.
(DWARF_LINK_REGISTER): Update.
(FIRST_PSEUDO_REGISTER): Update.
(AVGPR_REGNO_P): Define.
(enum reg_class): Add AVGPR_REGS and ALL_VGPR_REGS.
(REG_CLASS_CONTENTS): Add new register classes and add entries for
AVGPRs to all classes.
(REGISTER_NAMES): Add AVGPRs.
* config/gcn/gcn.md (FIRST_AVGPR_REG, LAST_AVGPR_REG): Define.
(AP_REGNUM, FP_REGNUM): Update.
(define_attr "type"): Add vop3p_mai.
(define_attr "unit"): Handle vop3p_mai.
(define_attr "gcn_version"): Add "cdna2".
(define_attr "enabled"): Handle cdna2.
(*mov<mode>_insn): Add AVGPR alternatives.
(*movti_insn): Likewise.
* config/gcn/mkoffload.cc (isa_has_combined_avgprs): New.
(process_asm): Process avgpr_count.
* config/gcn/predicates.md (gcn_avgpr_register_operand): New.
(gcn_avgpr_hard_register_operand): New.
* doc/md.texi: Document the "a" constraint.
gcc/testsuite/ChangeLog:
* gcc.target/gcn/avgpr-mem-double.c: New test.
* gcc.target/gcn/avgpr-mem-int.c: New test.
* gcc.target/gcn/avgpr-mem-long.c: New test.
* gcc.target/gcn/avgpr-mem-short.c: New test.
* gcc.target/gcn/avgpr-spill-double.c: New test.
* gcc.target/gcn/avgpr-spill-int.c: New test.
* gcc.target/gcn/avgpr-spill-long.c: New test.
* gcc.target/gcn/avgpr-spill-short.c: New test.
libgomp/ChangeLog:
* plugin/plugin-gcn.c (max_isa_vgprs): New.
(run_kernel): CDNA2 devices have more VGPRs.
|
|
Remove some unnecessary complexity; no functional change is intended,
although LRA appears to use the constraints from the reload_in/out
patterns, so it's probably an improvement for it to see the real sgprbase
constraints.
gcc/ChangeLog:
* config/gcn/gcn-valu.md (mov<mode>_sgprbase): Add @ modifier.
(reload_in<mode>): Delete.
(reload_out<mode>): Delete.
* config/gcn/gcn.cc (CODE_FOR): Delete.
(get_code_for_##PREFIX##vN##SUFFIX): Delete.
(CODE_FOR_OP): Delete.
(get_code_for_##PREFIX): Delete.
(gcn_secondary_reload): Replace "get_code_for" with "code_for".
|
|
By default the preprocessed output includes linemarkers. This leads to
an error if -pedantic is used as e.g. during bootstrap:
s390-gen-builtins.h:1:3: error: style of line directive is a GCC extension [-Werror]
Fixed by omitting linemarkers while generating s390-gen-builtins.h.
gcc/ChangeLog:
* config/s390/t-s390: Generate s390-gen-builtins.h without
linemarkers.
|
|
The following avoids hoisting of invariants from conditionally
executed parts of an if-converted loop. That now makes a difference
since we perform bitfield lowering even when we do not actually
if-convert the loop. if-conversion deals with resetting flow-sensitive
info when necessary already.
PR tree-optimization/112282
* tree-if-conv.cc (ifcvt_hoist_invariants): Only hoist from
the loop header.
* gcc.dg/torture/pr112282.c: New testcase.
|
|
We have to clear the visited flag on stmts.
* tree-vect-slp.cc (vect_slp_region): Also clear visited flag when
we skipped an instance due to -fdbg-cnt.
|
|
2023-11-15 Jakub Jelinek <jakub@redhat.com>
* LOCAL_PATCHES: Update revisions.
|
|
So that we don't have to bump libubsan.so.1 SONAME, the following patch
reverts part of the changes which removed two handlers. While we don't
actually use them from GCC, we shouldn't remove supported entrypoints
unless SONAME is changed (removal of __interceptor_* or ___interceptor_*
is fine). This is the only removal, other libraries just added some
symbols.
2023-11-15 Jakub Jelinek <jakub@redhat.com>
* ubsan/ubsan_handlers_cxx.h (FunctionTypeMismatchData): Forward
declare.
(__ubsan_handle_function_type_mismatch_v1,
__ubsan_handle_function_type_mismatch_v1_abort): Declare.
* ubsan/ubsan_handlers_cxx.cpp (handleFunctionTypeMismatch,
__ubsan_handle_function_type_mismatch_v1,
__ubsan_handle_function_type_mismatch_v1_abort): New functions readded
for backwards compatibility from older ubsan.
* ubsan/ubsan_interface.inc (__ubsan_handle_function_type_mismatch_v1,
__ubsan_handle_function_type_mismatch_v1_abort): Readd.
|
|
The updated libasan doesn't print __interceptor_free (or __interceptor_malloc)
but free (or malloc), the following patch adjusts the testcase so that it
accepts it.
2023-11-15 Jakub Jelinek <jakub@redhat.com>
* c-c++-common/asan/sanity-check-pure-c-1.c: Adjust for interceptor_
or wrap_ substrings possibly not being emitted in newer libasan.
|
|
This patch just reapplies local patches (will be noted in LOCAL_PATCHES).
|
|
The following patch is result of libsanitizer/merge.sh
from c425db2eb558c263 (yesterday evening).
Bootstrapped/regtested on x86_64-linux and i686-linux (together with
the follow-up 3 patches I'm about to post).
BTW, seems upstream has added riscv64 support for I think lsan/tsan,
so if anyone is willing to try it there, it would be a matter of
copying e.g. the s390*-*-linux* libsanitizer/configure.tgt entry
to riscv64-*-linux* with the obvious s/s390x/riscv64/ change in it.
|
|
This is isomorphic to the LLVM changes [1-2].
On LoongArch, the LL and SC instructions has memory barrier semantics:
- LL: <memory-barrier> + <load-exclusive>
- SC: <store-conditional> + <memory-barrier>
But the compare and swap operation is allowed to fail, and if it fails
the SC instruction is not executed, thus the guarantee of acquiring
semantics cannot be ensured. Therefore, an acquire barrier needs to be
generated when failure_memorder includes an acquire operation.
On CPUs implementing LoongArch v1.10 or later, "dbar 0b10100" is an
acquire barrier; on CPUs implementing LoongArch v1.00, it is a full
barrier. So it's always enough for acquire semantics. OTOH if an
acquire semantic is not needed, we still needs the "dbar 0x700" as the
load-load barrier like all LL-SC loops.
[1]:https://github.com/llvm/llvm-project/pull/67391
[2]:https://github.com/llvm/llvm-project/pull/69339
gcc/ChangeLog:
* config/loongarch/loongarch.cc
(loongarch_memmodel_needs_release_fence): Remove.
(loongarch_cas_failure_memorder_needs_acquire): New static
function.
(loongarch_print_operand): Redefine 'G' for the barrier on CAS
failure.
* config/loongarch/sync.md (atomic_cas_value_strong<mode>):
Remove the redundant barrier before the LL instruction, and
emit an acquire barrier on failure if needed by
failure_memorder.
(atomic_cas_value_cmp_and_7_<mode>): Likewise.
(atomic_cas_value_add_7_<mode>): Remove the unnecessary barrier
before the LL instruction.
(atomic_cas_value_sub_7_<mode>): Likewise.
(atomic_cas_value_and_7_<mode>): Likewise.
(atomic_cas_value_xor_7_<mode>): Likewise.
(atomic_cas_value_or_7_<mode>): Likewise.
(atomic_cas_value_nand_7_<mode>): Likewise.
(atomic_cas_value_exchange_7_<mode>): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/cas-acquire.c: New test.
|
|
The Xmethod for std::deque::operator[] has the same bug that I recently
fixed for the std::deque::size() Xmethod. The first node might have
unused capacity at the start, which needs to be accounted for when
indexing into the deque.
libstdc++-v3/ChangeLog:
PR libstdc++/112491
* python/libstdcxx/v6/xmethods.py (DequeWorkerBase.index):
Correctly handle unused capacity at the start of the first node.
* testsuite/libstdc++-xmethods/deque.cc: Check index operator
when elements have been removed from the front.
|
|
Fix a typo in a string literal and make the new hash.cc test gracefully
handle missing stacktrace data (see PR 112541).
libstdc++-v3/ChangeLog:
* include/std/stacktrace (basic_stacktrace::at): Fix class name
in exception message.
* testsuite/19_diagnostics/stacktrace/hash.cc: Do not fail if
current() returns a non-empty stacktrace.
|
|
My previous patch series added a new function to check for armv6t2
compatible hardware. But the test was not correctly implemented and
also did not follow the standard naming convention for Arm hw
compatibility tests. Fix both of these issues.
gcc/testsuite:
* lib/target-supports.exp (check_effective_target_arm_arch_v6t2_hw_ok):
Rename to...
(check_effective_target_arm_arch_v6t2_hw): ... this. Fix checks.
* gcc.target/arm/acle/data-intrinsics-armv6.c: Update pre-check.
* gcc.target/arm/acle/data-intrinsics-rbit.c: Likewise.
|
|
Add optimization when trailing elements > leading elements.
Consider this following case:
#include <stdint.h>
typedef int64_t v16di __attribute__ ((vector_size (128)));
__attribute__ ((noipa)) void
f_v16di (int64_t a, int64_t b, int64_t c, int64_t d, int64_t *out)
{
v16di v = {a, b, c, d, d, d, d, d, d, d, d, d, d, d, d, d};
*(v16di *) out = v;
}
https://godbolt.org/z/vWTjbrWGf
Before this patch:
f_v16di:
vsetivli zero,16,e64,m8,ta,ma
vmv.v.x v8,a0
vslide1down.vx v8,v8,a1
vslide1down.vx v8,v8,a2
vslide1down.vx v8,v8,a3
vslide1down.vx v8,v8,a3
vslide1down.vx v8,v8,a3
vslide1down.vx v8,v8,a3
vslide1down.vx v8,v8,a3
vslide1down.vx v8,v8,a3
vslide1down.vx v8,v8,a3
vslide1down.vx v8,v8,a3
vslide1down.vx v8,v8,a3
vslide1down.vx v8,v8,a3
vslide1down.vx v8,v8,a3
vslide1down.vx v8,v8,a3
vslide1down.vx v8,v8,a3
vse64.v v8,0(a4)
ret
After this patch:
f_v16di:
vsetivli zero,16,e64,m8,ta,ma
vmv.v.x v16,a3
vslide1up.vx v8,v16,a2
vslide1up.vx v16,v8,a1
vslide1up.vx v8,v16,a0
vse64.v v8,0(a4)
ret
gcc/ChangeLog:
* config/riscv/riscv-v.cc (expand_vector_init_trailing_same_elem): New function.
(expand_vec_init): Add trailing optimization.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vls/def.h: Add trailing tests.
* gcc.target/riscv/rvv/autovec/vls-vlmax/trailing-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/trailing-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/trailing_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/trailing_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/trailing-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/trailing-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/trailing-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/trailing-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/trailing-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/trailing-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls/trailing-7.c: New test.
|
|
Jeff reported this testcase newly FAILs on 16-bit targets, the following
patch adjusts the expected diagnostics for that case.
2023-11-15 Jakub Jelinek <jakub@redhat.com>
* gcc.dg/cpp/if-2.c: Adjust expected diagnostics for 16-bit targets.
|
|
Update in v2:
1. Add more test cases for fixed-vlmax.
2, Add test cases for vls mode.
Original log:
We take vec_init element int mode when generate the mask for
case 2. But actually we don't need as many bits as the element.
The extra bigger mode may introduce some unnecessary insns.
For example as below code:
typedef int64_t v16di __attribute__ ((vector_size (16 * 8)));
void __attribute__ ((noinline, noclone))
foo (int64_t *out, int64_t x, int64_t y)
{
v16di v = {y, x, y, x, y, x, y, x, y, x, y, x, y, x, y, x};
*(v16di *) out = v;
}
We will have VDImode when generate the 0b0101010101010101 mask but
actually VHImode is good enough here. This patch would like to
refine the mask generation to avoid:
1. Unnecessary scalar to generate big constant mask.
2. Unnecessary vector insn to v0 mask.
Before this patch:
foo:
li a5,-1431654400
li a4,-1431654400 <== unnecessary insn
addi a5,a5,-1365 <== unnecessary insn
addi a4,a4,-1366
slli a5,a5,32 <== unnecessary insn
add a5,a5,a4 <== unnecessary insn
vsetivli zero,16,e64,m8,ta,ma
vmv.v.x v8,a2
vmv.s.x v16,a5
vmv1r.v v0,v16 <== unnecessary insn
vmerge.vxm v8,v8,a1,v0
vse64.v v8,0(a0)
ret
After this patch:
foo:
li a5,-20480
addiw a5,a5,-1366
vsetivli zero,16,e64,m8,ta,ma
vmv.s.x v0,a5
vmv.v.x v8,a2
vmerge.vxm v8,v8,a1,v0
vs8r.v v8,0(a0)
ret
gcc/ChangeLog:
* config/riscv/riscv-v.cc (rvv_builder::get_merge_scalar_mask):
Add inner_mode mask arg for mask int mode.
(get_repeating_sequence_dup_machine_mode): Add mask_bit_mode arg
to get the good enough vector int mode on precision.
(expand_vector_init_merge_repeating_sequence): Pass required args
to above func.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-10.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-11.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-12.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-13.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-14.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-15.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-7.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-8.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/init-repeat-sequence-9.c: New test.
* gcc.target/riscv/rvv/autovec/vls/init-repeat-sequence-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/init-repeat-sequence-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/init-repeat-sequence-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/init-repeat-sequence-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/init-repeat-sequence-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/init-repeat-sequence-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/init-repeat-sequence-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls/init-repeat-sequence-7.c: New test.
* gcc.target/riscv/rvv/autovec/vls/init-repeat-sequence-8.c: New test.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
This patch is quite obvious patch which disallow for load/store address register
with RVV mode.
PR target/112535
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_legitimate_address_p): Disallow RVV modes base address.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr112535.c: New test.
|
|
Enumerations From C++26
The following patch implements C++26 P2864R2 by emitting pedwarn enabled by
the same options as the C++20 and later warnings (i.e. -Wenum-compare,
-Wdeprecated-enum-enum-conversion and -Wdeprecated-enum-float-conversion
which are all enabled by default). I think we still want to allow users
some option workaround, so am not using directly error. Additionally, for
cxx_dialect >= cxx26 && (complain & tf_warning_or_error) == 0 it causes for
these newly ill-formed constructs error_mark_node to be silently returned.
2023-11-15 Jakub Jelinek <jakub@redhat.com>
gcc/cp/
* typeck.cc: Implement C++26 P2864R2 - Remove Deprecated Arithmetic
Conversion on Enumerations From C++26.
(do_warn_enum_conversions): Return bool rather than void, add COMPLAIN
argument. Use pedwarn rather than warning_at for C++26 and remove
" is deprecated" part of the diagnostics in that case. For SFINAE
in C++26 return true on newly erroneous cases.
(cp_build_binary_op): For C++26 call do_warn_enum_conversions
unconditionally, pass complain argument to it and if it returns true,
return error_mark_node.
* call.cc (build_conditional_expr): Use pedwarn rather than warning_at
for C++26 and remove " is deprecated" part of the diagnostics in that
case and check for complain & tf_warning_or_error. Use emit_diagnostic
with cxx_dialect >= cxx26 ? DK_PEDWARN : DK_WARNING. For SFINAE in
C++26 return error_mark_node on newly erroneous cases.
(build_new_op): Use emit_diagnostic with cxx_dialect >= cxx26
? DK_PEDWARN : DK_WARNING and complain & tf_warning_or_error check
for C++26. For SFINAE in C++26 return error_mark_node on newly
erroneous cases.
gcc/testsuite/
* g++.dg/cpp26/enum-conv1.C: New test.
* g++.dg/cpp2a/enum-conv1.C: Adjust expected diagnostics in C++26.
* g++.dg/diagnostic/enum3.C: Likewise.
* g++.dg/parse/attr3.C: Likewise.
* g++.dg/cpp0x/linkage2.C: Likewise.
|
|
This reverts commit a1ad62ee2fd070854d2137f35614af639c1a94f2.
|
|
LTS GNU/Linux distros from 2018, still in use, don't have
pthread_cond_clockwait. There's no trivial way to detect it so as to
make the test conditional, but there's an easy enough way to silence
the fail due to lack of the function in libc, and that has nothing to
do with the false positive that this is testing against.
for gcc/testsuite/ChangeLog
* g++.dg/tsan/pthread_cond_clockwait.C: Add fallback overload.
|
|
gcc.target/i386/pr95126-m32-[34].c expect push instructions that are
only present with -mno-accumulate-outgoing-args, so make that option
explicit rather than dependent on tuning.
for gcc/testsuite/ChangeLog
* gcc.target/i386/pr95126-m32-3.c: Add
-mno-accumulate-outgoing-args.
* gcc.target/i386/pr95126-m32-4.c: Likewise.
|
|
It's customary to undefine temporary internal macros at the end of the
header that defines them, even such widely-usable ones as
_GLIBCXX_ALWAYS_INLINE, so do so in the header where the define was
recently introduced.
for libstdc++-v3/ChangeLog
* include/bits/stl_bvector.h (_GLIBCXX_ALWAYS_INLINE): Undef.
|
|
gcc/ChangeLog:
* json.cc (selftest::assert_print_eq): Add "loc" param and use
ASSERT_STREQ_AT.
(ASSERT_PRINT_EQ): New macro.
(selftest::test_writing_objects): Use ASSERT_PRINT_EQ to capture
source location of assertion.
(selftest::test_writing_arrays): Likewise.
(selftest::test_writing_float_numbers): Likewise.
(selftest::test_writing_integer_numbers): Likewise.
(selftest::test_writing_strings): Likewise.
(selftest::test_writing_literals): Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
|
|
libcpp will generate diagnostics when it encounters things in the main file
that only belong in a header file, such as `#pragma once' or `#pragma GCC
system_header'. But sometimes the main file is a header file that is just
being compiled separately, e.g. to produce a C++ module or a PCH, in which
case such diagnostics should be suppressed. libcpp already has an interface
to request that, so make use of it in the C frontends to prevent libcpp from
issuing unwanted diagnostics when compiling a PCH.
gcc/c-family/ChangeLog:
PR pch/9471
PR pch/47857
* c-opts.cc (c_common_post_options): Set cpp_opts->main_search
so libcpp knows it is compiling a header file separately.
gcc/testsuite/ChangeLog:
PR pch/9471
PR pch/47857
* g++.dg/pch/main-file-warnings.C: New test.
* g++.dg/pch/main-file-warnings.Hs: New test.
* gcc.dg/pch/main-file-warnings.c: New test.
* gcc.dg/pch/main-file-warnings.hs: New test.
|
|
The current implementation calls __detail::__modulo which is relatively
expensive.
A better implementation is possible if we assume that x.ok() && y.ok() == true,
so that n = x.c_encoding() - y.c_encoding() is in [-6, 6]. In this case, it
suffices to return n >= 0 ? n : n + 7.
The above is allowed by [time.cal.wd.nonmembers]/5: the returned value is
unspecified when x.ok() || y.ok() == false.
The assembly emitted for x86-64 and ARM can be seen in:
https://godbolt.org/z/nMdc5vv9n.
libstdc++-v3/ChangeLog:
* include/std/chrono (operator-(const weekday&, const weekday&)):
Optimize.
|
|
The following has undefined behaviour (signed overflow) [1]:
weekday max{sys_days{days{numeric_limits<days::rep>::max()}}};
The issue is in this line when __n is very large and __n + 4 overflows:
return weekday(__n >= -4 ? (__n + 4) % 7 : (__n + 5) % 7 + 6);
In addition to fixing this bug, the new implementation makes the compiler emit
shorter and branchless code for x86-64 and ARM [2].
[1] https://godbolt.org/z/1s5bv7KfT
[2] https://godbolt.org/z/zKsabzrhs
libstdc++-v3/ChangeLog:
* include/std/chrono (weekday::_S_from_days): Fix UB.
* testsuite/std/time/weekday/1.cc: Add test for overflow.
|
|
The current implementation returns
(_M_y & (__is_multiple_of_100 ? 15 : 3)) == 0;
where __is_multiple_of_100 is calculated using an obfuscated algorithm which
saves one ror instruction when compared to _M_y % 100 == 0 [1].
In leap years calculation, it's correct to replace the divisibility check by
100 with the one by 25. It turns out that _M_y % 25 == 0 also saves the ror
instruction [2]. Therefore, the obfuscation is not required.
[1] https://godbolt.org/z/5PaEv6a6b
[2] https://godbolt.org/z/55G8rn77e
libstdc++-v3/ChangeLog:
* include/std/chrono (year::is_leap): Clear code.
|
|
When year_month_day_last::day() was implemented, Dr. Matthias Kretz realised
that the operation "& 1" wasn't necessary but we did not patch it at that
time. This patch removes the unnecessary operation.
libstdc++-v3/ChangeLog:
* include/std/chrono (year_month_day_last::day): Remove &1.
|
|
In <charconv> we pass the int __base parameter to our internal versions
of <bit> functions, __bit_width and __countr_zero. Those functions are
only defined for unsigned types, so we need to convert the base to
unsigned. The base must be in the range [2,36] so we can mask off the
low bits and then convert that to unsigned, so that we don't need to
care about negative values becoming large unsigned values.
libstdc++-v3/ChangeLog:
* include/std/charconv (__from_chars_pow2_base): Convert base to
unsigned for call to __countr_zero.
(__from_chars_alnum): Likewise for call to __bit_width.
|
|
libstdc++-v3/ChangeLog:
PR libstdc++/112348
* include/std/stacktrace (hash<basic_stacktrace<Alloc>>): Fix
type of hash functio nfor entries.
* testsuite/19_diagnostics/stacktrace/hash.cc: New test.
|
|
gcc/analyzer/ChangeLog:
PR analyzer/103533
* sm-taint.cc: Remove "experimental" from comment.
* sm.cc (make_checkers): Always add taint state machine.
gcc/ChangeLog:
PR analyzer/103533
* doc/invoke.texi (Static Analyzer Options): Add the six
-Wanalyzer-tainted-* warnings. Update documentation of each
warning to reflect removed requirement to use
-fanalyzer-checker=taint. Remove discussion of
-fanalyzer-checker=taint.
gcc/testsuite/ChangeLog:
PR analyzer/103533
* c-c++-common/analyzer/attr-tainted_args-1.c: Remove use of
-fanalyzer-checker=taint.
* c-c++-common/analyzer/fread-1.c: Likewise.
* c-c++-common/analyzer/pr104029.c: Likewise.
* gcc.dg/analyzer/pr93032-mztools-signed-char.c: Add params to
work around state explosion.
* gcc.dg/analyzer/pr93032-mztools-unsigned-char.c: Likewise.
* gcc.dg/analyzer/pr93382.c: Remove use of
-fanalyzer-checker=taint.
* gcc.dg/analyzer/switch-enum-taint-1.c: Likewise.
* gcc.dg/analyzer/taint-CVE-2011-2210-1.c: Likewise.
* gcc.dg/analyzer/taint-CVE-2020-13143-1.c: Likewise.
* gcc.dg/analyzer/taint-CVE-2020-13143-2.c: Likewise.
* gcc.dg/analyzer/taint-CVE-2020-13143.h: Likewise.
* gcc.dg/analyzer/taint-alloc-1.c: Likewise.
* gcc.dg/analyzer/taint-alloc-2.c: Likewise.
* gcc.dg/analyzer/taint-alloc-3.c: Likewise.
* gcc.dg/analyzer/taint-alloc-4.c: Likewise.
* gcc.dg/analyzer/taint-alloc-5.c: Likewise.
* gcc.dg/analyzer/taint-assert-BUG_ON.c: Likewise.
* gcc.dg/analyzer/taint-assert-macro-expansion.c: Likewise.
* gcc.dg/analyzer/taint-assert-system-header.c: Likewise.
* gcc.dg/analyzer/taint-assert.c: Likewise.
* gcc.dg/analyzer/taint-divisor-1.c: Likewise.
* gcc.dg/analyzer/taint-divisor-2.c: Likewise.
* gcc.dg/analyzer/taint-merger.c: Likewise.
* gcc.dg/analyzer/taint-ops.c: Delete this test: it was a
duplicate of material in operations.c and data-model-1.c, with
-fanalyzer-checker=taint added.
* gcc.dg/analyzer/taint-read-index-1.c: Remove use of
-fanalyzer-checker=taint.
* gcc.dg/analyzer/taint-read-offset-1.c: Likewise.
* gcc.dg/analyzer/taint-realloc.c: Likewise. Add missing
dg-warning for leak now that the malloc state machine is also
active.
* gcc.dg/analyzer/taint-size-1.c: Remove use of
-fanalyzer-checker=taint.
* gcc.dg/analyzer/taint-size-access-attr-1.c: Likewise.
* gcc.dg/analyzer/taint-write-index-1.c: Likewise.
* gcc.dg/analyzer/taint-write-offset-1.c: Likewise.
* gcc.dg/analyzer/torture/taint-read-index-2.c: Likewise.
* gcc.dg/analyzer/torture/taint-read-index-3.c: Likewise.
* gcc.dg/plugin/taint-CVE-2011-0521-1-fixed.c: Likewise. Add
-Wno-pedantic.
* gcc.dg/plugin/taint-CVE-2011-0521-1.c: Likewise.
* gcc.dg/plugin/taint-CVE-2011-0521-2-fixed.c: Likewise.
* gcc.dg/plugin/taint-CVE-2011-0521-2.c: Likewise.
* gcc.dg/plugin/taint-CVE-2011-0521-3-fixed.c: Likewise.
* gcc.dg/plugin/taint-CVE-2011-0521-3.c: Likewise. Fix C++-style
comment.
* gcc.dg/plugin/taint-CVE-2011-0521-4.c: Remove use of
-fanalyzer-checker=taint and add -Wno-pedantic. Remove xfail and
add missing dg-warning.
* gcc.dg/plugin/taint-CVE-2011-0521-5-fixed.c: Remove use of
-fanalyzer-checker=taint and add -Wno-pedantic.
* gcc.dg/plugin/taint-CVE-2011-0521-5.c: Likewise.
* gcc.dg/plugin/taint-CVE-2011-0521-6.c: Likewise.
* gcc.dg/plugin/taint-antipatterns-1.c: : Remove use of
-fanalyzer-checker=taint.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
r14-5312-g040e5b0edbca861196d9e2ea2af5e805769c8d5d commit.
This commit got ignored because ChangeLog update can't parse its log message.
|
|
The -w option was used in gcc.dg/20020206-1.c to ignore warnings if the
'-fprefetch-loop-arrays' option is not supported by target.
When commit r14-5380-g5c432b0efab54e removed the -w option, some targets
(arm-none-eabi, pru and possibly others) started failing the test:
cc1: warning: '-fprefetch-loop-arrays' not supported for this target
FAIL: gcc.dg/20020206-1.c (test for excess errors)
Fix by instructing DejaGnu to prune the '-fprefetch-loop-arrays'
warning.
gcc/testsuite/ChangeLog:
* gcc.dg/20020206-1.c: Prune warning that
-fprefetch-loop-arrays is not supported.
Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
|
|
Virtual cloned functions have distinct vtable indices, stream them
explicitly.
As such, this patch ensures that DECL_VINDEX is properly passed on for
cloned functions as well to prevent this from causing issues.
PR c++/103499
gcc/cp/ChangeLog:
* module.cc (trees_out::decl_node): Write DECL_VINDEX for
virtual clones.
(trees_in::tree_node): Read DECL_VINDEX for virtual clones.
gcc/testsuite/ChangeLog:
* g++.dg/modules/pr103499_a.C: New test.
* g++.dg/modules/pr103499_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Signed-off-by: Nathan Sidwell <nathan@acm.org>
|