Age | Commit message (Collapse) | Author | Files | Lines |
|
Fixes PR go/114463
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/574476
|
|
In some cases combine will "combine" an I2 and I3, but end up putting
exactly the same thing back as I2 as was there before. This is never
progress, so we shouldn't do it, it will lead to oscillating behaviour
and the like.
If we want to canonicalise things, that's fine, but this is not the
way to do it.
2024-03-27 Segher Boessenkool <segher@kernel.crashing.org>
PR rtl-optimization/101523
* combine.cc (try_combine): Don't do a 2-insn combination if
it does not in fact change I2.
|
|
We got internally a question about the Spec File syntax, misunderstanding
what is the literal syntax and what are the placeholder variables in
the syntax descriptions.
The following patch attempts to use @var{S} etc. instead of just S
to clarify it stands for any option (or start of option etc.) rather
than literal S, say in %{S:X}. At least in HTML documentation it
then uses italics.
2024-03-27 Jakub Jelinek <jakub@redhat.com>
* doc/invoke.texi (Spec Files): Use @var{S} instead of S,
@var{X} instead of X etc. for other placeholders.
|
|
The following makes sure to record the scalars we add to the BB
reduction vectorization result as scalar uses for the purpose of
computing live lanes. This restores vectorization in the
bondfree.c TU of 435.gromacs.
PR tree-optimization/114057
* tree-vect-slp.cc (vect_bb_slp_mark_live_stmts): Mark
BB reduction remain defs as scalar uses.
|
|
These tests FAIL for quite a while on i686-linux since July last year,
likely r14-2628 . Since that patch gcc claims _Float16 and __bf16
support even without -msse2 because some functions could be using
target attribute.
Later r14-2691 added -msse2 to add_options_for_float16, but didn't do that
for bfloat16, plus ext-floating{3,12}.C tests need the added dg-add-options,
so that float16 and bfloat16 effective targets match the __STDCPP_FLOAT16_T__
or __STDCPP_BFLOAT16_T__ macros.
Fixes
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++23 (test for errors, line 144)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++23 (test for errors, line 146)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++23 (test for errors, line 148)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++23 (test for errors, line 150)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++23 (test for errors, line 152)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++23 (test for errors, line 154)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++26 (test for errors, line 144)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++26 (test for errors, line 146)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++26 (test for errors, line 148)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++26 (test for errors, line 150)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++26 (test for errors, line 152)
-FAIL: g++.dg/cpp23/ext-floating12.C -std=gnu++26 (test for errors, line 154)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for errors, line 107)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for errors, line 114)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for errors, line 126)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for errors, line 79)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for errors, line 86)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for errors, line 98)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for warnings, line 22)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for warnings, line 23)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for warnings, line 24)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++23 (test for warnings, line 25)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for errors, line 107)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for errors, line 114)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for errors, line 126)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for errors, line 79)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for errors, line 86)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for errors, line 98)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for warnings, line 22)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for warnings, line 23)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for warnings, line 24)
-FAIL: g++.dg/cpp23/ext-floating3.C -std=gnu++26 (test for warnings, line 25)
on the latter and changes nothing on the former.
2024-03-27 Jakub Jelinek <jakub@redhat.com>
* lib/target-supports.exp (add_options_for_bfloat16): Add -msse2 on
i?86/x86_64.
* g++.dg/cpp23/ext-floating3.C: Add dg-add-options float16.
* g++.dg/cpp23/ext-floating12.C: Add dg-add-options float16 and
bfloat16.
|
|
Due to the Linux kernel exposing the lrcpc3 architectural feature as
"lrcpc3", this patch corrects the relevant FEATURE_STRING entry in the
"rcpc3" AARCH64_OPT_FMV_EXTENSION macro, such that the feature can be
correctly detected when doing native compilation on rcpc3-enabled
targets.
gcc/ChangeLog:
* config/aarch64/aarch64-option-extensions.def (rcpc3):
Fix FEATURE_STRING field to "lrcpc3".
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/cpunative/info_24: New.
* gcc.target/aarch64/cpunative/native_cpu_24.c: Likewise.
|
|
Given how, at present, the choice of using LSE128 atomic instructions
by the toolchain is delegated to run-time selection in the form of
Libatomic ifuncs, responsible for querying target support, the
`+lse128' target architecture compile-time flag is absent from GCC.
This, however, contrasts with the Binutils implementation, which gates
LSE128 instructions behind the `+lse128' flag. This can lead to
problems in GCC for certain use-cases. One such example is in the use
of inline assembly, whereby the inability of enabling the feature in
the command-line prevents the compiler from automatically issuing the
necessary LSE128 `.arch' directive.
This patch therefore brings GCC into alignment with LLVM and Binutils
in adding support for the `+lse128' architectural extension flag.
gcc/ChangeLog:
* config/aarch64/aarch64-option-extensions.def: Add LSE128
AARCH64_OPT_EXTENSION, adding it as a dependency for the D128
feature.
* doc/invoke.texi (AArch64 Options): Document +lse128.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/lse128-flag.c: New.
* gcc.target/aarch64/cpunative/info_23: Likewise.
* gcc.target/aarch64/cpunative/native_cpu_23.c: Likewise.
|
|
For targets where LOGICAL_OP_NON_SHORT_CIRCUIT evaluates to false, two
conditional jumps are emitted instead of a combined conditional which
this test is all about. Thus, set it to true.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/copy-headers-8.c: Set
LOGICAL_OP_NON_SHORT_CIRCUIT to true.
|
|
|
|
PR libfortran/107031
libgfortran/ChangeLog:
* io/file_pos.c (st_endfile): Remove call to next_record().
gcc/testsuite/ChangeLog:
* gfortran.dg/endfile_5.f90: New test.
|
|
GCC 4.8 complained about the use of const rather than constexpr
for out-of-line static constexprs.
gcc/
* config/aarch64/aarch64-feature-deps.h: Use constexpr for
out-of-line statics.
|
|
GCC was defining bts_offset entry to always contain 0.
When comparing with clang, the same entry is instead a label to the
respective variable or function. The assembler emits relocations for
those labels.
gcc/ChangeLog:
PR target/114431
* btfout.cc (get_name_for_datasec_entry): Add function.
(btf_asm_datasec_entry): Print label when possible.
gcc/testsuite/ChangeLog:
* gcc.dg/debug/btf/btf-datasec-1.c: Correct for new
implementation.
* gcc.dg/debug/btf/btf-datasec-2.c: Likewise
* gcc.dg/debug/btf/btf-pr106773.c: Likewise
|
|
Apparently I've somehow screwed up the adjustments of the originally tested
testcase, tweaked it so that in the second/third cases it actually see
a MAX_EXPR rather than COND_EXPR the MAX_EXPR has been optimized into,
and didn't update the expected value.
2024-03-26 Jakub Jelinek <jakub@redhat.com>
PR middle-end/111151
PR testsuite/114486
* gcc.c-torture/execute/pr111151.c (main): Fix up expected value for
f.
|
|
This patch adds isnormal (and isgreater, isless, isgreaterequal,
islessequal, islessgreater, isunordered) c99 macro similar prototyped
builtins to m2.
gcc/m2/ChangeLog:
PR modula2/114478
* gm2-gcc/m2builtins.cc (struct builtin_macro_definition): New struct.
(lookup_builtin_macro): New function.
(m2builtins_BuildBuiltinTree): Rewrite to lookup builtin function
and builtin macro.
(lookup_builtin_function): New function.
(define_builtin): Rename parameter type to prototype push macro
definition to builtin_macros vector.
(define_builtin_ext): New function.
(define_builtin_math): New function.
(m2builtins_init): Add isgreater, isless, isgreaterequal,
islessequal, islessgreater, isunordered, isnormal to macro definitions.
* gm2-libs/Builtins.def (isgreater): New procedure function.
(isgreaterf): Ditto.
(isgreaterl): Ditto.
(isgreaterequal): Ditto.
(isgreaterequalf): Ditto.
(isgreaterequall): Ditto.
(isless): Ditto.
(islessf): Ditto.
(islessl): Ditto.
(islessequal): Ditto.
(islessequalf): Ditto.
(islessequall): Ditto.
(islessgreater): Ditto.
(islessgreaterf): Ditto.
(islessgreaterl): Ditto.
(isunordered): Ditto.
(isunorderedf): Ditto.
(isunorderedl): Ditto.
(iseqsig): Ditto.
(iseqsigf): Ditto.
(iseqsigl): Ditto.
(isnormal): Ditto.
(isnormalf): Ditto.
(isnormall): Ditto.
(isinf_sign): Ditto.
(isinf_signf): Ditto.
(isinf_signl): Ditto.
* gm2-libs/Builtins.mod (isgreater): New procedure function.
(isgreaterf): Ditto.
(isgreaterl): Ditto.
(isgreaterequal): Ditto.
(isgreaterequalf): Ditto.
(isgreaterequall): Ditto.
(isless): Ditto.
(islessf): Ditto.
(islessl): Ditto.
(islessequal): Ditto.
(islessequalf): Ditto.
(islessequall): Ditto.
(islessgreater): Ditto.
(islessgreaterf): Ditto.
(islessgreaterl): Ditto.
(isunordered): Ditto.
(isunorderedf): Ditto.
(isunorderedl): Ditto.
(iseqsig): Ditto.
(iseqsigf): Ditto.
(iseqsigl): Ditto.
(isnormal): Ditto.
(isnormalf): Ditto.
(isnormall): Ditto.
(isinf_sign): Ditto.
(isinf_signf): Ditto.
(isinf_signl): Ditto.
gcc/testsuite/ChangeLog:
PR modula2/114478
* gm2/builtins/run/pass/builtins-run-pass.exp: New test.
* gm2/builtins/run/pass/testcomparisons.mod: New test.
* gm2/builtins/run/pass/testisnormal.mod: New test.
* gm2/pimlib/run/pass/testchar.mod: New test.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
|
|
We used to hit the "Error reporting routines re-entered." ICE here but
it was fixed by Patrick's r14-3809.
PR c++/100557
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-pr100557.C: New test.
|
|
gcc/testsuite/ChangeLog:
* g++.dg/modules/decltype-1_a.C: Add missing } to dg-module-do
directive.
* g++.dg/modules/lambda-5_a.C: Likewise.
|
|
The SCHEDULER_IDENT for these two CPUs
was incorrectly set to cortexa55.
This can cause sub-optimal asm
to be generated.
gcc/ChangeLog:
PR target/114272
* config/aarch64/aarch64-cores.def (AARCH64_CORE):
Change SCHEDULER_IDENT from cortexa55 to cortexa53
for Cortex-A510 and Cortex-A520.
|
|
I've missed
FAIL: gcc.dg/torture/pr113126.c -O0 (test for excess errors)
etc. regressions on i686-linux since January. The problem is obvious
Excess errors:
.../gcc/testsuite/gcc.dg/torture/pr113126.c:11:1: warning: MMX vector return without MMX enabled changes the ABI [-Wpsabi]
and I've added -Wno-psabi to dg-additional-options to fix that up.
2024-03-26 Jakub Jelinek <jakub@redhat.com>
* gcc.dg/torture/pr113126.c: Add -Wno-psabi as dg-additional-options.
|
|
As I've tried to explain in the comments, the extract_muldiv_1
MIN/MAX_EXPR optimization is wrong for code == MULT_EXPR.
If the multiplication is done in unsigned type or in signed
type with -fwrapv, it is fairly obvious that max (a, b) * c
in many cases isn't equivalent to max (a * c, b * c) (or min if c is
negative) due to overflows, but even for signed with undefined overflow,
the optimization could turn something without UB in it (where
say a * c invokes UB, but max (or min) picks the other operand where
b * c doesn't).
As for division/modulo, I think it is in most cases safe, except if
the problematic INT_MIN / -1 case could be triggered, but we can
just punt for MAX_EXPR because for MIN_EXPR if one operand is INT_MIN,
we'd pick that operand already. It is just for completeness, match.pd
already has an optimization which turns x / -1 into -x, so the division
by zero is mostly theoretical. That is also why in the testcase the
i case isn't actually miscompiled without the patch, while the c and f
cases are.
2024-03-26 Jakub Jelinek <jakub@redhat.com>
PR middle-end/111151
* fold-const.cc (extract_muldiv_1) <case MAX_EXPR>: Punt for
MULT_EXPR altogether, or for MAX_EXPR if c is -1.
* gcc.c-torture/execute/pr111151.c: New test.
|
|
Similar to the asan and ubsan changes, we shouldn't instrument non-generic
address space accesses with tsan, because we just have library functions
which take address of the objects as generic address space pointers, so they
can't handle anything else.
2024-03-26 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/111736
* tsan.cc (instrument_expr): Punt on non-generic address space
accesses.
* gcc.dg/tsan/pr111736.c: New test.
|
|
The following fixes too lax verification of vector type compatibility
in vectorizable_operation. When we only have a single vector size then
comparing the number of elements is enough but with SLP we mix those
and thus for operations like BIT_AND_EXPR we need to verify compatible
element types as well. Allow sign changes for ABSU_EXPR though.
PR tree-optimization/114471
* tree-vect-stmts.cc (vectorizable_operation): Verify operand
types are compatible with the result type.
* gcc.dg/vect/pr114471.c: New testcase.
|
|
The following adds missing verification of vector type compatibility
to recurrence vectorization.
PR tree-optimization/114464
* tree-vect-loop.cc (vectorizable_recurr): Verify the latch
vector type is compatible with what we chose for the recurrence.
* g++.dg/vect/pr114464.cc: New testcase.
|
|
I've noticed a comment typo in x86-tune.def and cfgloopmanip.cc has the
same typo as well.
2024-03-26 Jakub Jelinek <jakub@redhat.com>
* cfgloopmanip.cc (update_loop_exit_probability_scale_dom_bbs):
Fix comment typo - multple -> multiple.
* config/i386/x86-tune.def (X86_TUNE_ACCUMULATE_OUTGOING_ARGS):
Likewise.
|
|
I've noticed that the c-c++-common/gomp/depobj-3.c test FAILs on i686-linux:
PASS: c-c++-common/gomp/depobj-3.c -std=c++17 at line 17 (test for warnings, line 15)
FAIL: c-c++-common/gomp/depobj-3.c -std=c++17 at line 39 (test for warnings, line 37)
PASS: c-c++-common/gomp/depobj-3.c -std=c++17 at line 43 (test for errors, line 41)
PASS: c-c++-common/gomp/depobj-3.c -std=c++17 (test for warnings, line 45)
FAIL: c-c++-common/gomp/depobj-3.c -std=c++17 (test for excess errors)
Excess errors:
/home/jakub/src/gcc/gcc/testsuite/c-c++-common/gomp/depobj-3.c:37:38: warning: the 'destroy' expression ''excess_precision_expr' not supported by dump_expr<expression error>' should
+be the same as the 'depobj' argument 'obj' [-Wopenmp]
The following patch replaces that 'excess_precision_expr' not supported by dump_expr<expression error>
with (float)(((long double)a) + (long double)5)
Still ugly and doesn't actually fix the FAIL (will deal with that
incrementally), but at least valid C/C++ and shows the excess precision
handling in action.
2024-03-26 Jakub Jelinek <jakub@redhat.com>
PR c++/112724
gcc/c-family/
* c-pretty-print.cc (pp_c_cast_expression,
c_pretty_printer::expression): Handle EXCESS_PRECISION_EXPR like
NOP_EXPR.
gcc/cp/
* error.cc (dump_expr): Handle EXCESS_PRECISION_EXPR like NOP_EXPR.
|
|
The following fixes out-of-bounds read in the testcase.
PR tree-optimization/114027
* gcc.dg/vect/pr114027.c: Fix iteration count.
|
|
Arm32 predefines __ARM_FEATURE_UNALIGNED if -mno-unaligned-access,
and RISC-V predefines __riscv_misaligned_avoid.
Let's define __mips_strict_alignment for MIPSr6 and -mstrict-align
is used.
Note that, this macro is always defined for pre-R6.
gcc
* config/mips/mips.h (TARGET_CPU_CPP_BUILTINS): Predefine
__mips_strict_alignment if STRICT_ALIGNMENT.
|
|
|
|
Patrick noticed that my r14-9339-gdc6c3bfb59baab patch is wrong;
we're dealing with a noexcept-spec there, not a noexcept-expr, so
setting cp_noexcept_operand et al is incorrect. Back to the drawing
board then.
To fix noexcept84.C, we should probably avoid doing push_to_top_level
in certain cases. maybe_push_to_top_level didn't work here as-is, so
I changed it to not push to top level if decl_function_context is
non-null, when we are not dealing with a lambda.
This also fixes c++/114349, introduced by r14-9339.
PR c++/114349
gcc/cp/ChangeLog:
* name-lookup.cc (maybe_push_to_top_level): For a non-lambda,
don't push to top level if decl_function_context is non-null.
* pt.cc (maybe_instantiate_noexcept): Use maybe_push_to_top_level.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/noexcept85.C: New test.
* g++.dg/cpp0x/noexcept86.C: New test.
|
|
can_init_array_with_p is wrongly saying that the init for 's' here:
struct S {
int *list = arr;
int arr[];
};
struct A {
A() {}
S s[2]{};
};
is invalid. But as process_init_constructor_array says, for "non-constant
initialization of trailing elements with no explicit initializers" we use
a VEC_INIT_EXPR wrapped in a TARGET_EXPR, built in process_init_constructor.
Unfortunately we didn't have a test for this scenario so I didn't
realize can_init_array_with_p must handle it.
PR c++/114439
gcc/cp/ChangeLog:
* init.cc (can_init_array_with_p): Return true for a VEC_INIT_EXPR
wrapped in a TARGET_EXPR.
gcc/testsuite/ChangeLog:
* g++.dg/init/array65.C: New test.
|
|
* de.po: Update.
|
|
* sv.po: Update.
|
|
Add support for the gfx1036 RDNA2 APU integrated graphics devices. The ROCm
documentation warns that these may not be supported, but it seems to work
at least partially.
gcc/ChangeLog:
* config.gcc (amdgcn): Add gfx1036 entries.
* config/gcn/gcn-hsa.h (NO_XNACK): Likewise.
(gcn_local_sym_hash): Likewise.
* config/gcn/gcn-opts.h (enum processor_type): Likewise.
(TARGET_GFX1036): New macro.
* config/gcn/gcn.cc (gcn_option_override): Handle gfx1036.
(gcn_omp_device_kind_arch_isa): Likewise.
(output_file_start): Likewise.
* config/gcn/gcn.h (TARGET_CPU_CPP_BUILTINS): Add __gfx1036__.
(TARGET_CPU_CPP_BUILTINS): Rename __gfx1030 to __gfx1030__.
* config/gcn/gcn.opt: Add gfx1036.
* config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX1036): New.
(main): Handle gfx1036.
* config/gcn/t-omp-device: Add gfx1036 isa.
* doc/install.texi (amdgcn): Add gfx1036.
* doc/invoke.texi (-march): Likewise.
libgomp/ChangeLog:
* plugin/plugin-gcn.c (EF_AMDGPU_MACH): GFX1036.
(gcn_gfx1103_s): New.
(isa_hsa_name): Handle gfx1036.
(isa_code): Likewise.
(max_isa_vgprs): Likewise.
|
|
This patch rebuilds the documentation for the target independent
library sections.
gcc/m2/ChangeLog:
* Make-lang.in (doc/m2.pdf): Add line break.
* target-independent/m2/Builtins.texi: Rebuilt.
* target-independent/m2/gm2-libs.texi: Rebuilt.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
|
|
This patch would like to allow the RVV intrinsic when function is
attributed as target("arch=+v") and build with rv64gc. For example:
vint32m1_t
__attribute__((target("arch=+v")))
test_1 (vint32m1_t a, vint32m1_t b, size_t vl)
{
return __riscv_vadd_vv_i32m1 (a, b, vl);
}
build with -march=rv64gc -mabi=lp64d -O3, we will have asm like below:
test_1:
.option push
.option arch, rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_v1p0_zicsr2p0_\
zifencei2p0_zve32f1p0_zve32x1p0_zve64d1p0_zve64f1p0_zve64x1p0_zvl128b1p0_zvl32b1p0_zvl64b1p0
vsetvli zero,a0,e32,m1,ta,ma
vadd.vv v8,v8,v9
ret
The riscv_vector.h must be included when leverage intrinisc type(s) and
API(s). And the scope of this attribute should not excced the function
body. Meanwhile, to make rvv types and API(s) available for this attribute,
include riscv_vector.h will not report error for now if v is not present
in march.
Below test are passed for this patch:
* The riscv fully regression test.
gcc/ChangeLog:
* config/riscv/riscv-c.cc (riscv_pragma_intrinsic): Remove error
when V is disabled and init the RVV types and intrinic APIs.
* config/riscv/riscv-vector-builtins.cc (expand_builtin): Report
error if V ext is disabled.
* config/riscv/riscv.cc (riscv_return_value_is_vector_type_p):
Ditto.
(riscv_arguments_is_vector_type_p): Ditto.
(riscv_vector_cc_function_p): Ditto.
* config/riscv/riscv_vector.h: Remove error if V is disable.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/pragma-1.c: Remove.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-1.c: New test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-2.c: New test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-3.c: New test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-4.c: New test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-5.c: New test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-6.c: New test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-7.c: New test.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-8.c: New test.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
|
|
|
|
This patch corrects two error format specifiers.
gcc/m2/ChangeLog:
PR modula2/114444
* gm2-compiler/M2Quads.mod (BuildTruncFunction): Correct
error format specifier.
(BuildFloatFunction): Correct error format specifier.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
|
|
This patch inserts a missing quotation at the end of a line
if required (after an appropiate error message is generated).
gcc/m2/ChangeLog:
PR modula2/114443
* m2.flex: Call AddTokCharStar with a stringtok if
end of line is reached without a closing quote.
gcc/testsuite/ChangeLog:
PR modula2/114443
* gm2/pim/fail/missingquote.mod: New test.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
|
|
[PR114408]
gcc/analyzer/ChangeLog:
PR analyzer/114408
* engine.cc (impl_run_checkers): Free up any dominance info that
we may have created.
* kf.cc (class kf_ubsan_handler): New.
(register_sanitizer_builtins): New.
(register_known_functions): Call register_sanitizer_builtins.
gcc/testsuite/ChangeLog:
PR analyzer/114408
* c-c++-common/analyzer/deref-before-check-pr114408.c: New test.
* c-c++-common/ubsan/analyzer-ice-pr114408.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
This bug was hidden since LO_SUM DLTIND14R addresses are normally
handled by the A constraint in the move patterns.
2024-03-23 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
* config/pa/pa.cc (pa_output_global_address): Handle
UNSPEC_DLTIND14R addresses.
* config/pa/pa.h (PRINT_OPERAND_ADDRESS): Output "RT'" for
UNSPEC_DLTIND14R address.
|
|
We ICE on the following testcase, because handle_cast was incorrectly
testing !m_first to see whether it should use m_data[m_bitfld_load + 1]
or fresh SSA_NAME for a PHI result.
Now, m_first is in the routine sometimes temporarily cleared in between
doing prepare_data_in_out and the !m_first check and only before returning
restored from the save_first copy.
Without this patch, we try to use the same SSA_NAME (_12 here) in 2
different PHI results which is obviously invalid IL and ICEs very quickly.
2024-03-23 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/114433
* gimple-lower-bitint.cc (bitint_large_huge::handle_cast): For
m_bitfld_load check save_first rather than m_first.
* gcc.dg/torture/bitint-68.c: New test.
|
|
The task of the build_bitint_stmt_ssa_conflicts hook for
tree-ssa-coalesce.cc next to special casing the
multiplication/division/modulo is to ignore statements with
large/huge _BitInt lhs which isn't in names bitmap and on the
other side pretend all uses of the stmt are used in a later stmt
(single user of that SSA_NAME or perhaps single user of lhs of
the single user etc.) where the lowering will actually emit the
code.
Unfortunately the function wasn't handling COMPLEX_TYPE of the large/huge
BITINT_TYPE, while the FE doesn't really support such types, they are
used under the hood for __builtin_{add,sub,mul}_overflow{,_p}, they are
also present or absent from the names bitmap and should be treated the same.
Without this patch, the operands of .ADD_OVERFLOW were incorrectly pretended
to be used right in that call statement rather than on the cast stmt from
IMAGPART_EXPR of .ADD_OVERFLOW return value to some integral type.
2024-03-23 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/114425
* gimple-lower-bitint.cc (build_bitint_stmt_ssa_conflicts): Handle
_Complex large/huge _BitInt types like the large/huge _BitInt types.
* gcc.dg/torture/bitint-67.c: New test.
|
|
On the following testcases, there is no overlap between data references
within a single iteration, but the data references have size which is twice
as large as the step, which means the data references overlap with the next
iteration which predcom doesn't take into account.
As discussed in the PR, even if the reference size is smaller than step,
if step isn't a multiple of the reference size, there could be overlaps with
some other iteration later on.
The initial version of the patch regressed (test still passed, but predcom
didn't optimize anymore) pr71083.c which has a packed char, short structure
and was reading/writing the short 2 bytes in there with step 3.
The following patch deals with that by retrying for COMPONENT_REFs also the
aggregate sizes etc., so that it then compares 3 bytes against step 3.
In make check-gcc/check-g++ this patch I believe affects code generation
for only the 2 new testcases according to statistics I've gathered.
2024-03-23 Jakub Jelinek <jakub@redhat.com>
PR middle-end/111683
* tree-predcom.cc (pcom_worker::suitable_component_p): If has_write
and comp_step is RS_NONZERO, return false if any reference in the
component doesn't have DR_STEP a multiple of access size.
* gcc.dg/pr111683-1.c: New test.
* gcc.dg/pr111683-2.c: New test.
|
|
int test(int a) {
return a * 4 + 30000;
}
In the example above, since Xtensa has instructions to add register value
scaled by 2, 4 or 8 (and corresponding define_insns), we would expect them
to be used but not, because it is transformed before reaching the RTL
generation pass as below:
int test(int a) {
return (a + 7500) * 4;
}
Fortunately, the RTL combination pass tries a splitting pattern that matches
the first example, so it is easy to solve by defining that pattern.
gcc/ChangeLog:
* config/xtensa/xtensa.md: Add new split pattern described above.
|
|
|
|
gcc/fortran/ChangeLog:
PR fortran/55978
* interface.cc (gfc_compare_actual_formal): Skip size check for
NULL() actual without MOLD argument.
gcc/testsuite/ChangeLog:
PR fortran/55978
* gfortran.dg/null_actual_5.f90: New test.
|
|
gcc/
* config/avr/avr.cc (avr_set_current_function): Adjust diagnostic
for deprecated SIGNAL and INTERRUPT usage without respective header.
|
|
Use dg_add_options riscv_a to add atomic extension when running compile
tests on non-a targets.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/amo-table-ztso-amo-add-1.c: Add
dg_add_options riscv_a
* gcc.target/riscv/amo-table-ztso-amo-add-2.c: Ditto.
* gcc.target/riscv/amo-table-ztso-amo-add-3.c: Ditto.
* gcc.target/riscv/amo-table-ztso-amo-add-4.c: Ditto.
* gcc.target/riscv/amo-table-ztso-amo-add-5.c: Ditto.
* gcc.target/riscv/amo-table-ztso-compare-exchange-1.c: Ditto.
* gcc.target/riscv/amo-table-ztso-compare-exchange-2.c: Ditto.
* gcc.target/riscv/amo-table-ztso-compare-exchange-3.c: Ditto.
* gcc.target/riscv/amo-table-ztso-compare-exchange-4.c: Ditto.
* gcc.target/riscv/amo-table-ztso-compare-exchange-5.c: Ditto.
* gcc.target/riscv/amo-table-ztso-compare-exchange-6.c: Ditto.
* gcc.target/riscv/amo-table-ztso-compare-exchange-7.c: Ditto.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-1.c: Ditto.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-2.c: Ditto.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-3.c: Ditto.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-4.c: Ditto.
* gcc.target/riscv/amo-table-ztso-subword-amo-add-5.c: Ditto.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
|
|
The RDNA devices have different cache architectures to the CDNA devices, and
the differences go deeper than just the assembler mnemonics.
I believe this patch is correct according to the documentation in the LLVM
AMDGPU user guide (the ISA manual is less instructive), but I hadn't observed
any real problems before (or after).
gcc/ChangeLog:
* config/gcn/gcn.md (*memory_barrier): Split into RDNA and !RDNA.
(atomic_load<mode>): Adjust RDNA cache settings.
(atomic_store<mode>): Likewise.
(atomic_exchange<mode>): Likewise.
|
|
We run these devices in wavefrontsize64 for compatibility, but they actually
only have 32-lane vectors, natively. If the upper part of a V64 is masked
off (as it is in V32) then RDNA devices will skip execution of the upper part
for most operations, so this adjustment shouldn't leave too much performance on
the table. One exception is memory instructions, so full wavefrontsize32
support would be better.
The advantage is that we avoid the missing V64 operations (such as permute and
vec_extract).
gcc/ChangeLog:
* config/gcn/gcn.cc (gcn_vectorize_preferred_simd_mode): Prefer V32 on
RDNA devices.
|