Age | Commit message (Collapse) | Author | Files | Lines |
|
[...]/c-c++-common/Wfree-nonheap-object-3.c:57:24: warning: 'malloc (dealloc_float)' attribute ignored with deallocation functions declared 'inline' [-Wattributes]
[...]/c-c++-common/Wfree-nonheap-object-3.c:51:1: note: deallocation function declared here
[...]/c-c++-common/Wfree-nonheap-object-3.c: In function 'void test_nowarn_int(int)':
[...]/c-c++-common/Wfree-nonheap-object-3.c:25:20: warning: 'void __builtin_free(void*)' called on pointer 'p' with nonzero offset 4 [-Wfree-nonheap-object]
[...]/c-c++-common/Wfree-nonheap-object-3.c:24:24: note: returned from 'int* alloc_int(int)'
[...]/c-c++-common/Wfree-nonheap-object-3.c: In function 'void test_nowarn_long(int)':
[...]/c-c++-common/Wfree-nonheap-object-3.c:45:18: warning: 'void dealloc_long(long int*)' called on pointer '<unknown>' with nonzero offset 8 [-Wfree-nonheap-object]
[...]/c-c++-common/Wfree-nonheap-object-3.c:44:26: note: returned from 'long int* alloc_long(int)'
In function 'void dealloc_float(float*)',
inlined from 'void test_nowarn_float(int)' at [...]/c-c++-common/Wfree-nonheap-object-3.c:68:19:
[...]/c-c++-common/Wfree-nonheap-object-3.c:53:18: warning: 'void __builtin_free(void*)' called on pointer '<unknown>' with nonzero offset 8 [-Wfree-nonheap-object]
[...]/c-c++-common/Wfree-nonheap-object-3.c: In function 'void test_nowarn_float(int)':
[...]/c-c++-common/Wfree-nonheap-object-3.c:67:28: note: returned from 'float* alloc_float(int)'
PASS: c-c++-common/Wfree-nonheap-object-3.c -std=gnu++98 (test for warnings, line 25)
FAIL: c-c++-common/Wfree-nonheap-object-3.c -std=gnu++98 (test for warnings, line 45)
PASS: c-c++-common/Wfree-nonheap-object-3.c -std=gnu++98 (test for warnings, line 51)
PASS: c-c++-common/Wfree-nonheap-object-3.c -std=gnu++98 (test for warnings, line 53)
PASS: c-c++-common/Wfree-nonheap-object-3.c -std=gnu++98 (test for warnings, line 57)
FAIL: c-c++-common/Wfree-nonheap-object-3.c -std=gnu++98 (test for excess errors)
Excess errors:
[...]/c-c++-common/Wfree-nonheap-object-3.c:45:18: warning: 'void dealloc_long(long int*)' called on pointer '<unknown>' with nonzero offset 8 [-Wfree-nonheap-object]
..., that is: decorated 'void dealloc_long(long int*)' instead of plain
'dealloc_long' -- similar to how all the other 'dg-warning's allow for the
decorated function signature in addition to the plain one.
This issue was latent since the test case was added in
commit fe7f75cf16783589eedbab597e6d0b8d35d7e470
"Correct/improve maybe_emit_free_warning (PR middle-end/98166, PR c++/57111, PR middle-end/98160)",
and was finally exposed by my recent
commit 9c03391ba447ff86038d6a34c90ae737c3915b5f
"Tighten 'dg-warning' alternatives in 'c-c++-common/Wfree-nonheap-object{,-2,-3}.c'".
gcc/testsuite/
* c-c++-common/Wfree-nonheap-object-3.c: Fix 'dg-warning' for C++.
|
|
The following patch introduces {add,sub}c5_optab and pattern recognizes
various forms of add with carry and subtract with carry/borrow, see
pr79173-{1,2,3,4,5,6}.c tests on what is matched.
Primarily forms with 2 __builtin_add_overflow or __builtin_sub_overflow
calls per limb (with just one for the least significant one), for
add with carry even when it is hand written in C (for subtraction
reassoc seems to change it too much so that the pattern recognition
doesn't work). __builtin_{add,sub}_overflow are standardized in C23
under ckd_{add,sub} names, so it isn't any longer a GNU only extension.
Note, clang has for these (IMHO badly designed)
__builtin_{add,sub}c{b,s,,l,ll} builtins which don't add/subtract just
a single bit of carry, but basically add 3 unsigned values or
subtract 2 unsigned values from one, and result in carry out of 0, 1, or 2
because of that. If we wanted to introduce those for clang compatibility,
we could and lower them early to just two __builtin_{add,sub}_overflow
calls and let the pattern matching in this patch recognize it later.
I've added expanders for this on ix86 and in addition to that
added various peephole2s (in preparation patches for this patch) to make
sure we get nice (and small) code for the common cases. I think there are
other PRs which request that e.g. for the _{addcarry,subborrow}_u{32,64}
intrinsics, which the patch also improves.
Would be nice if support for these optabs was added to many other targets,
arm/aarch64 and powerpc* certainly have such instructions, I'd expect
in fact that most targets do.
The _BitInt support I'm working on will also need this to emit reasonable
code.
2023-06-15 Jakub Jelinek <jakub@redhat.com>
PR middle-end/79173
* internal-fn.def (UADDC, USUBC): New internal functions.
* internal-fn.cc (expand_UADDC, expand_USUBC): New functions.
(commutative_ternary_fn_p): Return true also for IFN_UADDC.
* optabs.def (uaddc5_optab, usubc5_optab): New optabs.
* tree-ssa-math-opts.cc (uaddc_cast, uaddc_ne0, uaddc_is_cplxpart,
match_uaddc_usubc): New functions.
(math_opts_dom_walker::after_dom_children): Call match_uaddc_usubc
for PLUS_EXPR, MINUS_EXPR, BIT_IOR_EXPR and BIT_XOR_EXPR unless
other optimizations have been successful for those.
* gimple-fold.cc (gimple_fold_call): Handle IFN_UADDC and IFN_USUBC.
* fold-const-call.cc (fold_const_call): Likewise.
* gimple-range-fold.cc (adjust_imagpart_expr): Likewise.
* tree-ssa-dce.cc (eliminate_unnecessary_stmts): Likewise.
* doc/md.texi (uaddc<mode>5, usubc<mode>5): Document new named
patterns.
* config/i386/i386.md (uaddc<mode>5, usubc<mode>5): New
define_expand patterns.
(*setcc_qi_addqi3_cconly_overflow_1_<mode>, *setccc): Split
into NOTE_INSN_DELETED note rather than nop instruction.
(*setcc_qi_negqi_ccc_1_<mode>, *setcc_qi_negqi_ccc_2_<mode>):
Likewise.
* gcc.target/i386/pr79173-1.c: New test.
* gcc.target/i386/pr79173-2.c: New test.
* gcc.target/i386/pr79173-3.c: New test.
* gcc.target/i386/pr79173-4.c: New test.
* gcc.target/i386/pr79173-5.c: New test.
* gcc.target/i386/pr79173-6.c: New test.
* gcc.target/i386/pr79173-7.c: New test.
* gcc.target/i386/pr79173-8.c: New test.
* gcc.target/i386/pr79173-9.c: New test.
* gcc.target/i386/pr79173-10.c: New test.
|
|
destination [PR79173]
This patch adds subborrow<mode> alternative so that it can have memory
destination and adds various peephole2s which help to match it.
2023-06-15 Jakub Jelinek <jakub@redhat.com>
PR middle-end/79173
* config/i386/i386.md (subborrow<mode>): Add alternative with
memory destination and add for it define_peephole2
TARGET_READ_MODIFY_WRITE/-Os patterns to prefer using memory
destination in these patterns.
|
|
borrow with memory destination [PR79173]
This patch adds various peephole2s which help to recognize add with
carry or subtract with borrow with memory destination.
2023-06-14 Jakub Jelinek <jakub@redhat.com>
PR middle-end/79173
* config/i386/i386.md (*sub<mode>_3, @add<mode>3_carry,
addcarry<mode>, @sub<mode>3_carry, *add<mode>3_cc_overflow_1): Add
define_peephole2 TARGET_READ_MODIFY_WRITE/-Os patterns to prefer
using memory destination in these patterns.
|
|
into fold-const-call.cc
Here is an incremental patch to handle constant folding of these
in fold-const-call.cc rather than gimple-fold.cc.
Not really sure if that is the way to go because it is replacing 28
lines of former code with 65 of new code, for the overall benefit that say
int
foo (long long *p)
{
int one = 1;
long long max = __LONG_LONG_MAX__;
return __builtin_add_overflow (one, max, p);
}
can be now fully folded already in ccp1 pass while before it was only
cleaned up in forwprop1 pass right after it.
On Wed, Jun 14, 2023 at 12:25:46PM +0000, Richard Biener wrote:
> I think that's still very much desirable so this followup looks OK.
> Maybe you can re-base it as prerequesite though?
Rebased then (of course with the UADDC/USUBC handling removed from this
first patch, will be added in the second one).
2023-06-15 Jakub Jelinek <jakub@redhat.com>
* gimple-fold.cc (gimple_fold_call): Move handling of arg0
as well as arg1 INTEGER_CSTs for .UBSAN_CHECK_{ADD,SUB,MUL}
and .{ADD,SUB,MUL}_OVERFLOW calls from here...
* fold-const-call.cc (fold_const_call): ... here.
|
|
This patch adds new RTL and tests for sabd and uabd
PR tree-optimization/109156
gcc/ChangeLog:
* config/aarch64/aarch64-simd.md (aarch64_<su>abd<mode>):
Rename to <su>abd<mode>3.
* config/aarch64/aarch64-sve.md (<su>abd<mode>_3): Rename
to <su>abd<mode>3.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/abd.h: New file.
* gcc.target/aarch64/abd_2.c: New test.
* gcc.target/aarch64/abd_3.c: New test.
* gcc.target/aarch64/abd_4.c: New test.
* gcc.target/aarch64/abd_none_2.c: New test.
* gcc.target/aarch64/abd_none_3.c: New test.
* gcc.target/aarch64/abd_none_4.c: New test.
* gcc.target/aarch64/abd_run_1.c: New test.
* gcc.target/aarch64/sve/abd_1.c: New test.
* gcc.target/aarch64/sve/abd_none_1.c: New test.
* gcc.target/aarch64/sve/abd_2.c: New test.
* gcc.target/aarch64/sve/abd_none_2.c: New test.
|
|
This adds a recognition pattern for the non-widening
absolute difference (ABD).
gcc/ChangeLog:
* doc/md.texi (sabd, uabd): Document them.
* internal-fn.def (ABD): Use new optab.
* optabs.def (sabd_optab, uabd_optab): New optabs,
* tree-vect-patterns.cc (vect_recog_absolute_difference):
Recognize the following idiom abs (a - b).
(vect_recog_sad_pattern): Refactor to use
vect_recog_absolute_difference.
(vect_recog_abd_pattern): Use patterns found by
vect_recog_absolute_difference to build a new ABD
internal call.
|
|
During the regression testing of the LoongArch architecture GCC, it was found
that the tests in the pr90883.C file failed. The problem was modulated and
found that the error was caused by setting the macro LARCH_CALL_RATIO to a too
large value. Combined with the actual LoongArch architecture, the different
thresholds for meeting the test conditions were tested using the engineering method
(SPEC CPU 2006), and the results showed that its optimal threshold should be set
to 6.
gcc/ChangeLog:
* config/loongarch/loongarch.h (LARCH_CALL_RATIO): Modify the value
of macro LARCH_CALL_RATIO on LoongArch to make it perform optimally.
|
|
This patch is to optimize the permuation case that is suiteable use
merge approach.
Consider this following case:
typedef int8_t vnx16qi __attribute__((vector_size (16)));
void __attribute__ ((noipa))
merge0 (vnx16qi x, vnx16qi y, vnx16qi *out)
{
vnx16qi v = __builtin_shufflevector ((vnx16qi) x, (vnx16qi) y, MASK_16);
*(vnx16qi*)out = v;
}
The gimple IR:
v_3 = VEC_PERM_EXPR <x_1(D), y_2(D), { 0, 17, 2, 19, 4, 21, 6, 23, 8, 9, 10, 27, 12, 29, 14, 31 }>;
Selector = { 0, 17, 2, 19, 4, 21, 6, 23, 8, 9, 10, 27, 12, 29, 14, 31 }, the common expression:
{ 0, nunits + 1, 2, nunits + 3, 4, nunits + 5, ... }
For this selector, we can use vmsltu + vmerge to optimize the codegen.
Before this patch:
merge0:
addi a5,sp,16
vl1re8.v v3,0(a5)
li a5,31
vsetivli zero,16,e8,m1,ta,mu
vmv.v.x v2,a5
lui a5,%hi(.LANCHOR0)
addi a5,a5,%lo(.LANCHOR0)
vl1re8.v v1,0(a5)
vl1re8.v v4,0(sp)
vand.vv v1,v1,v2
vmsgeu.vi v0,v1,16
vrgather.vv v2,v4,v1
vadd.vi v1,v1,-16
vrgather.vv v2,v3,v1,v0.t
vs1r.v v2,0(a0)
ret
After this patch:
merge0:
addi a5,sp,16
vl1re8.v v1,0(a5)
lui a5,%hi(.LANCHOR0)
addi a5,a5,%lo(.LANCHOR0)
vsetivli zero,16,e8,m1,ta,ma
vl1re8.v v0,0(a5)
vl1re8.v v2,0(sp)
vmsltu.vi v0,v0,16
vmerge.vvm v1,v1,v2,v0
vs1r.v v1,0(a0)
ret
The key of this optimization is that:
1. mask = vmsltu (selector, nunits)
2. result = vmerge (op0, op1, mask)
gcc/ChangeLog:
* config/riscv/riscv-v.cc (shuffle_merge_patterns): New pattern.
(expand_vec_perm_const_1): Add merge optmization.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge-7.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-7.c: New test.
|
|
The V2 patch address comments from Juzhe, thanks.
Hi,
The reason for this bug is that in the case where the vector register is set
to a fixed length (with `--param=riscv-autovec-preference=fixed-vlmax` option),
TARGET_PASS_BY_REFERENCE thinks that variables of type vint32m1 can be passed
through two scalar registers, but when GCC calls FUNCTION_VALUE (call function
riscv_get_arg_info inside) it returns NULL_RTX. These two functions are not
unified. The current treatment is to pass all vector arguments and returns
through the function stack, and a new calling convention for vector registers
will be added in the future.
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/
https://github.com/palmer-dabbelt/riscv-elf-psabi-doc/commit/126fa719972ff998a8a239c47d506c7809aea363
Best,
Lehua
gcc/ChangeLog:
PR target/110119
* config/riscv/riscv.cc (riscv_get_arg_info): Return NULL_RTX for vector mode
(riscv_pass_by_reference): Return true for vector mode
gcc/testsuite/ChangeLog:
PR target/110119
* gcc.target/riscv/rvv/base/pr110119-1.c: New test.
* gcc.target/riscv/rvv/base/pr110119-2.c: New test.
|
|
This patch is considered as the follow up of the below PATCH.
https://gcc.gnu.org/pipermail/gcc-patches/2023-June/621347.html
We aligned the predictor style for the define_insn_and_split suggested
by Kito. To avoid potential issues before we hit.
Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:
* config/riscv/autovec-opt.md: Align the predictor sytle.
* config/riscv/autovec.md: Ditto.
|
|
When constructing a vector mask from individual elements we wrongly
assumed that we can broadcast BITS_PER_WORD (i.e. XLEN). The maximum is
actually the vector element length (i.e. ELEN). This patch fixes this.
After this patch, below failures on RV32 will be fixed.
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution test
FAIL: gcc.target/riscv/rvv/autovec/vls-vlmax/repeat_run-3.c -std=c99 -O3 -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax execution test
Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:
* config/riscv/riscv-v.cc (rvv_builder::get_merge_scalar_mask):
Take elen instead of scalar BITS_PER_WORD.
(expand_vector_init_merge_repeating_sequence): Use inner_bits_size
instead of scaler BITS_PER_WORD.
|
|
|
|
This patch removes a remnant of mudflap.
gcc/ChangeLog:
* config/moxie/uclinux.h (MFWRAP_SPEC): Remove
|
|
Pushing to fix bootstrap.
gcc/ChangeLog:
* config/aarch64/aarch64-sve-builtins-base.cc (svlast_impl::fold):
Fix signed comparison warning in loop from npats to enelts.
|
|
In discussion of this issue CWG decided that the change of behavior on
well-formed code like overload-conv-4.C is undesirable. In further
discussion of possible resolutions, we discovered that we can avoid that
change while still getting the desired behavior on overload-conv-3.C by
making this a tiebreaker after comparing conversions, rather than before.
This also simplifies the implementation.
The issue resolution has not yet been finalized, but this seems like a clear
improvement.
DR 2327
PR c++/86521
gcc/cp/ChangeLog:
* call.cc (joust_maybe_elide_copy): Don't change cand.
(joust): Move the elided tiebreaker later.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/overload-conv-4.C: Remove warnings.
* g++.dg/cpp1z/elide7.C: New test.
|
|
..., so that users don't manually need to specify
'-foffload-options=-lgfortran', '-foffload-options=-lm' in addition to
'-lgfortran', '-lm' (specified manually, or implicitly by the driver).
gcc/
* gcc.cc (driver_handle_option): Forward host '-lgfortran', '-lm'
to offloading compilation.
* config/gcn/mkoffload.cc (main): Adjust.
* config/nvptx/mkoffload.cc (main): Likewise.
* doc/invoke.texi (foffload-options): Update example.
libgomp/
* testsuite/libgomp.fortran/fortran.exp (lang_link_flags): Don't
set.
* testsuite/libgomp.oacc-fortran/fortran.exp (lang_link_flags):
Likewise.
* testsuite/libgomp.c/simd-math-1.c: Remove
'-foffload-options=-lm'.
* testsuite/libgomp.fortran/fortran-torture_execute_math.f90:
Likewise.
* testsuite/libgomp.oacc-fortran/fortran-torture_execute_math.f90:
Likewise.
|
|
..., via 'include'ing the existing 'gfortran.fortran-torture/execute/math.f90',
which therefore is enhanced for optional OpenACC 'serial', OpenMP 'target'
usage.
gcc/testsuite/
* gfortran.fortran-torture/execute/math.f90: Enhance for optional
OpenACC 'serial', OpenMP 'target' usage.
libgomp/
* testsuite/libgomp.fortran/fortran-torture_execute_math.f90: New.
* testsuite/libgomp.oacc-fortran/fortran-torture_execute_math.f90:
Likewise.
|
|
'c-c++-common/Wfree-nonheap-object{,-2,-3}.c'
..., added in commit fe7f75cf16783589eedbab597e6d0b8d35d7e470
"Correct/improve maybe_emit_free_warning (PR middle-end/98166, PR c++/57111, PR middle-end/98160)".
These use alternatives like, for example, "AB|CDE|FG", but what really must've
been meant is "A(B|C)D(E|F)G". The former variant also does "work": it matches
any of "AB", or "CDE", or "FG", which are components of the latter variant.
(That means, the former variant matches too loosely.)
gcc/testsuite/
* c-c++-common/Wfree-nonheap-object-2.c: Tighten 'dg-warning'
alternatives.
* c-c++-common/Wfree-nonheap-object-3.c: Likewise.
* c-c++-common/Wfree-nonheap-object.c: Likewise.
|
|
..., which, presumably, was added by mistake in
commit dce6c58db87ebf7f4477bd3126228e73e4eeee97
"Add support for detecting mismatched allocation/deallocation calls".
gcc/testsuite/
* g++.dg/warn/Wfree-nonheap-object.s: Remove.
|
|
Since there's no evex version for vpcmpeq ymm, ymm, ymm.
gcc/ChangeLog:
PR target/110227
* config/i386/sse.md (mov<mode>_internal>): Use x instead of v
for alternative 2 since there's no evex version for vpcmpeqd
ymm, ymm, ymm.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr110227.c: New test.
|
|
|
|
This patch contains a python3 script to check the meta format error
specifications. It also includes about 20 fixes to M2Quads.mod format
specifications.
gcc/m2/ChangeLog:
* Make-lang.in (check-format-error): New rule.
* gm2-compiler/M2MetaError.mod (op): Add calls InternalError if
digits are detected.
* gm2-compiler/M2Quads.mod (BuildForToByDo): Bugfix to format
specifier.
(BuildLengthFunction): Bugfix to format specifiers.
(BuildOddFunction): Bugfix to format specifiers.
(BuildAbsFunction): Bugfix to format specifiers.
(BuildCapFunction): Bugfix to format specifiers.
(BuildChrFunction): Bugfix to format specifiers.
(BuildOrdFunction): Bugfix to format specifiers.
(BuildMakeAdrFunction): Bugfix to format specifiers.
(BuildSizeFunction): Bugfix to format specifiers.
(BuildBitSizeFunction): Bugfix to format specifiers.
* tools-src/checkmeta.py: New file.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
|
|
Quoting "How a computer should talk to people" (as quoted
in "Concepts Error Messages for Humans"):
"Various negative tones or actions are unfriendly: being manipulative,
not giving a second chance, talking down, using fashionable slang,
blaming. We must not seem to blame the person. We should avoid suggesting
that the person is inadequate. Phrases like "you forgot" may seem
harmless, but what if a computer said this to you four or five times in
two minutes? Anyway, the person may disagree, so why risk offense?"
gcc/c-family/ChangeLog:
PR c/84890
* known-headers.cc
(suggest_missing_header::~suggest_missing_header): Reword note to
avoid negative tone of "forgetting".
gcc/cp/ChangeLog:
PR c/84890
* name-lookup.cc (missing_std_header::~missing_std_header): Reword
note to avoid negative tone of "forgetting".
gcc/testsuite/ChangeLog:
PR c/84890
* g++.dg/cpp2a/srcloc3.C: Update expected message.
* g++.dg/lookup/missing-std-include-2.C: Likewise.
* g++.dg/lookup/missing-std-include-3.C: Likewise.
* g++.dg/lookup/missing-std-include-6.C: Likewise.
* g++.dg/lookup/missing-std-include.C: Likewise.
* g++.dg/spellcheck-inttypes.C: Likewise.
* g++.dg/spellcheck-stdint.C: Likewise.
* g++.dg/spellcheck-stdlib.C: Likewise.
* gcc.dg/spellcheck-inttypes.c: Likewise.
* gcc.dg/spellcheck-stdbool.c: Likewise.
* gcc.dg/spellcheck-stdint.c: Likewise.
* gcc.dg/spellcheck-stdlib.c: Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
gcc/testsuite/
* gfortran.dg/data_array_7.f90: New test.
|
|
gcc/fortran/ChangeLog:
PR fortran/86277
* trans-array.cc (gfc_trans_allocate_array_storage): When passing a
zero-sized array with fixed (= non-dynamic) size, allocate temporary
by the caller, not by the callee.
gcc/testsuite/ChangeLog:
PR fortran/86277
* gfortran.dg/zero_sized_14.f90: New test.
* gfortran.dg/zero_sized_15.f90: New test.
Co-authored-by: Mikael Morin <mikael@gcc.gnu.org>
|
|
I happened to be digging into the specs to understand a build
failure and spotted mflib and mfwrap. Those were used by the
mudflap system which we ripped out years ago and we just missed
these.
I verified x86 still bootstraps after removing these bits.
Pushed to the trunk as obvious,
gcc/
* gcc.cc (LINK_COMMAND_SPEC): Remove mudflap spec handling.
|
|
Spurred by Akari Takahashi's patch to config/sh/divtab.cc, this removes
divtab.cc completely.
divtab.cc was used to calculate a division table for the sh5 media
processor. GCC dropped support for that (unmanufactured) chip back
in 2016 and this file simply got missed AFAICT.
gcc/
* config/sh/divtab.cc: Remove.
|
|
I've noticed that standard_sse_constant_opcode emits some spurious
whitespace around tab, that isn't something which is done for
any other instruction and looks wrong.
2023-06-13 Jakub Jelinek <jakub@redhat.com>
* config/i386/i386.cc (standard_sse_constant_opcode): Remove
superfluous spaces around \t for vpcmpeqd.
|
|
This middle-end patch avoids some redundant RTL for vector initialization
during RTL expansion. For the simple test case:
typedef __int128 v1ti __attribute__ ((__vector_size__ (16)));
__int128 key;
v1ti foo() {
return (v1ti){key};
}
the middle-end currently expands:
(set (reg:V1TI 85) (const_vector:V1TI [ (const_int 0) ]))
(set (reg:V1TI 85) (mem/c:V1TI (symbol_ref:DI ("key"))))
where we create a dead instruction that initializes the vector to zero,
immediately followed by a set of the entire vector. This patch skips
this zeroing instruction when the vector has only a single element.
It also updates the code to indicate when we've cleared the vector,
so that we don't need to initialize zero elements.
2023-06-13 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* expr.cc (store_constructor) <case VECTOR_TYPE>: Don't bother
clearing vectors with only a single element. Set CLEARED if the
vector was initialized to zero.
|
|
Hi,
This patch remove the duplicate `#include "riscv-vector-switch.def"` statement
and add #undef for ENTRY and TUPLE_ENTRY macros later.
Best,
Lehua
gcc/ChangeLog:
* config/riscv/riscv-v.cc (struct mode_vtype_group): Remove duplicate
#include.
(ENTRY): Undef.
(TUPLE_ENTRY): Undef.
|
|
gcc/ChangeLog:
* config/riscv/riscv-v.cc (rvv_builder::single_step_npatterns_p): Add comment.
(shuffle_generic_patterns): Ditto.
(expand_vec_perm_const_1): Ditto.
|
|
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/partial/slp-10.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp-11.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp-13.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp-14.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp-15.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-10.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-11.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-13.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-14.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-15.c: New test.
|
|
Sorry for producing bugs in the previous VLA SLP patch.
Consider this following permutation:
_85 = VEC_PERM_EXPR <{ 99, 17, ... }, { 11, 80, ... }, { 0, POLY_INT_CST [4, 4], 1, POLY_INT_CST [5, 4], 2, POLY_INT_CST [6, 4], ... }>;
The correct result should be:
_85 = { 99, 11, 17, 80, ... }
However, I did wrong in the previous patch.
Code sequence before this patch:
set mask = { 0, 1, 0, 1, ... }
set v0 = { 99, 17, 99, 17, ... }
set v1 = { 11, 80, 11, 80, ... }
set index = viota (mask) = { 0, 0, 1, 1, 2, 2, ... }
set result = vrgather_mu (v0, v1, index, mask) = { 99, 11, 99, 80 }
The result is incorrect.
After this patch:
set mask = { 0, 1, 0, 1, ... }
set index = viota (mask) = { 0, 0, 1, 1, 2, 2, ... }
set v0 = vrgather ({ 99, 17, 99, 17, ... }, index) = { 99, 99, 17, 17, ... }
set v1 = { 11, 80, 11, 80, ... }
set result = vrgather_mu (v0, v1, index, mask) = { 99, 11, 17, 80 }
The result is what we expected.
This issue was discovered in the test I appended in this patch with --param=riscv-autovec-lmul=2.
gcc/ChangeLog:
* config/riscv/riscv-v.cc (emit_vlmax_decompress_insn): Fix bug.
(shuffle_decompress_patterns): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/partial/slp-12.c: New test.
* gcc.target/riscv/rvv/autovec/partial/slp_run-12.c: New test.
|
|
* tree-ssa-loop-ch.cc (ch_base::copy_headers): Free loop BBs.
|
|
If the type of a temporary has mutable members, we can't set TREE_READONLY
on the VAR_DECL; this is parallel to the check in
cp_apply_type_quals_to_decl.
gcc/cp/ChangeLog:
* tree.cc (build_target_expr): Check TYPE_HAS_MUTABLE_P.
gcc/testsuite/ChangeLog:
* g++.dg/tree-ssa/initlist-opt6.C: New test.
|
|
This patch adds support to check function's argument or return is vector type
and throw warning if yes.
There're two exceptions,
- The vector_size attribute.
- The intrinsic functions.
Some cases that need to add -Wno-psabi to ignore the warning.
gcc/ChangeLog:
* config/riscv/riscv-protos.h (riscv_init_cumulative_args): Set
warning flag if func is not builtin
* config/riscv/riscv.cc
(riscv_scalable_vector_type_p): Determine whether the type is scalable vector.
(riscv_arg_has_vector): Determine whether the arg is vector type.
(riscv_pass_in_vector_p): Check the vector type param is passed by value.
(riscv_init_cumulative_args): The same as header.
(riscv_get_arg_info): Add the checking.
(riscv_function_value): Check the func return and set warning flag
* config/riscv/riscv.h (INIT_CUMULATIVE_ARGS): Add a flag to
determine whether warning psabi or not.
gcc/testsuite/ChangeLog:
* g++.target/riscv/rvv/base/pr109244.C: Add the -Wno-psabi.
* g++.target/riscv/rvv/base/pr109535.C: Same
* gcc.target/riscv/rvv/base/binop_vx_constraint-120.c: Same
* gcc.target/riscv/rvv/base/integer_compare_insn_shortcut.c: Same
* gcc.target/riscv/rvv/base/mask_insn_shortcut.c: Same
* gcc.target/riscv/rvv/base/misc_vreinterpret_vbool_vint.c: Same
* gcc.target/riscv/rvv/base/pr110109-2.c: Same
* gcc.target/riscv/rvv/base/scalar_move-9.c: Same
* gcc.target/riscv/rvv/base/spill-10.c: Same
* gcc.target/riscv/rvv/base/spill-11.c: Same
* gcc.target/riscv/rvv/base/spill-9.c: Same
* gcc.target/riscv/rvv/base/vlmul_ext-1.c: Same
* gcc.target/riscv/rvv/base/zero_base_load_store_optimization.c: Same
* gcc.target/riscv/rvv/base/zvfh-intrinsic.c: Same
* gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c: Same
* gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Same
* gcc.target/riscv/rvv/vsetvl/vsetvl-1.c: Same
* gcc.target/riscv/vector-abi-1.c: New test.
* gcc.target/riscv/vector-abi-2.c: New test.
* gcc.target/riscv/vector-abi-3.c: New test.
* gcc.target/riscv/vector-abi-4.c: New test.
* gcc.target/riscv/vector-abi-5.c: New test.
* gcc.target/riscv/vector-abi-6.c: New test.
Signed-off-by: Yanzhang Wang <yanzhang.wang@intel.com>
Co-authored-by: Kito Cheng <kito.cheng@sifive.com>
|
|
After discussing the -mtp= option with Arm's LLVM developers we'd like to extend
the functionality of the option somewhat.
There are actually 3 system registers that can be accessed for the thread pointer
in aarch32: tpidrurw, tpidruro, tpidrprw. They are all read through the CP15 co-processor
mechanism. The current -mtp=cp15 option reads the tpidruro register.
This patch extends -mtp to allow for the above three explicit tpidr names and
keeps -mtp=cp15 as an alias of -mtp=tpidruro for backwards compatibility.
Bootstrapped and tested on arm-none-linux-gnueabihf.
gcc/ChangeLog:
* config/arm/arm-opts.h (enum arm_tp_type): Remove TP_CP15.
Add TP_TPIDRURW, TP_TPIDRURO, TP_TPIDRPRW values.
* config/arm/arm-protos.h (arm_output_load_tpidr): Declare prototype.
* config/arm/arm.cc (arm_option_reconfigure_globals): Replace TP_CP15
with TP_TPIDRURO.
(arm_output_load_tpidr): Define.
* config/arm/arm.h (TARGET_HARD_TP): Define in terms of TARGET_SOFT_TP.
* config/arm/arm.md (load_tp_hard): Call arm_output_load_tpidr to output
assembly.
(reload_tp_hard): Likewise.
* config/arm/arm.opt (tpidrurw, tpidruro, tpidrprw): New values for
arm_tp_type.
* doc/invoke.texi (Arm Options, mtp): Document new values.
gcc/testsuite/ChangeLog:
* gcc.target/arm/mtp.c: New test.
* gcc.target/arm/mtp_1.c: New test.
* gcc.target/arm/mtp_2.c: New test.
* gcc.target/arm/mtp_3.c: New test.
* gcc.target/arm/mtp_4.c: New test.
|
|
After discussing the -mtp= option with Arm's LLVM developers we'd like to extend
the functionality of the option somewhat.
First of all, there is another TPIDR register that can be used to read the thread pointer:
TPIDRRO_EL0 (which can also be accessed by AArch32 under another name) so it makes sense
to add -mtp=tpidrr0_el0. This makes the existing arguments el0, el1, el2, el3 somewhat
inconsistent in their naming so this patch introduces the more "full" names
tpidr_el0, tpidr_el1, tpidr_el2, tpidr_el3 and makes the above short names alias of these new ones.
Long story short, we preserve backwards compatibility and add a new TPIDR register to access through
-mtp that wasn't available previously.
There is more relevant discussion of the options at https://reviews.llvm.org/D152433 if you're interested.
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/ChangeLog:
PR target/108779
* config/aarch64/aarch64-opts.h (enum aarch64_tp_reg): Add
AARCH64_TPIDRRO_EL0 value.
* config/aarch64/aarch64.cc (aarch64_output_load_tp): Define.
* config/aarch64/aarch64.opt (tpidr_el0, tpidr_el1, tpidr_el2,
tpidr_el3, tpidrro_el3): New accepted values to -mtp=.
* doc/invoke.texi (AArch64 Options): Document new -mtp= options.
gcc/testsuite/ChangeLog:
PR target/108779
* gcc.target/aarch64/mtp_5.c: New test.
* gcc.target/aarch64/mtp_6.c: New test.
* gcc.target/aarch64/mtp_7.c: New test.
* gcc.target/aarch64/mtp_8.c: New test.
* gcc.target/aarch64/mtp_9.c: New test.
|
|
C++ requires inline functions to be declared inline and defined in
every translation unit that uses them. frange_nextafter is used in
gimple-range-op.cc but it's only defined as inline in
range-op-float.cc. Drop the extraneous inline specifier.
Other non-static inline functions in range-op-float.cc are not
referenced elsewhere, so I'm making them static.
for gcc/ChangeLog
* range-op-float.cc (frange_nextafter): Drop inline.
(frelop_early_resolve): Add static.
(frange_float): Likewise.
|
|
The following fixes native interpretation of a buffer as boolean
vector with bit-precision elements such as AVX512 vectors. The
check whether the buffer covers the whole vector was broken for
bit-precision elements and the following instead implements it
based on the vector type size.
PR middle-end/110232
* fold-const.cc (native_interpret_vector): Use TYPE_SIZE_UNIT
to check whether the buffer covers the whole vector.
* gcc.target/i386/pr110232.c: New testcase.
|
|
Alias analysis was treating .MASK_LOAD as storing a full vector
which means we disambiguate against decls of smaller than vector size.
This complements the previous patch handling .MASK_STORE and fixes
runtime execution FAILs of gfortran.dg/matmul_3.f90 and
gfortran.dg/inline_sum_2.f90 when using AVX512 with full masked loop
vectorization on Zen4.
* tree-ssa-alias.cc (ref_maybe_used_by_call_p_1): For
.MASK_LOAD and friends set the size of the access to unknown.
|
|
Update powerpc tests with extra zero_extend removal with default ree pass.
2023-06-13 Ajit Kumar Agarwal <aagarwa1@linux.ibm.com>
gcc/testsuite/ChangeLog:
PR testsuite/109880
* gcc.target/powerpc/fold-vec-extract-int.p8.c: Update test.
|
|
This patch is to make newly added test cases pr109932-{1,2}.c
check int128 effective target to avoid unsupported type error
on 32-bit. I did hit this failure during testing and fixed
it, but made a stupid mistake not updating the local formatted
patch which was actually out of date.
PR testsuite/110230
PR target/109932
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr109932-1.c: Adjust with int128 effective target.
* gcc.target/powerpc/pr109932-2.c: Ditto.
|
|
This patch is an alternative solution for a recent fix in analysis of
iterated component association.
To recap, if the iterated expression is an aggregate, we want to
propagate the component type downward with a call to Resolve_Aggr_Expr;
otherwise we want this expression to be only preanalysed (since the
association might need to be repeatedly evaluated), but also we need to
apply predicate and range checks to the expression itself (these are
required for GNATprove).
It turns out that Resolve_Aggr_Expr already knows how to deal with a
nested aggregate and also works for GNATprove, where it both preanalyzes
the expression and applies necessary checks.
In other words, expression of the iterated component association is now
resolved just like expression of an ordinary array aggregate.
gcc/ada/
* sem_aggr.adb (Resolve_Iterated_Component_Association): Simply resolve
the expression.
|
|
If a quantified expression says "for all ... of F(...)"
where F(...) is a function call that returns on the secondary
stack, we need to clean up the secondary stack. This patch
adds the required ss_mark/ss_release in that case.
gcc/ada/
* exp_ch4.adb
(Expand_N_Quantified_Expression): Detect the secondary-stack
case, and find the innermost scope where we should mark/release,
and Set_Uses_Sec_Stack on that. Skip intermediate blocks and loops
that are part of expansion.
|
|
As iterated_component_association is an array_component_association
(because of a grammar rule Ada 2022 RM 4.3.3(5/5)), its expression is
repeatedly evaluated (because of Ada 2022 RM 6.1.1(22.14/5)).
With this patch we will now get errors for both conjuncts in this code,
which have semantically equivalent array aggregates that use an ordinary
component association and iterated component association.
procedure Iter (S : String)
with Post => String'(for J in 1 .. 3 => S (S'First)'Old) =
String'( 1 .. 3 => S (S'First)'Old);
gcc/ada/
* sem_util.adb (Is_Repeatedly_Evaluated): Recognize iterated component
association as repeatedly evaluated.
|
|
Routine Is_Potentially_Unevaluated was written for Ada 2012, but now we
use it for Ada 2022 as well, so it must recognize iterated component
associations (which were added by Ada 2022) as an array component
association.
gcc/ada/
* sem_util.adb (Is_Potentially_Unevaluated): Recognize iterated
component association as potentially unevaluated.
|
|
Instead of explicitly disabling inlining in quantified expressions,
(which happen to be only preanalysed) and then disabling inlining in
potentially unevaluated contexts that are fully analysed (which happen
to include quantified expressions), we now simply disable inlining in
all potentially unevaluated contexts, regardless of the full analysis
mode.
This also disables inlining in iterated component associations, which
can be both preanalysed or fully analysed depending on their expression,
but nevertheless are potentially unevaluated.
gcc/ada/
* sem_res.adb (Resolve_Call): Replace early call to
In_Quantified_Expression with a call to Is_Potentially_Unevaluated that
was only done when Full_Analysis is true.
|
|
This patch allows subprograms to be annotated with aspect
Always_Terminates that requires a boolean expression. When this
expression evaluates to True, the subprogram is required to terminate or
raise an exception, but not loop infinitely.
This aspect is only meant to be used by GNATprove and it has no
meaningful run-time semantics: either the annotated subprogram
terminates and then the aspect expression doesn't matter, or the
subprogram loops infinitely and there is nothing we can do. (We could
also evaluate the aspect expression just to detect run-time errors in
the expression itself, but this can be implemented later, after a
backend support for the aspect is added to GNATprove.)
Implementation of this aspect is heavily based on the implementation of
Subprogram_Variant, which in turn is heavily based on the implementation
of Contract_Cases. Since the new aspect is not yet expanded, there is no
corresponding assertion kind that would control the expansion.
gcc/ada/
* aspects.ads (Aspect_Id): Add new aspect.
(Implementation_Defined_Aspect): New aspect is
implementation-defined.
(Aspect_Argument): New aspect has an expression argument.
(Is_Representation_Aspect): New aspect is not a representation
aspect.
(Aspect_Names): Link new aspect identifier with a name.
(Aspect_Delay): New aspect is never delayed.
* contracts.adb (Expand_Subprogram_Contract): Mention new aspect
in comment.
(Add_Contract_Item): Attach pragma corresponding to the new aspect
to contract items.
(Analyze_Entry_Or_Subprogram_Contract): Analyze pragma
corresponding to the new aspect that appears with subprogram spec.
(Analyze_Subprogram_Body_Stub_Contract): Expand pragma
corresponding to the new aspect.
* contracts.ads
(Add_Contract_Item, Analyze_Entry_Or_Subprogram_Contract)
(Analyze_Entry_Or_Subprogram_Body_Contract)
(Analyze_Subprogram_Body_Stub_Contract): Mention new aspect in
comment.
* einfo-utils.adb (Get_Pragma): Return pragma attached to
contract.
* einfo-utils.ads (Get_Pragma): Mention new contract in comment.
* exp_prag.adb (Expand_Pragma_Always_Terminates): Placeholder for
possibly expanding new aspect.
* exp_prag.ads (Expand_Pragma_Always_Terminates): Dedicated
routine for expansion of the new aspect.
* inline.adb (Remove_Aspects_And_Pragmas): Remove aspect from
inlined bodies.
* par-prag.adb (Prag): Postpone checking of the pragma until
analysis.
* sem_ch12.adb: Mention new aspect in explanation of handling
contracts on generic units.
* sem_ch13.adb (Analyze_Aspect_Specifications): Convert new aspect
into a corresponding pragma.
(Check_Aspect_At_Freeze_Point): Don't expect new aspect.
* sem_prag.adb (Analyze_Always_Terminates_In_Decl_Part): Analyze
pragma corresponding to the new aspect.
(Analyze_Pragma): Handle pragma corresponding to the new aspect.
(Is_Non_Significant_Pragma_Reference): Handle references appearing
within new aspect.
* sem_prag.ads (Aspect_Specifying_Pragma): New aspect can be
emulated with a pragma.
(Assertion_Expression_Pragma): New aspect has an assertion
expression.
(Pragma_Significant_To_Subprograms): New aspect is significant to
subprograms.
(Analyze_Always_Terminates_In_Decl_Part): Add spec for routine
that analyses new aspect.
(Find_Related_Declaration_Or_Body): Mention new aspect in comment.
* sem_util.adb (Is_Subprogram_Contract_Annotation): New aspect is
a subprogram contract annotation.
* sem_util.ads (Is_Subprogram_Contract_Annotation): Mention new
aspect in comment.
* sinfo.ads (Is_Generic_Contract_Pragma): New pragma is a generic
contract.
(Contract): Explain attaching new pragma to subprogram contract.
* snames.ads-tmpl (Name_Always_Terminates): New name for the new
contract.
(Pragma_Always_Terminates): New pragma identifier.
|