Age | Commit message (Collapse) | Author | Files | Lines |
|
gcc/ada/
* sem_prag.adb: (Analyze_Pragma): Reduce the number of nested if
statements.
|
|
Restore the original state of Style_Check pragmas before analyzing
each compilation unit to avoid Style_Check pragmas from unit affecting
the style checks of a different unit.
gcc/ada/
* sem_ch10.adb: (Analyze_Compilation_Unit): Restore the orignal
state of style check pragmas at the end of the analysis.
|
|
This occurs when the component is part of a discriminated type and its
offset depends on a discriminant, the problem being that the front-end
generates an incomplete Bit_Position attribute reference.
gcc/ada/
* exp_pakd.adb (Get_Base_And_Bit_Offset): Use the full component
reference instead of just the selector name for 'Bit_Position.
|
|
Previously, in this patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635392.html
I use vect64 && vect128 to represent both RVV and AMDGCN. However, it caused additional FAIL on ARM SVE.
I don't know why ARM SVE vect64 is set as true since their AdvSIMD is 128bit vector and they don't use 64bit vector.
So, here we leverage current AMDGCN solution, just add RISCV like AMDGCN.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/bb-slp-cond-1.c: Add riscv.
|
|
With the latest trunk, case pr106550_1.c runs with failure on ppc under -m32.
Previously, this case failed with ICE due to PR111971. Now, this emission is
exposed.
While, the case is testing 64bit constant building. So, "has_arch_ppc64"
is required.
PR target/112340
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr106550_1.c: Add has_arch_ppc64 target require.
|
|
This patch fixed the fellowing failed testcases on the trunk:
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-2.c scan-assembler-times \\tvfwredusum\\.vs\\tv[0-9]+,v[0-9]+,v[0-9]+,v0\\.t 2
...
FAIL: gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-2.c scan-assembler-times \\tvwredsumu\\.vs\\tv[0-9]+,v[0-9]+,v[0-9]+,v0\\.t 3
...
The reason for these failed testcases is the introduce of .VCOND_MASK_LEN
in midend for other bugfix and further leads to a new vcond_mask_len rtl
pattern after expand. So we need add new combine patterns handle this case.
Consider this code:
int16_t foo (int8_t *restrict a, int8_t *restrict pred)
{
int16_t sum = 0;
for (int i = 0; i < 16; i += 1)
if (pred[i])
sum += a[i];
return sum;
}
Before this patch:
foo:
vsetivli zero,16,e8,m1,ta,ma
vle8.v v0,0(a1)
vsetvli a5,zero,e8,m1,ta,ma
vmsne.vi v0,v0,0
vsetvli zero,zero,e16,m2,ta,ma
li a3,0
vmv.v.i v2,0
vsetivli zero,16,e16,m2,ta,ma
vle8.v v6,0(a0),v0.t
vmv.s.x v1,a3
vsetvli a5,zero,e16,m2,ta,ma
vsext.vf2 v4,v6
vsetivli zero,16,e16,m2,tu,ma
vmerge.vvm v2,v2,v4,v0
vsetvli a5,zero,e16,m2,ta,ma
vredsum.vs v2,v2,v1
vmv.x.s a0,v2
slliw a0,a0,16
sraiw a0,a0,16
ret
After this patch:
foo:
vsetivli zero,16,e16,m2,ta,ma
li a5,0
vle8.v v0,0(a1)
vmv.s.x v1,a5
vsetvli zero,zero,e8,m1,ta,ma
vmsne.vi v0,v0,0
vle8.v v2,0(a0),v0.t
vwredsum.vs v1,v2,v1,v0.t
vsetvli zero,zero,e16,m1,ta,ma
vmv.x.s a0,v1
slliw a0,a0,16
sraiw a0,a0,16
ret
Combine the vsext.vf2, vmerge.vvm, and vredsum.vs instructions while
reducing the corresponding vsetvl instructions.
gcc/ChangeLog:
* config/riscv/autovec-opt.md (*cond_len_<optab><v_double_trunc><mode>):
New combine pattern.
(*cond_len_<optab><v_quad_trunc><mode>): Ditto.
(*cond_len_<optab><v_oct_trunc><mode>): Ditto.
(*cond_len_extend<v_double_trunc><mode>): Ditto.
(*cond_len_widen_reduc_plus_scal_<mode>): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-1.c:
* gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-2.c:
|
|
vect-sdiv-pow2-1.c for RVV#
RVV didn't explictly enable DIV_POW2 optab but we cen vectorize it.
We should check pattern recognition instead of explicit pattern check.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/vect-sdiv-pow2-1.c: Fix dump check.
|
|
RVV didn't explicitly enable SAD optab but we can vectorize it
since loop vectorizer is able to recognize SAD pattern for RVV during analysis.
Current scan check of explicit SAD pattern looks odd,
it should be more reasonable to check recognition of SAD pattern during Loop vectorize analysis.
Other SAD tests like slp-reduc-sad-2.c are checking pattern recognition instead of explicit pattern enable.
Fix SAD dump check to fix the FAILS for RVV.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/slp-reduc-sad.c: Fix check.
* gcc.dg/vect/vect-reduc-sad.c: Ditto.
|
|
RVV is variable length vector but also has 256 bit VLS mode vector.
This test is vectorized as:
f:
vsetivli zero,8,e32,m2,ta,ma
vle32.v v2,0(a0)
vmv.v.i v4,1
vle16.v v1,0(a1)
vmseq.vv v0,v2,v4
vsetvli zero,zero,e16,m1,ta,ma
vmseq.vi v1,v1,2
vsetvli zero,zero,e32,m2,ta,ma
vmv.v.i v2,0
vmand.mm v0,v0,v1
vmerge.vvm v2,v2,v4,v0
vse32.v v2,0(a0)
ret
Use 256 bit vector, so remove XFAIL for 256 bits vector.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/bb-slp-43.c: Fix XPASS for RVV.
|
|
I notice we failed to AVL propagate for reduction with more complicate situation:
double foo (double *__restrict a,
double *__restrict b,
double *__restrict c,
int n)
{
double result = 0;
for (int i = 0; i < n; i++)
result += a[i] * b[i] * c[i];
return result;
}
vsetvli a5,a3,e8,mf8,ta,ma -> should be fused into e64m1,TU
slli a4,a5,3
vle64.v v3,0(a0)
vle64.v v1,0(a1)
vsetvli a6,zero,e64,m1,ta,ma -> redundant
vfmul.vv v1,v1,v3
vsetvli zero,a5,e64,m1,tu,ma -> redundant
vle64.v v3,0(a2)
vfmacc.vv v2,v1,v3
add a0,a0,a4
add a1,a1,a4
add a2,a2,a4
sub a3,a3,a5
bne a3,zero,.L3
The failed AVL propgation causes redundant AVL/VL togglling.
The root cause as follows:
vsetvl a5, zero
vadd.vv def r136
vsetvl zero, a3, ... TU
vsub.vv (use r136)
We propagate AVL (r136) from 'vsub.vv' into 'vadd.vv' when 'vsub.vv' is TA policy.
However, it's too restrict so we missed optimization here. We enhance AVL propation
for TU policy for following situation:
vsetvl a5, zero
vadd.vv def r136
vsetvl zero, a3, ... TU
vsub.vv (use r136, merge != r136)
Note that we should only propagate AVL when merge != r136 for 'vsub.vv' doesn't
depend on the tail elements.
After this patch:
vsetvli a5,a3,e64,m1,tu,ma
slli a4,a5,3
vle64.v v3,0(a0)
vle64.v v1,0(a1)
vfmul.vv v1,v1,v3
vle64.v v3,0(a2)
vfmacc.vv v2,v3,v1
add a0,a0,a4
add a1,a1,a4
add a2,a2,a4
sub a3,a3,a5
bne a3,zero,.L3
PR target/112399
gcc/ChangeLog:
* config/riscv/riscv-avlprop.cc
(pass_avlprop::get_vlmax_ta_preferred_avl): Enhance AVL propagation.
* config/riscv/t-riscv: Add new include.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/vsetvl/imm_switch-2.c: Adapt test.
* gcc.target/riscv/rvv/autovec/pr112399.c: New test.
|
|
This patch would like to support the FP below API auto vectorization
with different type size
+---------+-----------+----------+
| API | RV64 | RV32 |
+---------+-----------+----------+
| iceil | DF => SI | DF => SI |
| iceilf | - | - |
| lceil | - | DF => SI |
| lceilf | SF => DI | - |
| llceil | - | - |
| llceilf | SF => DI | SF => DI |
+---------+-----------+----------+
Given below code:
void
test_lceilf (long *out, float *in, unsigned count)
{
for (unsigned i = 0; i < count; i++)
out[i] = __builtin_lceilf (in[i]);
}
Before this patch:
.L3:
flw fa0,0(s0)
addi s0,s0,4
addi s1,s1,8
call ceilf
fcvt.l.s a5,fa0,rtz
sd a5,-8(s1)
bne s2,s0,.L3
ld ra,24(sp)
ld s0,16(sp)
ld s1,8(sp)
ld s2,0(sp)
addi sp,sp,32
jr ra
After this patch:
fsrmi 3 // RUP mode
.L3:
vsetvli a5,a2,e32,mf2,ta,ma
vle32.v v2,0(a1)
slli a3,a5,2
slli a4,a5,3
vfwcvt.x.f.v v1,v2
sub a2,a2,a5
vse64.v v1,0(a0)
add a1,a1,a3
add a0,a0,a4
bne a2,zero,.L3
Unfortunately, the HF mode is not include due to it requires
additional middle-end support from internal-fun.def.
gcc/ChangeLog:
* config/riscv/autovec.md: Remove the size check of lceil.l
* config/riscv/riscv-v.cc (expand_vec_lceil): Leverage
emit_vec_rounding_to_integer for ceil.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/unop/math-iceil-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-iceil-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lceil-rv32-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lceil-rv32-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lceilf-rv64-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lceilf-rv64-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-llceilf-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-llceilf-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-iceil-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lceil-rv32-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lceilf-rv64-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-llceilf-0.c: New test.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
This reverts commit ee7ba242cf43884477f09e59d9b80af4bf91d143.
|
|
This patch fixes:
FAIL: gcc.dg/vect/bb-slp-cond-1.c -flto -ffat-lto-objects scan-tree-dump-times vect "loop vectorized" 1
FAIL: gcc.dg/vect/bb-slp-cond-1.c scan-tree-dump-times vect "loop vectorized" 1
For RVV, "loop vectorized" appears 2 times instead of 1. Because:
optimized: loop vectorized using 16 byte vectors
optimized: loop vectorized using 8 byte vectors
As long as targets have both 64bit and 128bit vectors, it will occur 2 times.
2 targets are same situation, one is AMDGCN, the other is RVV.
Replace it target amdgcn with vect64 && vect128 to make test more general and easy maintain.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/bb-slp-cond-1.c: Fix FAIL.
|
|
Like s390, add riscv to fix the fail.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/bb-slp-39.c: Add RISCV.
|
|
|
|
2023-11-06 John David Anglin <danglin@gcc.gnu.org>
* config/pa/pa.cc (pa_asm_trampoline_template): Fix typo.
|
|
2023-11-06 John David Anglin <danglin@gcc.gnu.org>
* config/pa/pa-linux.h (NEED_INDICATE_EXEC_STACK): Define to 1.
|
|
This patch removes almost all use of diagnostic_context from the
source-printing code.
No functional change intended.
gcc/ChangeLog:
* diagnostic-show-locus.cc (class colorizer): Take just a
pretty_printer rather than a diagnostic_context.
(layout::layout): Make context param a const reference,
and pretty_printer param non-optional.
(layout::m_context): Drop field.
(layout::m_options): New field.
(layout::m_colorize_source_p): Drop field.
(layout::m_show_labels_p): Drop field.
(layout::m_show_line_numbers_p): Drop field.
(layout::print_gap_in_line_numbering): Use m_options.
(layout::calculate_line_spans): Likewise.
(layout::calculate_linenum_width): Likewise.
(layout::calculate_x_offset_display): Likewise.
(layout::print_source_line): Likewise.
(layout::start_annotation_line): Likewise.
(layout::print_annotation_line): Likewise.
(layout::print_line): Likewise.
(gcc_rich_location::add_location_if_nearby): Update for changes to
layout ctor.
(diagnostic_show_locus): Likewise.
(selftest::test_offset_impl): Likewise.
(selftest::test_layout_x_offset_display_utf8): Likewise.
(selftest::test_layout_x_offset_display_tab): Likewise.
(selftest::test_tab_expansion): Likewise.
* diagnostic.h (diagnostic_context::m_source_printing): Move
declaration of struct outside diagnostic_context as...
(struct diagnostic_source_printing_options)... this.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
This patch gathers the 6 fields in diagnostic_context relating to
keeping track of overriding the severity of warnings, and
pushing/popping those severities, moving them all into a new class
diagnostic_option_classifier.
No functional change intended.
gcc/ChangeLog:
* diagnostic.cc (diagnostic_context::push_diagnostics): Convert
to...
(diagnostic_option_classifier::push): ...this.
(diagnostic_context::pop_diagnostics): Convert to...
(diagnostic_option_classifier::pop): ...this.
(diagnostic_context::initialize): Move code to...
(diagnostic_option_classifier::init): ...this new function.
(diagnostic_context::finish): Move code to...
(diagnostic_option_classifier::fini): ...this new function.
(diagnostic_context::classify_diagnostic): Convert to...
(diagnostic_option_classifier::classify_diagnostic): ...this.
(diagnostic_context::update_effective_level_from_pragmas): Convert
to...
(diagnostic_option_classifier::update_effective_level_from_pragmas):
...this.
(diagnostic_context::diagnostic_enabled): Update for refactoring.
* diagnostic.h (struct diagnostic_classification_change_t): Move into...
(class diagnostic_option_classifier): ...this new class.
(diagnostic_context::option_unspecified_p): Update for move of
fields into m_option_classifier.
(diagnostic_context::classify_diagnostic): Likewise.
(diagnostic_context::push_diagnostics): Likewise.
(diagnostic_context::pop_diagnostics): Likewise.
(diagnostic_context::update_effective_level_from_pragmas): Delete.
(diagnostic_context::m_classify_diagnostic): Move into class
diagnostic_option_classifier.
(diagnostic_context::m_option_classifier): Likewise.
(diagnostic_context::m_classification_history): Likewise.
(diagnostic_context::m_n_classification_history): Likewise.
(diagnostic_context::m_push_list): Likewise.
(diagnostic_context::m_n_push): Likewise.
(diagnostic_context::m_option_classifier): New.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
No functional change intended.
gcc/ChangeLog:
* diagnostic.cc (diagnostic_context::set_urlifier): New.
* diagnostic.h (diagnostic_context::set_urlifier): New decl.
(diagnostic_context::m_urlifier): Make private.
* gcc.cc (driver::global_initializations): Use set_urlifier rather
than directly setting field.
* toplev.cc (general_init): Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
No functional change intended.
gcc/ChangeLog:
* diagnostic.cc (diagnostic_context::check_max_errors): Replace
uses of diagnostic_kind_count with simple field acesss.
(diagnostic_context::report_diagnostic): Likewise.
(diagnostic_text_output_format::~diagnostic_text_output_format):
Replace use of diagnostic_kind_count with
diagnostic_context::diagnostic_count.
* diagnostic.h (diagnostic_kind_count): Delete.
(errorcount): Replace use of diagnostic_kind_count with
diagnostic_context::diagnostic_count.
(warningcount): Likewise.
(werrorcount): Likewise.
(sorrycount): Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
This should be safe because this is a preprocessor test; it
should not exercise implicit function declarations.
* gcc.dg/cpp/wchar-1.c (main): Call __builtin_abort instead of
abort.
|
|
In some configurations of our validation setup, we always call the
compiler with -Wl,-rpath=XXX, which instructs the driver to invoke the
linker if none of -c, -S or -E is used.
This happens to be the case in the PCH tests, where dg-flags-pch sets
dg-do-what-default to precompile.
This works most of the time, in absence of any linker option, the
compiler defaults to generating a precompiled header (otherwise the
linker complains because it cannot find 'main').
This small patch forces the use of '-c' when generating the .gch file,
which is sufficient not to invoke the linker.
Arguably, this could be seen as a dejagnu bug: in gcc-dg-test-1 (in
gcc-dg.exp), we set compile_type to "precompiled_header", which is not
one of the supported values in dejagnu's default_target_compile (in
target.exp).
2023-10-27 Christophe Lyon <christophe.lyon@linaro.org>
gcc/testsuite/
* lib/dg-pch.exp (dg-flags-pch): Add -c when generating the
precompiled header.
|
|
Some targets like arm-eabi with newlib and default settings rely on
__sync_synchronize() to ensure synchronization. Newlib does not
implement it by default, to make users aware they have to take special
care.
This makes a few tests fail to link.
This patch adds a new thread_fence effective target (similar to the
corresponding one in libstdc++ testsuite), and uses it in the tests
that need it, making them UNSUPPORTED instead of FAIL and UNRESOLVED.
2023-09-10 Christophe Lyon <christophe.lyon@linaro.org>
gcc/
* doc/sourcebuild.texi (Other attributes): Document thread_fence
effective-target.
gcc/testsuite/
* g++.dg/init/array54.C: Require thread_fence.
* gcc.dg/c2x-nullptr-1.c: Likewise.
* gcc.dg/pr103721-2.c: Likewise.
* lib/target-supports.exp (check_effective_target_thread_fence):
New.
|
|
when developing an otherwise unrelated patch I've discovered that the
fnspec for the Fortran library function generate_error is wrong. It is
currently ". R . R " where the first R describes the first parameter
and means that it "is only read and does not escape." The function
itself, however, with signature:
bool
generate_error_common (st_parameter_common *cmp, int family, const char *message)
contains the following:
/* Report status back to the compiler. */
cmp->flags &= ~IOPARM_LIBRETURN_MASK;
which does not correspond to the fnspec and breaks testcase
gfortran.dg/large_unit_2.f90 when my patch is applied, since it tries
to re-use the flags from before the call.
This patch replaces the "R" with "W" which stands for "specifies that
the memory pointed to by the parameter does not escape."
gcc/fortran/ChangeLog:
2023-11-02 Martin Jambor <mjambor@suse.cz>
* trans-decl.cc (gfc_build_builtin_function_decls): Fix fnspec of
generate_error.
|
|
Use "addr" attribute with "gpr8" value to limit address register class
to non-REX registers in instructions with high registers, where REX
registers can not be used in the address.
gcc/ChangeLog:
* config/i386/constraints.md (Bc): Remove constraint.
(Bn): Rewrite to use x86_extended_reg_mentioned_p predicate.
* config/i386/i386.cc (ix86_memory_address_reg_class):
Do not limit processing to TARGET_APX_EGPR. Exit early for
NULL insn. Do not check recog_data.insn before calling
extract_insn_cached.
(ix86_insn_base_reg_class): Handle ADDR_GPR8.
(ix86_regno_ok_for_insn_base_p): Ditto.
(ix86_insn_index_reg_class): Ditto.
* config/i386/i386.md (*cmpqi_ext<mode>_1_mem_rex64):
Remove insn pattern and corresponding peephole2 pattern.
(*cmpi_ext<mode>_1): Remove (m,Q) alternative.
Change (QBc,Q) alternative to (QBn,Q). Add "addr" attribute.
(*cmpqi_ext<mode>_3_mem_rex64): Remove insn pattern
and corresponding peephole2 pattern.
(*cmpi_ext<mode>_3): Remove (Q,m) alternative.
Change (Q,QnBc) alternative to (Q,QnBn). Add "addr" attribute.
(*extzvqi_mem_rex64): Remove insn pattern and
corresponding peephole2 pattern.
(*extzvqi): Remove (Q,m) alternative. Change (Q,QnBc)
alternative to (Q,QnBn). Add "addr" attribute.
(*insvqi_1_mem_rex64): Remove insn pattern and
corresponding peephole2 pattern.
(*insvqi_1): Remove (Q,m) alternative. Change (Q,QnBc)
alternative to (Q,QnBn). Add "addr" attribute.
(@insv<mode>_1): Ditto.
(*addqi_ext<mode>_0): Remove (m,0,Q) alternative. Change (QBc,0,Q)
alternative to (QBn,0,Q). Add "addr" attribute.
(*subqi_ext<mode>_0): Ditto.
(*andqi_ext<mode>_0): Ditto.
(*<any_or:code>qi_ext<mode>_0): Ditto.
(*addqi_ext<mode>_1): Remove (Q,0,m) alternative. Change (Q,0,QnBc)
alternative to (Q,0,QnBn). Add "addr" attribute.
(*andqi_ext<mode>_1): Ditto.
(*andqi_ext<mode>_1_cc): Ditto.
(*<any_or:code>qi_ext<mode>_1): Ditto.
(*xorqi_ext<mode>_1_cc): Ditto.
* config/i386/predicates.md (nonimm_x64constmem_operand):
Remove predicate.
(general_x64constmem_operand): Ditto.
(norex_memory_operand): Ditto.
|
|
At the June WG14 meeting, WG14 decided it preferred to keep C23 as the
informal name for the next revision of the C standard, despite
publication not being before 2024 (publication is due in 2024 whether
or not technical changes at the January meeting result in an FDIS
ballot being needed). At the Cauldron I raised the question of
whether we should thus now add option names such as -std=c23 to GCC,
and there was support for doing so.
Add -std=c23, making -std=c2x a deprecated alias; also add the alias
-std=iso9899:2024. Likewise, add -std=gnu23, making -std=gnu2x a
deprecated alias, and add -Wc11-c23-compat, making -Wc11-c2x-compat a
deprecated alias.
Here, I'm generally just adding the new options and making the minimum
changes required to do so, with documentation changed to refer to C23
instead of C2X only where directly associated with documentation of
these options. It's intended that future changes will update
documentation, diagnostics, comments, variable names, testcase names,
etc. to refer consistently to C23. When such changes are made, the
new tests c23-opts-3.c, c23-opts-5.c and gnu23-opts-2.c are intended
to keep using the old option names they are specifically testing,
while other tests would start using the c23/gnu23 versions of the
names (as well as the tests themselves being renamed).
Updating option names is independent of updating to the final
__STDC_VERSION__ value. There, the question is whether we should
update the value now or wait for the remaining significant features to
be implemented first. (I intend to review Martin's tag compatibility
patches for GCC 14. I'm not aware of anyone working on #embed - or on
the [[unsequenced]] and [[reproducible]] attributes, though support
for standard attributes is optional.)
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
PR c/107954
gcc/
* doc/cpp.texi (__STDC_VERSION__): Refer to -std=c23 and
-std=gnu23 instead of -std=c2x and -std=gnu2x.
* doc/extend.texi (Attribute Syntax): Refer to C23 and -std=c23
instead of C2x and -std=c2x.
* doc/invoke.texi (-Wc11-c23-compat, -std=c23, -std=gnu23)
(-std=iso9899:2024): Document, with -Wc11-c2x-compat, -std=c2x and
-std=gnu2x as deprecated aliases. Update descriptions of C23.
* doc/standards.texi (Standards): Describe C23 with C2X as an old
name.
gcc/c-family/
* c.opt (Wc11-c2x-compat): Rename to Wc11-c23-compat and make into
a deprecated alias of Wc11-c23-compat.
(std=c2x): Rename to std=c23 and make into a deprecated alias of
std=c23.
(std=gnu2x): Rename to std=gnu23 and make into a deprecated alias
of std=gnu23.
(std=iso9899:2024): New option. Alias of std=c23.
* c-lex.cc (interpret_float): Use OPT_Wc11_c23_compat instead of
OPT_Wc11_c2x_compat.
* c-opts.cc (c_common_handle_option): Use OPT_std_c23 instead of
OPT_std_c2x and OPT_std_gnu23 instead of OPT_std_gnu2x.
gcc/c/
* c-errors.cc (pedwarn_c11): Use OPT_Wc11_c23_compat instead of
OPT_Wc11_c2x_compat.
* c-typeck.cc (build_conditional_expr, convert_for_assignment):
Use OPT_Wc11_c23_compat instead of OPT_Wc11_c2x_compat.
gcc/testsuite/
* gcc.dg/c23-opts-1.c, gcc.dg/c23-opts-2.c, gcc.dg/c23-opts-3.c,
gcc.dg/c23-opts-4.c, gcc.dg/c23-opts-5.c, gcc.dg/gnu23-opts-1.c,
gcc.dg/gnu23-opts-2.c: New tests.
|
|
With this 'MAKE_DECL_ONE_ONLY' definition, we get 'SUPPORTS_ONE_ONLY', and thus
'__GXX_WEAK__', and thus '__GXX_TYPEINFO_EQUALITY_INLINE'. This unblocks build
of 'libstdc++-v3/libsupc++/tinfo.cc', which otherwise depends on symbol alias
support, which GCC/nvptx doesn't generally provide. Also, this gets us a
number of FAIL -> PASS progressions in the test suite.
Given that GCC/nvptx support for weak symbols isn't complete, we also get a few
more of the already-known
'error: PTX does not support weak declarations (only weak definitions)':
[-PASS:-]{+FAIL:+} g++.old-deja/g++.other/crash11.C -std=c++14 (test for excess errors)
[-PASS:-]{+FAIL:+} g++.old-deja/g++.other/crash11.C -std=c++17 (test for excess errors)
[-PASS:-]{+FAIL:+} g++.old-deja/g++.other/crash11.C -std=c++20 (test for excess errors)
[-PASS:-]{+FAIL:+} g++.old-deja/g++.other/crash11.C -std=c++98 (test for excess errors)
[-PASS:-]{+FAIL:+} g++.old-deja/g++.pt/crash29.C -std=c++14 (test for excess errors)
[-PASS:-]{+FAIL:+} g++.old-deja/g++.pt/crash29.C -std=c++17 (test for excess errors)
[-PASS:-]{+FAIL:+} g++.old-deja/g++.pt/crash29.C -std=c++20 (test for excess errors)
[-PASS:-]{+FAIL:+} g++.old-deja/g++.pt/crash29.C -std=c++98 (test for excess errors)
[-PASS:-]{+FAIL:+} 23_containers/map/56613.cc -std=gnu++17 (test for excess errors)
... as well as one more of the already-known
'sorry, unimplemented: target cannot support nonlocal goto':
PASS: g++.dg/tree-ssa/pr22488.C -std=gnu++14 (test for excess errors)
PASS: g++.dg/tree-ssa/pr22488.C -std=gnu++17 (test for excess errors)
PASS: g++.dg/tree-ssa/pr22488.C -std=gnu++20 (test for excess errors)
[-PASS:-]{+FAIL:+} g++.dg/tree-ssa/pr22488.C -std=gnu++98 (test for excess errors)
We shall look into these, later.
gcc/
* config/nvptx/nvptx.h (MAKE_DECL_ONE_ONLY): Define.
|
|
The following fixes the mask argument generation for SIMD clone
calls under either loop masking or when the actual call is not
masked but only a inbranch simd clone is available. The issue
was that we tried to directly convert the vector mask to the
call argument type but SIMD clone masks require 1 or 0 (which
could be even float) values for mask elements so we have to
resort to a VEC_COND_EXPR to generate them just like we do for
regular passing of the mask.
PR tree-optimization/112405
* tree-vect-stmts.cc (vectorizable_simd_clone_call):
Properly handle invariant and/or loop mask passing.
|
|
This patch would like to support the FP below API auto vectorization
with different type size
+----------+-----------+----------+
| API | RV64 | RV32 |
+----------+-----------+----------+
| iround | DF => SI | DF => SI |
| iroundf | - | - |
| lround | - | DF => SI |
| lroundf | SF => DI | - |
| llround | - | - |
| llroundf | SF => DI | SF => DI |
+----------+-----------+----------+
Given below code:
void
test_lroundf (long *out, float *in, unsigned count)
{
for (unsigned i = 0; i < count; i++)
out[i] = __builtin_lroundf (in[i]);
}
Before this patch:
.L3:
flw fa5,0(a1)
addi a1,a1,4
addi a0,a0,8
fcvt.l.s a5,fa5,rmm
sd a5,-8(a0)
bne a4,a1,.L3
After this patch:
fsrmi 4 // RMM rounding mode
vsetivli zero,16,e32,m4,ta,ma
.L4:
vle32.v v4,0(a5)
addi a5,a5,64
vfwcvt.x.f.v v8,v4
vse64.v v8,0(a4)
addi a4,a4,128
bne a3,a5,.L4
andi a5,a2,15
andi a4,a2,-16
beq a5,zero,.L16
Unfortunately, the HF mode is not include due to it requires
additional middle-end support from internal-fun.def.
gcc/ChangeLog:
* config/riscv/autovec.md: Remove the size check of lround.
* config/riscv/riscv-v.cc (expand_vec_lround): Leverage
emit_vec_rounding_to_integer for round.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/unop/math-iround-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-iround-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-llroundf-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-llroundf-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lround-rv32-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lround-rv32-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lroundf-rv64-0.c: New test.
* gcc.target/riscv/rvv/autovec/unop/math-lroundf-rv64-run-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-iround-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-llroundf-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lround-rv32-0.c: New test.
* gcc.target/riscv/rvv/autovec/vls/math-lroundf-rv64-0.c: New test.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
An ICE was discovered in recent rounding autovec support:
config/riscv/riscv-v.cc:4314
65 | }
| ^
0x1fa5223 riscv_vector::validate_change_or_fail(rtx_def*, rtx_def**,
rtx_def*, bool)
/home/pli/repos/gcc/222/riscv-gnu-toolchain/gcc/__RISC-V_BUILD/../gcc/config/riscv/riscv-v.cc:4314
0x1fb1aa2 pre_vsetvl::remove_avl_operand()
/home/pli/repos/gcc/222/riscv-gnu-toolchain/gcc/__RISC-V_BUILD/../gcc/config/riscv/riscv-vsetvl.cc:3342
0x1fb18c1 pre_vsetvl::cleaup()
/home/pli/repos/gcc/222/riscv-gnu-toolchain/gcc/__RISC-V_BUILD/../gcc/config/riscv/riscv-vsetvl.cc:3308
0x1fb216d pass_vsetvl::lazy_vsetvl()
/home/pli/repos/gcc/222/riscv-gnu-toolchain/gcc/__RISC-V_BUILD/../gcc/config/riscv/riscv-vsetvl.cc:3480
0x1fb2214 pass_vsetvl::execute(function*)
/home/pli/repos/gcc/222/riscv-gnu-toolchain/gcc/__RISC-V_BUILD/../gcc/config/riscv/riscv-vsetvl.cc:3504
The root cause is that the RA reload into (set (reg) vec_duplicate:DI). However, it is not valid in RV32 system
since we don't have a single broadcast instruction DI scalar in RV32 system.
We should expand it early for RV32 system.
gcc/ChangeLog:
* config/riscv/predicates.md: Adapt predicate.
* config/riscv/riscv-protos.h (can_be_broadcasted_p): New function.
* config/riscv/riscv-v.cc (can_be_broadcasted_p): Ditto.
* config/riscv/vector.md (vec_duplicate<mode>): New pattern.
(*vec_duplicate<mode>): Adapt vec_duplicate insn pattern.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/unop/sew64-rv32.c: New test.
|
|
The following simplifies LC-PHI arg population during epilog peeling,
thereby fixing the testcase in this PR.
PR tree-optimization/111950
* tree-vect-loop-manip.cc (slpeel_duplicate_current_defs_from_edges):
Remove.
(find_guard_arg): Likewise.
(slpeel_update_phi_nodes_for_guard2): Likewise.
(slpeel_tree_duplicate_loop_to_edge_cfg): Remove calls to
slpeel_duplicate_current_defs_from_edges, do not elide
LC-PHIs for invariant values.
(vect_do_peeling): Materialize PHI arguments for the edge
around the epilog from the PHI defs of the main loop exit.
* gcc.dg/torture/pr111950.c: New testcase.
|
|
The following fixes an oversight in vect_check_scalar_mask when
the mask is external or constant. When doing BB vectorization
we need to provide a group_size, best via an overload accepting
the SLP node as argument.
When fixed we then run into the issue that we have not analyzed
alignment of the .MASK_LOADs because they were not identified
as loads by vect_gather_slp_loads. Fixed by reworking the
detection.
PR tree-optimization/112404
* tree-vectorizer.h (get_mask_type_for_scalar_type): Declare
overload with SLP node argument.
* tree-vect-stmts.cc (get_mask_type_for_scalar_type): Implement it.
(vect_check_scalar_mask): Use it.
* tree-vect-slp.cc (vect_gather_slp_loads): Properly identify
loads also for nodes with children, like .MASK_LOAD.
* tree-vect-loop.cc (vect_analyze_loop_2): Look at the
representative for load nodes and check whether it is a grouped
access before looking for load-lanes support.
* gfortran.dg/pr112404.f90: New testcase.
|
|
gcc/testsuite/
* gcc.c-torture/compile/20000412-2.c (f): Call
__builtin_strlen instead of strlen.
* gcc.c-torture/compile/20000427-1.c (FindNearestPowerOf2):
Declare.
* gcc.c-torture/compile/20000802-1.c (bar): Call
__builtin_memcpy instead of memcpy.
* gcc.c-torture/compile/20010525-1.c (kind_varread): Likewise.
* gcc.c-torture/compile/20010706-1.c (foo): Add missing int
return type.
* gcc.c-torture/compile/20020314-1.c (add_output_space_event)
(del_tux_atom, add_req_to_workqueue): Declare.
* gcc.c-torture/compile/20020701-1.c (f): Call
__builtin_memcpy instead of memcpy.
* gcc.c-torture/compile/20021015-2.c (f): Call __builtin_bcmp
instead of bcmo.
* gcc.c-torture/compile/20030110-1.c (inb): Declare.
* gcc.c-torture/compile/20030314-1.c (bar): Add missing
void return type.
* gcc.c-torture/compile/20030405-1.c (bar): Add missing int
return type.
* gcc.c-torture/compile/20030416-1.c (bar): Declare.
(main): Add missing int return type.
* gcc.c-torture/compile/20030503-1.c (bar): Declare.
* gcc.c-torture/compile/20030530-1.c: (bar): Declare.
* gcc.c-torture/compile/20031031-2.c (foo, bar, baz): Declare.
* gcc.c-torture/compile/20040101-1.c (test16): Call
__builtin_printf instead of printf.
* gcc.c-torture/compile/20040124-1.c (f2, f3): Declare.
* gcc.c-torture/compile/20040304-1.c (macarg): Declare.
* gcc.c-torture/compile/20040705-1.c (f): Call
__builtin_memcpy instead of memcpy.
* gcc.c-torture/compile/20040908-1.c (bar): Declare.
* gcc.c-torture/compile/20050510-1.c (dont_remove): Declare.
* gcc.c-torture/compile/20051228-1.c (bar): Declare.
* gcc.c-torture/compile/20060109-1.c (cpp_interpret_string):
Declare.
(int_c_lex, cb_ident): Add missing void return type.
(cb_ident): Define as static.
* gcc.c-torture/compile/20060202-1.c (sarray_get): Declare.
* gcc.c-torture/compile/20070129.c (regcurly)
(reguni): Declare.
* gcc.c-torture/compile/20070529-1.c (__fswab16): Declare.
* gcc.c-torture/compile/20070529-2.c (kmem_free): Declare.
* gcc.c-torture/compile/20070605-1.c (quantize_fs_dither):
Add missing void return type.
* gcc.c-torture/compile/20071107-1.c
(settings_install_property_parser): Declare.
* gcc.c-torture/compile/20090907-1.c (load_waveform): Call
__builtin_abort instead of abort.
* gcc.c-torture/compile/20100907.c (t): Add missing void
types.
* gcc.c-torture/compile/20120524-1.c (build_packet): Call
__builtin_memcpy instead of memcpy.
* gcc.c-torture/compile/20120830-2.c
(ubidi_writeReordered_49): Add missing void return type.
* gcc.c-torture/compile/20121010-1.c (read_long): Add missing
int return type.
* gcc.c-torture/compile/920301-1.c (f, g): Add missing void
types.
* gcc.c-torture/compile/920409-1.c (x): Likewise.
* gcc.c-torture/compile/920410-1.c (main): Add missing int
return type. Call __builtin_printf instead of printf.
* gcc.c-torture/compile/920410-2.c (joe): Add missing void
types.
* gcc.c-torture/compile/920411-2.c (x): Likewise.
* gcc.c-torture/compile/920413-1.c (f): Add missing int return
type.
* gcc.c-torture/compile/920428-3.c (x): Add missing int types.
* gcc.c-torture/compile/920428-4.c (x): Add missing void
return type and int parameter type.
* gcc.c-torture/compile/920501-10.c (x): Add missing int
types.
* gcc.c-torture/compile/920501-12.c (x, a, b, A, B): Likewise.
* gcc.c-torture/compile/920501-17.c (x): Add missing void
types.
* gcc.c-torture/compile/920501-19.c (y): Likewise.
* gcc.c-torture/compile/920501-22.c (x): Likewise.
* gcc.c-torture/compile/920501-3.c (x): Likewise.
* gcc.c-torture/compile/920501-4.c (foo): Likewise.
* gcc.c-torture/compile/920529-1.c (f): Call __builtin_abort
instead of abort.
* gcc.c-torture/compile/920615-1.c (f): Add missing void
types.
* gcc.c-torture/compile/920623-1.c (g): Likewise.
* gcc.c-torture/compile/920624-1.c (f): Likewise.
* gcc.c-torture/compile/920711-1.c (f): Add missing int types.
* gcc.c-torture/compile/920729-1.c (f): Add missing void
types.
* gcc.c-torture/compile/920806-1.c (f): Likewise.
* gcc.c-torture/compile/920821-2.c (f): Likewise.
* gcc.c-torture/compile/920825-1.c (f): Likewise.
* gcc.c-torture/compile/920825-2.c (f, g): Add missing void
return type.
* gcc.c-torture/compile/920826-1.c (f): Likewise.
* gcc.c-torture/compile/920828-1.c (f): Add missing int types.
* gcc.c-torture/compile/920829-1.c (f): Add missing void
return type.
* gcc.c-torture/compile/920928-3.c (f): Likewise.
* gcc.c-torture/compile/921012-2.c (f): Likewise.
* gcc.c-torture/compile/921013-1.c (f): Likewise.
* gcc.c-torture/compile/921019-1.c (f): Add missing void
types.
* gcc.c-torture/compile/921026-1.c (f): Add missing void
return type.
* gcc.c-torture/compile/921126-1.c (f): Add missing int
return type and missing void.
* gcc.c-torture/compile/921227-1.c (f): Add missing void
types.
* gcc.c-torture/compile/930109-2.c (f): Add missing int types.
* gcc.c-torture/compile/930210-1.c (f): Add missing void
types.
* gcc.c-torture/compile/930222-1.c (g): Declare.
(f): Add missing int return type.
* gcc.c-torture/compile/930421-1.c (f): Add missing void
return type.
* gcc.c-torture/compile/930503-1.c (f): Likewise.
* gcc.c-torture/compile/930513-1.c (f): Add missing int return
type.
* gcc.c-torture/compile/930513-3.c (test): Add missing void
types.
* gcc.c-torture/compile/930523-1.c (f): Likewise.
* gcc.c-torture/compile/930527-1.c (f): Likewise.
* gcc.c-torture/compile/930603-1.c (f): Likewise.
* gcc.c-torture/compile/930607-1.c (g): Likewise.
* gcc.c-torture/compile/930702-1.c (f): Add missing int
return type and missing void.
* gcc.c-torture/compile/931018-1.c (f): Add missing void
return type.
* gcc.c-torture/compile/931031-1.c (f): Likewise.
* gcc.c-torture/compile/931102-1.c (xxx): Add missing void
types.
* gcc.c-torture/compile/940611-1.c (f): Likewise.
* gcc.c-torture/compile/940712-1.c (f): Add missing int
return type and missing void.
* gcc.c-torture/compile/950512-1.c (g): Declare.
(f): Add missing void return type.
* gcc.c-torture/compile/950530-1.c (f): Add missing int
return type.
* gcc.c-torture/compile/950610-1.c (f): Add missing void
return type.
* gcc.c-torture/compile/950613-1.c (f): Add missing void
types.
* gcc.c-torture/compile/950816-1.c (f): Add missing int return
type and missing void.
* gcc.c-torture/compile/950816-2.c (func): Declare.
(f): Add missing void types.
* gcc.c-torture/compile/950816-3.c (f): Add missing int
return type and missing void.
* gcc.c-torture/compile/950919-1.c (f): Add missing void
types.
* gcc.c-torture/compile/950921-1.c (f): Add missing int
return type and missing void.
* gcc.c-torture/compile/951004-1.c (f): Add missing void
return type.
* gcc.c-torture/compile/951116-1.c (f): Add missing int
return type and missing void.
* gcc.c-torture/compile/951128-1.c (f): Add missing void
return type.
* gcc.c-torture/compile/951220-1.c (f): Add missing int return
type.
* gcc.c-torture/compile/960220-1.c (f): Add missing void
types.
* gcc.c-torture/compile/960221-1.c (foo): Add missing void
return type.
* gcc.c-torture/compile/960704-1.c (main): Add missing int
return type and missing void.
* gcc.c-torture/compile/961031-1.c (f): Add missing void
types.
* gcc.c-torture/compile/961126-1.c (sub, sub2): Declare.
(main): Add missing int return type and missing void.
* gcc.c-torture/compile/961203-1.c (main): Call __builtin_exit
instead of exit.
* gcc.c-torture/compile/981001-1.c (main): Likewise.
* gcc.c-torture/compile/981107-1.c (call): Declare.
* gcc.c-torture/compile/990517-1.c (sdbm__splpage): Call
__builtin_memcpy instead of memcpy.
* gcc.c-torture/compile/990617-1.c (main): Call
__builtin_printf instead of printf.
* gcc.c-torture/compile/991026-2.c (detach): Add missing void
types.
* gcc.c-torture/compile/991229-1.c (ejEval): Likewise.
* gcc.c-torture/compile/991229-3.c (rand): Declare.
|
|
Current glibc headers only declare fputs_unlocked for _GNU_SOURCE,
so define it to obtain an official prototype.
Add a fallback prototype declaration for other systems that do not
have fputs_unlocked. This seems to the most straightforward approach
to avoid an implicit function declaration, without reducing test
coverage and introducing ongoing maintenance requirements (e.g.,
FreeBSD added fputs_unlocked support fairly recently).
gcc/testsuite/
* gcc.c-torture/execute/builtins/fputs.c (_GNU_SOURCE):
Define.
(fputs_unlocked): Declare.
|
|
In order to prevent simplification of a COND_OP with degenerate mask
(CONSTM1_RTX) into just an OP in the presence of length masking this
patch introduces a length-masked analog to VEC_COND_EXPR:
IFN_VCOND_MASK_LEN.
It also adds new match patterns that allow the combination of
unconditional unary, binary and ternay operations with the
VCOND_MASK_LEN into a conditional operation if the target supports it.
gcc/ChangeLog:
PR tree-optimization/111760
* config/riscv/autovec.md (vcond_mask_len_<mode><vm>): Add
expander.
* config/riscv/riscv-protos.h (enum insn_type): Add.
* config/riscv/riscv-v.cc (needs_fp_rounding): Add !pred_mov.
* doc/md.texi: Add vcond_mask_len.
* gimple-match-exports.cc (maybe_resimplify_conditional_op):
Create VCOND_MASK_LEN when length masking.
* gimple-match.h (gimple_match_op::gimple_match_op): Always
initialize len and bias.
* internal-fn.cc (vec_cond_mask_len_direct): Add.
(direct_vec_cond_mask_len_optab_supported_p): Add.
(internal_fn_len_index): Add VCOND_MASK_LEN.
(internal_fn_mask_index): Ditto.
* internal-fn.def (VCOND_MASK_LEN): New internal function.
* match.pd: Combine unconditional unary, binary and ternary
operations into the respective COND_LEN operations.
* optabs.def (OPTAB_D): Add vcond_mask_len optab.
gcc/testsuite/ChangeLog:
* gcc.dg/vect/vect-cond-arith-2.c: No vect cost model for
riscv_v.
|
|
align_dynamic_address would output alignment operations even
for a required alignment of 1 byte.
gcc/
* explow.cc (align_dynamic_address): Do nothing if the required
alignment is a byte.
|
|
This patch allows allocate_dynamic_stack_space to be called before
or after virtual registers have been instantiated. It uses the
same approach as allocate_stack_local, which already supported this.
gcc/
* function.h (get_stack_dynamic_offset): Declare.
* function.cc (get_stack_dynamic_offset): New function,
split out from...
(get_stack_dynamic_offset): ...here.
* explow.cc (allocate_dynamic_stack_space): Handle calls made
after virtual registers have been instantiated.
|
|
gcc/ChangeLog:
PR target/112393
* config/i386/i386-expand.cc (ix86_expand_vec_perm_vpermt2):
Avoid generating RTL code when d->testing_p.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr112393.c: New test.
|
|
The following fixes an error in strip_float_extensions when facing
vector conversions.
PR tree-optimization/112369
* tree.cc (strip_float_extensions): Use element_precision.
* gcc.dg/pr112369.c: New testcase.
|
|
The FP rint test cases for RV32 need some additional adjust
for types and data. This patch would like to fix this which
is missed in FP rint support PATCH for RV32 only by mistake.
Please note the math-llrintf-run-0.c will trigger one ICE in the
vsetvl pass in RV32 only.
./riscv32-unknown-elf-gcc -march=rv32gcv -mabi=ilp32d \
-O3 -ftree-vectorize -fno-vect-cost-model -ffast-math \
gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/math-llrintf-run-0.c \
-o test.elf -lm
Then there will have ICE similar as below, and will file bugzilla for it.
config/riscv/riscv-v.cc:4314
65 | }
| ^
0x1fa5223 riscv_vector::validate_change_or_fail(rtx_def*, rtx_def**,
rtx_def*, bool)
/home/pli/repos/gcc/222/riscv-gnu-toolchain/gcc/__RISC-V_BUILD/../gcc/config/riscv/riscv-v.cc:4314
0x1fb1aa2 pre_vsetvl::remove_avl_operand()
/home/pli/repos/gcc/222/riscv-gnu-toolchain/gcc/__RISC-V_BUILD/../gcc/config/riscv/riscv-vsetvl.cc:3342
0x1fb18c1 pre_vsetvl::cleaup()
/home/pli/repos/gcc/222/riscv-gnu-toolchain/gcc/__RISC-V_BUILD/../gcc/config/riscv/riscv-vsetvl.cc:3308
0x1fb216d pass_vsetvl::lazy_vsetvl()
/home/pli/repos/gcc/222/riscv-gnu-toolchain/gcc/__RISC-V_BUILD/../gcc/config/riscv/riscv-vsetvl.cc:3480
0x1fb2214 pass_vsetvl::execute(function*)
/home/pli/repos/gcc/222/riscv-gnu-toolchain/gcc/__RISC-V_BUILD/../gcc/config/riscv/riscv-vsetvl.cc:3504
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c: Adjust
test cases.
* gcc.target/riscv/rvv/autovec/unop/math-llrintf-run-0.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/math-lrint-rv32-run-0.c: Ditto.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
The following tries to clarify the __builtin_constant_p documentation,
stating that the argument expression is not evaluated and side-effects
are discarded. I'm struggling to find the correct terms matching
what the C language standard would call things so I'd appreciate
some help here.
OK for trunk?
Shall we diagnose arguments with side-effects? It seems to me
such use is usually unintended? I think rather than dropping
side-effects as a side-effect of folding the frontend should
discard them at parsing time instead, no?
Thanks,
Richard.
PR middle-end/112296
* doc/extend.texi (__builtin_constant_p): Clarify that
side-effects are discarded.
|
|
As discussed in PR111828, rs6000_update_ipa_fn_target_info
is much conservative, currently for any non-empty inline
asm, without any parsing, it would take inline asm could
have HTM insns. It means for one function attributed with
power8 having inline asm, even if it has no HTM insns, we
don't make a function attributed with power10 inline it.
Peter pointed out an inline asm parser can be a slippery
slope, and noticed that the current gnu assembler still
allows HTM insns even with power10 machine type, so he
suggested that we can aggressively ignore the handling on
inline asm, this patch goes for this suggestion.
Considering that there are a few assembler alternatives
and assembler can update its behaviors (complaining HTM
insns at power10 and later cpus sounds reasonable from a
certain point of view), this patch also checks assembler
complains on HTM insns at power10 or not. For a case that
a caller attributed power10 calls a callee attributed
power8 having inline asm with HTM insn, without inlining
at least the compilation succeeds, but if assembler
complains HTM insns at power10, after inlining the
compilation would fail.
The two associated test cases are fine without and with
this patch (effective target takes effect or not).
PR target/111828
gcc/ChangeLog:
* config.in: Regenerate.
* config/rs6000/rs6000.cc (rs6000_update_ipa_fn_target_info): Guard
inline asm handling under !HAVE_AS_POWER10_HTM.
* configure: Regenerate.
* configure.ac: Detect assembler support for HTM insns at power10.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp
(check_effective_target_powerpc_as_p10_htm): New proc.
* g++.target/powerpc/pr111828-1.C: New test.
* g++.target/powerpc/pr111828-2.C: New test.
|
|
Update in v6:
* Rename maybe_require_frm_p to may_require_frm_p.
* Rename maybe_require_vxrm_p to may_require_vxrm_p.
* Move may_require_frm_p and may_require_vxrm_p to function_base.
Update in v5:
* Split has_vxrm_or_frm_p into maybe_require_frm_p and
maybe_require_vxrm_p.
* Adjust comments.
Update in v4:
* Remove class function_resolver.
* Remove function get_non_overloaded_instance.
* Add overloaded hash traits for non-overloaded intrinsic.
* All overloaded intrinsics are implemented, and the tests pass.
Update in v3:
* Rewrite comment for overloaded function add.
* Move get_non_overloaded_instance to function_base.
Update in v2:
* Add get_non_overloaded_instance for function instance.
* Fix overload check for policy function.
* Enrich the test cases check.
Original log:
This patch would like add the framework to support the RVV overloaded
intrinsic API in riscv-xxx-xxx-gcc, like riscv-xxx-xxx-g++ did.
However, it almost leverage the hook TARGET_RESOLVE_OVERLOADED_BUILTIN
with below steps.
* Register overloaded functions.
* Add function_resolver for overloaded function resolving.
* Add resolve API for function shape with default implementation.
* Implement HOOK for navigating the overloaded API to non-overloaded API.
gcc/ChangeLog:
* config/riscv/riscv-c.cc (riscv_resolve_overloaded_builtin): New function for the hook.
(riscv_register_pragmas): Register the hook.
* config/riscv/riscv-protos.h (resolve_overloaded_builtin): New decl.
* config/riscv/riscv-vector-builtins-bases.cc: New function impl.
* config/riscv/riscv-vector-builtins-shapes.cc (build_one): Register overloaded function.
* config/riscv/riscv-vector-builtins.cc (struct non_overloaded_registered_function_hasher):
New hash table.
(function_builder::add_function): Add overloaded arg.
(function_builder::add_unique_function): Map overloaded function to non-overloaded function.
(function_builder::add_overloaded_function): New API impl.
(registered_function::overloaded_hash): Calculate hash value.
(has_vxrm_or_frm_p): New function impl.
(non_overloaded_registered_function_hasher::hash): Ditto.
(non_overloaded_registered_function_hasher::equal): Ditto.
(handle_pragma_vector): Allocate space for hash table.
(resolve_overloaded_builtin): New function impl.
* config/riscv/riscv-vector-builtins.h (function_base::may_require_frm_p): Ditto.
(function_base::may_require_vxrm_p): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/overloaded_rv32_vadd.c: New test.
* gcc.target/riscv/rvv/base/overloaded_rv32_vfadd.c: New test.
* gcc.target/riscv/rvv/base/overloaded_rv32_vget_vset.c: New test.
* gcc.target/riscv/rvv/base/overloaded_rv32_vloxseg2ei16.c: New test.
* gcc.target/riscv/rvv/base/overloaded_rv32_vmv.c: New test.
* gcc.target/riscv/rvv/base/overloaded_rv32_vreinterpret.c: New test.
* gcc.target/riscv/rvv/base/overloaded_rv64_vadd.c: New test.
* gcc.target/riscv/rvv/base/overloaded_rv64_vfadd.c: New test.
* gcc.target/riscv/rvv/base/overloaded_rv64_vget_vset.c: New test.
* gcc.target/riscv/rvv/base/overloaded_rv64_vloxseg2ei16.c: New test.
* gcc.target/riscv/rvv/base/overloaded_rv64_vmv.c: New test.
* gcc.target/riscv/rvv/base/overloaded_rv64_vreinterpret.c: New test.
* gcc.target/riscv/rvv/base/overloaded_vadd.h: New test.
* gcc.target/riscv/rvv/base/overloaded_vfadd.h: New test.
* gcc.target/riscv/rvv/base/overloaded_vget_vset.h: New test.
* gcc.target/riscv/rvv/base/overloaded_vloxseg2ei16.h: New test.
* gcc.target/riscv/rvv/base/overloaded_vmv.h: New test.
* gcc.target/riscv/rvv/base/overloaded_vreinterpret.h: New test.
Signed-off-by: Li Xu <xuli1@eswincomputing.com>
Co-Authored-By: Pan Li <pan2.li@intel.com>
|
|
gcc/ChangeLog:
PR target/111889
* config/i386/avx512bf16intrin.h: Push no-evex512 target.
* config/i386/avx512bf16vlintrin.h: Ditto.
* config/i386/avx512bitalgvlintrin.h: Ditto.
* config/i386/avx512bwintrin.h: Ditto.
* config/i386/avx512dqintrin.h: Ditto.
* config/i386/avx512fintrin.h: Ditto.
* config/i386/avx512fp16intrin.h: Ditto.
* config/i386/avx512fp16vlintrin.h: Ditto.
* config/i386/avx512ifmavlintrin.h: Ditto.
* config/i386/avx512vbmi2vlintrin.h: Ditto.
* config/i386/avx512vbmivlintrin.h: Ditto.
* config/i386/avx512vlbwintrin.h: Ditto.
* config/i386/avx512vldqintrin.h: Ditto.
* config/i386/avx512vlintrin.h: Ditto.
* config/i386/avx512vnnivlintrin.h: Ditto.
* config/i386/avx512vp2intersectvlintrin.h: Ditto.
* config/i386/avx512vpopcntdqvlintrin.h: Ditto.
gcc/testsuite/ChangeLog:
PR target/111889
* gcc.target/i386/pr111889.c: New test.
|
|
gcc/ChangeLog:
* config/i386/avx512bf16vlintrin.h
(_mm_avx512_castsi128_ps): New.
(_mm256_avx512_castsi256_ps): Ditto.
(_mm_avx512_slli_epi32): Ditto.
(_mm256_avx512_slli_epi32): Ditto.
(_mm_avx512_cvtepi16_epi32): Ditto.
(_mm256_avx512_cvtepi16_epi32): Ditto.
(__attribute__): Change intrin call.
* config/i386/avx512bwintrin.h
(_mm_avx512_set_epi32): New.
(_mm_avx512_set_epi16): Ditto.
(_mm_avx512_set_epi8): Ditto.
(__attribute__): Change intrin call.
* config/i386/avx512fp16intrin.h: Ditto.
* config/i386/avx512fp16vlintrin.h
(_mm_avx512_set1_ps): New.
(_mm256_avx512_set1_ps): Ditto.
(_mm_avx512_and_si128): Ditto.
(_mm256_avx512_and_si256): Ditto.
(__attribute__): Change intrin call.
* config/i386/avx512vlbwintrin.h
(_mm_avx512_set1_epi32): New.
(_mm_avx512_set1_epi16): Ditto.
(_mm_avx512_set1_epi8): Ditto.
(_mm256_avx512_set_epi16): Ditto.
(_mm256_avx512_set_epi8): Ditto.
(_mm256_avx512_set1_epi16): Ditto.
(_mm256_avx512_set1_epi32): Ditto.
(_mm256_avx512_set1_epi8): Ditto.
(_mm_avx512_max_epi16): Ditto.
(_mm_avx512_min_epi16): Ditto.
(_mm_avx512_max_epu16): Ditto.
(_mm_avx512_min_epu16): Ditto.
(_mm_avx512_max_epi8): Ditto.
(_mm_avx512_min_epi8): Ditto.
(_mm_avx512_max_epu8): Ditto.
(_mm_avx512_min_epu8): Ditto.
(_mm256_avx512_max_epi16): Ditto.
(_mm256_avx512_min_epi16): Ditto.
(_mm256_avx512_max_epu16): Ditto.
(_mm256_avx512_min_epu16): Ditto.
(_mm256_avx512_insertf128_ps): Ditto.
(_mm256_avx512_extractf128_pd): Ditto.
(_mm256_avx512_extracti128_si256): Ditto.
(_MM256_AVX512_REDUCE_OPERATOR_BASIC_EPI16): Ditto.
(_MM256_AVX512_REDUCE_OPERATOR_MAX_MIN_EP16): Ditto.
(_MM256_AVX512_REDUCE_OPERATOR_BASIC_EPI8): Ditto.
(_MM256_AVX512_REDUCE_OPERATOR_MAX_MIN_EP8): Ditto.
(__attribute__): Change intrin call.
|
|
gcc/ChangeLog:
* config/i386/avx512bf16vlintrin.h: Change intrin call.
* config/i386/avx512fintrin.h
(_mm_avx512_undefined_ps): New.
(_mm_avx512_undefined_pd): Ditto.
(__attribute__): Change intrin call.
* config/i386/avx512vbmivlintrin.h: Ditto.
* config/i386/avx512vlbwintrin.h: Ditto.
* config/i386/avx512vldqintrin.h: Ditto.
* config/i386/avx512vlintrin.h
(_mm_avx512_undefined_si128): New.
(_mm256_avx512_undefined_ps): Ditto.
(_mm256_avx512_undefined_pd): Ditto.
(_mm256_avx512_undefined_si256): Ditto.
(__attribute__): Change intrin call.
|
|
The newly added _mm{,256}_avx512* intrins are duplicated from their
_mm{,256}_* forms from AVX2 or before. We need to add them to prevent target
option mismatch when calling AVX512 intrins implemented with these intrins
under no-evex512 function attribute. All AVX512 intrins calling those AVX2
intrins or before will change their calls to these newly added AVX512 version.
gcc/ChangeLog:
* config/i386/avx512bitalgvlintrin.h: Change intrin call.
* config/i386/avx512dqintrin.h: Ditto.
* config/i386/avx512fintrin.h:
(_mm_avx512_setzero_ps): New.
(_mm_avx512_setzero_pd): Ditto.
(__attribute__): Change intrin call.
* config/i386/avx512fp16intrin.h: Ditto.
* config/i386/avx512fp16vlintrin.h: Ditto.
* config/i386/avx512vbmi2vlintrin.h: Ditto.
* config/i386/avx512vbmivlintrin.h: Ditto.
* config/i386/avx512vlbwintrin.h: Ditto.
* config/i386/avx512vldqintrin.h: Ditto.
* config/i386/avx512vlintrin.h
(_mm_avx512_setzero_si128): New.
(_mm256_avx512_setzero_pd): Ditto.
(_mm256_avx512_setzero_ps): Ditto.
(_mm256_avx512_setzero_si256): Ditto.
(__attribute__): Change intrin call.
* config/i386/avx512vpopcntdqvlintrin.h: Ditto.
* config/i386/gfniintrin.h: Ditto.
|
|
|
|
Test is currently failing on x86_64-apple-darwin with
"decimal floating-point not supported for this target".
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr111753.c: Require dfp.
|