Age | Commit message (Collapse) | Author | Files | Lines |
|
This patch fixes the backend pattern that was printing the wrong input
scalar register pair when inserting into lane 1.
Added a new test to force float-abi=hard so we can use scan-assembler to check
correct codegen.
gcc/ChangeLog:
PR target/115611
* config/arm/mve.md (mve_vec_setv2di_internal): Fix printing of input
scalar register pair when lane = 1.
gcc/testsuite/ChangeLog:
* gcc.target/arm/mve/intrinsics/vsetq_lane_su64.c: New test.
|
|
In this PR, due to the -f flags, we ended up with:
bb1: r10=r10
...
bb2: r10=r10
...
bb3: ...=r10
with bb1->bb2 and bb1->bb3.
late-combine successfully combined the bb1->bb2 def-use and set
the insn code to NOOP_MOVE_INSN_CODE. The bb1->bb3 combination
then failed for... reasons. At this point, everything should have
been rewound to its original state.
However, substituting r10=r10 into r10=r10 gives r10=r10, and
validate_change had an early-out for no-op rtl changes. This meant
that validate_change did not register a change for the bb2 insn and
so did not save its old insn code. The NOOP_MOVE_INSN_CODE therefore
persisted even after the attempt had been rewound.
IMO it'd be too cumbersome and error-prone to expect all users of
validate_change to be aware of this possibility. If code is using
validate_change with in_group=1, I think it has a reasonable expectation
that a change will be registered and that the insn code will be saved
(and restored on cancel). This patch therefore limits the shortcut
to the !in_group case.
gcc/
PR rtl-optimization/115782
* recog.cc (validate_change_1): Suppress early exit for no-op
changes that are part of a group.
gcc/testsuite/
PR rtl-optimization/115782
* gcc.dg/pr115782.c: New test.
|
|
gcc/fortran/ChangeLog:
* trans-array.cc (gfc_conv_array_parameter): Init variable to
NULL_TREE to fix bootstrap.
|
|
The Ada compiler now defers to the gimplifier for ordering comparisons of
arrays of bytes (Ada parlance for <, >, <= and >=) because the gimplifier
in turn defers to memcmp for them, which implements the required semantics.
However, the gimplifier has a special processing for aggregate types whose
mode is not BLKmode and this processing deviates from the memcmp semantics
when the target is little-endian.
gcc/
* gimplify.cc (gimplify_scalar_mode_aggregate_compare): Add support
for ordering comparisons.
(gimplify_expr) <default>: Call gimplify_scalar_mode_aggregate_compare
only for integral scalar modes.
gcc/testsuite/
* gnat.dg/array42.adb, gnat.dg/array42_pkg.ads: New test.
|
|
There are these insns that subtract and zero-extend where
the subtrahend is zero-extended to the mode of the minuend.
This patch uses one insn (and split) with mode iterators
instead of spelling out each variant individually.
This has the additional benefit that u32 - u24 is also supported,
which previously wasn't.
gcc/
* config/avr/avr-protos.h (avr_out_minus): New prototype.
* config/avr/avr.cc (avr_out_minus): New function.
* config/avr/avr.md (*sub<HISI:mode>3.zero_extend.<QIPSI:mode>)
(*sub<HISI:mode>3.zero_extend.<QIPSI:mode>_split): New insns.
(*subpsi3_zero_extend.qi_split): Remove isns_and_split.
(*subpsi3_zero_extend.hi_split): Remove insn_and_split.
(*subhi3_zero_extend1_split): Remove insn_and_split.
(*subsi3_zero_extend_split): Remove insn_and_split.
(*subsi3_zero_extend.hi_split): Remove insn_and_split.
(*subpsi3_zero_extend.qi): Remove insn.
(*subpsi3_zero_extend.hi): Remove insn.
(*subhi3_zero_extend1): Remove insn.
(*subsi3_zero_extend): Remove insn.
(*subsi3_zero_extend.hi): Remove insn.
gcc/testsuite/
* gcc.target/avr/torture/sub-zerox.c: New test.
|
|
When duplicate_decls finds a match with an existing imported
declaration, it clears DECL_LANG_SPECIFIC of the olddecl and replaces it
with the contents of newdecl; this clears DECL_MODULE_ENTITY_P causing
an ICE if the same declaration is imported again later.
This fixes the issue by ensuring that the flag is transferred to newdecl
before clearing so that it ends up on olddecl again.
For future-proofing we also do the same with DECL_MODULE_KEYED_DECLS_P,
though because we don't yet support textual redefinition merging we
can't yet test this works as intended. I don't expect it's possible for
a new declaration already to have extra keyed decls mismatching that of
the old declaration though, so I don't do anything with 'keyed_map' at
this time.
PR c++/99241
gcc/cp/ChangeLog:
* decl.cc (duplicate_decls): Merge module entity information.
gcc/testsuite/ChangeLog:
* g++.dg/modules/pr99241_a.H: New test.
* g++.dg/modules/pr99241_b.H: New test.
* g++.dg/modules/pr99241_c.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
|
|
This patch would like to add the test cases for the vector .SAT_SUB in
the zip benchmark. Aka:
Form in zip benchmark:
#define DEF_VEC_SAT_U_SUB_ZIP(T1, T2) \
void __attribute__((noinline)) \
vec_sat_u_sub_##T1##_##T2##_fmt_zip (T1 *x, T2 b, unsigned limit) \
{ \
T2 a; \
T1 *p = x; \
do { \
a = *--p; \
*p = (T1)(a >= b ? a - b : 0); \
} while (--limit); \
}
DEF_VEC_SAT_U_SUB_ZIP(uint8_t, uint16_t)
vec_sat_u_sub_uint16_t_uint32_t_fmt_zip:
...
vsetvli a4,zero,e32,m1,ta,ma
vmv.v.x v6,a1
vsetvli zero,zero,e16,mf2,ta,ma
vid.v v2
li a4,-1
vnclipu.wi v6,v6,0 // .SAT_TRUNC
.L3:
vle16.v v3,0(a3)
vrsub.vx v5,v2,a6
mv a7,a4
addw a4,a4,t3
vrgather.vv v1,v3,v5
vssubu.vv v1,v1,v6 // .SAT_SUB
vrgather.vv v3,v1,v5
vse16.v v3,0(a3)
sub a3,a3,t1
bgtu t4,a4,.L3
Passed the rv64gcv tests.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h: Add test
helper macros.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_data.h: Add test
data for .SAT_SUB in zip benchmark.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vx.h: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_zip-run.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_zip.c: New test.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
association argument and add un/pack_class. [PR96992]
Removing the assert in trans-expr, lead to initial strides not set
which is now fixed. When the array needs repacking, this is done for
class arrays now, too.
Packing class arrays was done using the regular internal pack
function in the past. But that does not use the vptr's copy
function and breaks OOP paradigms (e.g. deep copy). The new
un-/pack_class functions use the vptr's copy functionality to
implement OOP paradigms correctly.
PR fortran/96992
gcc/fortran/ChangeLog:
* trans-array.cc (gfc_trans_array_bounds): Set a starting
stride, when descriptor expects a variable for the stride.
(gfc_trans_dummy_array_bias): Allow storage association for
dummy class arrays, when they are not elemental.
(gfc_conv_array_parameter): Add more general class support
and packing for classes, too.
* trans-array.h (gfc_conv_array_parameter): Add lbound shift
for class arrays.
* trans-decl.cc (gfc_build_builtin_function_decls): Add decls
for internal_un-/pack_class.
* trans-expr.cc (gfc_reset_vptr): Allow supplying a type-tree
to generate the vtab from.
(gfc_class_set_vptr): Allow supplying a class-tree to take the
vptr from.
(class_array_data_assign): Rename to gfc_class_array_data_assign
and make usable from other compile units.
(gfc_class_array_data_assign): Renamed from class_array_data_
assign.
(gfc_conv_derived_to_class): Remove assert to
allow converting derived to class type arrays with assumed
rank. Reduce code base and use gfc_conv_array_parameter also
for classes.
(gfc_conv_class_to_class): Use gfc_class_data_assign.
(gfc_conv_procedure_call): Adapt to new signature of
gfc_conv_derived_to_class.
* trans-io.cc (transfer_expr): Same.
* trans-stmt.cc (trans_associate_var): Same.
* trans.h (gfc_conv_derived_to_class): Signature changed.
(gfc_class_array_data_assign): Made public.
(gfor_fndecl_in_pack_class): Added declaration.
(gfor_fndecl_in_unpack_class): Same.
libgfortran/ChangeLog:
* Makefile.am: Add in_un-/pack_class.c to build.
* Makefile.in: Regenerated from Makefile.am.
* gfortran.map: Added new functions and bumped ABI.
* libgfortran.h (GFC_CLASS_T): Added for generating class
representation at runtime.
* runtime/in_pack_class.c: New file.
* runtime/in_unpack_class.c: New file.
gcc/testsuite/ChangeLog:
* gfortran.dg/class_dummy_11.f90: New test.
|
|
Add the --include and --exclude flags to gcov to control what functions
to report on. This is meant to make gcov more practical as an when
writing test suites or performing other coverage experiments, which
tends to focus on a few functions at the time. This really shines in
combination with the -t/--stdout flag. With support for more expansive
metrics in gcov like modified condition/decision coverage (MC/DC) and
path coverage, output quickly gets overwhelming without filtering.
The approach is quite simple: filters are egrep regexes and are
evaluated left-to-right, and the last filter "wins", that is, if a
function matches an --include and a subsequent --exclude, it should not
be included in the output. All of the output machinery works on the
function table, so by optionally (not) adding function makes the even
the json output work as expected, and only minor changes are needed to
suppress the filtered-out functions.
Demo: math.c
int mul (int a, int b) {
return a * b;
}
int sub (int a, int b) {
return a - b;
}
int sum (int a, int b) {
return a + b;
}
Plain matches:
$ gcov -t math --include=sum
-: 0:Source:math.c
-: 0:Graph:math.gcno
-: 0:Data:-
-: 0:Runs:0
#####: 9:int sum (int a, int b) {
#####: 10: return a + b;
-: 11:}
$ gcov -t math --include=mul
-: 0:Source:math.c
-: 0:Graph:math.gcno
-: 0:Data:-
-: 0:Runs:0
#####: 1:int mul (int a, int b) {
#####: 2: return a * b;
-: 3:}
Regex match:
$ gcov -t math --include=su
-: 0:Source:math.c
-: 0:Graph:math.gcno
-: 0:Data:-
-: 0:Runs:0
#####: 5:int sub (int a, int b) {
#####: 6: return a - b;
-: 7:}
#####: 9:int sum (int a, int b) {
#####: 10: return a + b;
-: 11:}
And similar for exclude:
$ gcov -t math --exclude=sum
-: 0:Source:math.c
-: 0:Graph:math.gcno
-: 0:Data:-
-: 0:Runs:0
#####: 1:int mul (int a, int b) {
#####: 2: return a * b;
-: 3:}
#####: 5:int sub (int a, int b) {
#####: 6: return a - b;
-: 7:}
And json, for good measure:
$ gcov -t math --include=sum --json | jq ".files[].lines[]"
{
"line_number": 9,
"function_name": "sum",
"count": 0,
"unexecuted_block": true,
"block_ids": [],
"branches": [],
"calls": []
}
{
"line_number": 10,
"function_name": "sum",
"count": 0,
"unexecuted_block": true,
"block_ids": [
2
],
"branches": [],
"calls": []
}
Matching generally work well for mangled names, as the mangled names
also have the base symbol name in it. By default, functions are matched
by the mangled name, which means matching on base names always work as
expected. The -M flag makes the matching work on the demangled name
which is quite useful when you only want to report on specific
overloads and can use the full type names.
Why not just use grep? grep is not really sufficient as grep is very
line oriented, and the reports that benefit the most from filtering
often unpredictably span multiple lines based on the state of coverage.
For example, a condition coverage report for 3 terms/6 outcomes only
outputs 1 line when all conditions are covered, and 7 with no lines
covered.
gcc/ChangeLog:
* doc/gcov.texi: Add --include, --exclude, --match-on-demangled
documentation.
* gcov.cc (struct fnfilter): New.
(print_usage): Add --include, --exclude, -M,
--match-on-demangled.
(process_args): Likewise.
(release_structures): Release filters.
(read_graph_file): Only add function_infos matching filters.
(output_lines): Likewise.
gcc/testsuite/ChangeLog:
* lib/gcov.exp: Add filtering test function.
* g++.dg/gcov/gcov-19.C: New test.
* g++.dg/gcov/gcov-20.C: New test.
* g++.dg/gcov/gcov-21.C: New test.
* gcc.misc-tests/gcov-25.c: New test.
* gcc.misc-tests/gcov-26.c: New test.
* gcc.misc-tests/gcov-27.c: New test.
* gcc.misc-tests/gcov-28.c: New test.
|
|
Ensure that the function.end_line in the lines vector for the source
file, even if it is not explicitly touched by a basic block. This
ensures consistency with what you would expect. For example, this file
has sources[sum.cc].lines.size () == 23 and main.end_line == 2 without
adjusting sources.lines, which in this case is a no-op.
#####: 17:int main ()
-: 18:{
#####: 19: sum (1, 2);
#####: 20: sum (1.1, 2);
#####: 21: sum (2.2, 2.3);
#####: 22:}
This is a useful property when combined with selective reporting.
gcc/ChangeLog:
* gcov.cc (process_all_functions): Ensure fn.end_line is
included source[fn].lines.
|
|
According to Zc-1.0.4-3.pdf from
https://github.com/riscvarchive/riscv-code-size-reduction/releases/tag/v1.0.4-3
The rule is that:
- C always implies Zca
- C+F implies Zcf (RV32 only)
- C+D implies Zcd
Signed-off-by: Fei Gao <gaofei@eswincomputing.com>
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc:
c implies zca, and conditionally zcf & zcd.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/attribute-15.c: adapt TC.
* gcc.target/riscv/attribute-16.c: likewise.
* gcc.target/riscv/attribute-17.c: likewise.
* gcc.target/riscv/attribute-18.c: likewise.
* gcc.target/riscv/pr110696.c: likewise.
* gcc.target/riscv/rvv/base/abi-callee-saved-1-zcmp.c: likewise.
* gcc.target/riscv/rvv/base/abi-callee-saved-2-zcmp.c: likewise.
* gcc.target/riscv/rvv/base/pr114352-1.c: likewise.
* gcc.target/riscv/rvv/base/pr114352-3.c: likewise.
* gcc.target/riscv/arch-39.c: New test.
* gcc.target/riscv/arch-40.c: New test.
|
|
|
|
To get better vectorized code of .SAT_SUB, we would like to avoid the
truncated operation for the assignment. For example, as below.
unsigned int _1;
unsigned int _2;
unsigned short int _4;
_9 = (unsigned short int).SAT_SUB (_1, _2);
If we make sure that the _1 is in the range of unsigned short int. Such
as a def similar to:
_1 = (unsigned short int)_4;
Then we can do the distribute the truncation operation to:
_3 = (unsigned short int) MIN (65535, _2); // aka _3 = .SAT_TRUNC (_2);
_9 = .SAT_SUB (_4, _3);
Then, we can better vectorized code and avoid the unnecessary narrowing
stmt during vectorization with below stmt(s).
_3 = .SAT_TRUNC(_2); // SI => HI
_9 = .SAT_SUB (_4, _3);
Let's take RISC-V vector as example to tell the changes. For below
sample code:
__attribute__((noinline))
void test (uint16_t *x, unsigned b, unsigned n)
{
unsigned a = 0;
uint16_t *p = x;
do {
a = *--p;
*p = (uint16_t)(a >= b ? a - b : 0);
} while (--n);
}
Before this patch:
...
.L3:
vle16.v v1,0(a3)
vrsub.vx v5,v2,t1
mv t3,a4
addw a4,a4,t5
vrgather.vv v3,v1,v5
vsetvli zero,zero,e32,m1,ta,ma
vzext.vf2 v1,v3
vssubu.vx v1,v1,a1
vsetvli zero,zero,e16,mf2,ta,ma
vncvt.x.x.w v1,v1
vrgather.vv v3,v1,v5
vse16.v v3,0(a3)
sub a3,a3,t4
bgtu t6,a4,.L3
...
After this patch:
test:
...
.L3:
vle16.v v3,0(a3)
vrsub.vx v5,v2,a6
mv a7,a4
addw a4,a4,t3
vrgather.vv v1,v3,v5
vssubu.vv v1,v1,v6
vrgather.vv v3,v1,v5
vse16.v v3,0(a3)
sub a3,a3,t1
bgtu t4,a4,.L3
...
The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The rv64gcv build with glibc.
3. The x86 bootstrap tests.
4. The x86 fully regression tests.
gcc/ChangeLog:
* tree-vect-patterns.cc (vect_recog_sat_sub_pattern_transform):
Add new func impl to perform the truncation distribution.
(vect_recog_sat_sub_pattern): Perform above optimize before
generate .SAT_SUB call.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
A last minute change led to a wrong operand order in the compare insn.
gcc/ChangeLog:
* config/i386/i386.md (ustruncdi<mode>2): Swap compare operands.
(ustruncsi<mode>2): Ditto.
(ustrunchiqi2): Ditto.
|
|
In GCC 14 we deprecated Concepts TS and discussed removing the code
in GCC 15. This patch removes Concepts TS code from the front end,
including support for template-introductions, as in:
template<typename T>
concept C = true;
C{T} void foo (T); // write template<C T> void foo (T);
The biggest part of this patch is adjusting the testsuite. We don't
want to lose coverage so I've converted most of -fconcepts-ts tests
to C++20. That means they no longer have to be c++17_only. Mostly
this meant turning "concept bool" into "concept" and turning function
concepts into C++20 concepts. I've added missing "auto"s where
required, but "auto"s in template-argument-lists are not supported
anymore so I've removed some of the tests; some of them are still
present to verify we don't crash on such autos. I've also added ()
around "requires" expressions.
I plan to add a porting_to.html entry with a few hints.
I've rebased and tested the patch after the recent r15-1103.
gcc/c-family/ChangeLog:
* c-cppbuiltin.cc (c_cpp_builtins): Remove flag_concepts_ts code.
* c-opts.cc (c_common_post_options): Likewise.
* c.opt: Remove -fconcepts-ts.
* c.opt.urls: Regenerate.
gcc/cp/ChangeLog:
* constraint.cc (deduce_concept_introduction, get_deduced_wildcard,
get_introduction_prototype, introduce_type_template_parameter,
introduce_template_template_parameter,
introduce_nontype_template_parameter,
build_introduced_template_parameter, introduce_template_parameter,
introduce_template_parameter_pack, introduce_template_parameter,
introduce_template_parameters, process_introduction_parms,
check_introduction_list, finish_template_introduction): Remove.
(finish_shorthand_constraint): Remove a Concepts TS comment.
* cp-tree.h (check_auto_in_tmpl_args, finish_template_introduction):
Remove.
* decl.cc (function_requirements_equivalent_p): Remove pre-C++20 code.
(grokfndecl): Don't check flag_concepts_ts.
(grokvardecl): Don't check that concept have type bool.
* parser.cc (cp_parser_decl_specifier_seq): Don't check
flag_concepts_ts.
(cp_parser_introduction_list): Remove.
(cp_parser_template_id): Remove dead code.
(cp_parser_simple_type_specifier): Don't check flag_concepts_ts.
(cp_parser_placeholder_type_specifier): Require require auto or
decltype(auto) even pre-C++20. Don't check flag_concepts_ts.
(cp_parser_type_id_1): Don't check flag_concepts_ts.
(cp_parser_template_type_arg): Likewise.
(cp_parser_requires_clause_opt): Remove flag_concepts_ts code.
(cp_parser_compound_requirement): Don't check flag_concepts_ts.
(cp_parser_template_introduction): Remove.
(cp_parser_template_declaration_after_export): Don't call
cp_parser_template_introduction.
* pt.cc (template_heads_equivalent_p): Remove pre-C++20 code.
(find_parameter_pack_data): Remove type_pack_expansion_p.
(find_parameter_packs_r): Remove flag_concepts_ts code. Remove
type_pack_expansion_p code.
(uses_parameter_packs): Remove type_pack_expansion_p code.
(make_pack_expansion): Likewise.
(check_for_bare_parameter_packs): Likewise.
(fixed_parameter_pack_p): Likewise.
(tsubst_qualified_id): Remove dead code.
(extract_autos_r): Remove.
(extract_autos): Remove.
(do_auto_deduction): Remove flag_concepts_ts code.
(type_uses_auto): Likewise.
(check_auto_in_tmpl_args): Remove.
gcc/ChangeLog:
* doc/invoke.texi: Mention that -fconcepts-ts was removed.
libstdc++-v3/ChangeLog:
* testsuite/std/ranges/access/101782.cc: Don't compile with
-fconcepts-ts.
gcc/testsuite/ChangeLog:
* g++.dg/concepts/auto3.C: Compile with -fconcepts. Run in C++17 and
up. Add dg-error.
* g++.dg/concepts/auto5.C: Likewise.
* g++.dg/concepts/auto7.C: Compile with -fconcepts. Add dg-error.
* g++.dg/concepts/auto8a.C: Compile with -fconcepts.
* g++.dg/concepts/class-deduction1.C: Compile with -fconcepts. Run in
C++17 and up. Convert to C++20.
* g++.dg/concepts/class5.C: Likewise.
* g++.dg/concepts/class6.C: Likewise.
* g++.dg/concepts/debug1.C: Likewise.
* g++.dg/concepts/decl-diagnose.C: Compile with -fconcepts. Run in
C++17 and up. Add dg-error.
* g++.dg/concepts/deduction-constraint1.C: Compile with -fconcepts.
Run in C++17 and up. Convert to C++20.
* g++.dg/concepts/diagnostic1.C: Likewise.
* g++.dg/concepts/dr1430.C: Likewise.
* g++.dg/concepts/equiv.C: Likewise.
* g++.dg/concepts/equiv2.C: Likewise.
* g++.dg/concepts/expression.C: Likewise.
* g++.dg/concepts/expression2.C: Likewise.
* g++.dg/concepts/expression3.C: Likewise.
* g++.dg/concepts/fn-concept2.C: Compile with -fconcepts. Run in
C++17 and up. Remove code. Add dg-prune-output.
* g++.dg/concepts/fn-concept3.C: Compile with -fconcepts. Run in
C++17 and up. Convert to C++20.
* g++.dg/concepts/fn1.C: Likewise.
* g++.dg/concepts/fn10.C: Likewise.
* g++.dg/concepts/fn2.C: Likewise.
* g++.dg/concepts/fn3.C: Likewise.
* g++.dg/concepts/fn4.C: Likewise.
* g++.dg/concepts/fn5.C: Likewise.
* g++.dg/concepts/fn6.C: Likewise.
* g++.dg/concepts/fn7.C: Compile with -fconcepts. Add dg-error.
* g++.dg/concepts/fn8.C: Compile with -fconcepts. Run in C++17 and up.
Convert to C++20.
* g++.dg/concepts/fn9.C: Likewise.
* g++.dg/concepts/generic-fn-err.C: Likewise.
* g++.dg/concepts/generic-fn.C: Likewise.
* g++.dg/concepts/inherit-ctor1.C: Likewise.
* g++.dg/concepts/inherit-ctor3.C: Likewise.
* g++.dg/concepts/intro1.C: Likewise.
* g++.dg/concepts/locations1.C: Compile with -fconcepts. Run in C++17
and up. Add dg-prune-output.
* g++.dg/concepts/partial-concept-id1.C: Compile with -fconcepts.
Run in C++17 and up. Convert to C++20.
* g++.dg/concepts/partial-concept-id2.C: Likewise.
* g++.dg/concepts/partial-spec5.C: Likewise.
* g++.dg/concepts/placeholder2.C: Likewise.
* g++.dg/concepts/placeholder3.C: Likewise.
* g++.dg/concepts/placeholder4.C: Likewise.
* g++.dg/concepts/placeholder5.C: Likewise.
* g++.dg/concepts/placeholder6.C: Likewise.
* g++.dg/concepts/pr65634.C: Likewise.
* g++.dg/concepts/pr65636.C: Likewise.
* g++.dg/concepts/pr65681.C: Likewise.
* g++.dg/concepts/pr65848.C: Likewise.
* g++.dg/concepts/pr67249.C: Likewise.
* g++.dg/concepts/pr67595.C: Likewise.
* g++.dg/concepts/pr68434.C: Likewise.
* g++.dg/concepts/pr71127.C: Likewise.
* g++.dg/concepts/pr71128.C: Compile with -fconcepts. Run in C++17
and up. Add dg-error.
* g++.dg/concepts/pr71131.C: Compile with -fconcepts. Run in C++17
and up. Convert to C++20.
* g++.dg/concepts/pr71385.C: Likewise.
* g++.dg/concepts/pr85065.C: Likewise.
* g++.dg/concepts/pr92804-2.C: Compile with -fconcepts. Convert to
C++20.
* g++.dg/concepts/template-parm11.C: Compile with -fconcepts. Run in
C++17 and up. Convert to C++20.
* g++.dg/concepts/template-parm12.C: Likewise.
* g++.dg/concepts/template-parm2.C: Likewise.
* g++.dg/concepts/template-parm3.C: Likewise.
* g++.dg/concepts/template-parm4.C: Likewise.
* g++.dg/concepts/template-template-parm1.C: Likewise.
* g++.dg/concepts/var-concept1.C: Likewise.
* g++.dg/concepts/var-concept2.C: Likewise.
* g++.dg/concepts/var-concept3.C: Likewise.
* g++.dg/concepts/var-concept4.C: Likewise.
* g++.dg/concepts/var-concept5.C: Likewise.
* g++.dg/concepts/var-concept6.C: Likewise.
* g++.dg/concepts/var-concept7.C: Likewise.
* g++.dg/concepts/var-templ1.C: Run in C++17 and up.
* g++.dg/concepts/var-templ2.C: Compile with -fconcepts. Run in C++17
and up. Convert to C++20.
* g++.dg/concepts/var-templ3.C: Likewise.
* g++.dg/concepts/variadic1.C: Likewise.
* g++.dg/concepts/variadic2.C: Likewise.
* g++.dg/concepts/variadic3.C: Likewise.
* g++.dg/concepts/variadic4.C: Likewise.
* g++.dg/cpp2a/concepts-pr65575.C: Likewise.
* g++.dg/cpp2a/concepts-pr66091.C: Likewise.
* g++.dg/cpp2a/concepts-pr67148.C: Compile with -fconcepts. Convert
to C++20.
* g++.dg/cpp2a/concepts-pr67225-1.C: Likewise.
* g++.dg/cpp2a/concepts-pr67225-2.C: Likewise.
* g++.dg/cpp2a/concepts-pr67225-3.C: Likewise.
* g++.dg/cpp2a/concepts-pr67225-4.C: Likewise.
* g++.dg/cpp2a/concepts-pr67225-5.C: Likewise.
* g++.dg/cpp2a/concepts-pr67319.C: Likewise.
* g++.dg/cpp2a/concepts-pr67427.C: Likewise.
* g++.dg/cpp2a/concepts-pr67654.C: Likewise.
* g++.dg/cpp2a/concepts-pr67658.C: Likewise.
* g++.dg/cpp2a/concepts-pr67684.C: Likewise.
* g++.dg/cpp2a/concepts-pr67697.C: Likewise.
* g++.dg/cpp2a/concepts-pr67719.C: Likewise.
* g++.dg/cpp2a/concepts-pr67774.C: Likewise.
* g++.dg/cpp2a/concepts-pr67825.C: Likewise.
* g++.dg/cpp2a/concepts-pr67860.C: Likewise.
* g++.dg/cpp2a/concepts-pr67862.C: Likewise.
* g++.dg/cpp2a/concepts-pr67969.C: Likewise.
* g++.dg/cpp2a/concepts-pr68093-2.C: Likewise.
* g++.dg/cpp2a/concepts-pr68372.C: Likewise.
* g++.dg/cpp2a/concepts-pr68812.C: Likewise.
* g++.dg/cpp2a/concepts-pr69235.C: Likewise.
* g++.dg/cpp2a/concepts-pr78752-2.C: Likewise.
* g++.dg/cpp2a/concepts-pr78752.C: Likewise.
* g++.dg/cpp2a/concepts-pr79759.C: Likewise.
* g++.dg/cpp2a/concepts-pr80746.C: Likewise.
* g++.dg/cpp2a/concepts-pr80773.C: Likewise.
* g++.dg/cpp2a/concepts-pr82507.C: Likewise.
* g++.dg/cpp2a/concepts-pr82740.C: Likewise.
* g++.dg/cpp2a/concepts-pr84980.C: Compile with -fconcepts. Run in
C++17 and up. Convert to C++20.
* g++.dg/cpp2a/concepts-pr85265.C: Likewise.
* g++.dg/cpp2a/concepts-pr85808.C: Compile with -fconcepts. Convert
to C++20.
* g++.dg/cpp2a/concepts-pr86269.C: Likewise.
* g++.dg/cpp2a/concepts-pr87441.C: Likewise.
* g++.dg/cpp2a/concepts-requires5.C: Compile with -fconcepts.
Adjust dg-error. Add same_as.
* g++.dg/cpp2a/nontype-class50a.C: Compile with -fconcepts.
* g++.dg/concepts/auto1.C: Removed.
* g++.dg/concepts/auto4.C: Removed.
* g++.dg/concepts/auto6.C: Removed.
* g++.dg/concepts/fn-concept1.C: Removed.
* g++.dg/concepts/intro2.C: Removed.
* g++.dg/concepts/intro3.C: Removed.
* g++.dg/concepts/intro4.C: Removed.
* g++.dg/concepts/intro5.C: Removed.
* g++.dg/concepts/intro6.C: Removed.
* g++.dg/concepts/intro7.C: Removed.
* g++.dg/cpp2a/concepts-ts1.C: Removed.
* g++.dg/cpp2a/concepts-ts2.C: Removed.
* g++.dg/cpp2a/concepts-ts3.C: Removed.
* g++.dg/cpp2a/concepts-ts4.C: Removed.
* g++.dg/cpp2a/concepts-ts5.C: Removed.
* g++.dg/cpp2a/concepts-ts6.C: Removed.
|
|
Here we ICE in c_expr_sizeof_expr on an erroneous expr.value. The
code checks for expr.value == error_mark_node but here the e_m_n is
wrapped in a C_MAYBE_CONST_EXPR. I don't think we should have created
such a tree, so let's return earlier in c_cast_expr.
PR c/115642
gcc/c/ChangeLog:
* c-typeck.cc (c_cast_expr): Return error_mark_node if build_c_cast
failed.
gcc/testsuite/ChangeLog:
* gcc.dg/noncompile/sizeof-1.c: New test.
|
|
I had this PR in my open tabs so why not go ahead and fix it.
decl_attributes gets last_decl, the last already pushed declaration,
to be used in common_handle_aligned_attribute. In C++, we look up
the decl via find_last_decl, which returns NULL_TREE if it finds
a decl that had not been declared. In C, we look up the decl via
lookup_last_decl which returns error_mark_node rather than NULL_TREE
in that case.
The error_mark_node causes a crash in common_handle_aligned_attribute.
We can fix this on the C FE side like in the patch below.
PR c/115549
gcc/c/ChangeLog:
* c-decl.cc (c_decl_attributes): If lookup_last_decl returns
error_mark_node, use NULL_TREE as last_decl.
gcc/testsuite/ChangeLog:
* c-c++-common/attr-aligned-2.c: New test.
|
|
Since r13-1006-g2005b9b888eeac, the test case copysign_softfloat_1.c
no longer contains any lsr istruction, so drop the check as per
comment 9 in PR105090.
gcc/testsuite/ChangeLog:
PR target/105090
* gcc.target/arm/copysign_softfloat_1.c: Drop check for lsr
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
|
|
Update all instances of zba_zbb_zbs in the testsuite to use b instead
gcc/testsuite/ChangeLog:
* g++.target/riscv/redundant-bitmap-1.C: Use gcb instead of
zba_zbb_zbs
* g++.target/riscv/redundant-bitmap-2.C: Ditto
* g++.target/riscv/redundant-bitmap-3.C: Ditto
* g++.target/riscv/redundant-bitmap-4.C: Ditto
* gcc.target/riscv/shift-add-1.c: Ditto
* gcc.target/riscv/shift-add-2.c: Ditto
* gcc.target/riscv/synthesis-1.c: Ditto
* gcc.target/riscv/synthesis-2.c: Ditto
* gcc.target/riscv/synthesis-3.c: Ditto
* gcc.target/riscv/synthesis-4.c: Ditto
* gcc.target/riscv/synthesis-5.c: Ditto
* gcc.target/riscv/synthesis-6.c: Ditto
* gcc.target/riscv/synthesis-7.c: Ditto
* gcc.target/riscv/synthesis-8.c: Ditto
* gcc.target/riscv/zba_zbs_and-1.c: Ditto
* gcc.target/riscv/zbs-zext-3.c: Ditto
* lib/target-supports.exp: Add b to riscv_get_arch
Signed-off-by: Edwin Lu <ewlu@rivosinc.com>
|
|
This patch adds support for recognizing the B standard extension to be the
collection of Zba, Zbb, Zbs extensions for consistency and conciseness
across toolchains
https://github.com/riscv/riscv-b/tags
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Add imply rules for B extension
* config/riscv/arch-canonicalize: Ditto
Signed-off-by: Edwin Lu <ewlu@rivosinc.com>
|
|
expand_fn_using_insn has code to handle SUBREG_PROMOTED_VAR_P
destinations. Specifically, for:
(subreg/v:M1 (reg:M2 R) ...)
it creates a new temporary register T, uses it for the output
operand, then sign- or zero-extends the M1 lowpart of T to M2,
storing the result in R.
This patch splits this handling out into helper routines and
uses them for other instances of:
if (!rtx_equal_p (target, ops[0].value))
emit_move_insn (target, ops[0].value);
It's quite probable that this doesn't help any of the other cases;
in particular, it shouldn't affect vectors. But I think it could
be useful for the CRC work.
gcc/
* internal-fn.cc (create_call_lhs_operand, assign_call_lhs): New
functions, split out from...
(expand_fn_using_insn): ...here.
(expand_load_lanes_optab_fn): Use them.
(expand_GOMP_SIMT_ENTER_ALLOC): Likewise.
(expand_GOMP_SIMT_LAST_LANE): Likewise.
(expand_GOMP_SIMT_ORDERED_PRED): Likewise.
(expand_GOMP_SIMT_VOTE_ANY): Likewise.
(expand_GOMP_SIMT_XCHG_BFLY): Likewise.
(expand_GOMP_SIMT_XCHG_IDX): Likewise.
(expand_partial_load_optab_fn): Likewise.
(expand_vec_cond_optab_fn): Likewise.
(expand_vec_cond_mask_optab_fn): Likewise.
(expand_RAWMEMCHR): Likewise.
(expand_gather_load_optab_fn): Likewise.
(expand_while_optab_fn): Likewise.
(expand_SPACESHIP): Likewise.
|
|
This extends the r11-5179 fix which doesn't work with multidimensional
arrays. In particular,
struct S {
explicit S() { }
};
auto p = new S[1][1]();
should not say "converting to S from initializer list would use
explicit constructor" because there's no {}. However, since we
went into the block where we create a {}, we got confused. We
should not have gotten there but we did because array_p was true.
This patch refines the check once more.
PR c++/115645
gcc/cp/ChangeLog:
* init.cc (build_new): Don't do any deduction for arrays with
bounds if it's value-initialized.
gcc/testsuite/ChangeLog:
* g++.dg/expr/anew7.C: New test.
|
|
insn_propagation would previously only replace (reg:M H) with X
for some hard register H if the uses of H were also in mode M.
This patch extends it to handle simple mode punning too.
The original motivation was to try to get rid of the execution
frequency test in aarch64_split_simd_shift_p, but doing that is
follow-up work.
I tried this on at least one target per CPU directory (as for
the late-combine patches) and it seems to be a small win for
all of them.
The patch includes a couple of updates to the ia32 results.
In pr105033.c, foo3 replaced:
vmovq 8(%esp), %xmm1
vpunpcklqdq %xmm1, %xmm0, %xmm0
with:
vmovhps 8(%esp), %xmm0, %xmm0
In vect-bfloat16-2b.c, 5 of the vec_extract_v32bf_* routines
(specifically the ones with nonzero even indices) replaced
things like:
movl 28(%esp), %eax
vmovd %eax, %xmm0
with:
vpinsrw $0, 28(%esp), %xmm0, %xmm0
(These functions return a bf16, and so only the low 16 bits matter.)
gcc/
* recog.cc (insn_propagation::apply_to_rvalue_1): Handle simple
cases of hardreg propagation in which the register is set and
used in different modes.
gcc/testsuite/
* gcc.target/i386/pr105033.c: Expect vmovhps for the ia32 version
of foo.
* gcc.target/i386/vect-bfloat16-2b.c: Expect more vpinsrws.
|
|
change_insns is used to change multiple instructions at once, so that
the IR on return is valid & self-consistent. These changes can involve
moving instructions, and the new position for one instruction might
be expressed in terms of the old position of another instruction
that is changing at the same time.
change_insns therefore adds placeholder instructions to mark each
new instruction position, then replaces each placeholder with the
corresponding real instruction. This replacement was done in two
steps: removing the old placeholder instruction and inserting the new
real instruction. But it's more convenient for the upcoming fix for
PR115785 if we do the operation as a single step. That should also
be slightly more efficient, since e.g. no splay tree operations are
needed.
This operation happens purely on the rtl-ssa instruction chain.
The placeholders are never represented in rtl.
gcc/
PR rtl-optimization/115785
* rtl-ssa/functions.h (function_info::replace_nondebug_insn): Declare.
* rtl-ssa/insns.h (insn_info::order_node::set_uid): New function.
(insn_info::remove_note): Declare.
* rtl-ssa/insns.cc (insn_info::remove_note): New function.
(function_info::replace_nondebug_insn): Likewise.
* rtl-ssa/changes.cc (function_info::change_insns): Use
replace_nondebug_insn instead of remove_insn + add_insn.
|
|
During contract parsing, in grok_contract(), we proceed even if the
condition contains errors. This results in contracts with embedded errors
which eventually confuse gimplify. Checks for errors have been added in
grok_contract() to exit early if an error is encountered.
PR c++/113968
gcc/cp/ChangeLog:
* contracts.cc (grok_contract): Check for error_mark_node early
exit.
gcc/testsuite/ChangeLog:
* g++.dg/contracts/pr113968.C: New test.
Signed-off-by: Nina Ranns <dinka.ranns@gmail.com>
|
|
The bug fix changes gcc/m2/gm2-gcc/m2builtins.c:m2builtins_BuiltinExists
to recognise both __builtin_<functionname> and functionname as a builtin.
gcc/m2/ChangeLog:
PR modula2/115823
* gm2-gcc/m2builtins.cc (struct builtin_macro_definition): New
field builtinname.
(builtin_function_match): New function.
(builtin_macro_match): Ditto.
(m2builtins_BuiltinExists): Use builtin_function_match and
builtin_macro_match.
(lookup_builtin_macro): Use builtin_macro_match.
(lookup_builtin_function): Use builtin_function_match.
(define_builtin): Assign builtinname field.
gcc/testsuite/ChangeLog:
PR modula2/115823
* gm2/builtins/run/pass/testalloa.mod: New test.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
|
|
emit_store_flag_1 calculates scode (swapped condition code) at the
beginning of the function from the value of code variable. However,
code variable may change before scode usage site, resulting in
invalid stalled scode value.
Move calculation of scode value just before its only usage site to
avoid stalled scode value.
PR middle-end/115836
gcc/ChangeLog:
* expmed.cc (emit_store_flag_1): Move calculation of
scode just before its only usage site.
|
|
The arm 'pe' target was removed back in 2012 when the FPA support was
removed, but in a small number of places some conditional code was
accidentally left behind. It's no-longer needed, so remove it.
gcc/ChangeLog:
* config/arm/arm-protos.h (arm_dllexport_name_p): Remove prototype.
(arm_dllimport_name_p): Likewise.
(arm_pe_unique_section): Likewise.
(arm_pe_encode_section_info): Likewise.
(arm_dllexport_p): Likewise.
(arm_dllimport_p): Likewise.
(arm_mark_dllexport): Likewise.
(arm_mark_dllimport): Likewise.
(arm_change_mode_p): Likewise.
* config/arm/arm.cc (arm_gnu_attributes): Remove attributes for ARM_PE.
(TARGET_ENCODE_SECTION_INFO): Remove setting for ARM_PE.
(is_called_in_ARM_mode): Remove ARM_PE conditional code.
(thumb1_output_interwork): Remove obsolete ARM_PE code.
(arm_encode_section_info): Remove surrounding #ifndef.
|
|
gcc/ChangeLog:
PR lto/115394
* lto-streamer.h: Remove streamer_debugging definition.
* lto-streamer-out.cc (stream_write_tree_ref): Remove use of streamer_debugging.
(lto_output_tree): Likewise.
* tree-streamer-in.cc (streamer_read_tree_bitfields): Likewise.
(streamer_get_pickled_tree): Likewise.
* tree-streamer-out.cc (pack_ts_base_value_fields): Likewise.
Signed-off-by: Prathamesh Kulkarni <prathameshk@nvidia.com>
|
|
This patch would like to add form 2 support for the .SAT_TRUNC. Aka:
Form 2:
#define DEF_SAT_U_TRUC_FMT_2(NT, WT) \
NT __attribute__((noinline)) \
sat_u_truc_##WT##_to_##NT##_fmt_2 (WT x) \
{ \
bool overflow = x > (WT)(NT)(-1); \
return overflow ? (NT)-1 : (NT)x; \
}
DEF_SAT_U_TRUC_FMT_2(uint32, uint64)
Before this patch:
3 │
4 │ __attribute__((noinline))
5 │ uint32_t sat_u_truc_uint64_t_to_uint32_t_fmt_2 (uint64_t x)
6 │ {
7 │ uint32_t _1;
8 │ long unsigned int _3;
9 │
10 │ ;; basic block 2, loop depth 0
11 │ ;; pred: ENTRY
12 │ _3 = MIN_EXPR <x_2(D), 4294967295>;
13 │ _1 = (uint32_t) _3;
14 │ return _1;
15 │ ;; succ: EXIT
16 │
17 │ }
After this patch:
3 │
4 │ __attribute__((noinline))
5 │ uint32_t sat_u_truc_uint64_t_to_uint32_t_fmt_2 (uint64_t x)
6 │ {
7 │ uint32_t _1;
8 │
9 │ ;; basic block 2, loop depth 0
10 │ ;; pred: ENTRY
11 │ _1 = .SAT_TRUNC (x_2(D)); [tail call]
12 │ return _1;
13 │ ;; succ: EXIT
14 │
15 │ }
The below test suites are passed for this patch:
1. The x86 bootstrap test.
2. The x86 fully regression test.
3. The rv64gcv fully regresssion test.
gcc/ChangeLog:
* match.pd: Add form 2 for .SAT_TRUNC.
* tree-ssa-math-opts.cc (math_opts_dom_walker::after_dom_children):
Add new case NOP_EXPR, and try to match SAT_TRUNC.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
As a follow-up to adding a pattern that folds x/sqrt(x) to sqrt(x) in match.pd, this patch adds a test case for type Float16 for armv8.2-a+fp16.
The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/testsuite/
* gcc.target/aarch64/sqrt_div_float16.c: New test.
|
|
While working on adding V4QI support to the aarch64 backend,
vect/slp-gap-1.c started to fail but only because the regex
was failing. Before it was loading use SI (int) and afterwards,
we started to use V4QI. The generated code was the same and the
generated gimple was almost the same. The regex was searching
for `zero-padding trick` and it was still doing that but instead
of directly 0, it was V4QI 0 (or rather `{ 0, 0, 0 }`).
This extends regex to support both.
Tested on x86_64-linux-gnu and aarch64-linux-gnu (with the support added).
gcc/testsuite/ChangeLog:
* gcc.dg/vect/slp-gap-1.c: Support matching `{_1, { 0, 0, 0, 0 }}`
in addition to `{_1, 0}`.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
This code has been dead at least since the move over to tuples
in 0-88576-g726a989a8b74bf, when gimple returns could only have
a simple expression in it. So let's remove it.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.
gcc/ChangeLog:
PR tree-optimization/115721
* tree-complex.cc (expand_complex_comparison): Remove
support for GIMPLE_RETURN.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
No functional changes compared with V1, just spaces to table conversion
in testcases to pass check-function-bodies.
V1 passed regression locally but suprisingly failed in pre-commit CI, after
picking the patch from patchwork, I realize table got coverted to spaces
before sending the patch.
Root cause:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=b27d323a368033f0b37e93c57a57a35fd9997864
Commit above tries in targetm.gen_epilogue () to detect if
there's li a0,0 insn at the end of insn chain, if so, cm.popret
is replaced by cm.popretz and li a0,0 insn is deleted.
Insertion of the generated epilogue sequence
into the insn chain doesn't happen at this moment.
If later shrink-wrap decides NOT to insert the epilogue sequence at the end
of insn chain, then the li a0,0 insn has already been mistakeny removed.
Fix this issue by removing generation of cm.popretz in epilogue,
leaving the assignment to a0 and use insn with cm.popret.
That's likely going to result in some kind of code size regression,
but not a correctness regression.
Optimization can be done in future.
Signed-off-by: Fei Gao <gaofei@eswincomputing.com>
gcc/ChangeLog:
PR target/113715
* config/riscv/riscv.cc (riscv_zcmp_can_use_popretz): Removed.
(riscv_gen_multi_pop_insn): Remove generation of cm.popretz.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rv32e_zcmp.c: Adapt TC.
* gcc.target/riscv/rv32i_zcmp.c: Likewise.
|
|
|
|
Some tests added to test the type of redeclarations of enumerators
in r15-1394 fail on architectures where sizeof(long) == sizeof(int).
Adapt tests to use long long and/or accept that long long is selected
as type for the enumerator.
PR testsuite/115545
gcc/testsuite/
* gcc.dg/pr115109.c: Adapt test.
* gcc.dg/c23-tag-enum-6.c: Adapt test.
* gcc.dg/c23-tag-enum-7.c: Adapt test.
|
|
For redeclarations of struct in C23, if one has an alignment attribute
that makes the alignment different, we later get an ICE in verify_types.
This patches disallows such redeclarations by declaring such types to
be different.
PR c/114727
gcc/c/
* c-typeck.cc (tagged_types_tu_compatible): Add test.
gcc/testsuite/
* gcc.dg/pr114727.c: New test.
|
|
The new verification code produces an ICE for incorrect code. Add the
same logic as already used in comptypes to to bail out under certain
conditions.
PR c/115696
gcc/c/
* c-typeck.cc (comptypes_verify): Bail out for
identical, empty, and erroneous input types.
gcc/testsuite/
* gcc.dg/pr115696.c: New test.
|
|
The vector init built-ins:
__builtin_vec_init_v16qi, __builtin_vec_init_v8hi,
__builtin_vec_init_v4si, __builtin_vec_init_v4sf,
__builtin_vec_init_v2di, __builtin_vec_init_v2df,
__builtin_vec_init_v1ti
perform the same operation as initializing the vector in C code. For
example:
result_v4si = __builtin_vec_init_v4si (1, 2, 3, 4);
result_v4si = {1, 2, 3, 4};
These two constructs were tested and verified they generate identical
assembly instructions with no optimization and -O3 optimization.
The vector set built-ins:
__builtin_vec_set_v16qi, __builtin_vec_set_v8hi.
__builtin_vec_set_v4si, __builtin_vec_set_v4sf,
__builtin_vec_set_v1ti, __builtin_vec_set_v2di,
__builtin_vec_set_v2df
perform the same operation as setting a specific element in the vector in
C code. For example:
src_v4si = __builtin_vec_set_v4si (src_v4si, int_val, index);
src_v4si[index] = int_val;
The built-in actually generates more instructions than the inline C code
with no optimization but is identical with -O3 optimizations.
All of the above built-ins that are removed do not have test cases and
are not documented.
Built-ins __builtin_vec_set_v1ti __builtin_vec_set_v2di,
__builtin_vec_set_v2df are not removed as they are used in function
resolve_vec_insert() in file rs6000-c.cc.
The built-ins are removed as they don't provide any benefit over just
using C code.
The code to define the bif_init_bit, bif_is_init, as well as their uses
are removed. The function altivec_expand_vec_init_builtin is also removed.
gcc/ChangeLog:
* config/rs6000/rs6000-builtin.cc (altivec_expand_vec_init_builtin):
Remove the function.
(rs6000_expand_builtin): Remove the if bif_is_int check to call
the altivec_expand_vec_init_builtin function.
* config/rs6000/rs6000-builtins.def: Remove the attribute string
comment for init.
(__builtin_vec_init_v16qi,
__builtin_vec_init_v4sf, __builtin_vec_init_v4si,
__builtin_vec_init_v8hi, __builtin_vec_init_v1ti,
__builtin_vec_init_v2df, __builtin_vec_init_v2di,
__builtin_vec_set_v16qi, __builtin_vec_set_v4sf,
__builtin_vec_set_v4si, __builtin_vec_set_v8hi): Remove
built-in definitions.
* config/rs6000/rs6000-gen-builtins.cc: Remove comment for init
attribute string.
(struct attrinfo): Remove isinit entry.
(parse_bif_attrs): Remove the if statement to check for attribute
init.
(ifdef DEBUG): Remove print for init attribute string.
(write_decls): Remove print for define bif_init_bit and
define for bif_is_init.
(write_bif_static_init): Remove if bifp->attrs.isinit statement.
|
|
The built-in __builtin_vsx_xvcmpeqsp_p is a duplicate of the overloaded
__builtin_altivec_vcmpeqfp_p built-in. The built-in is undocumented and
there are no test cases for it. The patch removes built-in
__builtin_vsx_xvcmpeqsp_p.
gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcmpeqsp_p):
Remove built-in definition.
|
|
Add a new signed and unsigned int128 overloaded vector instances for
vec_xxpermdi:
__int128 vec_xxpermdi (__int128, __int128, const int);
__uint128 vec_xxpermdi (__uint128, __uint128, const int);
Update the documentation to include a reference to the new vector built-in
instances of vec_xxpermdi.
Add test cases for the new overloaded instances.
gcc/ChangeLog:
* config/rs6000/rs6000-overload.def (vec_xxpermdi): Add new
overloaded built-in instances of vector signed and unsigned
int128.
* doc/extend.texi: Add documentation for built-in instances of
vector signed and unsigned int128.
gcc/testsuite/ChangeLog:gcc/testsuite/ChangeLog:
* gcc.target/powerpc/vec_perm-runnable-i128.c: New test file.
|
|
The undocumented __builtin_vsx_xvnegdp and __builtin_vsx_xvnegsp are
redundant. The overloaded vec_neg built-in provides the same
functionality. The two built-ins are not documented nor are there any
test cases for them.
Remove the definitions so users will use the overloaded vec_neg built-in
which is documented in the PVIPR.
gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvnegdp,
__builtin_vsx_xvnegsp): Remove built-in definitions.
|
|
The undocumented built-ins:
__builtin_vsx_vperm_16qi_uns,
__builtin_vsx_vperm_1ti,
__builtin_vsx_vperm_1ti_uns,
__builtin_vsx_vperm_2df,
__builtin_vsx_vperm_2di,
__builtin_vsx_vperm_2di_uns,
__builtin_vsx_vperm_4sf,
__builtin_vsx_vperm_4si,
__builtin_vsx_vperm_4si_uns
are duplicats of the __builtin_altivec_* built-ins that are used by
the overloaded vec_perm built-in that is documented in the PVIPR.
gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_vperm_16qi_uns,
__builtin_vsx_vperm_1ti, __builtin_vsx_vperm_1ti_uns,
__builtin_vsx_vperm_2df, __builtin_vsx_vperm_2di,
__builtin_vsx_vperm_2di_uns, __builtin_vsx_vperm_4sf,
__builtin_vsx_vperm_4si, __builtin_vsx_vperm_4si_uns): Remove
built-in definitions and comments.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/vsx-builtin-3.c (__builtin_vsx_vperm_16qi_uns,
__builtin_vsx_vperm_1ti, __builtin_vsx_vperm_1ti_uns,
__builtin_vsx_vperm_2df, __builtin_vsx_vperm_2di,
__builtin_vsx_vperm_2di_uns, __builtin_vsx_vperm_4sf,
__builtin_vsx_vperm_4si, __builtin_vsx_vperm_4si_uns,
__builtin_vsx_vperm): Change call to built-in to the overloaded
built-in vec_perm.
|
|
The following undocumented built-ins are covered by the existing overloaded
vec_sel built-in definitions.
const vsc __builtin_vsx_xxsel_16qi (vsc, vsc, vsc);
same as vsc __builtin_vec_sel (vsc, vsc, vuc); (overloaded vec_sel)
const vuc __builtin_vsx_xxsel_16qi_uns (vuc, vuc, vuc);
same as vuc __builtin_vec_sel (vuc, vuc, vuc); (overloaded vec_sel)
const vd __builtin_vsx_xxsel_2df (vd, vd, vd);
same as vd __builtin_vec_sel (vd, vd, vull); (overloaded vec_sel)
const vsll __builtin_vsx_xxsel_2di (vsll, vsll, vsll);
same as vsll __builtin_vec_sel (vsll, vsll, vsll); (overloaded vec_sel)
const vull __builtin_vsx_xxsel_2di_uns (vull, vull, vull);
same as vull __builtin_vec_sel (vull, vull, vsll); (overloaded vec_sel)
const vf __builtin_vsx_xxsel_4sf (vf, vf, vf);
same as vf __builtin_vec_sel (vf, vf, vsi) (overloaded vec_sel)
const vsi __builtin_vsx_xxsel_4si (vsi, vsi, vsi);
same as vsi __builtin_vec_sel (vsi, vsi, vbi); (overloaded vec_sel)
const vui __builtin_vsx_xxsel_4si_uns (vui, vui, vui);
same as vui __builtin_vec_sel (vui, vui, vui); (overloaded vec_sel)
const vss __builtin_vsx_xxsel_8hi (vss, vss, vss);
same as vss __builtin_vec_sel (vss, vss, vbs); (overloaded vec_sel)
const vus __builtin_vsx_xxsel_8hi_uns (vus, vus, vus);
same as vus __builtin_vec_sel (vus, vus, vus); (overloaded vec_sel)
This patch removed the duplicate built-in definitions so users will only
use the documented vec_sel built-in. The __builtin_vsx_xxsel_[4si, 8hi,
16qi, 4sf, 2df] tests are also removed.
gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xxsel_16qi,
__builtin_vsx_xxsel_16qi_uns, __builtin_vsx_xxsel_2df,
__builtin_vsx_xxsel_2di, __builtin_vsx_xxsel_2di_uns,
__builtin_vsx_xxsel_4sf, __builtin_vsx_xxsel_4si,
__builtin_vsx_xxsel_4si_uns, __builtin_vsx_xxsel_8hi,
__builtin_vsx_xxsel_8hi_uns): Remove built-in definitions.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/vsx-builtin-3.c (__builtin_vsx_xxsel_4si,
__builtin_vsx_xxsel_8hi, __builtin_vsx_xxsel_16qi,
__builtin_vsx_xxsel_4sf, __builtin_vsx_xxsel_2df,
__builtin_vsx_xxsel): Change built-in call to overloaded built-in
call vec_sel.
|
|
Extend the vec_sel built-in to take three signed/unsigned/bool int128
arguments and return a signed/unsigned/bool int128 result.
Extending the vec_sel built-in makes the existing buit-ins
__builtin_vsx_xxsel_1ti and __builtin_vsx_xxsel_1ti_uns obsolete. The
patch removes these built-ins.
The patch adds documentation and test cases for the new overloaded
vec_sel built-ins.
gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xxsel_1ti,
__builtin_vsx_xxsel_1ti_uns): Remove built-in definitions.
* config/rs6000/rs6000-overload.def (vec_sel): Add new
overloaded vector signed, unsigned and bool 128-bit definitions.
* doc/extend.texi (vec_sel): Add documentation for new instances
with signed, unsigned and bool 129-bit bool arguments.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/builtins-10-runnable.c: New runnable test
file.
* gcc.target/powerpc/builtins-10.c: New compile only test file.
|
|
The following undocumented built-ins are same as existing documented
overloaded builtins.
const vf __builtin_vsx_xxmrghw (vf, vf);
same as vf __builtin_vec_mergeh (vf, vf); (overloaded vec_mergeh)
const vsi __builtin_vsx_xxmrghw_4si (vsi, vsi);
same as vsi __builtin_vec_mergeh (vsi, vsi); (overloaded vec_mergeh)
const vf __builtin_vsx_xxmrglw (vf, vf);
same as vf __builtin_vec_mergel (vf, vf); (overloaded vec_mergel)
const vsi __builtin_vsx_xxmrglw_4si (vsi, vsi);
same as vsi __builtin_vec_mergel (vsi, vsi); (overloaded vec_mergel)
This patch removes the duplicate built-in definitions so only the
documented built-ins will be available for use. The case statements in
rs6000_gimple_fold_builtin are removed as they are no longer needed. The
patch removes the now unused define_expands for vsx_xxmrghw_<mode> and
vsx_xxmrglw_<mode>.
gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xxmrghw,
__builtin_vsx_xxmrghw_4si, __builtin_vsx_xxmrglw,
__builtin_vsx_xxmrglw_4si, __builtin_vsx_xxsel_16qi): Remove
built-in definition.
* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_builtin):
remove case entries RS6000_BIF_XXMRGLW_4SI,
RS6000_BIF_XXMRGLW_4SF, RS6000_BIF_XXMRGHW_4SI,
RS6000_BIF_XXMRGHW_4SF.
* config/rs6000/vsx.md (vsx_xxmrghw_<mode>, vsx_xxmrglw_<mode>):
Remove unused define_expands.
|
|
The following built-ins are redundant as they are covered by another
overloaded built-in.
__builtin_vsx_xvcvspdp covered by vec_double{e,o}
__builtin_vsx_xvcvdpsp covered by vec_float{e,o}
__builtin_vsx_xvcvsxwdp covered by vec_double{e,o}
__builtin_vsx_xvcvuxddp_uns covered by vec_double
Remove the redundant built-ins. They are not documented nor do they have
test cases.
gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcvspdp,
__builtin_vsx_xvcvdpsp, __builtin_vsx_xvcvsxwdp,
__builtin_vsx_xvcvuxddp_uns): Remove.
|
|
The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds
convert a vector of floats to a vector of signed/unsigned long long ints.
Extend the existing vec_{un,}signed{e,o} built-ins to handle the argument
vector of floats to return a vector of even/odd signed/unsigned integers.
The define expands vsignede_v4sf, vsignedo_v4sf, vunsignede_v4sf,
vunsignedo_v4sf are added to support the new vec_{un,}signed{e,o}
built-ins.
The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds are
now for internal use only. They are not documented and they do not
have test cases.
Add testcases and update documentation.
gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcvspsxds,
__builtin_vsx_xvcvspuxds): Rename to __builtin_vsignede_v4sf,
__builtin_vunsignede_v4sf respectively.
(XVCVSPSXDS, XVCVSPUXDS): Rename to VEC_VSIGNEDE_V4SF,
VEC_VUNSIGNEDE_V4SF respectively.
(__builtin_vsignedo_v4sf, __builtin_vunsignedo_v4sf): New
built-in definitions.
* config/rs6000/rs6000-overload.def (vec_signede, vec_signedo,
vec_unsignede, vec_unsignedo): Add new overloaded specifications.
* config/rs6000/vsx.md (vsignede_v4sf, vsignedo_v4sf,
vunsignede_v4sf, vunsignedo_v4sf): New define_expands.
* doc/extend.texi (vec_signedo, vec_signede, vec_unsignedo,
vec_unsignede): Add documentation for new overloaded built-ins to
convert vector float to vector {un,}signed long long.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/builtins-3-runnable.c
(test_unsigned_int_result, test_ll_unsigned_int_result): Add
new argument.
(vec_signede, vec_signedo, vec_unsignede, vec_unsignedo): New
tests for the overloaded built-ins.
|
|
The built-in __builtin_vsx_vunsigned_v2df is supposed to take a vector of
doubles and return a vector of unsigned long long ints. Similarly
__builtin_vsx_vunsigned_v4sf takes a vector of floats an is supposed to
return a vector of unsinged ints. The definitions are using the signed
version of the instructions not the unsigned version of the instruction.
The results should also be unsigned. The built-ins are used by the
overloaded vec_unsigned built-in which has an unsigned result.
Similarly the built-ins __builtin_vsx_vunsignede_v2df and
__builtin_vsx_vunsignedo_v2df are supposed to return an unsigned result.
If the floating point argument is negative, the unsigned result is zero.
The built-ins are used in the overloaded built-in vec_unsignede and
vec_unsignedo respectively.
Add a test cases for a negative floating point arguments for each of the
above built-ins.
gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_vunsigned_v2df,
__builtin_vsx_vunsigned_v4sf, __builtin_vsx_vunsignede_v2df,
__builtin_vsx_vunsignedo_v2df): Change the result type to unsigned.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/builtins-3-runnable.c: Add tests for
vec_unsignede and vec_unsignedo with negative arguments.
|
|
The built-in __builtin_vsx_xvcvspsxws is covered by built-in vec_signed
built-in that is documented in the PVIPR. The __builtin_vsx_xvcvspsxws
built-in is not documented and there are no test cases for it.
The built-in __builtin_vsx_xvcvdpuxds_uns is redundant as it is covered by
vec_unsigned, remove.
The __builtin_vsx_xvcvspuxws is redundant as it is covered by
vec_unsigned, remove.
The built-in __builtin_vsx_xvcvdpsxws is redundant as it is covered by
vec_signed{e,o}, remove.
The built-in __builtin_vsx_xvcvdpuxws is redundant as it is covered by
vec_unsigned{e,o}, remove.
This patch removes the redundant built-ins.
gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcvspsxws,
__builtin_vsx_xvcvdpuxds_uns, __builtin_vsx_xvcvspuxws,
__builtin_vsx_xvcvdpsxws, __builtin_vsx_xvcvdpuxws): Remove
built-in definitions.
|