Age | Commit message (Collapse) | Author | Files | Lines |
|
The previous commit changed atomic_ref<bool> to not use the integral
specialization. This adds a test to verify that change. We can't
directly test that the primary template is used, but we can check that
the member functions of the integral specializations are not present.
libstdc++-v3/ChangeLog:
* testsuite/29_atomics/atomic_ref/bool.cc: New test.
|
|
Per [atomics.ref.int] `bool` is excluded from the list of integral types
for which there is a specialization of the `atomic_ref` class template
and [Note 1] clearly states that `atomic_ref<bool>` "uses the primary
template" instead.
libstdc++-v3/ChangeLog:
* include/bits/atomic_base.h (__atomic_ref): Do not use integral
specialization for bool.
Signed-off-by: Damien Lebrun-Grandie <dalg24@gmail.com>
|
|
* elf.c (elf_nodebug): Suggest -g.
* macho.c (macho_nodebug): Suggest -g and dsymutil.
* pecoff.c (coff_nodebug): Suggest -g.
|
|
* dwarf.c: Remove trailing whitespace.
* macho.c: Likewise.
|
|
libstdc++-v3:
* doc/xml/manual/using.xml: Switch gcc.gnu.org links to https.
* doc/html/manual/using_concurrency.html: Regenerate.
* doc/html/manual/using_dynamic_or_shared.html: Ditto.
* doc/html/manual/using_headers.html: Ditto.
* doc/html/manual/using_namespaces.html: Ditto.
|
|
str[n]cmp
This patch eliminates an unnecessary sign extension for scalar inlined
string comparisons on rv64.
Conceptually this is pretty simple. Prove all the paths which "return"
a value from the inlined string comparison already have sign extended
values.
FINAL_LABEL is the point after the calculation of the return value. So
if we have a jump to FINAL_LABEL, we must have a properly extended
result value at that point.
Second we're going to arrange in the .md part of the expander to use an
X mode temporary for the result. After computing the result we will (if
necessary) extract the low part of the result using a SUBREG tagged with
the appropriate SUBREG_PROMOTED_* bits.
So with that background.
We find a jump to FINAL_LABEL in emit_strcmp_scalar_compare_byte. Since
we know the result is X mode, we can just emit the subtraction of the
two chars in X mode and we'll have a properly sign extended result.
There's 4 jumps to final_label in emit_strcmp_scalar.
The first is just returning zero and needs trivial simplification to not
force the result into SImode.
The second is after calling strcmp in the library. The ABI mandates
that value is sign extended, so there's nothing to do for that case.
The 3rd occurs after a call to
emit_strcmp_scalar_result_calculation_nonul. If we dive into that
routine it needs simplificationq similar to what we did in
emit_strcmp_scalar_compare_byte
The 4th occurs after a call to emit_strcmp_scalar_result_calculation
which again needs trivial adjustment like we've done in the other routines.
Finally, at the end of expand_strcmp, just store the X mode result
sitting in SUB to RESULT.
The net of all that is we know every path has its result properly
extended to X mode. Standard redundant extension removal will take care
of the rest.
We've been running this within Ventana for about 6 months, so naturally
it's been through various QA cycles, dhrystone, spec2017, etc. It's
also been through a build/test cycle in my tester. Waiting on results
from the pre-commit testing before moving forward.
gcc/
* config/riscv/riscv-string.cc
(emit_strcmp_scalar_compare_byte): Set RESULT directly rather
than using a new temporary.
(emit_strcmp_scalar_result_calculation_nonul): Likewise.
(emit_strcmp_scalar_result_calculation): Likewise.
(riscv_expand_strcmp_scalar): Use CONST0_RTX rather than
generating a new node.
(expand_strcmp): Copy directly from SUB to RESULT.
* config/riscv/riscv.md (cmpstrnsi, cmpstrsi): Pass an X
mode temporary to the expansion routines. If necessary
extract low part of the word to store in final result location.
|
|
I noticed there was a warning from clang about int_range's
dtor being marked as final saying the class cannot be inherited from.
So let's mark the few ranger classes as final for those which we know
will be final.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* value-range.h (class int_range): Mark as final.
(class prange): Likewise.
(class frange): Likewise.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
This test uses -fkeep-inline-functions and with debug mode enabled that
compiles much slower and times out if the system is under heavy load.
The original problem being tested is independent of debug mode, so just
require normal mode for the test.
libstdc++-v3/ChangeLog:
PR libstdc++/108636
* testsuite/27_io/filesystem/path/108636.cc: Require normal
mode.
|
|
This patch fixes the backend pattern that was printing the wrong input
scalar register pair when inserting into lane 1.
Added a new test to force float-abi=hard so we can use scan-assembler to check
correct codegen.
gcc/ChangeLog:
PR target/115611
* config/arm/mve.md (mve_vec_setv2di_internal): Fix printing of input
scalar register pair when lane = 1.
gcc/testsuite/ChangeLog:
* gcc.target/arm/mve/intrinsics/vsetq_lane_su64.c: New test.
|
|
In this PR, due to the -f flags, we ended up with:
bb1: r10=r10
...
bb2: r10=r10
...
bb3: ...=r10
with bb1->bb2 and bb1->bb3.
late-combine successfully combined the bb1->bb2 def-use and set
the insn code to NOOP_MOVE_INSN_CODE. The bb1->bb3 combination
then failed for... reasons. At this point, everything should have
been rewound to its original state.
However, substituting r10=r10 into r10=r10 gives r10=r10, and
validate_change had an early-out for no-op rtl changes. This meant
that validate_change did not register a change for the bb2 insn and
so did not save its old insn code. The NOOP_MOVE_INSN_CODE therefore
persisted even after the attempt had been rewound.
IMO it'd be too cumbersome and error-prone to expect all users of
validate_change to be aware of this possibility. If code is using
validate_change with in_group=1, I think it has a reasonable expectation
that a change will be registered and that the insn code will be saved
(and restored on cancel). This patch therefore limits the shortcut
to the !in_group case.
gcc/
PR rtl-optimization/115782
* recog.cc (validate_change_1): Suppress early exit for no-op
changes that are part of a group.
gcc/testsuite/
PR rtl-optimization/115782
* gcc.dg/pr115782.c: New test.
|
|
gcc/fortran/ChangeLog:
* trans-array.cc (gfc_conv_array_parameter): Init variable to
NULL_TREE to fix bootstrap.
|
|
The Ada compiler now defers to the gimplifier for ordering comparisons of
arrays of bytes (Ada parlance for <, >, <= and >=) because the gimplifier
in turn defers to memcmp for them, which implements the required semantics.
However, the gimplifier has a special processing for aggregate types whose
mode is not BLKmode and this processing deviates from the memcmp semantics
when the target is little-endian.
gcc/
* gimplify.cc (gimplify_scalar_mode_aggregate_compare): Add support
for ordering comparisons.
(gimplify_expr) <default>: Call gimplify_scalar_mode_aggregate_compare
only for integral scalar modes.
gcc/testsuite/
* gnat.dg/array42.adb, gnat.dg/array42_pkg.ads: New test.
|
|
There are these insns that subtract and zero-extend where
the subtrahend is zero-extended to the mode of the minuend.
This patch uses one insn (and split) with mode iterators
instead of spelling out each variant individually.
This has the additional benefit that u32 - u24 is also supported,
which previously wasn't.
gcc/
* config/avr/avr-protos.h (avr_out_minus): New prototype.
* config/avr/avr.cc (avr_out_minus): New function.
* config/avr/avr.md (*sub<HISI:mode>3.zero_extend.<QIPSI:mode>)
(*sub<HISI:mode>3.zero_extend.<QIPSI:mode>_split): New insns.
(*subpsi3_zero_extend.qi_split): Remove isns_and_split.
(*subpsi3_zero_extend.hi_split): Remove insn_and_split.
(*subhi3_zero_extend1_split): Remove insn_and_split.
(*subsi3_zero_extend_split): Remove insn_and_split.
(*subsi3_zero_extend.hi_split): Remove insn_and_split.
(*subpsi3_zero_extend.qi): Remove insn.
(*subpsi3_zero_extend.hi): Remove insn.
(*subhi3_zero_extend1): Remove insn.
(*subsi3_zero_extend): Remove insn.
(*subsi3_zero_extend.hi): Remove insn.
gcc/testsuite/
* gcc.target/avr/torture/sub-zerox.c: New test.
|
|
When duplicate_decls finds a match with an existing imported
declaration, it clears DECL_LANG_SPECIFIC of the olddecl and replaces it
with the contents of newdecl; this clears DECL_MODULE_ENTITY_P causing
an ICE if the same declaration is imported again later.
This fixes the issue by ensuring that the flag is transferred to newdecl
before clearing so that it ends up on olddecl again.
For future-proofing we also do the same with DECL_MODULE_KEYED_DECLS_P,
though because we don't yet support textual redefinition merging we
can't yet test this works as intended. I don't expect it's possible for
a new declaration already to have extra keyed decls mismatching that of
the old declaration though, so I don't do anything with 'keyed_map' at
this time.
PR c++/99241
gcc/cp/ChangeLog:
* decl.cc (duplicate_decls): Merge module entity information.
gcc/testsuite/ChangeLog:
* g++.dg/modules/pr99241_a.H: New test.
* g++.dg/modules/pr99241_b.H: New test.
* g++.dg/modules/pr99241_c.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
|
|
This patch would like to add the test cases for the vector .SAT_SUB in
the zip benchmark. Aka:
Form in zip benchmark:
#define DEF_VEC_SAT_U_SUB_ZIP(T1, T2) \
void __attribute__((noinline)) \
vec_sat_u_sub_##T1##_##T2##_fmt_zip (T1 *x, T2 b, unsigned limit) \
{ \
T2 a; \
T1 *p = x; \
do { \
a = *--p; \
*p = (T1)(a >= b ? a - b : 0); \
} while (--limit); \
}
DEF_VEC_SAT_U_SUB_ZIP(uint8_t, uint16_t)
vec_sat_u_sub_uint16_t_uint32_t_fmt_zip:
...
vsetvli a4,zero,e32,m1,ta,ma
vmv.v.x v6,a1
vsetvli zero,zero,e16,mf2,ta,ma
vid.v v2
li a4,-1
vnclipu.wi v6,v6,0 // .SAT_TRUNC
.L3:
vle16.v v3,0(a3)
vrsub.vx v5,v2,a6
mv a7,a4
addw a4,a4,t3
vrgather.vv v1,v3,v5
vssubu.vv v1,v1,v6 // .SAT_SUB
vrgather.vv v3,v1,v5
vse16.v v3,0(a3)
sub a3,a3,t1
bgtu t4,a4,.L3
Passed the rv64gcv tests.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h: Add test
helper macros.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_data.h: Add test
data for .SAT_SUB in zip benchmark.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vx.h: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_zip-run.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_zip.c: New test.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
* MAINTAINERS: Update my email address.
Signed-off-by: Kugan Vivekanandarajah <kvivekananda@nvidia.com>
|
|
association argument and add un/pack_class. [PR96992]
Removing the assert in trans-expr, lead to initial strides not set
which is now fixed. When the array needs repacking, this is done for
class arrays now, too.
Packing class arrays was done using the regular internal pack
function in the past. But that does not use the vptr's copy
function and breaks OOP paradigms (e.g. deep copy). The new
un-/pack_class functions use the vptr's copy functionality to
implement OOP paradigms correctly.
PR fortran/96992
gcc/fortran/ChangeLog:
* trans-array.cc (gfc_trans_array_bounds): Set a starting
stride, when descriptor expects a variable for the stride.
(gfc_trans_dummy_array_bias): Allow storage association for
dummy class arrays, when they are not elemental.
(gfc_conv_array_parameter): Add more general class support
and packing for classes, too.
* trans-array.h (gfc_conv_array_parameter): Add lbound shift
for class arrays.
* trans-decl.cc (gfc_build_builtin_function_decls): Add decls
for internal_un-/pack_class.
* trans-expr.cc (gfc_reset_vptr): Allow supplying a type-tree
to generate the vtab from.
(gfc_class_set_vptr): Allow supplying a class-tree to take the
vptr from.
(class_array_data_assign): Rename to gfc_class_array_data_assign
and make usable from other compile units.
(gfc_class_array_data_assign): Renamed from class_array_data_
assign.
(gfc_conv_derived_to_class): Remove assert to
allow converting derived to class type arrays with assumed
rank. Reduce code base and use gfc_conv_array_parameter also
for classes.
(gfc_conv_class_to_class): Use gfc_class_data_assign.
(gfc_conv_procedure_call): Adapt to new signature of
gfc_conv_derived_to_class.
* trans-io.cc (transfer_expr): Same.
* trans-stmt.cc (trans_associate_var): Same.
* trans.h (gfc_conv_derived_to_class): Signature changed.
(gfc_class_array_data_assign): Made public.
(gfor_fndecl_in_pack_class): Added declaration.
(gfor_fndecl_in_unpack_class): Same.
libgfortran/ChangeLog:
* Makefile.am: Add in_un-/pack_class.c to build.
* Makefile.in: Regenerated from Makefile.am.
* gfortran.map: Added new functions and bumped ABI.
* libgfortran.h (GFC_CLASS_T): Added for generating class
representation at runtime.
* runtime/in_pack_class.c: New file.
* runtime/in_unpack_class.c: New file.
gcc/testsuite/ChangeLog:
* gfortran.dg/class_dummy_11.f90: New test.
|
|
This reverts commit 7d454cae9d7df1f2936ad02d0742674a85396736.
The change breaks bootstrap on some x86_64 OS versions.
|
|
Add the --include and --exclude flags to gcov to control what functions
to report on. This is meant to make gcov more practical as an when
writing test suites or performing other coverage experiments, which
tends to focus on a few functions at the time. This really shines in
combination with the -t/--stdout flag. With support for more expansive
metrics in gcov like modified condition/decision coverage (MC/DC) and
path coverage, output quickly gets overwhelming without filtering.
The approach is quite simple: filters are egrep regexes and are
evaluated left-to-right, and the last filter "wins", that is, if a
function matches an --include and a subsequent --exclude, it should not
be included in the output. All of the output machinery works on the
function table, so by optionally (not) adding function makes the even
the json output work as expected, and only minor changes are needed to
suppress the filtered-out functions.
Demo: math.c
int mul (int a, int b) {
return a * b;
}
int sub (int a, int b) {
return a - b;
}
int sum (int a, int b) {
return a + b;
}
Plain matches:
$ gcov -t math --include=sum
-: 0:Source:math.c
-: 0:Graph:math.gcno
-: 0:Data:-
-: 0:Runs:0
#####: 9:int sum (int a, int b) {
#####: 10: return a + b;
-: 11:}
$ gcov -t math --include=mul
-: 0:Source:math.c
-: 0:Graph:math.gcno
-: 0:Data:-
-: 0:Runs:0
#####: 1:int mul (int a, int b) {
#####: 2: return a * b;
-: 3:}
Regex match:
$ gcov -t math --include=su
-: 0:Source:math.c
-: 0:Graph:math.gcno
-: 0:Data:-
-: 0:Runs:0
#####: 5:int sub (int a, int b) {
#####: 6: return a - b;
-: 7:}
#####: 9:int sum (int a, int b) {
#####: 10: return a + b;
-: 11:}
And similar for exclude:
$ gcov -t math --exclude=sum
-: 0:Source:math.c
-: 0:Graph:math.gcno
-: 0:Data:-
-: 0:Runs:0
#####: 1:int mul (int a, int b) {
#####: 2: return a * b;
-: 3:}
#####: 5:int sub (int a, int b) {
#####: 6: return a - b;
-: 7:}
And json, for good measure:
$ gcov -t math --include=sum --json | jq ".files[].lines[]"
{
"line_number": 9,
"function_name": "sum",
"count": 0,
"unexecuted_block": true,
"block_ids": [],
"branches": [],
"calls": []
}
{
"line_number": 10,
"function_name": "sum",
"count": 0,
"unexecuted_block": true,
"block_ids": [
2
],
"branches": [],
"calls": []
}
Matching generally work well for mangled names, as the mangled names
also have the base symbol name in it. By default, functions are matched
by the mangled name, which means matching on base names always work as
expected. The -M flag makes the matching work on the demangled name
which is quite useful when you only want to report on specific
overloads and can use the full type names.
Why not just use grep? grep is not really sufficient as grep is very
line oriented, and the reports that benefit the most from filtering
often unpredictably span multiple lines based on the state of coverage.
For example, a condition coverage report for 3 terms/6 outcomes only
outputs 1 line when all conditions are covered, and 7 with no lines
covered.
gcc/ChangeLog:
* doc/gcov.texi: Add --include, --exclude, --match-on-demangled
documentation.
* gcov.cc (struct fnfilter): New.
(print_usage): Add --include, --exclude, -M,
--match-on-demangled.
(process_args): Likewise.
(release_structures): Release filters.
(read_graph_file): Only add function_infos matching filters.
(output_lines): Likewise.
gcc/testsuite/ChangeLog:
* lib/gcov.exp: Add filtering test function.
* g++.dg/gcov/gcov-19.C: New test.
* g++.dg/gcov/gcov-20.C: New test.
* g++.dg/gcov/gcov-21.C: New test.
* gcc.misc-tests/gcov-25.c: New test.
* gcc.misc-tests/gcov-26.c: New test.
* gcc.misc-tests/gcov-27.c: New test.
* gcc.misc-tests/gcov-28.c: New test.
|
|
Ensure that the function.end_line in the lines vector for the source
file, even if it is not explicitly touched by a basic block. This
ensures consistency with what you would expect. For example, this file
has sources[sum.cc].lines.size () == 23 and main.end_line == 2 without
adjusting sources.lines, which in this case is a no-op.
#####: 17:int main ()
-: 18:{
#####: 19: sum (1, 2);
#####: 20: sum (1.1, 2);
#####: 21: sum (2.2, 2.3);
#####: 22:}
This is a useful property when combined with selective reporting.
gcc/ChangeLog:
* gcov.cc (process_all_functions): Ensure fn.end_line is
included source[fn].lines.
|
|
According to Zc-1.0.4-3.pdf from
https://github.com/riscvarchive/riscv-code-size-reduction/releases/tag/v1.0.4-3
The rule is that:
- C always implies Zca
- C+F implies Zcf (RV32 only)
- C+D implies Zcd
Signed-off-by: Fei Gao <gaofei@eswincomputing.com>
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc:
c implies zca, and conditionally zcf & zcd.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/attribute-15.c: adapt TC.
* gcc.target/riscv/attribute-16.c: likewise.
* gcc.target/riscv/attribute-17.c: likewise.
* gcc.target/riscv/attribute-18.c: likewise.
* gcc.target/riscv/pr110696.c: likewise.
* gcc.target/riscv/rvv/base/abi-callee-saved-1-zcmp.c: likewise.
* gcc.target/riscv/rvv/base/abi-callee-saved-2-zcmp.c: likewise.
* gcc.target/riscv/rvv/base/pr114352-1.c: likewise.
* gcc.target/riscv/rvv/base/pr114352-3.c: likewise.
* gcc.target/riscv/arch-39.c: New test.
* gcc.target/riscv/arch-40.c: New test.
|
|
|
|
To get better vectorized code of .SAT_SUB, we would like to avoid the
truncated operation for the assignment. For example, as below.
unsigned int _1;
unsigned int _2;
unsigned short int _4;
_9 = (unsigned short int).SAT_SUB (_1, _2);
If we make sure that the _1 is in the range of unsigned short int. Such
as a def similar to:
_1 = (unsigned short int)_4;
Then we can do the distribute the truncation operation to:
_3 = (unsigned short int) MIN (65535, _2); // aka _3 = .SAT_TRUNC (_2);
_9 = .SAT_SUB (_4, _3);
Then, we can better vectorized code and avoid the unnecessary narrowing
stmt during vectorization with below stmt(s).
_3 = .SAT_TRUNC(_2); // SI => HI
_9 = .SAT_SUB (_4, _3);
Let's take RISC-V vector as example to tell the changes. For below
sample code:
__attribute__((noinline))
void test (uint16_t *x, unsigned b, unsigned n)
{
unsigned a = 0;
uint16_t *p = x;
do {
a = *--p;
*p = (uint16_t)(a >= b ? a - b : 0);
} while (--n);
}
Before this patch:
...
.L3:
vle16.v v1,0(a3)
vrsub.vx v5,v2,t1
mv t3,a4
addw a4,a4,t5
vrgather.vv v3,v1,v5
vsetvli zero,zero,e32,m1,ta,ma
vzext.vf2 v1,v3
vssubu.vx v1,v1,a1
vsetvli zero,zero,e16,mf2,ta,ma
vncvt.x.x.w v1,v1
vrgather.vv v3,v1,v5
vse16.v v3,0(a3)
sub a3,a3,t4
bgtu t6,a4,.L3
...
After this patch:
test:
...
.L3:
vle16.v v3,0(a3)
vrsub.vx v5,v2,a6
mv a7,a4
addw a4,a4,t3
vrgather.vv v1,v3,v5
vssubu.vv v1,v1,v6
vrgather.vv v3,v1,v5
vse16.v v3,0(a3)
sub a3,a3,t1
bgtu t4,a4,.L3
...
The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The rv64gcv build with glibc.
3. The x86 bootstrap tests.
4. The x86 fully regression tests.
gcc/ChangeLog:
* tree-vect-patterns.cc (vect_recog_sat_sub_pattern_transform):
Add new func impl to perform the truncation distribution.
(vect_recog_sat_sub_pattern): Perform above optimize before
generate .SAT_SUB call.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
Users are not supposed to create objects of this type, and there's no
reason it needs to be copyable. LWG 4061 makes it non-copyable and
non-default constructible.
libstdc++-v3/ChangeLog:
PR libstdc++/114387
* include/std/format (basic_format_context): Define copy
operations as deleted, as per LWG 4061.
* testsuite/std/format/context.cc: New test.
|
|
For the C locale we know the encoding is ASCII, so we can avoid using
newlocale and nl_langinfo_l. Similarly, for an unnamed locale, we aren't
going to get a useful answer, so we can just use a default-constrcuted
std::text_encoding representing an unknown encoding.
libstdc++-v3/ChangeLog:
* src/c++26/text_encoding.cc (__locale_encoding): Add to unnamed
namespace.
(std::locale::encoding): Optimize for "C" and "*" names.
|
|
[PR115854]
The consensus in the standard committee is that this change shouldn't be
necessary, and the Allocator requirements should require conversions
between rebound allocators to be implicit. But we can make it work for
now anyway.
libstdc++-v3/ChangeLog:
PR libstdc++/115854
* include/bits/stl_bvector.h (_Bvector_base): Convert allocator
to rebound type explicitly.
* testsuite/23_containers/vector/allocator/115854.cc: New test.
* testsuite/23_containers/vector/bool/allocator/115854.cc: New test.
|
|
For an integer-class type we need to use an explicit conversion to size_t.
libstdc++-v3/ChangeLog:
PR libstdc++/115799
* include/bits/ranges_util.h (__find_fn): Make conversion
from difference type ti size_t explicit.
* testsuite/25_algorithms/find/bytes.cc: Check ranges::find with
__gnu_test::test_contiguous_range.
* testsuite/std/ranges/range.cc: Adjust expected difference_type
for __gnu_test::test_contiguous_range.
* testsuite/util/testsuite_iterators.h (contiguous_iterator_wrapper):
Use __max_diff_type as difference type.
(test_range::sentinel, test_sized_range_sized_sent::sentinel):
Ensure that operator- returns difference_type.
|
|
A last minute change led to a wrong operand order in the compare insn.
gcc/ChangeLog:
* config/i386/i386.md (ustruncdi<mode>2): Swap compare operands.
(ustruncsi<mode>2): Ditto.
(ustrunchiqi2): Ditto.
|
|
In GCC 14 we deprecated Concepts TS and discussed removing the code
in GCC 15. This patch removes Concepts TS code from the front end,
including support for template-introductions, as in:
template<typename T>
concept C = true;
C{T} void foo (T); // write template<C T> void foo (T);
The biggest part of this patch is adjusting the testsuite. We don't
want to lose coverage so I've converted most of -fconcepts-ts tests
to C++20. That means they no longer have to be c++17_only. Mostly
this meant turning "concept bool" into "concept" and turning function
concepts into C++20 concepts. I've added missing "auto"s where
required, but "auto"s in template-argument-lists are not supported
anymore so I've removed some of the tests; some of them are still
present to verify we don't crash on such autos. I've also added ()
around "requires" expressions.
I plan to add a porting_to.html entry with a few hints.
I've rebased and tested the patch after the recent r15-1103.
gcc/c-family/ChangeLog:
* c-cppbuiltin.cc (c_cpp_builtins): Remove flag_concepts_ts code.
* c-opts.cc (c_common_post_options): Likewise.
* c.opt: Remove -fconcepts-ts.
* c.opt.urls: Regenerate.
gcc/cp/ChangeLog:
* constraint.cc (deduce_concept_introduction, get_deduced_wildcard,
get_introduction_prototype, introduce_type_template_parameter,
introduce_template_template_parameter,
introduce_nontype_template_parameter,
build_introduced_template_parameter, introduce_template_parameter,
introduce_template_parameter_pack, introduce_template_parameter,
introduce_template_parameters, process_introduction_parms,
check_introduction_list, finish_template_introduction): Remove.
(finish_shorthand_constraint): Remove a Concepts TS comment.
* cp-tree.h (check_auto_in_tmpl_args, finish_template_introduction):
Remove.
* decl.cc (function_requirements_equivalent_p): Remove pre-C++20 code.
(grokfndecl): Don't check flag_concepts_ts.
(grokvardecl): Don't check that concept have type bool.
* parser.cc (cp_parser_decl_specifier_seq): Don't check
flag_concepts_ts.
(cp_parser_introduction_list): Remove.
(cp_parser_template_id): Remove dead code.
(cp_parser_simple_type_specifier): Don't check flag_concepts_ts.
(cp_parser_placeholder_type_specifier): Require require auto or
decltype(auto) even pre-C++20. Don't check flag_concepts_ts.
(cp_parser_type_id_1): Don't check flag_concepts_ts.
(cp_parser_template_type_arg): Likewise.
(cp_parser_requires_clause_opt): Remove flag_concepts_ts code.
(cp_parser_compound_requirement): Don't check flag_concepts_ts.
(cp_parser_template_introduction): Remove.
(cp_parser_template_declaration_after_export): Don't call
cp_parser_template_introduction.
* pt.cc (template_heads_equivalent_p): Remove pre-C++20 code.
(find_parameter_pack_data): Remove type_pack_expansion_p.
(find_parameter_packs_r): Remove flag_concepts_ts code. Remove
type_pack_expansion_p code.
(uses_parameter_packs): Remove type_pack_expansion_p code.
(make_pack_expansion): Likewise.
(check_for_bare_parameter_packs): Likewise.
(fixed_parameter_pack_p): Likewise.
(tsubst_qualified_id): Remove dead code.
(extract_autos_r): Remove.
(extract_autos): Remove.
(do_auto_deduction): Remove flag_concepts_ts code.
(type_uses_auto): Likewise.
(check_auto_in_tmpl_args): Remove.
gcc/ChangeLog:
* doc/invoke.texi: Mention that -fconcepts-ts was removed.
libstdc++-v3/ChangeLog:
* testsuite/std/ranges/access/101782.cc: Don't compile with
-fconcepts-ts.
gcc/testsuite/ChangeLog:
* g++.dg/concepts/auto3.C: Compile with -fconcepts. Run in C++17 and
up. Add dg-error.
* g++.dg/concepts/auto5.C: Likewise.
* g++.dg/concepts/auto7.C: Compile with -fconcepts. Add dg-error.
* g++.dg/concepts/auto8a.C: Compile with -fconcepts.
* g++.dg/concepts/class-deduction1.C: Compile with -fconcepts. Run in
C++17 and up. Convert to C++20.
* g++.dg/concepts/class5.C: Likewise.
* g++.dg/concepts/class6.C: Likewise.
* g++.dg/concepts/debug1.C: Likewise.
* g++.dg/concepts/decl-diagnose.C: Compile with -fconcepts. Run in
C++17 and up. Add dg-error.
* g++.dg/concepts/deduction-constraint1.C: Compile with -fconcepts.
Run in C++17 and up. Convert to C++20.
* g++.dg/concepts/diagnostic1.C: Likewise.
* g++.dg/concepts/dr1430.C: Likewise.
* g++.dg/concepts/equiv.C: Likewise.
* g++.dg/concepts/equiv2.C: Likewise.
* g++.dg/concepts/expression.C: Likewise.
* g++.dg/concepts/expression2.C: Likewise.
* g++.dg/concepts/expression3.C: Likewise.
* g++.dg/concepts/fn-concept2.C: Compile with -fconcepts. Run in
C++17 and up. Remove code. Add dg-prune-output.
* g++.dg/concepts/fn-concept3.C: Compile with -fconcepts. Run in
C++17 and up. Convert to C++20.
* g++.dg/concepts/fn1.C: Likewise.
* g++.dg/concepts/fn10.C: Likewise.
* g++.dg/concepts/fn2.C: Likewise.
* g++.dg/concepts/fn3.C: Likewise.
* g++.dg/concepts/fn4.C: Likewise.
* g++.dg/concepts/fn5.C: Likewise.
* g++.dg/concepts/fn6.C: Likewise.
* g++.dg/concepts/fn7.C: Compile with -fconcepts. Add dg-error.
* g++.dg/concepts/fn8.C: Compile with -fconcepts. Run in C++17 and up.
Convert to C++20.
* g++.dg/concepts/fn9.C: Likewise.
* g++.dg/concepts/generic-fn-err.C: Likewise.
* g++.dg/concepts/generic-fn.C: Likewise.
* g++.dg/concepts/inherit-ctor1.C: Likewise.
* g++.dg/concepts/inherit-ctor3.C: Likewise.
* g++.dg/concepts/intro1.C: Likewise.
* g++.dg/concepts/locations1.C: Compile with -fconcepts. Run in C++17
and up. Add dg-prune-output.
* g++.dg/concepts/partial-concept-id1.C: Compile with -fconcepts.
Run in C++17 and up. Convert to C++20.
* g++.dg/concepts/partial-concept-id2.C: Likewise.
* g++.dg/concepts/partial-spec5.C: Likewise.
* g++.dg/concepts/placeholder2.C: Likewise.
* g++.dg/concepts/placeholder3.C: Likewise.
* g++.dg/concepts/placeholder4.C: Likewise.
* g++.dg/concepts/placeholder5.C: Likewise.
* g++.dg/concepts/placeholder6.C: Likewise.
* g++.dg/concepts/pr65634.C: Likewise.
* g++.dg/concepts/pr65636.C: Likewise.
* g++.dg/concepts/pr65681.C: Likewise.
* g++.dg/concepts/pr65848.C: Likewise.
* g++.dg/concepts/pr67249.C: Likewise.
* g++.dg/concepts/pr67595.C: Likewise.
* g++.dg/concepts/pr68434.C: Likewise.
* g++.dg/concepts/pr71127.C: Likewise.
* g++.dg/concepts/pr71128.C: Compile with -fconcepts. Run in C++17
and up. Add dg-error.
* g++.dg/concepts/pr71131.C: Compile with -fconcepts. Run in C++17
and up. Convert to C++20.
* g++.dg/concepts/pr71385.C: Likewise.
* g++.dg/concepts/pr85065.C: Likewise.
* g++.dg/concepts/pr92804-2.C: Compile with -fconcepts. Convert to
C++20.
* g++.dg/concepts/template-parm11.C: Compile with -fconcepts. Run in
C++17 and up. Convert to C++20.
* g++.dg/concepts/template-parm12.C: Likewise.
* g++.dg/concepts/template-parm2.C: Likewise.
* g++.dg/concepts/template-parm3.C: Likewise.
* g++.dg/concepts/template-parm4.C: Likewise.
* g++.dg/concepts/template-template-parm1.C: Likewise.
* g++.dg/concepts/var-concept1.C: Likewise.
* g++.dg/concepts/var-concept2.C: Likewise.
* g++.dg/concepts/var-concept3.C: Likewise.
* g++.dg/concepts/var-concept4.C: Likewise.
* g++.dg/concepts/var-concept5.C: Likewise.
* g++.dg/concepts/var-concept6.C: Likewise.
* g++.dg/concepts/var-concept7.C: Likewise.
* g++.dg/concepts/var-templ1.C: Run in C++17 and up.
* g++.dg/concepts/var-templ2.C: Compile with -fconcepts. Run in C++17
and up. Convert to C++20.
* g++.dg/concepts/var-templ3.C: Likewise.
* g++.dg/concepts/variadic1.C: Likewise.
* g++.dg/concepts/variadic2.C: Likewise.
* g++.dg/concepts/variadic3.C: Likewise.
* g++.dg/concepts/variadic4.C: Likewise.
* g++.dg/cpp2a/concepts-pr65575.C: Likewise.
* g++.dg/cpp2a/concepts-pr66091.C: Likewise.
* g++.dg/cpp2a/concepts-pr67148.C: Compile with -fconcepts. Convert
to C++20.
* g++.dg/cpp2a/concepts-pr67225-1.C: Likewise.
* g++.dg/cpp2a/concepts-pr67225-2.C: Likewise.
* g++.dg/cpp2a/concepts-pr67225-3.C: Likewise.
* g++.dg/cpp2a/concepts-pr67225-4.C: Likewise.
* g++.dg/cpp2a/concepts-pr67225-5.C: Likewise.
* g++.dg/cpp2a/concepts-pr67319.C: Likewise.
* g++.dg/cpp2a/concepts-pr67427.C: Likewise.
* g++.dg/cpp2a/concepts-pr67654.C: Likewise.
* g++.dg/cpp2a/concepts-pr67658.C: Likewise.
* g++.dg/cpp2a/concepts-pr67684.C: Likewise.
* g++.dg/cpp2a/concepts-pr67697.C: Likewise.
* g++.dg/cpp2a/concepts-pr67719.C: Likewise.
* g++.dg/cpp2a/concepts-pr67774.C: Likewise.
* g++.dg/cpp2a/concepts-pr67825.C: Likewise.
* g++.dg/cpp2a/concepts-pr67860.C: Likewise.
* g++.dg/cpp2a/concepts-pr67862.C: Likewise.
* g++.dg/cpp2a/concepts-pr67969.C: Likewise.
* g++.dg/cpp2a/concepts-pr68093-2.C: Likewise.
* g++.dg/cpp2a/concepts-pr68372.C: Likewise.
* g++.dg/cpp2a/concepts-pr68812.C: Likewise.
* g++.dg/cpp2a/concepts-pr69235.C: Likewise.
* g++.dg/cpp2a/concepts-pr78752-2.C: Likewise.
* g++.dg/cpp2a/concepts-pr78752.C: Likewise.
* g++.dg/cpp2a/concepts-pr79759.C: Likewise.
* g++.dg/cpp2a/concepts-pr80746.C: Likewise.
* g++.dg/cpp2a/concepts-pr80773.C: Likewise.
* g++.dg/cpp2a/concepts-pr82507.C: Likewise.
* g++.dg/cpp2a/concepts-pr82740.C: Likewise.
* g++.dg/cpp2a/concepts-pr84980.C: Compile with -fconcepts. Run in
C++17 and up. Convert to C++20.
* g++.dg/cpp2a/concepts-pr85265.C: Likewise.
* g++.dg/cpp2a/concepts-pr85808.C: Compile with -fconcepts. Convert
to C++20.
* g++.dg/cpp2a/concepts-pr86269.C: Likewise.
* g++.dg/cpp2a/concepts-pr87441.C: Likewise.
* g++.dg/cpp2a/concepts-requires5.C: Compile with -fconcepts.
Adjust dg-error. Add same_as.
* g++.dg/cpp2a/nontype-class50a.C: Compile with -fconcepts.
* g++.dg/concepts/auto1.C: Removed.
* g++.dg/concepts/auto4.C: Removed.
* g++.dg/concepts/auto6.C: Removed.
* g++.dg/concepts/fn-concept1.C: Removed.
* g++.dg/concepts/intro2.C: Removed.
* g++.dg/concepts/intro3.C: Removed.
* g++.dg/concepts/intro4.C: Removed.
* g++.dg/concepts/intro5.C: Removed.
* g++.dg/concepts/intro6.C: Removed.
* g++.dg/concepts/intro7.C: Removed.
* g++.dg/cpp2a/concepts-ts1.C: Removed.
* g++.dg/cpp2a/concepts-ts2.C: Removed.
* g++.dg/cpp2a/concepts-ts3.C: Removed.
* g++.dg/cpp2a/concepts-ts4.C: Removed.
* g++.dg/cpp2a/concepts-ts5.C: Removed.
* g++.dg/cpp2a/concepts-ts6.C: Removed.
|
|
Here we ICE in c_expr_sizeof_expr on an erroneous expr.value. The
code checks for expr.value == error_mark_node but here the e_m_n is
wrapped in a C_MAYBE_CONST_EXPR. I don't think we should have created
such a tree, so let's return earlier in c_cast_expr.
PR c/115642
gcc/c/ChangeLog:
* c-typeck.cc (c_cast_expr): Return error_mark_node if build_c_cast
failed.
gcc/testsuite/ChangeLog:
* gcc.dg/noncompile/sizeof-1.c: New test.
|
|
I had this PR in my open tabs so why not go ahead and fix it.
decl_attributes gets last_decl, the last already pushed declaration,
to be used in common_handle_aligned_attribute. In C++, we look up
the decl via find_last_decl, which returns NULL_TREE if it finds
a decl that had not been declared. In C, we look up the decl via
lookup_last_decl which returns error_mark_node rather than NULL_TREE
in that case.
The error_mark_node causes a crash in common_handle_aligned_attribute.
We can fix this on the C FE side like in the patch below.
PR c/115549
gcc/c/ChangeLog:
* c-decl.cc (c_decl_attributes): If lookup_last_decl returns
error_mark_node, use NULL_TREE as last_decl.
gcc/testsuite/ChangeLog:
* c-c++-common/attr-aligned-2.c: New test.
|
|
Since r13-1006-g2005b9b888eeac, the test case copysign_softfloat_1.c
no longer contains any lsr istruction, so drop the check as per
comment 9 in PR105090.
gcc/testsuite/ChangeLog:
PR target/105090
* gcc.target/arm/copysign_softfloat_1.c: Drop check for lsr
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
|
|
Update all instances of zba_zbb_zbs in the testsuite to use b instead
gcc/testsuite/ChangeLog:
* g++.target/riscv/redundant-bitmap-1.C: Use gcb instead of
zba_zbb_zbs
* g++.target/riscv/redundant-bitmap-2.C: Ditto
* g++.target/riscv/redundant-bitmap-3.C: Ditto
* g++.target/riscv/redundant-bitmap-4.C: Ditto
* gcc.target/riscv/shift-add-1.c: Ditto
* gcc.target/riscv/shift-add-2.c: Ditto
* gcc.target/riscv/synthesis-1.c: Ditto
* gcc.target/riscv/synthesis-2.c: Ditto
* gcc.target/riscv/synthesis-3.c: Ditto
* gcc.target/riscv/synthesis-4.c: Ditto
* gcc.target/riscv/synthesis-5.c: Ditto
* gcc.target/riscv/synthesis-6.c: Ditto
* gcc.target/riscv/synthesis-7.c: Ditto
* gcc.target/riscv/synthesis-8.c: Ditto
* gcc.target/riscv/zba_zbs_and-1.c: Ditto
* gcc.target/riscv/zbs-zext-3.c: Ditto
* lib/target-supports.exp: Add b to riscv_get_arch
Signed-off-by: Edwin Lu <ewlu@rivosinc.com>
|
|
This patch adds support for recognizing the B standard extension to be the
collection of Zba, Zbb, Zbs extensions for consistency and conciseness
across toolchains
https://github.com/riscv/riscv-b/tags
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Add imply rules for B extension
* config/riscv/arch-canonicalize: Ditto
Signed-off-by: Edwin Lu <ewlu@rivosinc.com>
|
|
expand_fn_using_insn has code to handle SUBREG_PROMOTED_VAR_P
destinations. Specifically, for:
(subreg/v:M1 (reg:M2 R) ...)
it creates a new temporary register T, uses it for the output
operand, then sign- or zero-extends the M1 lowpart of T to M2,
storing the result in R.
This patch splits this handling out into helper routines and
uses them for other instances of:
if (!rtx_equal_p (target, ops[0].value))
emit_move_insn (target, ops[0].value);
It's quite probable that this doesn't help any of the other cases;
in particular, it shouldn't affect vectors. But I think it could
be useful for the CRC work.
gcc/
* internal-fn.cc (create_call_lhs_operand, assign_call_lhs): New
functions, split out from...
(expand_fn_using_insn): ...here.
(expand_load_lanes_optab_fn): Use them.
(expand_GOMP_SIMT_ENTER_ALLOC): Likewise.
(expand_GOMP_SIMT_LAST_LANE): Likewise.
(expand_GOMP_SIMT_ORDERED_PRED): Likewise.
(expand_GOMP_SIMT_VOTE_ANY): Likewise.
(expand_GOMP_SIMT_XCHG_BFLY): Likewise.
(expand_GOMP_SIMT_XCHG_IDX): Likewise.
(expand_partial_load_optab_fn): Likewise.
(expand_vec_cond_optab_fn): Likewise.
(expand_vec_cond_mask_optab_fn): Likewise.
(expand_RAWMEMCHR): Likewise.
(expand_gather_load_optab_fn): Likewise.
(expand_while_optab_fn): Likewise.
(expand_SPACESHIP): Likewise.
|
|
This extends the r11-5179 fix which doesn't work with multidimensional
arrays. In particular,
struct S {
explicit S() { }
};
auto p = new S[1][1]();
should not say "converting to S from initializer list would use
explicit constructor" because there's no {}. However, since we
went into the block where we create a {}, we got confused. We
should not have gotten there but we did because array_p was true.
This patch refines the check once more.
PR c++/115645
gcc/cp/ChangeLog:
* init.cc (build_new): Don't do any deduction for arrays with
bounds if it's value-initialized.
gcc/testsuite/ChangeLog:
* g++.dg/expr/anew7.C: New test.
|
|
insn_propagation would previously only replace (reg:M H) with X
for some hard register H if the uses of H were also in mode M.
This patch extends it to handle simple mode punning too.
The original motivation was to try to get rid of the execution
frequency test in aarch64_split_simd_shift_p, but doing that is
follow-up work.
I tried this on at least one target per CPU directory (as for
the late-combine patches) and it seems to be a small win for
all of them.
The patch includes a couple of updates to the ia32 results.
In pr105033.c, foo3 replaced:
vmovq 8(%esp), %xmm1
vpunpcklqdq %xmm1, %xmm0, %xmm0
with:
vmovhps 8(%esp), %xmm0, %xmm0
In vect-bfloat16-2b.c, 5 of the vec_extract_v32bf_* routines
(specifically the ones with nonzero even indices) replaced
things like:
movl 28(%esp), %eax
vmovd %eax, %xmm0
with:
vpinsrw $0, 28(%esp), %xmm0, %xmm0
(These functions return a bf16, and so only the low 16 bits matter.)
gcc/
* recog.cc (insn_propagation::apply_to_rvalue_1): Handle simple
cases of hardreg propagation in which the register is set and
used in different modes.
gcc/testsuite/
* gcc.target/i386/pr105033.c: Expect vmovhps for the ia32 version
of foo.
* gcc.target/i386/vect-bfloat16-2b.c: Expect more vpinsrws.
|
|
change_insns is used to change multiple instructions at once, so that
the IR on return is valid & self-consistent. These changes can involve
moving instructions, and the new position for one instruction might
be expressed in terms of the old position of another instruction
that is changing at the same time.
change_insns therefore adds placeholder instructions to mark each
new instruction position, then replaces each placeholder with the
corresponding real instruction. This replacement was done in two
steps: removing the old placeholder instruction and inserting the new
real instruction. But it's more convenient for the upcoming fix for
PR115785 if we do the operation as a single step. That should also
be slightly more efficient, since e.g. no splay tree operations are
needed.
This operation happens purely on the rtl-ssa instruction chain.
The placeholders are never represented in rtl.
gcc/
PR rtl-optimization/115785
* rtl-ssa/functions.h (function_info::replace_nondebug_insn): Declare.
* rtl-ssa/insns.h (insn_info::order_node::set_uid): New function.
(insn_info::remove_note): Declare.
* rtl-ssa/insns.cc (insn_info::remove_note): New function.
(function_info::replace_nondebug_insn): Likewise.
* rtl-ssa/changes.cc (function_info::change_insns): Use
replace_nondebug_insn instead of remove_insn + add_insn.
|
|
The fix is unnecessary on macOS.
fixincludes/ChangeLog:
* fixincl.x: Regenerate.
* inclhack.def (stdio_stdarg_h): Skip on darwin.
|
|
During contract parsing, in grok_contract(), we proceed even if the
condition contains errors. This results in contracts with embedded errors
which eventually confuse gimplify. Checks for errors have been added in
grok_contract() to exit early if an error is encountered.
PR c++/113968
gcc/cp/ChangeLog:
* contracts.cc (grok_contract): Check for error_mark_node early
exit.
gcc/testsuite/ChangeLog:
* g++.dg/contracts/pr113968.C: New test.
Signed-off-by: Nina Ranns <dinka.ranns@gmail.com>
|
|
Headers are now fixed in the macOS 15 SDK, and the fix should be
bypassed there.
fixincludes/ChangeLog:
* fixincl.x: Regenerate.
* inclhack.def (darwin_objc_runtime_1): Add bypass.
|
|
The bug fix changes gcc/m2/gm2-gcc/m2builtins.c:m2builtins_BuiltinExists
to recognise both __builtin_<functionname> and functionname as a builtin.
gcc/m2/ChangeLog:
PR modula2/115823
* gm2-gcc/m2builtins.cc (struct builtin_macro_definition): New
field builtinname.
(builtin_function_match): New function.
(builtin_macro_match): Ditto.
(m2builtins_BuiltinExists): Use builtin_function_match and
builtin_macro_match.
(lookup_builtin_macro): Use builtin_macro_match.
(lookup_builtin_function): Use builtin_function_match.
(define_builtin): Assign builtinname field.
gcc/testsuite/ChangeLog:
PR modula2/115823
* gm2/builtins/run/pass/testalloa.mod: New test.
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
|
|
emit_store_flag_1 calculates scode (swapped condition code) at the
beginning of the function from the value of code variable. However,
code variable may change before scode usage site, resulting in
invalid stalled scode value.
Move calculation of scode value just before its only usage site to
avoid stalled scode value.
PR middle-end/115836
gcc/ChangeLog:
* expmed.cc (emit_store_flag_1): Move calculation of
scode just before its only usage site.
|
|
The arm 'pe' target was removed back in 2012 when the FPA support was
removed, but in a small number of places some conditional code was
accidentally left behind. It's no-longer needed, so remove it.
gcc/ChangeLog:
* config/arm/arm-protos.h (arm_dllexport_name_p): Remove prototype.
(arm_dllimport_name_p): Likewise.
(arm_pe_unique_section): Likewise.
(arm_pe_encode_section_info): Likewise.
(arm_dllexport_p): Likewise.
(arm_dllimport_p): Likewise.
(arm_mark_dllexport): Likewise.
(arm_mark_dllimport): Likewise.
(arm_change_mode_p): Likewise.
* config/arm/arm.cc (arm_gnu_attributes): Remove attributes for ARM_PE.
(TARGET_ENCODE_SECTION_INFO): Remove setting for ARM_PE.
(is_called_in_ARM_mode): Remove ARM_PE conditional code.
(thumb1_output_interwork): Remove obsolete ARM_PE code.
(arm_encode_section_info): Remove surrounding #ifndef.
|
|
gcc/ChangeLog:
PR lto/115394
* lto-streamer.h: Remove streamer_debugging definition.
* lto-streamer-out.cc (stream_write_tree_ref): Remove use of streamer_debugging.
(lto_output_tree): Likewise.
* tree-streamer-in.cc (streamer_read_tree_bitfields): Likewise.
(streamer_get_pickled_tree): Likewise.
* tree-streamer-out.cc (pack_ts_base_value_fields): Likewise.
Signed-off-by: Prathamesh Kulkarni <prathameshk@nvidia.com>
|
|
This patch would like to add form 2 support for the .SAT_TRUNC. Aka:
Form 2:
#define DEF_SAT_U_TRUC_FMT_2(NT, WT) \
NT __attribute__((noinline)) \
sat_u_truc_##WT##_to_##NT##_fmt_2 (WT x) \
{ \
bool overflow = x > (WT)(NT)(-1); \
return overflow ? (NT)-1 : (NT)x; \
}
DEF_SAT_U_TRUC_FMT_2(uint32, uint64)
Before this patch:
3 │
4 │ __attribute__((noinline))
5 │ uint32_t sat_u_truc_uint64_t_to_uint32_t_fmt_2 (uint64_t x)
6 │ {
7 │ uint32_t _1;
8 │ long unsigned int _3;
9 │
10 │ ;; basic block 2, loop depth 0
11 │ ;; pred: ENTRY
12 │ _3 = MIN_EXPR <x_2(D), 4294967295>;
13 │ _1 = (uint32_t) _3;
14 │ return _1;
15 │ ;; succ: EXIT
16 │
17 │ }
After this patch:
3 │
4 │ __attribute__((noinline))
5 │ uint32_t sat_u_truc_uint64_t_to_uint32_t_fmt_2 (uint64_t x)
6 │ {
7 │ uint32_t _1;
8 │
9 │ ;; basic block 2, loop depth 0
10 │ ;; pred: ENTRY
11 │ _1 = .SAT_TRUNC (x_2(D)); [tail call]
12 │ return _1;
13 │ ;; succ: EXIT
14 │
15 │ }
The below test suites are passed for this patch:
1. The x86 bootstrap test.
2. The x86 fully regression test.
3. The rv64gcv fully regresssion test.
gcc/ChangeLog:
* match.pd: Add form 2 for .SAT_TRUNC.
* tree-ssa-math-opts.cc (math_opts_dom_walker::after_dom_children):
Add new case NOP_EXPR, and try to match SAT_TRUNC.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
As a follow-up to adding a pattern that folds x/sqrt(x) to sqrt(x) in match.pd, this patch adds a test case for type Float16 for armv8.2-a+fp16.
The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/testsuite/
* gcc.target/aarch64/sqrt_div_float16.c: New test.
|
|
While working on adding V4QI support to the aarch64 backend,
vect/slp-gap-1.c started to fail but only because the regex
was failing. Before it was loading use SI (int) and afterwards,
we started to use V4QI. The generated code was the same and the
generated gimple was almost the same. The regex was searching
for `zero-padding trick` and it was still doing that but instead
of directly 0, it was V4QI 0 (or rather `{ 0, 0, 0 }`).
This extends regex to support both.
Tested on x86_64-linux-gnu and aarch64-linux-gnu (with the support added).
gcc/testsuite/ChangeLog:
* gcc.dg/vect/slp-gap-1.c: Support matching `{_1, { 0, 0, 0, 0 }}`
in addition to `{_1, 0}`.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
This code has been dead at least since the move over to tuples
in 0-88576-g726a989a8b74bf, when gimple returns could only have
a simple expression in it. So let's remove it.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.
gcc/ChangeLog:
PR tree-optimization/115721
* tree-complex.cc (expand_complex_comparison): Remove
support for GIMPLE_RETURN.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
No functional changes compared with V1, just spaces to table conversion
in testcases to pass check-function-bodies.
V1 passed regression locally but suprisingly failed in pre-commit CI, after
picking the patch from patchwork, I realize table got coverted to spaces
before sending the patch.
Root cause:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=b27d323a368033f0b37e93c57a57a35fd9997864
Commit above tries in targetm.gen_epilogue () to detect if
there's li a0,0 insn at the end of insn chain, if so, cm.popret
is replaced by cm.popretz and li a0,0 insn is deleted.
Insertion of the generated epilogue sequence
into the insn chain doesn't happen at this moment.
If later shrink-wrap decides NOT to insert the epilogue sequence at the end
of insn chain, then the li a0,0 insn has already been mistakeny removed.
Fix this issue by removing generation of cm.popretz in epilogue,
leaving the assignment to a0 and use insn with cm.popret.
That's likely going to result in some kind of code size regression,
but not a correctness regression.
Optimization can be done in future.
Signed-off-by: Fei Gao <gaofei@eswincomputing.com>
gcc/ChangeLog:
PR target/113715
* config/riscv/riscv.cc (riscv_zcmp_can_use_popretz): Removed.
(riscv_gen_multi_pop_insn): Remove generation of cm.popretz.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rv32e_zcmp.c: Adapt TC.
* gcc.target/riscv/rv32i_zcmp.c: Likewise.
|