Age | Commit message (Collapse) | Author | Files | Lines |
|
This patch adds optabs that check whether a read followed by a write
or a write followed by a read can be divided into interleaved byte
accesses without changing the dependencies between the bytes.
This is one of the uses of the SVE2 WHILERW and WHILEWR instructions.
(The instructions can also be used to limit the VF at runtime,
but that's future work.)
2019-11-18 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* doc/sourcebuild.texi (vect_check_ptrs): Document.
* optabs.def (check_raw_ptrs_optab, check_war_ptrs_optab): New optabs.
* doc/md.texi: Document them.
* internal-fn.def (IFN_CHECK_RAW_PTRS, IFN_CHECK_WAR_PTRS): New
internal functions.
* internal-fn.h (internal_check_ptrs_fn_supported_p): Declare.
* internal-fn.c (check_ptrs_direct): New macro.
(expand_check_ptrs_optab_fn): Likewise.
(direct_check_ptrs_optab_supported_p): Likewise.
(internal_check_ptrs_fn_supported_p): New fuction.
* tree-data-ref.c: Include internal-fn.h.
(create_ifn_alias_checks): New function.
(create_intersect_range_checks): Use it.
* config/aarch64/iterators.md (SVE2_WHILE_PTR): New int iterator.
(optab, cmp_op): Handle it.
(raw_war, unspec): New int attributes.
* config/aarch64/aarch64.md (UNSPEC_WHILERW, UNSPEC_WHILE_WR): New
constants.
* config/aarch64/predicates.md (aarch64_bytes_per_sve_vector_operand):
New predicate.
* config/aarch64/aarch64-sve2.md (check_<raw_war>_ptrs<mode>): New
expander.
(@aarch64_sve2_while<cmp_op><GPI:mode><PRED_ALL:mode>_ptest): New
pattern.
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_vect_check_ptrs):
New procedure.
* gcc.dg/vect/vect-alias-check-14.c: Expect IFN_CHECK_WAR to be
used, if available.
* gcc.dg/vect/vect-alias-check-15.c: Likewise.
* gcc.dg/vect/vect-alias-check-16.c: Likewise IFN_CHECK_RAW.
* gcc.target/aarch64/sve2/whilerw_1.c: New test.
* gcc.target/aarch64/sve2/whilewr_1.c: Likewise.
* gcc.target/aarch64/sve2/whilewr_2.c: Likewise.
From-SVN: r278414
|
|
Empty vector constructors are equivalent to zero vectors. If we handle
that case directly, we can support it for variable-length vectors and
can hopefully make things more efficient for fixed-length vectors.
This is needed by a later C++ patch.
2019-11-18 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree.c (build_vector_from_ctor): Directly return a zero vector for
empty constructors.
From-SVN: r278413
|
|
SVE has two composite conditions:
pmore == at least one bit set && last bit clear
plast == no bits set || last bit set
So in general we generate them from:
A: CC = test bits
B: reg1 = first condition
C: CC = test bits
D: reg2 = second condition
E: result = (reg1 op reg2) where op is || or &&
To fold all this into a single test, we need to be able to remove
the redundant C (the cse.c patch) and then fold B, D and E down to
a single condition (the simplify-rtx.c patch).
The underlying conditions are unsigned, so the simplify-rtx.c part needs
to support both unsigned comparisons and AND. However, to avoid opening
the can of worms that is ANDing FP comparisons for unordered inputs,
I've restricted the new AND handling to cases in which NaNs can be
ignored. I think this is still a strict extension of what we have now,
it just doesn't go as far as it could. Going further would need an
entirely different set of testcases so I think would make more sense
as separate work.
2019-11-18 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* cse.c (cse_insn): Delete no-op register moves too.
* simplify-rtx.c (comparison_to_mask): Handle unsigned comparisons.
Take a second comparison to control the value for NE.
(mask_to_comparison): Handle unsigned comparisons.
(simplify_logical_relational_operation): Likewise. Update call
to comparison_to_mask. Handle AND if !HONOR_NANs.
(simplify_binary_operation_1): Call the above for AND too.
gcc/testsuite/
* gcc.target/aarch64/sve/acle/asm/ptest_pmore.c: New test.
From-SVN: r278411
|
|
This patch handles VIEW_CONVERT_EXPRs of variable-length VECTOR_CSTs
by adding tree-level versions of native_decode_vector_rtx and
simplify_const_vector_subreg. It uses the same code for fixed-length
vectors, both to get more coverage and because operating directly on
the compressed encoding should be more efficient for longer vectors
with a regular pattern.
The structure and comments are very similar between the tree and
rtx routines.
2019-11-18 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* fold-const.c (native_encode_vector): Turn into a wrapper function,
splitting the main code out into...
(native_encode_vector_part): ...this new function.
(native_decode_vector_tree): New function.
(fold_view_convert_vector_encoding): Likewise.
(fold_view_convert_expr): Use it for converting VECTOR_CSTs
to VECTOR_TYPEs.
gcc/testsuite/
* gcc.target/aarch64/sve/acle/general/temporaries_1.c: New test.
From-SVN: r278410
|
|
For:
void
f1 (int *x, int *y)
{
for (int i = 0; i < 32; ++i)
x[i] += y[i];
}
we checked at runtime whether one vector at x would overlap one vector
at y. But in cases like this, the vector code would handle x <= y just
fine, since any write to address A still happens after any read from
address A. The only problem is if x is ahead of y by less than a
vector.
The same is true for two writes:
void
f2 (int *x, int *y)
{
for (int i = 0; i < 32; ++i)
{
x[i] = i;
y[i] = 2;
}
}
if y <= x then a vector write at y after a vector write at x would
have the same net effect as the original scalar writes.
This patch optimises the alias checks for these two cases. E.g.,
before the patch, f1 used:
add x2, x0, 15
sub x2, x2, x1
cmp x2, 30
bls .L2
whereas after the patch it uses:
add x2, x1, 4
sub x2, x0, x2
cmp x2, 8
bls .L2
Read-after-write cases like:
int
f3 (int *x, int *y)
{
int res = 0;
for (int i = 0; i < 32; ++i)
{
x[i] = i;
res += y[i];
}
return res;
}
can cope with x == y, but otherwise don't allow overlap in either
direction. Since checking for x == y at runtime would require extra
code, we're probably better off sticking with the current overlap test.
An overlap test is also needed if the scalar or vector accesses covered
by the alias check are mixed together, rather than all statements for
the second access following all statements for the first access.
The new code for gcc.target/aarch64/sve/var_strict_[135].c is slightly
better than before.
2019-11-18 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-data-ref.c (create_intersect_range_checks_index): If the
alias pair describes simple WAW and WAR dependencies, just check
whether the first B access overlaps later A accesses.
(create_waw_or_war_checks): New function that performs the same
optimization on addresses.
(create_intersect_range_checks): Call it.
gcc/testsuite/
* gcc.dg/vect/vect-alias-check-8.c: Expect WAR/WAW checks to be used.
* gcc.dg/vect/vect-alias-check-14.c: Likewise.
* gcc.dg/vect/vect-alias-check-15.c: Likewise.
* gcc.dg/vect/vect-alias-check-18.c: Likewise.
* gcc.dg/vect/vect-alias-check-19.c: Likewise.
* gcc.target/aarch64/sve/var_stride_1.c: Update expected sequence.
* gcc.target/aarch64/sve/var_stride_2.c: Likewise.
* gcc.target/aarch64/sve/var_stride_3.c: Likewise.
* gcc.target/aarch64/sve/var_stride_5.c: Likewise.
From-SVN: r278409
|
|
LRA allows address constraints that are more relaxed than "p":
/* Target hooks sometimes don't treat extra-constraint addresses as
legitimate address_operands, so handle them specially. */
if (insn_extra_address_constraint (cn)
&& satisfies_address_constraint_p (&ad, cn))
return change_p;
For SVE it's useful to allow the same thing for memory constraints.
The particular use case is LD1RQ, which is an SVE instruction that
addresses Advanced SIMD vector modes and that accepts some addresses
that normal Advanced SIMD moves don't.
Normally we require every memory to satisfy at least "m", which is
defined to be a memory "with any kind of address that the machine
supports in general". However, LD1RQ is very much special-purpose:
it doesn't really have any relation to normal operations on these
modes. Adding its addressing modes to "m" would lead to bad Advanced
SIMD optimisation decisions in passes like ivopts. LD1RQ therefore
has a memory constraint that accepts things "m" doesn't.
2019-11-18 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* lra-constraints.c (valid_address_p): Take the operand and a
constraint as argument. If the operand is a MEM and the constraint
is a memory constraint, check whether the eliminated form of the
MEM already satisfies the constraint.
(process_address_1): Update calls accordingly.
gcc/testsuite/
* gcc.target/aarch64/sve/acle/asm/ld1rq_f16.c: Remove XFAIL.
* gcc.target/aarch64/sve/acle/asm/ld1rq_f32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1rq_f64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1rq_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1rq_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1rq_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1rq_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1rq_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/ld1rq_u64.c: Likewise.
From-SVN: r278408
|
|
I happened to notice that MODIFY_JNI_METHOD_CALL was defined in
cygming.h and documented in tm.texi. However, because it was only
needed for gcj, it is obsolete. This patch removes the vestiges.
Tested by grep, and rebuilding the documentation.
gcc/ChangeLog
2019-11-18 Tom Tromey <tromey@adacore.com>
* doc/tm.texi: Rebuild.
* doc/tm.texi.in (Misc): Don't document MODIFY_JNI_METHOD_CALL.
* config/i386/cygming.h (MODIFY_JNI_METHOD_CALL): Don't define.
From-SVN: r278407
|
|
tree-vect-slp.c:4095 since r278246)
2019-11-18 Richard Biener <rguenther@suse.de>
PR tree-optimization/92516
* tree-vect-slp.c (vect_analyze_slp_instance): Add bst_map
argument, hoist bst_map creation/destruction to ...
(vect_analyze_slp): ... here, forming a true graph with
SLP instances being the entries.
(vect_detect_hybrid_slp_stmts): Remove wrapper.
(vect_detect_hybrid_slp): Use one visited set for all
graph entries.
(vect_slp_analyze_node_operations): Simplify visited/lvisited
to hash-sets of slp_tree.
(vect_slp_analyze_operations): Likewise.
(vect_bb_slp_scalar_cost): Remove wrapper.
(vect_bb_vectorization_profitable_p): Use one visited set for
all graph entries.
(vect_schedule_slp_instance): Elide bst_map use.
(vect_schedule_slp): Likewise.
* g++.dg/vect/slp-pr92516.cc: New testcase.
2019-11-18 Richard Biener <rguenther@suse.de>
* tree-vect-slp.c (vect_analyze_slp_instance): When a CTOR
was vectorized with just external refs fail.
* gcc.dg/vect/vect-ctor-1.c: New testcase.
From-SVN: r278406
|
|
2019-11-18 Martin Liska <mliska@suse.cz>
PR ipa/92525
* ipa-icf.c (sem_function::init): Unset m_checker
at the end of the function.
From-SVN: r278405
|
|
2019-11-18 Martin Liska <mliska@suse.cz>
* gcc.dg/ipa/ipa-icf-36.c: Remove 'all-all-all'.
* gcc.dg/ipa/ipa-icf-37.c: Likewise.
From-SVN: r278404
|
|
From-SVN: r278403
|
|
The std::jthread::get_id() function was missing a return statement.
The is_invocable check needs to be done using decayed types, as they'll
be forwarded to std::invoke as rvalues.
Also reduce header dependencies for the <thread> header. We don't need
to include <functional> for std::jthread because <bits/invoke.h> is
already included, which defines std::__invoke. We can also remove
<bits/functexcept.h> which isn't used at all. Finally, when
_GLIBCXX_HAS_GTHREADS is not defined there's no point including any
other headers, since we're not going to define anything in <thread>
anyway.
* include/std/thread: Reduce header dependencies.
(jthread::get_id()): Add missing return.
(jthread::get_stop_token()): Avoid unnecessary stop_source temporary.
(jthread::_S_create): Check is_invocable using decayed types. Add
static assertion.
* testsuite/30_threads/jthread/1.cc: Add dg-require-gthreads.
* testsuite/30_threads/jthread/2.cc: Likewise.
* testsuite/30_threads/jthread/3.cc: New test.
* testsuite/30_threads/jthread/jthread.cc: Add missing directives for
pthread and gthread support. Use VERIFY instead of assert.
From-SVN: r278402
|
|
* include/bits/alloc_traits.h (allocator_traits::construct)
(allocator_traits::destroy, allocator_traits::max_size): Add unused
attributes to parameters that are not used in C++20.
* include/std/bit (__ceil2): Add braces around assertion to avoid
-Wmissing-braces warning.
From-SVN: r278401
|
|
-march=znver2 -flto since r278289)
2019-11-18 Richard Biener <rguenther@suse.de>
PR tree-optimization/92558
* tree-vect-loop.c (vect_create_epilog_for_reduction): When
reducting the width of a reduction vector def update new_phis.
* gcc.dg/vect/pr92558.c: New testcase.
From-SVN: r278400
|
|
The gthr weak reference based single thread detection is unsafe with
static linking and in case of dynamic linking it's ineffective on musl
since pthread symbols are defined in libc.so.
(Ideally this should be fixed for all targets, since glibc plans to move
libpthread.so into libc.so too and users want to static link to pthread
without --whole-archive: PR87189.)
For now we have to explicitly opt out from the broken behaviour in the
config machinery of each target lib and libgcc was previously missed.
libgcc/ChangeLog:
2019-11-18 Szabolcs Nagy <szabolcs.nagy@arm.com>
* config.host: Add t-gthr-noweak on *-*-musl*.
* config/t-gthr-noweak: New file.
From-SVN: r278399
|
|
On powerpc and s390x the musl ABI requires 64 bit and 128 bit long
double respectively, so adjust the default.
gcc/ChangeLog:
2019-11-18 Szabolcs Nagy <szabolcs.nagy@arm.com>
* configure.ac (gcc_cv_target_ldbl128): Set for powerpc*-*-linux-musl*
and s390*-*-linux-musl* targets.
* configure: Regenerate.
From-SVN: r278398
|
|
Add the musl dynamic linker names.
gcc/ChangeLog:
2019-11-18 Szabolcs Nagy <szabolcs.nagy@arm.com>
* config/s390/linux.h (MUSL_DYNAMIC_LINKER32): Define.
(MUSL_DYNAMIC_LINKER64): Define.
From-SVN: r278397
|
|
2019-11-18 Martin Liska <mliska@suse.cz>
* dbgcnt.c (dbg_cnt_set_limit_by_name): Provide error
message for an unknown counter.
(dbg_cnt_process_single_pair): Support 0 as minimum value.
(dbg_cnt_process_opt): Remove unreachable code.
From-SVN: r278396
|
|
2019-11-18 Martin Liska <mliska@suse.cz>
PR ipa/92529
* ipa-icf-gimple.c (func_checker::compare_gimple_assign):
Compare LHS types of NOP_EXPR.
2019-11-18 Martin Liska <mliska@suse.cz>
PR ipa/92529
* gcc.dg/ipa/pr92529.c: New test.
From-SVN: r278395
|
|
Hi there,
When compiling an __RTL function that has an unspecified "startwith"
pass we currently don't run the cleanup pass, this means that we ICE on
the next function (if it's a basic function).
This change ensures that the clean_state pass is run even if the
startwith pass is unspecified.
We also ensure the name of the startwith pass is always freed correctly.
As an example, before this change the following code would ICE when compiling
the function `foo_a`.
When compiled with
./aarch64-none-linux-gnu-gcc -O0 -S unspecified-pass-error.c -o test.s
```
int __RTL () badfoo ()
{
(function "badfoo"
(insn-chain
(block 2
(edge-from entry (flags "FALLTHRU"))
(cnote 3 [bb 2] NOTE_INSN_BASIC_BLOCK)
(cinsn 101 (set (reg:DI x19) (reg:DI x0)))
(cinsn 10 (use (reg/i:SI x19)))
(edge-to exit (flags "FALLTHRU"))
) ;; block 2
) ;; insn-chain
) ;; function "foo2"
}
int
foo_a ()
{
return 200;
}
```
Now it silently ignores the __RTL function and successfully compiles foo_a.
regtest done on aarch64
regtest done on x86_64
OK for trunk?
gcc/ChangeLog:
2019-11-18 Matthew Malcomson <matthew.malcomson@arm.com>
* run-rtl-passes.c (run_rtl_passes): Accept and handle empty
"initial_pass_name" argument -- by running "*clean_state" pass.
Also free the "initial_pass_name" when done.
gcc/c/ChangeLog:
2019-11-18 Matthew Malcomson <matthew.malcomson@arm.com>
* c-parser.c (c_parser_parse_rtl_body): Always call
run_rtl_passes, even if startwith pass is not provided.
gcc/testsuite/ChangeLog:
2019-11-18 Matthew Malcomson <matthew.malcomson@arm.com>
* gcc.dg/rtl/aarch64/unspecified-pass-error.c: New test.
From-SVN: r278393
|
|
hoisted out)
2019-11-18 Richard Biener <rguenther@suse.de>
PR rtl-optimization/92462
* alias.c (find_base_term): Restrict the look through ANDs.
(find_base_value): Likewise.
From-SVN: r278391
|
|
option name
2019-11-18 Christophe Lyon <christophe.lyon@linaro.org>
* lib/target-supports.exp
(check_effective_target_arm_vfp_ok_nocache): Fix typo in option
name.
From-SVN: r278390
|
|
PR target/92545
* config/avr/gen-avr-mmcu-specs.c (print_mcu)
[link_pm_base_address]: Symbol name is __RODATA_PM_OFFSET__.
From-SVN: r278389
|
|
PR target/92545
* doc/avr-mmcu.texi: Regenerate.
From-SVN: r278388
|
|
PR target/92545
* config/avr/avr-arch.h (avr_mcu_t) <flash_pm_offset>: New field.
* config/avr/avr-devices.c (avr_mcu_types): Adjust initializers.
* config/avr/avr-mcus.def (AVR_MCU): Add respective field.
* config/avr/specs.h (LINK_SPEC) <%(link_pm_base_address)>: Add.
* config/avr/gen-avr-mmcu-specs.c (print_mcu)
<*cpp, *cpp_mcu, *cpp_avrlibc, *link_pm_base_address>: Emit code
for spec definitions.
* doc/avr-mmcu.texi: Regenerate.
From-SVN: r278387
|
|
and X86_TUNE_AVX128_OPTIMAL.
Changelog
gcc/
PR target/92448
* config/i386/i386-expand.c (ix86_expand_set_or_cpymem):
Replace TARGET_AVX128_OPTIMAL with TARGET_AVX256_SPLIT_REGS.
* config/i386/i386-option.c (ix86_vec_cost): Ditto.
(ix86_reassociation_width): Ditto.
* config/i386/i386-options.c (ix86_option_override_internal):
Replace TARGET_AVX128_OPTIAML with
ix86_tune_features[X86_TUNE_AVX128_OPTIMAL]
* config/i386/i386.h (TARGET_AVX256_SPLIT_REGS): New macro.
(TARGET_AVX128_OPTIMAL): Deleted.
* config/i386/x86-tune.def (X86_TUNE_AVX256_SPLIT_REGS): New
DEF_TUNE.
From-SVN: r278385
|
|
Commit r276389 ("configure.ac: Remove GCC_HEADER_STDINT(gstdint.h)") has
not regenerated `testsuite/Makefile.in'. Fix it.
libgomp/
* testsuite/Makefile.in: Regenerate.
From-SVN: r278384
|
|
A change made with r271340 ("libfortran/90038: Use posix_spawn instead
of fork") accidentally brought the obsolete `runstatedir' setting back
in. Fix it.
libgfortran/
* Makefile.in: Regenerate.
From-SVN: r278383
|
|
From-SVN: r278382
|
|
* config/pa/linux-atomic.c (__kernel_cmpxchg): Change argument 1 to
volatile void *. Remove trap check.
(__kernel_cmpxchg2): Likewise.
(FETCH_AND_OP_2): Adjust operand types.
(OP_AND_FETCH_2): Likewise.
(FETCH_AND_OP_WORD): Likewise.
(OP_AND_FETCH_WORD): Likewise.
(COMPARE_AND_SWAP_2): Likewise.
(__sync_val_compare_and_swap_4): Likewise.
(__sync_bool_compare_and_swap_4): Likewise.
(SYNC_LOCK_TEST_AND_SET_2): Likewise.
(__sync_lock_test_and_set_4): Likewise.
(SYNC_LOCK_RELEASE_1): Likewise. Use __kernel_cmpxchg2 for release.
(__sync_lock_release_4): Adjust operand types. Use __kernel_cmpxchg
for release.
(__sync_lock_release_8): Remove.
From-SVN: r278377
|
|
From-SVN: r278376
|
|
the decl.
* method.c (lookup_comparison_result): Use %qD instead of %<%T::%D%>
to print the decl.
(lookup_comparison_category): Use %qD instead of %<std::%D%> to print
the decl.
* g++.dg/cpp2a/spaceship-err3.C: New test.
From-SVN: r278375
|
|
2019-11-16 Edward Smith-Rowland <3dw4rd@verizon.net>
Repair the <tuple> part of C++20 p1032 Misc constexpr bits.
* include/bits/uses_allocator.h (__uses_alloc0::_Sink::operaror=)
(__use_alloc(const _Alloc&)) : Constexpr.
From-SVN: r278373
|
|
* include/std/string_view (basic_string_view(It, End)): Add range
constructor and deduction guide from P1391R4.
* testsuite/21_strings/basic_string_view/cons/char/range.cc: New test.
From-SVN: r278371
|
|
This adds another chunk of the <ranges> header.
The changes from P1456R1 (Move-only views) and P1862R1 (Range adaptors
for non-copyable iterators) are included, but not the changes from
P1870R1 (forwarding-range<T> is too subtle).
The tests for subrange and iota_view are poor and should be improved.
* include/bits/regex.h (match_results): Specialize __enable_view_impl.
* include/bits/stl_set.h (set): Likewise.
* include/bits/unordered_set.h (unordered_set, unordered_multiset):
Likewise.
* include/debug/multiset.h (__debug::multiset): Likewise.
* include/debug/set.h (__debug::set): Likewise.
* include/debug/unordered_set (__debug::unordered_set)
(__debug::unordered_multiset): Likewise.
* include/std/ranges (ranges::view, ranges::enable_view)
(ranges::view_interface, ranges::subrange, ranges::empty_view)
(ranges::single_view, ranges::views::single, ranges::iota_view)
(ranges::views::iota): Define for C++20.
* testsuite/std/ranges/empty_view.cc: New test.
* testsuite/std/ranges/iota_view.cc: New test.
* testsuite/std/ranges/single_view.cc: New test.
* testsuite/std/ranges/view.cc: New test.
From-SVN: r278370
|
|
From-SVN: r278369
|
|
Also make it a parmeterized name: @cceq_{ior,rev}_compare_<mode>.
* config/rs6000/rs6000.md (cceq_ior_compare): Rename to...
(@cceq_ior_compare_<mode> for GPR): ... this. Allow GPR instead of
just SI.
(cceq_rev_compare): Rename to...
(@cceq_rev_compare_<mode> for GPR): ... this. Allow GPR instead of
just SI.
(define_split for <bd>tf_<mode>): Add SImode first argument to
gen_cceq_ior_compare.
From-SVN: r278366
|
|
This was not meant to be on the branch I committed r278364 from, as it
is not ready to commit yet.
* include/std/ranges: Revert accidentally committed changes.
From-SVN: r278365
|
|
This change avoids storing a copy of a stop_token object that isn't
needed and won't be passed to the callable object. This slightly reduces
memory usage when the callable doesn't use a stop_token. It also removes
indirection in the invocation of the callable in the new thread, as
there is no lambda and no additional calls to std::invoke.
It also adds some missing [[nodiscard]] attributes, and the non-member
swap overload for std::jthread.
* include/std/thread (jthread::jthread()): Use nostopstate constant.
(jthread::jthread(Callable&&, Args&&...)): Use helper function to
create std::thread instead of indirection through a lambda. Use
remove_cvref_t instead of decay_t.
(jthread::joinable(), jthread::get_id(), jthread::native_handle())
(jthread::hardware_concurrency()): Add nodiscard attribute.
(swap(jthread&. jthread&)): Define hidden friend.
(jthread::_S_create): New helper function for constructor.
From-SVN: r278364
|
|
From-SVN: r278363
|
|
I missed this part in r266961. Various people have been editing it
since; I finally noticed.
* common/config/powerpcspe: Delete.
From-SVN: r278361
|
|
* cp-demangle.c (d_print_init): Remove const from 4th param.
(cplus_demangle_fill_name): Initialize d->d_counting.
(cplus_demangle_fill_extended_operator): Likewise.
(cplus_demangle_fill_ctor): Likewise.
(cplus_demangle_fill_dtor): Likewise.
(d_make_empty): Likewise.
(d_count_templates_scopes): Remobe const from 3rd param,
Return on dc->d_counting > 1,
Increment dc->d_counting.
* cp-demint.c (cplus_demangle_fill_component): Initialize d->d_counting.
(cplus_demangle_fill_builtin_type): Likewise.
(cplus_demangle_fill_operator): Likewise.
* demangle.h (struct demangle_component): Add member
d_counting.
From-SVN: r278359
|
|
* demangle.h (rust_demangle_callback): Add.
* cplus-dem.c (cplus_demangle): Use rust_demangle directly.
(rust_demangle): Remove.
* rust-demangle.c (is_prefixed_hash): Rename to is_legacy_prefixed_hash.
(parse_lower_hex_nibble): Rename to decode_lower_hex_nibble.
(parse_legacy_escape): Rename to decode_legacy_escape.
(rust_is_mangled): Remove.
(struct rust_demangler): Add.
(peek): Add.
(next): Add.
(struct rust_mangled_ident): Add.
(parse_ident): Add.
(rust_demangle_sym): Remove.
(print_str): Add.
(PRINT): Add.
(print_ident): Add.
(rust_demangle_callback): Add.
(struct str_buf): Add.
(str_buf_reserve): Add.
(str_buf_append): Add.
(str_buf_demangle_callback): Add.
(rust_demangle): Add.
* rust-demangle.h: Remove.
From-SVN: r278358
|
|
From-SVN: r278357
|
|
This patch uses distinct values for the FFR and FFRT outputs of
aarch64_wrffr, so that a following aarch64_copy_ffr_to_ffrt has
an effect. This is needed to avoid regressions with later patches.
The block comment at the head of the file already described
the pattern this way, and there was already an unspec for it.
Not sure what made me change it...
2019-11-16 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* config/aarch64/aarch64-sve.md (aarch64_wrffr): Wrap the FFRT
output in UNSPEC_WRFFR.
From-SVN: r278356
|
|
This patch rewrites the index-based alias checks to use conditions
of the form:
(unsigned T) (a - b + bias) <= limit
E.g. before the patch:
struct s { int x[100]; };
void
f1 (struct s *s1, int a, int b)
{
for (int i = 0; i < 32; ++i)
s1->x[i + a] += s1->x[i + b];
}
used:
add w3, w1, 3
cmp w3, w2
add w3, w2, 3
ccmp w1, w3, 0, ge
ble .L2
whereas after the patch it uses:
sub w3, w1, w2
add w3, w3, 3
cmp w3, 6
bls .L2
The patch also fixes the seg_len1 and seg_len2 negation for cases in
which seg_len is a "negative unsigned" value narrower than 64 bits,
like it is for 32-bit targets. Previously we'd end up with values
like 0xffffffff000000001 instead of 1.
2019-11-16 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-data-ref.c (create_intersect_range_checks_index): Rewrite
the index tests to have the form (unsigned T) (B - A + bias) <= limit.
gcc/testsuite/
* gcc.dg/vect/vect-alias-check-18.c: New test.
* gcc.dg/vect/vect-alias-check-19.c: Likewise.
* gcc.dg/vect/vect-alias-check-20.c: Likewise.
From-SVN: r278354
|
|
This patch prints a message to say how an alias check is being
implemented.
2019-11-16 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-data-ref.c (create_intersect_range_checks_index)
(create_intersect_range_checks): Print dump messages.
gcc/testsuite/
* gcc.dg/vect/vect-alias-check-1.c: Test for the type of alias check.
* gcc.dg/vect/vect-alias-check-8.c: Likewise.
* gcc.dg/vect/vect-alias-check-9.c: Likewise.
* gcc.dg/vect/vect-alias-check-10.c: Likewise.
* gcc.dg/vect/vect-alias-check-11.c: Likewise.
* gcc.dg/vect/vect-alias-check-12.c: Likewise.
* gcc.dg/vect/vect-alias-check-13.c: Likewise.
* gcc.dg/vect/vect-alias-check-14.c: Likewise.
* gcc.dg/vect/vect-alias-check-15.c: Likewise.
* gcc.dg/vect/vect-alias-check-16.c: Likewise.
* gcc.dg/vect/vect-alias-check-17.c: Likewise.
From-SVN: r278353
|
|
This patch dumps the final (merged) list of alias pairs. It also adds:
- WAW and RAW versions of vect-alias-check-8.c
- a "well-ordered" version of vect-alias-check-9.c (i.e. all reads
before any writes)
- a test with mixed steps in the same alias pair
I also tweaked the test value in vect-alias-check-9.c so that the
result was less likely to be accidentally correct if the alias
isn't honoured.
2019-11-16 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-data-ref.c (dump_alias_pair): New function.
(prune_runtime_alias_test_list): Use it to dump each merged alias pair.
gcc/testsuite/
* gcc.dg/vect/vect-alias-check-8.c: Test for the RAW flag.
* gcc.dg/vect/vect-alias-check-9.c: Test for the ARBITRARY flag.
(TEST_VALUE): Use a higher value for early iterations.
* gcc.dg/vect/vect-alias-check-14.c: New test.
* gcc.dg/vect/vect-alias-check-15.c: Likewise.
* gcc.dg/vect/vect-alias-check-16.c: Likewise.
* gcc.dg/vect/vect-alias-check-17.c: Likewise.
From-SVN: r278352
|
|
prune_runtime_alias_test_list can merge dr_with_seg_len_pair_ts that
have different steps for the first reference or different steps for the
second reference. This patch adds a flag to record that.
I don't know whether the change to create_intersect_range_checks_index
fixes anything in practice. It would have to be a corner case if so,
since at present we only merge two alias pairs if either the first or
the second references are identical and only the other references differ.
And the vectoriser uses VF-based segment lengths only if both references
in a pair have the same step. Either way, it still seems wrong to use
DR_STEP when it doesn't represent all checks that have been merged into
the pair.
2019-11-16 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-data-ref.h (DR_ALIAS_MIXED_STEPS): New flag.
* tree-data-ref.c (prune_runtime_alias_test_list): Set it when
merging data references with different steps.
(create_intersect_range_checks_index): Take a
dr_with_seg_len_pair_t instead of two dr_with_seg_lens.
Bail out if DR_ALIAS_MIXED_STEPS is set.
(create_intersect_range_checks): Take a dr_with_seg_len_pair_t
instead of two dr_with_seg_lens. Update call to
create_intersect_range_checks_index.
(create_runtime_alias_checks): Update call accordingly.
From-SVN: r278351
|
|
This patch adds a bunch of flags to dr_with_seg_len_pair_t,
for use by later patches. The update to tree-loop-distribution.c
is conservatively correct, but might be tweakable later.
2019-11-16 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* tree-data-ref.h (DR_ALIAS_RAW, DR_ALIAS_WAR, DR_ALIAS_WAW)
(DR_ALIAS_ARBITRARY, DR_ALIAS_SWAPPED, DR_ALIAS_UNSWAPPED): New flags.
(dr_with_seg_len_pair_t::sequencing): New enum.
(dr_with_seg_len_pair_t::flags): New member variable.
(dr_with_seg_len_pair_t::dr_with_seg_len_pair_t): Take a sequencing
parameter and initialize the flags member variable.
* tree-loop-distribution.c (compute_alias_check_pairs): Update
call accordingly.
* tree-vect-data-refs.c (vect_prune_runtime_alias_test_list): Likewise.
Ensure the two data references in an alias pair are in statement
order, if there is a defined order.
* tree-data-ref.c (prune_runtime_alias_test_list): Use
DR_ALIAS_SWAPPED and DR_ALIAS_UNSWAPPED to record whether we've
swapped the references in a dr_with_seg_len_pair_t. OR together
the flags when merging two dr_with_seg_len_pair_ts. After merging,
try to restore the original dr_with_seg_len order, updating the
flags if that fails.
From-SVN: r278350
|