Age | Commit message (Collapse) | Author | Files | Lines |
|
This makes sure to use splats early when facing uniform internal
operands in BB SLP discovery rather than relying on the late
heuristincs re-building nodes from scratch.
2020-10-27 Richard Biener <rguenther@suse.de>
* tree-vect-slp.c (vect_build_slp_tree_2): When vectorizing
BBs splat uniform operands and stop SLP discovery.
* gcc.target/i386/pr95866-1.c: Adjust.
|
|
FAIL: gcc.target/powerpc/swaps-p8-22.c (test for excess errors)
Excess errors:
cc1: error: '-mcmodel' not supported in this configuration
* gcc.target/powerpc/swaps-p8-22.c: Enable only for aix and
-m64 linux.
|
|
The allocation of mutex objects for synchronized statements has been
moved to the library as of merging druntime 58560d51. All support code
in the compiler for getting the OS critical section size has been
removed along with it.
Reviewed-on: https://github.com/dlang/dmd/pull/11902
https://github.com/dlang/druntime/pull/3248
gcc/ChangeLog:
* config/aarch64/aarch64-linux.h (GNU_USER_TARGET_D_CRITSEC_SIZE):
Remove.
* config/glibc-d.c (glibc_d_critsec_size): Likewise.
(TARGET_D_CRITSEC_SIZE): Likewise.
* config/i386/linux-common.h (GNU_USER_TARGET_D_CRITSEC_SIZE):
Likewise.
* config/sol2-d.c (solaris_d_critsec_size): Likewise.
(TARGET_D_CRITSEC_SIZE): Likewise.
* doc/tm.texi.in (TARGET_D_CRITSEC_SIZE): Likewise.
* doc/tm.texi: Regenerate.
gcc/d/ChangeLog:
* dmd/MERGE: Merge upstream dmd bec5973b0.
* d-target.cc (Target::critsecsize): Remove.
* d-target.def: Remove d_critsec_size.
libphobos/ChangeLog:
* libdruntime/MERGE: Merge upstream druntime 58560d51.
|
|
Fixes a bug where there was undefined template references when compiling
upstream dmd mainline.
In `TemplateInstance::semantic`, there exists special handling of
matching template instances for the same template declaration to ensure
that only at most one instance gets codegen'd.
If the primary instance `inst` originated from a non-root module, the
`minst` field will be updated so it is now coming from a root module,
however all Dsymbol `inst->members` of the instance still have their
`_scope->minst` pointing at the original non-root module. We must now
propagate `minst` to all members so that forward referenced dependencies
that get instantiated will also be appended to the root module,
otherwise there will be undefined references at link-time.
This doesn't affect compilations where all modules are compiled
together, as every module is a root module in that situation. What this
primarily affects are cases where there is a mix of root and non-root
modules, and a template was first instantiated in a non-root context,
then later instantiated again in a root context.
Reviewed-on: https://github.com/dlang/dmd/pull/11867
gcc/d/ChangeLog:
* dmd/MERGE: Merge upstream dmd 0fcdaab32
|
|
gcc/ChangeLog:
PR gcov-profile/97461
* gcov-io.h (GCOV_PREALLOCATED_KVP): Pre-allocate 64
static counters.
libgcc/ChangeLog:
PR gcov-profile/97461
* libgcov.h (gcov_counter_add): Use first static counters
as it should help to have malloc wrappers set up.
gcc/testsuite/ChangeLog:
PR gcov-profile/97461
* gcc.dg/tree-prof/pr97461.c: New test.
|
|
* tree-ssa-alias.c (attr_fnspec::verify): Re-enabl checking.
|
|
gcc/ada/
* Makefile.rtl: Add vx7r2cert spec file to ARM, PowerPC and x86
targets.
* vxworks7-cert-rtp-link.spec: New spec file.
|
|
gcc/ada/
* Makefile.rtl (GNATRTL_NONTASKING_OBJS): Add g-spogwa object.
* libgnat/g-spogwa.adb: Fix style errors.
|
|
gcc/ada/
* exp_spark.adb (Expand_SPARK_Array_Aggregate): Dedicated
routine for array aggregates; mostly reuses existing code, but
calls itself recursively for multi-dimensional array aggregates.
(Expand_SPARK_N_Aggregate): Call Expand_SPARK_Array_Aggregate to
do the actual expansion, starting from the first index of the
array type.
|
|
gcc/ada/
* sem_aggr.adb (Resolve_Iterated_Component_Association): new
internal subprogram Remove_References, to reset semantic
information on each reference to the index variable of the
association, so that Collect_Aggregate_Bounds can work properly
on multidimensional arrays with nested associations, and
subsequent expansion into loops can verify that dimensions of
each subaggregate are compatible.
|
|
gcc/ada/
* exp_prag.adb (Append_Copies): Handle N_Parameter_Associations.
|
|
gcc/ada/
* ada_get_targ.adb (Digits_From_Size): Delete.
(Width_From_Size): Likewise.
* get_targ.adb (Digits_From_Size): Likewise.
(Width_From_Size): Likewise.
* get_targ.ads (Digits_From_Size): Likewise.
(Width_From_Size): Likewise.
* ttypes.ads: Remove with clause for Get_Targ.
(Standard_Short_Short_Integer_Width): Delete.
(Standard_Short_Integer_Width): Likewise.
(Standard_Integer_Width): Likewise.
(Standard_Long_Integer_Width): Likewise.
(Standard_Long_Long_Integer_Width): Likewise.
(Standard_Long_Long_Long_Integer_Width): Likewise.
(Standard_Short_Float_Digits): Likewise.
(Standard_Float_Digits): Likewise.
(Standard_Long_Float_Digits): Likewise.
(Standard_Long_Long_Float_Digits): Likewise.
* gnat1drv.adb (Adjust_Global_Switches): Adjust.
|
|
gcc/ada/
* exp_ch6.adb, freeze.adb, gnat1drv.adb, opt.ads, sem_ch6.adb
(Transform_Function_Array): New flag, split from Modify_Tree_For_C.
* exp_unst.adb: Minor reformatting.
|
|
gcc/ada/
* libgnat/g-socpol.adb (Wait): Do not exit from loop on EINTR
error and timeout is over.
|
|
* builtin-attrs.def (STRERRNOC): New macro.
(STRERRNOP): New macro.
(ATTR_ERRNOCONST_NOTHROW_LEAF_LIST): New attr list.
(ATTR_ERRNOPURE_NOTHROW_LEAF_LIST): New attr list.
* builtins.def (ATTR_MATHFN_ERRNO): Use
ATTR_ERRNOCONST_NOTHROW_LEAF_LIST.
(ATTR_MATHFN_FPROUNDING_ERRNO): Use ATTR_ERRNOCONST_NOTHROW_LEAF_LIST
or ATTR_ERRNOPURE_NOTHROW_LEAF_LIST.
|
|
- Generalize logic for translating arch to internal flags, this patch
is infrastructure for supporing sub-extension parsing.
gcc/ChangeLog
* common/config/riscv/riscv-common.c (opt_var_ref_t): New.
(riscv_ext_flag_table_t): New.
(riscv_ext_flag_table): New.
(riscv_parse_arch_string): Pass gcc_options* instead of
&opts->x_target_flags only, and using riscv_arch_option_table to
setup flags.
(riscv_handle_option): Update argument for riscv_parse_arch_string.
(riscv_expand_arch): Ditto.
(riscv_expand_arch_from_cpu): Ditto.
|
|
* tree-ssa-ccp.c (evaluate_stmt): Use EAF_RETURN_ARG; do not handle
string buitings specially.
|
|
* tree.c (set_call_expr_flags): Fix string for ECF_RET1.
(build_common_builtin_nodes): Do not set ECF_RET1 for memcpy, memmove,
and memset. They are handled by builtin_fnspec.
|
|
* builtins.c (builtin_fnspec): Add bzero, memcmp, memcmp_eq, bcmp,
strncmp, strncmp_eq, strncasecmp, rindex, strlen, strlnen, strcasecmp,
strcspn, strspn, strcmp, strcmp_eq.
|
|
This introduces a global alloc-pool for SLP nodes to reduce overhead
on SLP allocation churn which will get worse and to eventually release
SLP cycles which will retain a refcount of one and thus are never
freed at the moment.
2020-10-26 Richard Biener <rguenther@suse.de>
* tree-vectorizer.h (slp_tree_pool): Declare.
(_slp_tree::operator new): Likewise.
(_slp_tree::operator delete): Likewise.
* tree-vectorizer.c (vectorize_loops): Allocate and free the
slp_tree_pool.
(pass_slp_vectorize::execute): Likewise.
* tree-vect-slp.c (slp_tree_pool): Define.
(_slp_tree::operator new): Likewise.
(_slp_tree::operator delete): Likewise.
|
|
We newly correctly detect that a job server is not active for
a LTO linking:
lto-wrapper: warning: jobserver is not available: '--jobserver-auth=' is not present in 'MAKEFLAGS'
In that situation we should not call make -f abc.mk as it can leed
to N^2 LTRANS units.
gcc/ChangeLog:
* lto-wrapper.c (run_gcc): Do not use sub-make when jobserver is
not detected properly.
|
|
gcc/ChangeLog:
* symbol-summary.h (call_summary_base): Pass symtab hooks to
base and register (or unregister) hooks directly.
|
|
gcc/ChangeLog:
* symbol-summary.h (function_summary_base::unregister_hooks):
Call disable_insertion_hook and disable_duplication_hook.
(function_summary_base::symtab_insertion): New field.
(function_summary_base::symtab_removal): Likewise.
(function_summary_base::symtab_duplication): Likewise.
Register hooks in function_summary_base and directly register
(or unregister) hooks.
|
|
gcc/testsuite/ChangeLog:
PR tree-optimization/97560
* g++.dg/pr97560.C: New test.
|
|
* gcc.target/powerpc/vsx_mask-count-runnable.c: Separate options
passed to dg-require-effective-target.
* gcc.target/powerpc/vsx_mask-expand-runnable.c: Likewise.
* gcc.target/powerpc/vsx_mask-extract-runnable.c: Likewise.
* gcc.target/powerpc/vsx_mask-move-runnable.c: Likewise.
|
|
|
|
When running with -m32
FAIL: gcc.target/powerpc/pr94740.c (test for excess errors)
Excess errors:
cc1: error: '-mpcrel' requires '-mcmodel=medium'
The others don't run for -m32, but remove the unnecessary -mpcrel
anyway.
* gcc.target/powerpc/localentry-1.c: Remove -mpcrel from options.
* gcc.target/powerpc/notoc-direct-1.c: Likewise.
* gcc.target/powerpc/pr94740.c: Likewise.
|
|
* gcc.target/powerpc/bswap64-4.c: Comment.
|
|
* gcc.target/powerpc/pr93122.c: Replace -mcpu with -mdejagnu-cpu.
* gcc.target/powerpc/vsx_mask-count-runnable.c: Likewise.
* gcc.target/powerpc/vsx_mask-expand-runnable.c: Likewise.
* gcc.target/powerpc/vsx_mask-extract-runnable.c: Likewise.
* gcc.target/powerpc/vsx_mask-move-runnable.c: Likewise.
|
|
All these tests fail with -m32 due to lack of int128 support, in some
cases with what I thought was not the best error message. For example
vsx_mask-move-runnable.c:34:3: error: unknown type name 'vector'
is misleading. The problem isn't "vector" but "vector __uint128_t".
* gcc.target/powerpc/vsx-load-element-extend-char.c: Require int128.
* gcc.target/powerpc/vsx-load-element-extend-int.c: Likewise.
* gcc.target/powerpc/vsx-load-element-extend-longlong.c: Likewise.
* gcc.target/powerpc/vsx-load-element-extend-short.c: Likewise.
* gcc.target/powerpc/vsx-store-element-truncate-char.c: Likewise.
* gcc.target/powerpc/vsx-store-element-truncate-int.c: Likewise.
* gcc.target/powerpc/vsx-store-element-truncate-longlong.c: Likewise.
* gcc.target/powerpc/vsx-store-element-truncate-short.c: Likewise.
* gcc.target/powerpc/vsx_mask-count-runnable.c: Likewise.
* gcc.target/powerpc/vsx_mask-expand-runnable.c: Likewise.
* gcc.target/powerpc/vsx_mask-extract-runnable.c: Likewise.
* gcc.target/powerpc/vsx_mask-move-runnable.c: Likewise.
|
|
Running the assembler and linker catches more errors.
* gcc.target/powerpc/cfuged-1.c,
gcc.target/powerpc/cntlzdm-1.c,
gcc.target/powerpc/cnttzdm-1.c,
gcc.target/powerpc/dg-future-1.c,
gcc.target/powerpc/lsbb-runnable.c,
gcc.target/powerpc/mma-double-test.c,
gcc.target/powerpc/mma-single-test.c,
gcc.target/powerpc/p10-arch31.c,
gcc.target/powerpc/p10-identify.c,
gcc.target/powerpc/pdep-1.c,
gcc.target/powerpc/pextd-1.c,
gcc.target/powerpc/pr96787-2.c,
gcc.target/powerpc/vec-blend-runnable.c,
gcc.target/powerpc/vec-cfuged-1.c,
gcc.target/powerpc/vec-clrl-1.c,
gcc.target/powerpc/vec-clrl-3.c,
gcc.target/powerpc/vec-clrr-1.c,
gcc.target/powerpc/vec-clrr-3.c,
gcc.target/powerpc/vec-cntlzm-1.c,
gcc.target/powerpc/vec-cnttzm-1.c,
gcc.target/powerpc/vec-extracth-1.c,
gcc.target/powerpc/vec-extracth-3.c,
gcc.target/powerpc/vec-extracth-5.c,
gcc.target/powerpc/vec-extracth-7.c,
gcc.target/powerpc/vec-extractl-1.c,
gcc.target/powerpc/vec-extractl-3.c,
gcc.target/powerpc/vec-extractl-5.c,
gcc.target/powerpc/vec-extractl-7.c,
gcc.target/powerpc/vec-gnb-1.c,
gcc.target/powerpc/vec-insert-word-runnable.c,
gcc.target/powerpc/vec-pdep-1.c,
gcc.target/powerpc/vec-permute-ext-runnable.c,
gcc.target/powerpc/vec-pext-1.c,
gcc.target/powerpc/vec-replace-word-runnable.c,
gcc.target/powerpc/vec-shift-double-runnable.c,
gcc.target/powerpc/vec-splati-runnable.c,
gcc.target/powerpc/vec-stril-1.c,
gcc.target/powerpc/vec-stril-16.c,
gcc.target/powerpc/vec-stril-17.c,
gcc.target/powerpc/vec-stril-18.c,
gcc.target/powerpc/vec-stril-19.c,
gcc.target/powerpc/vec-stril-20.c,
gcc.target/powerpc/vec-stril-21.c,
gcc.target/powerpc/vec-stril-22.c,
gcc.target/powerpc/vec-stril-23.c,
gcc.target/powerpc/vec-stril-3.c,
gcc.target/powerpc/vec-stril-5.c,
gcc.target/powerpc/vec-stril-7.c,
gcc.target/powerpc/vec-stril_p-1.c,
gcc.target/powerpc/vec-stril_p-3.c,
gcc.target/powerpc/vec-stril_p-5.c,
gcc.target/powerpc/vec-stril_p-7.c,
gcc.target/powerpc/vec-strir-1.c,
gcc.target/powerpc/vec-strir-16.c,
gcc.target/powerpc/vec-strir-17.c,
gcc.target/powerpc/vec-strir-18.c,
gcc.target/powerpc/vec-strir-19.c,
gcc.target/powerpc/vec-strir-20.c,
gcc.target/powerpc/vec-strir-21.c,
gcc.target/powerpc/vec-strir-22.c,
gcc.target/powerpc/vec-strir-23.c,
gcc.target/powerpc/vec-strir-3.c,
gcc.target/powerpc/vec-strir-5.c,
gcc.target/powerpc/vec-strir-7.c,
gcc.target/powerpc/vec-strir_p-1.c,
gcc.target/powerpc/vec-strir_p-3.c,
gcc.target/powerpc/vec-strir_p-5.c,
gcc.target/powerpc/vec-strir_p-7.c,
gcc.target/powerpc/vec-ternarylogic-1.c,
gcc.target/powerpc/vec-ternarylogic-3.c,
gcc.target/powerpc/vec-ternarylogic-5.c,
gcc.target/powerpc/vec-ternarylogic-7.c,
gcc.target/powerpc/vec-ternarylogic-9.c,
gcc.target/powerpc/vsx_mask-count-runnable.c,
gcc.target/powerpc/vsx_mask-expand-runnable.c,
gcc.target/powerpc/vsx_mask-extract-runnable.c,
gcc.target/powerpc/vsx_mask-move-runnable.c,
gcc.target/powerpc/xxgenpc-runnable.c: Link testcase when it
can't be run.
|
|
This tests behaviour near the limit of 16-bit signed offsets. If
power10 prefix instructions are enabled, no such testing occurs.
* gcc.target/powerpc/dimode_off.c: Add -mno-prefixed to options.
|
|
These tests require -mno-pcrel because they are testing features
of the non-pcrel ABI.
* gcc.target/powerpc/cprophard.c: Add -mno-pcrel to options.
* gcc.target/powerpc/float128-hw3.c: Likewise.
* gcc.target/powerpc/pr79439-1.c: Likewise.
* gcc.target/powerpc/pr79439-2.c: Likewise.
* gcc.target/powerpc/r2_shrink-wrap.c: Likewise.
|
|
Import additional code from upstream for handing system
calls on BSD systems. This makes the syscall package on
NetBSD complete enough to compile the standard library.
Updates golang/go#38538.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/265123
|
|
When combining logical OR operands with a FALSE result, union the false
ranges for operand1 and operand2... not intersection.
gcc/
PR tree-optimization/97567
* gimple-range-gori.cc (gori_compute::logical_combine): Union the
ranges of operand1 and operand2, not intersect.
gcc/testsuite/
* gcc.dg/pr97567.c: New.
|
|
libstdc++-v3/ChangeLog:
* include/experimental/executor (strand::_State): Fix thinko.
|
|
* attr-fnspec.h: Update toplevel comment.
(attr_fnspec::attr_fnspec): New constructor.
(attr_fnspec::arg_read_p,
attr_fnspec::arg_written_p,
attr_fnspec::arg_access_size_given_by_arg_p,
attr_fnspec::arg_single_access_p
attr_fnspec::loads_known_p
attr_fnspec::stores_known_p,
attr_fnspec::clobbers_errno_p): New member functions.
(gimple_call_fnspec): Declare.
(builtin_fnspec): Declare.
* builtins.c: Include attr-fnspec.h
(builtin_fnspec): New function.
* builtins.def (BUILT_IN_MEMCPY): Do not specify RET1 fnspec.
(BUILT_IN_MEMMOVE): Do not specify RET1 fnspec.
(BUILT_IN_MEMSET): Do not specify RET1 fnspec.
(BUILT_IN_STRCAT): Do not specify RET1 fnspec.
(BUILT_IN_STRCPY): Do not specify RET1 fnspec.
(BUILT_IN_STRNCAT): Do not specify RET1 fnspec.
(BUILT_IN_STRNCPY): Do not specify RET1 fnspec.
(BUILT_IN_MEMCPY_CHK): Do not specify RET1 fnspec.
(BUILT_IN_MEMMOVE_CHK): Do not specify RET1 fnspec.
(BUILT_IN_MEMSET_CHK): Do not specify RET1 fnspec.
(BUILT_IN_STRCAT_CHK): Do not specify RET1 fnspec.
(BUILT_IN_STRCPY_CHK): Do not specify RET1 fnspec.
(BUILT_IN_STRNCAT_CHK): Do not specify RET1 fnspec.
(BUILT_IN_STRNCPY_CHK): Do not specify RET1 fnspec.
* gimple.c (gimple_call_fnspec): Return attr_fnspec.
(gimple_call_arg_flags): Update.
(gimple_call_return_flags): Update.
* tree-ssa-alias.c (check_fnspec): New function.
(ref_maybe_used_by_call_p_1): Use fnspec for builtin handling.
(call_may_clobber_ref_p_1): Likewise.
(attr_fnspec::verify): Update verifier.
* calls.c (decl_fnspec): New function.
(decl_return_flags): Use it.
|
|
The problem here is we are trying to add 1 to a -1 in a signed 1-bit
field and coming up with UNDEFINED because of the overflow.
Signed 1-bits are annoying because you can't really add or subtract
one, because the one is unrepresentable. For invert() we have a
special subtract_one() function that handles 1-bit signed fields.
This patch implements the analogous add_one() function so that invert
works.
gcc/ChangeLog:
PR tree-optimization/97555
* range-op.cc (range_tests): Test 1-bit signed invert.
* value-range.cc (subtract_one): Adjust comment.
(add_one): New.
(irange::invert): Call add_one.
gcc/testsuite/ChangeLog:
* gcc.dg/pr97555.c: New test.
|
|
this patch implements thre two-state optimize_for_size predicates, so with -Os
and with profile feedback for never executed code it returns OPTIMIZE_SIZE_MAX
while in cases we decide to optimize for size based on branch prediction logic
it return OPTIMIZE_SIZE_BALLANCED.
The idea is that for places where we guess that code is unlikely we do not
want to do extreme optimizations for size that leads to many fold slowdowns
(using idiv rather than few shigts or using rep based inlined stringops).
I will update RTL handling code to also support this with BB granuality (which
we don't currently). LLVM has -Os and -Oz levels where -Oz is our -Os and
LLVM's -Os would ocrrespond to OPTIMIZE_SIZE_BALLANCED. I wonder if we want
to export this to command line somehow? For me it would be definitly useful
to test things, I am not sure how "weaker" -Os is desired in practice.
gcc/ChangeLog:
* cgraph.h (cgraph_node::optimize_for_size_p): Return
optimize_size_level.
(cgraph_node::optimize_for_size_p): Update.
* coretypes.h (enum optimize_size_level): New enum.
* predict.c (unlikely_executed_edge_p): Microoptimize.
(optimize_function_for_size_p): Return optimize_size_level.
(optimize_bb_for_size_p): Likewise.
(optimize_edge_for_size_p): Likewise.
(optimize_insn_for_size_p): Likewise.
(optimize_loop_nest_for_size_p): Likewise.
* predict.h (optimize_function_for_size_p): Update declaration.
(optimize_bb_for_size_p): Update declaration.
(optimize_edge_for_size_p): Update declaration.
(optimize_insn_for_size_p): Update declaration.
(optimize_loop_for_size_p): Update declaration.
(optimize_loop_nest_for_size_p): Update declaration.
|
|
This refactors the toplevel entry to analyze an SLP instance to
expose a worker analyzing from a vector of stmts and an SLP entry
kind.
2020-10-26 Richard Biener <rguenther@suse.de>
* tree-vect-slp.c (enum slp_instance_kind): New.
(vect_build_slp_instance): Split out from...
(vect_analyze_slp_instance): ... this.
|
|
classes.
Initialize zerov to match vr-values.c.
* gimple-range.cc (range_of_builtin_call): Initialize zerov to 0.
|
|
gcc/c-family/ChangeLog:
* c-common.c (__is_nothrow_assignable): New.
(__is_nothrow_constructible): Likewise.
* c-common.h (RID_IS_NOTHROW_ASSIGNABLE): New.
(RID_IS_NOTHROW_CONSTRUCTIBLE): Likewise.
gcc/cp/ChangeLog:
* cp-tree.h (CPTK_IS_NOTHROW_ASSIGNABLE): New.
(CPTK_IS_NOTHROW_CONSTRUCTIBLE): Likewise.
(is_nothrow_xible): Likewise.
* method.c (is_nothrow_xible): New.
(is_trivially_xible): Tweak.
* parser.c (cp_parser_primary_expression): Handle the new RID_*.
(cp_parser_trait_expr): Likewise.
* semantics.c (trait_expr_value): Handle the new RID_*.
(finish_trait_expr): Likewise.
libstdc++-v3/ChangeLog:
* include/std/type_traits (__is_nt_constructible_impl): Remove.
(__is_nothrow_constructible_impl): Adjust.
(is_nothrow_default_constructible): Likewise.
(__is_nt_assignable_impl): Remove.
(__is_nothrow_assignable_impl): Adjust.
|
|
gcc/ChangeLog:
PR ipa/97576
* cgraphclones.c (cgraph_node::materialize_clone): Clear stmt
references.
* cgraphunit.c (mark_functions_to_output): Do not clear them here.
* ipa-inline-transform.c (inline_transform): Clear stmt references.
* symtab.c (symtab_node::clear_stmts_in_references): Make recursive
for clones.
* tree-ssa-structalias.c (ipa_pta_execute): Do not clear references.
gcc/testsuite/ChangeLog:
PR ipa/97576
* gcc.c-torture/compile/pr97576.c: New test.
|
|
2020-10-26 Zhiheng Xie <xiezhiheng@huawei.com>
Nannan Zheng <zhengnannan@huawei.com>
gcc/ChangeLog:
* config/aarch64/aarch64-builtins.c: Add FLAG STORE.
* config/aarch64/aarch64-simd-builtins.def: Add proper FLAG
for store intrinsics.
|
|
libstdc++-v3/ChangeLog:
PR libstdc++/97570
* libsupc++/new_opa.cc: Declare size_t in global namespace.
Remove unused header.
|
|
sizes
This patch fixes the ICE in the PR by bailing out of find_bswap_or_nop
on poly_int sizes.
I don't think it intends to handle them and from my reading of the code
it's the most appropriate place to reject them
here rather than in the callers.
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/
PR tree-optimization/97546
* gimple-ssa-store-merging.c (find_bswap_or_nop): Return NULL if
type is not INTEGER_CST.
gcc/testsuite/
PR tree-optimization/97546
* gcc.target/aarch64/sve/acle/general/pr97546.c: New test.
|
|
This makes us always use a single-bit boolean type component type
for integer mode mask VECTOR_BOOLEAN_TYPE_P to match the RTL and target
representation. This aovids the need for magic translation and
the inconsistencies from the translation requirement now that
we expose temporaries of those types on the GIMPLE level.
2020-10-23 Richard Biener <rguenther@suse.de>
PR middle-end/97521
* expr.c (const_scalar_mask_from_tree): Remove.
(expand_expr_real_1): Always VIEW_CONVERT integer mode
vector constants to an integer type.
* tree.c (build_truth_vector_type_for_mode): Use a single-bit
boolean component type for non-vector-mode mask_mode.
* gcc.target/i386/pr97521.c: New testcase.
|
|
Expand strncmp to "repz cmpsb" only with -minline-all-stringops since
"repz cmpsb" can be much slower than strncmp function implemented with
vector instructions, see
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
gcc/
PR target/95458
* config/i386/i386-expand.c (ix86_expand_cmpstrn_or_cmpmem):
Return false for -mno-inline-all-stringops.
gcc/testsuite/
PR target/95458
* gcc.target/i386/pr95458-1.c: New test.
* gcc.target/i386/pr95458-2.c: Likewise.
|
|
We used to expand memcmp to "repz cmpsb" via cmpstrnsi. It was changed
by
commit 9b0f6f5e511ca512e4faeabc81d2fd3abad9b02f
Author: Nick Clifton <nickc@redhat.com>
Date: Fri Aug 12 16:26:11 2011 +0000
builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi pattern.
* builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi
pattern.
* doc/md.texi (cmpstrn): Note that the comparison stops if both
fetched bytes are zero.
(cmpstr): Likewise.
(cmpmem): Note that the comparison does not stop if both of the
fetched bytes are zero.
Duplicate the cmpstrn pattern for cmpmem. The only difference is that
the length argument of cmpmem is guaranteed to be less than or equal to
lengths of 2 memory areas. Since "repz cmpsb" can be much slower than
memcmp function implemented with vector instruction, see
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
expand cmpmem to "repz cmpsb" only for -minline-all-stringops.
gcc/
PR target/95151
* config/i386/i386-expand.c (ix86_expand_cmpstrn_or_cmpmem): New
function.
* config/i386/i386-protos.h (ix86_expand_cmpstrn_or_cmpmem): New
prototype.
* config/i386/i386.md (cmpmemsi): New pattern.
gcc/testsuite/
PR target/95151
* gcc.target/i386/pr95151-1.c: New test.
* gcc.target/i386/pr95151-2.c: Likewise.
* gcc.target/i386/pr95151-3.c: Likewise.
* gcc.target/i386/pr95151-4.c: Likewise.
|
|
After adding vec_cmp expanders we have seen various performance
related regression in the testsuite. These appear to be caused by a
missing vcond_mask definition in the backend. Fixed with this patch.
The patch fixes the following testsuite fails:
FAIL: gcc.dg/vect/vect-21.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 3 loops" 1
FAIL: gcc.dg/vect/vect-21.c scan-tree-dump-times vect "vectorized 3 loops" 1
FAIL: gcc.dg/vect/vect-23.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 3 loops" 1
FAIL: gcc.dg/vect/vect-23.c scan-tree-dump-times vect "vectorized 3 loops" 1
FAIL: gcc.dg/vect/vect-24.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 3 loops" 1
FAIL: gcc.dg/vect/vect-24.c scan-tree-dump-times vect "vectorized 3 loops" 1
FAIL: gcc.dg/vect/vect-live-6.c -flto -ffat-lto-objects scan-tree-dump vect "vectorized 1 loops"
FAIL: gcc.dg/vect/vect-live-6.c scan-tree-dump vect "vectorized 1 loops"
FAIL: gcc.target/s390/vector/vcond-shift.c scan-assembler-times vesrab\\t%v.?,%v.?,7 6
FAIL: gcc.target/s390/vector/vcond-shift.c scan-assembler-times vesraf\\t%v.?,%v.?,31 6
FAIL: gcc.target/s390/vector/vcond-shift.c scan-assembler-times vesrah\\t%v.?,%v.?,15 6
FAIL: gcc.target/s390/vector/vcond-shift.c scan-assembler-times vesrlb\\t%v.?,%v.?,7 4
FAIL: gcc.target/s390/vector/vcond-shift.c scan-assembler-times vesrlf\\t%v.?,%v.?,31 4
FAIL: gcc.target/s390/vector/vcond-shift.c scan-assembler-times vesrlh\\t%v.?,%v.?,15 4
FAIL: gcc.dg/vect/vect-21.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 3 loops" 1
FAIL: gcc.dg/vect/vect-21.c scan-tree-dump-times vect "vectorized 3 loops" 1
FAIL: gcc.dg/vect/vect-23.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 3 loops" 1
FAIL: gcc.dg/vect/vect-23.c scan-tree-dump-times vect "vectorized 3 loops" 1
FAIL: gcc.dg/vect/vect-24.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 3 loops" 1
FAIL: gcc.dg/vect/vect-24.c scan-tree-dump-times vect "vectorized 3 loops" 1
FAIL: gcc.dg/vect/vect-live-6.c -flto -ffat-lto-objects scan-tree-dump vect "vectorized 1 loops"
FAIL: gcc.dg/vect/vect-live-6.c scan-tree-dump vect "vectorized 1 loops"
FAIL: gcc.target/s390/vector/vcond-shift.c scan-assembler-times vesrab\\t%v.?,%v.?,7 6
FAIL: gcc.target/s390/vector/vcond-shift.c scan-assembler-times vesraf\\t%v.?,%v.?,31 6
FAIL: gcc.target/s390/vector/vcond-shift.c scan-assembler-times vesrah\\t%v.?,%v.?,15 6
FAIL: gcc.target/s390/vector/vcond-shift.c scan-assembler-times vesrlb\\t%v.?,%v.?,7 4
FAIL: gcc.target/s390/vector/vcond-shift.c scan-assembler-times vesrlf\\t%v.?,%v.?,31 4
FAIL: gcc.target/s390/vector/vcond-shift.c scan-assembler-times vesrlh\\t%v.?,%v.?,15 4
gcc/ChangeLog:
* config/s390/vector.md ("vcond_mask_<mode><mode>"): New expander.
|