Age | Commit message (Collapse) | Author | Files | Lines |
|
Since a889e06ac68 the following fails.
In file included from ../../gcc/tree-ssa-propagate.h:25:0,
from ../../gcc/config/rs6000/rs6000.c:78:
../../gcc/value-query.h:90:31: error: ‘irange’ has not been declared
virtual bool range_of_expr (irange &r, tree name, gimple * = NULL) = 0;
^~~~~~
../../gcc/value-query.h:91:31: error: ‘irange’ has not been declared
virtual bool range_on_edge (irange &r, edge, tree name);
^~~~~~
../../gcc/value-query.h:92:31: error: ‘irange’ has not been declared
virtual bool range_of_stmt (irange &r, gimple *, tree name = NULL);
^~~~~~
In file included from ../../gcc/tree-ssa-propagate.h:25:0,
from ../../gcc/config/rs6000/rs6000-call.c:67:
../../gcc/value-query.h:90:31: error: ‘irange’ has not been declared
virtual bool range_of_expr (irange &r, tree name, gimple * = NULL) = 0;
^~~~~~
../../gcc/value-query.h:91:31: error: ‘irange’ has not been declared
virtual bool range_on_edge (irange &r, edge, tree name);
^~~~~~
../../gcc/value-query.h:92:31: error: ‘irange’ has not been declared
virtual bool range_of_stmt (irange &r, gimple *, tree name = NULL);
gcc/ChangeLog:
* config/rs6000/rs6000-call.c: Include value-range.h.
* config/rs6000/rs6000.c: Likewise.
|
|
When running:
...
$ gcc.sh src/gcc/testsuite/gcc.target/nvptx/abi-complex-arg.c -S -dP
...
we have in abi-complex-arg.s:
...
//(insn 3 5 4 2
// (set
// (reg:QI 23)
// (truncate:QI (reg:SI 22))) "abi-complex-arg.c":38:1 29 {truncsiqi2}
// (nil))
cvt.u32.u32 %r23, %r22; // 3 [c=4] truncsiqi2/0
...
The cvt.u32.u32 can be written shorter and clearer as mov.u32.
Fix this in define_insn "truncsi<QHIM>2".
Tested on nvptx.
gcc/ChangeLog:
2020-10-01 Tom de Vries <tdevries@suse.de>
PR target/80845
* config/nvptx/nvptx.md (define_insn "truncsi<QHIM>2"): Emit mov.u32
instead of cvt.u32.u32.
|
|
This patch does several things at once:
(1) Add vector compare patterns (vec_cmp and vec_cmpu).
(2) Add vector selects between floating-point modes when the
values being compared are integers (affects vcond and vcondu).
(3) Add vector selects between integer modes when the values being
compared are floating-point (affects vcond).
(4) Add standalone vector select patterns (vcond_mask).
(5) Tweak the handling of compound comparisons with zeros.
Unfortunately it proved too difficult (for me) to separate this
out into a series of smaller patches, since everything is so
inter-related. Defining only some of the new patterns does
not leave things in a happy state.
The handling of comparisons is mostly taken from the vcond patterns.
This means that it remains non-compliant with IEEE: “quiet” comparisons
use signalling instructions. But that shouldn't matter for floats,
since we require -funsafe-math-optimizations to vectorize for them
anyway.
It remains the case that comparisons and selects aren't implemented
at all for HF vectors. Implementing those feels like separate work.
gcc/
PR target/96528
PR target/97288
* config/arm/arm-protos.h (arm_expand_vector_compare): Declare.
(arm_expand_vcond): Likewise.
* config/arm/arm.c (arm_expand_vector_compare): New function.
(arm_expand_vcond): Likewise.
* config/arm/neon.md (vec_cmp<VDQW:mode><v_cmp_result>): New pattern.
(vec_cmpu<VDQW:mode><VDQW:mode>): Likewise.
(vcond<VDQW:mode><VDQW:mode>): Require operand 5 to be a register
or zero. Use arm_expand_vcond.
(vcond<V_cvtto><V32:mode>): New pattern.
(vcondu<VDQIW:mode><VDQIW:mode>): Generalize to...
(vcondu<VDQW:mode><v_cmp_result): ...this. Require operand 5
to be a register or zero. Use arm_expand_vcond.
(vcond_mask_<VDQW:mode><v_cmp_result>): New pattern.
(neon_vc<cmp_op><mode>, neon_vc<cmp_op><mode>_insn): Add "@" marker.
(neon_vbsl<mode>): Likewise.
(neon_vc<cmp_op>u<mode>): Reexpress as...
(@neon_vc<code><mode>): ...this.
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_vect_cond_mixed): Add
arm neon targets.
* gcc.target/arm/neon-compare-1.c: New test.
* gcc.target/arm/neon-compare-2.c: Likewise.
* gcc.target/arm/neon-compare-3.c: Likewise.
* gcc.target/arm/neon-compare-4.c: Likewise.
* gcc.target/arm/neon-compare-5.c: Likewise.
* gcc.target/arm/neon-vcond-gt.c: Expect comparisons with zero.
* gcc.target/arm/neon-vcond-ltgt.c: Likewise.
* gcc.target/arm/neon-vcond-unordered.c: Likewise.
|
|
gcc/testsuite/
* gcc.target/aarch64/movtf_1.c: Restrict the asm matching to lp64.
* gcc.target/aarch64/movti_1.c: Likewise.
|
|
gcc/testsuite/
PR target/96375
* gcc.target/arm/lob1.c: Fix missing flag.
* gcc.target/arm/lob2.c: Likewise.
* gcc.target/arm/lob3.c: Likewise.
* gcc.target/arm/lob4.c: Likewise.
* gcc.target/arm/lob5.c: Likewise.
* gcc.target/arm/lob6.c: Likewise.
* lib/target-supports.exp
(check_effective_target_arm_v8_1_lob_ok): Return 1 only for
cortex-m targets, add '-mthumb' flag.
|
|
* config/i386/t-rtems: Change from mtune to march when building
multilibs. The mtune argument tunes or optimizes for a specific
CPU model but does not ensure the generated code is appropriate
for the CPU model. Prior to this patch, i386 compatible code
was always generated but tuned for later models.
|
|
gcc/ChangeLog:
* builtins.c (compute_objsize): Replace vr_values with range_query.
(get_range): Same.
(gimple_call_alloc_size): Same.
* builtins.h (class vr_values): Remove.
(gimple_call_alloc_size): Replace vr_values with range_query.
* gimple-ssa-sprintf.c (get_int_range): Same.
(struct directive): Pass gimple context to fmtfunc callback.
(directive::set_width): Replace inline with out-of-line version.
(directive::set_precision): Same.
(format_none): New gimple argument.
(format_percent): New gimple argument.
(format_integer): New gimple argument.
(format_floating): New gimple argument.
(get_string_length): Use range_query API.
(format_character): New gimple argument.
(format_string): New gimple argument.
(format_plain): New gimple argument.
(format_directive): New gimple argument.
(parse_directive): Replace vr_values with range_query.
(compute_format_length): Same.
(handle_printf_call): Same. Adjust for range_query API.
* tree-ssa-strlen.c (get_range): Same.
(compare_nonzero_chars): Same.
(get_addr_stridx) Replace vr_values with range_query.
(get_stridx): Same.
(dump_strlen_info): Same.
(get_range_strlen_dynamic): Adjust for range_query API.
(set_strlen_range): Same
(maybe_warn_overflow): Replace vr_values with range_query.
(handle_builtin_strcpy): Same.
(maybe_diag_stxncpy_trunc): Add FIXME comment.
(handle_builtin_memcpy): Replace vr_values with range_query.
(handle_builtin_memset): Same.
(get_len_or_size): Same.
(strxcmp_eqz_result): Same.
(handle_builtin_string_cmp): Same.
(count_nonzero_bytes_addr): Same, plus adjust for range_query API.
(count_nonzero_bytes): Replace vr_values with range_query.
(handle_store): Same.
(strlen_check_and_optimize_call): Same.
(handle_integral_assign): Same.
(check_and_optimize_stmt): Same.
* tree-ssa-strlen.h (class vr_values): Remove.
(get_range): Replace vr_values with range_query.
(get_range_strlen_dynamic): Same.
(handle_printf_call): Same.
|
|
gcc/ChangeLog:
* gimple-loop-versioning.cc (lv_dom_walker::before_dom_children):
Pass m_range_analyzer instead of get_vr_values.
(loop_versioning::name_prop::get_value): Rename to...
(loop_versioning::name_prop::value_of_expr): ...this.
* gimple-ssa-evrp-analyze.c (evrp_range_analyzer::evrp_range_analyzer):
Adjust for evrp_range_analyzer
inheriting from vr_values.
(evrp_range_analyzer::try_find_new_range): Same.
(evrp_range_analyzer::record_ranges_from_incoming_edge): Same.
(evrp_range_analyzer::record_ranges_from_phis): Same.
(evrp_range_analyzer::record_ranges_from_stmt): Same.
(evrp_range_analyzer::push_value_range): Same.
(evrp_range_analyzer::pop_value_range): Same.
* gimple-ssa-evrp-analyze.h (class evrp_range_analyzer): Inherit from
vr_values. Adjust accordingly.
* gimple-ssa-evrp.c: Adjust for evrp_range_analyzer inheriting from
vr_values.
(evrp_folder::value_of_evrp): Rename from get_value.
* tree-ssa-ccp.c (class ccp_folder): Rename get_value to
value_of_expr.
(ccp_folder::get_value): Rename to...
(ccp_folder::value_of_expr): ...this.
* tree-ssa-copy.c (class copy_folder): Rename get_value to
value_of_expr.
(copy_folder::get_value): Rename to...
(copy_folder::value_of_expr): ...this.
* tree-ssa-dom.c (dom_opt_dom_walker::after_dom_children): Adjust
for evrp_range_analyzer inheriting from vr_values.
(dom_opt_dom_walker::optimize_stmt): Same.
* tree-ssa-propagate.c (substitute_and_fold_engine::replace_uses_in):
Call value_of_* instead of get_value.
(substitute_and_fold_engine::replace_phi_args_in): Same.
(substitute_and_fold_engine::propagate_into_phi_args): Same.
(substitute_and_fold_dom_walker::before_dom_children): Same.
* tree-ssa-propagate.h: Include value-query.h.
(class substitute_and_fold_engine): Inherit from value_query.
* tree-ssa-strlen.c (strlen_dom_walker::before_dom_children):
Adjust for evrp_range_analyzer inheriting from vr_values.
* tree-ssa-threadedge.c (record_temporary_equivalences_from_phis):
Same.
* tree-vrp.c (class vrp_folder): Same.
(vrp_folder::get_value): Rename to value_of_expr.
* vr-values.c (vr_values::get_lattice_entry): Adjust for
vr_values inheriting from range_query.
(vr_values::range_of_expr): New.
(vr_values::value_of_expr): New.
(vr_values::value_on_edge): New.
(vr_values::value_of_stmt): New.
(simplify_using_ranges::op_with_boolean_value_range_p): Call
get_value_range through query.
(check_for_binary_op_overflow): Rename store to query.
(vr_values::vr_values): Remove vrp_value_range_pool.
(vr_values::~vr_values): Same.
(simplify_using_ranges::get_vr_for_comparison): Call get_value_range
through query.
(simplify_using_ranges::compare_names): Same.
(simplify_using_ranges::vrp_evaluate_conditional): Same.
(simplify_using_ranges::vrp_visit_cond_stmt): Same.
(simplify_using_ranges::simplify_abs_using_ranges): Same.
(simplify_using_ranges::simplify_cond_using_ranges_1): Same.
(simplify_cond_using_ranges_2): Same.
(simplify_using_ranges::simplify_switch_using_ranges): Same.
(simplify_using_ranges::two_valued_val_range_p): Same.
(simplify_using_ranges::simplify_using_ranges): Rename store to query.
(simplify_using_ranges::simplify): Assert that we have a query.
* vr-values.h (class range_query): Remove.
(class simplify_using_ranges): Remove inheritance of range_query.
(class vr_values): Add virtuals for range_of_expr, value_of_expr,
value_on_edge, value_of_stmt, and get_value_range.
Call range_query allocator instead of using vrp_value_range_pool.
Remove vrp_value_range_pool.
(simplify_using_ranges::get_value_range): Remove.
|
|
This avoids using VMAT_CONTIGUOUS with single-element interleaving
when using V1mode vectors. Instead keep VMAT_ELEMENTWISE but
continue to avoid load-lanes and gathers.
2020-10-01 Richard Biener <rguenther@suse.de>
PR tree-optimization/97236
* tree-vect-stmts.c (get_group_load_store_type): Keep
VMAT_ELEMENTWISE for single-element vectors.
* gcc.dg/vect/pr97236.c: New testcase.
|
|
I discovered pushdecl_top_level was not setting the decl's context,
and we ended up with namespace-scope decls with NULL context. That
broke modules. Then I discovered a couple of places where we set the
context to a FUNCTION_DECL, which is also wrong. AFAICT the literals
in question belong in global scope, as they're comdatable entities.
But create_temporary would use current_scope for the context before we
pushed it into namespace scope.
This patch asserts the context is NULL and then sets it to the frobbed
global_namespace.
gcc/cp/
* name-lookup.c (pushdecl_top_level): Assert incoming context is
null, add global_namespace context.
(pushdecl_top_level_and_finish): Likewise.
* pt.c (get_template_parm_object): Clear decl context before
pushing.
* semantics.c (finish_compound_literal): Likewise.
|
|
PR ipa/97243
* gcc.c-torture/compile/pr97243.c: New test.
|
|
gcc/ChangeLog:
* ipa-modref.c (compute_parm_map): Be ready for callee_pi to be NULL.
|
|
PR ipa/97244
* gcc.dg/ipa/remref-2a.c: Add -fno-ipa-modref
|
|
PR ipa/97244
* ipa-fnsummary.c (pass_free_fnsummary::execute): Free
also indirect inlining datastructure.
* ipa-modref.c (pass_ipa_modref::execute): Do not free them here.
* ipa-prop.c (ipa_free_all_node_params): Do not crash when info does
not exist.
(ipa_unregister_cgraph_hooks): Likewise.
|
|
* internal-fn.c (DEF_INTERNAL_FN): Fix handling of fnspec
|
|
gcc/ChangeLog:
* Makefile.in: Add value-query.o.
* value-query.cc: New file.
* value-query.h: New file.
|
|
It turns out I'd already found lookup_and_check_tag's control flow
confusing, and had refactored it on the modules branch. For instance,
it continually checks 'if (decl &&$ condition)' before finally getting
to 'else if (!decl)'. why not just check !decl first and be done?
Well, it is done thusly.
gcc/cp/
* decl.c (lookup_and_check_tag): Refactor.
|
|
This moves the recent entry for Neoverse N2 down and adds a comment in
order to preserve the existing order/structure in arm-cpus.in.
gcc/ChangeLog:
* config/arm/arm-cpus.in: Fix ordering, move Neoverse N2 down.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm-tune.md: Regenerate.
|
|
When compiling test-case pr94600-1.c for nvptx, this gimple mem move:
...
MEM[(volatile struct t0 *)655404B] ={v} a0[0];
...
is expanded into a memcpy, but when compiling pr94600-2.c instead, this similar
gimple mem move:
...
MEM[(volatile struct t0 *)655404B] ={v} a00;
...
is expanded into a 32-bit load/store pair.
In both cases, emit_block_move is called.
In the latter case, can_move_by_pieces (4 /* byte-size */, 32 /* bit-align */)
is called, which returns true (because by_pieces_ninsns returns 1, which is
smaller than the MOVE_RATIO of 4).
In the former case, can_move_by_pieces (4 /* byte-size */, 8 /* bit-align */)
is called, which returns false (because by_pieces_ninsns returns 4, which is
not smaller than the MOVE_RATIO of 4).
So the difference in code generation is explained by the alignment. The
difference in alignment comes from the move sources: a0[0] vs. a00. Both
have the same type with 8-bit alignment, but a00 is on stack, which based on
the base stack align and stack variable placement happens to result in a
32-bit alignment.
Enable test-cases pr94600-{1,3}.c for nvptx by forcing the currently 8-byte
aligned variables to have a 32-bit alignment for STRICT_ALIGNMENT targets.
Tested on nvptx.
gcc/testsuite/ChangeLog:
2020-10-01 Tom de Vries <tdevries@suse.de>
* gcc.dg/pr94600-1.c: Force 32-bit alignment for a0 for !non_strict_align
targets. Remove target clauses from scan tests.
* gcc.dg/pr94600-3.c: Same.
|
|
> > The following testcase is miscompiled (in particular the a and i
> > initialization). The problem is that build_special_member_call due to
> > the immediate constructors (but not evaluated in constant expression mode)
> > doesn't create a CALL_EXPR, but returns a TARGET_EXPR with CONSTRUCTOR
> > as the initializer for it,
>
> That seems like the bug; at the end of build_over_call, after you
>
> > call = cxx_constant_value (call, obj_arg);
>
> You need to build an INIT_EXPR if obj_arg isn't a dummy.
That works. obj_arg is NULL if it is a dummy from the earlier code.
2020-10-01 Jakub Jelinek <jakub@redhat.com>
PR c++/96994
* call.c (build_over_call): If obj_arg is non-NULL, return INIT_EXPR
setting obj_arg to call.
* g++.dg/cpp2a/consteval18.C: New test.
|
|
[PR97195]
As mentioned in the PR, we only support due to a bug in constant expressions
std::construct_at on non-automatic variables, because we VERIFY_CONSTANT the
second argument of placement new, which fails verification if it is an
address of an automatic variable.
The following patch fixes it by not performing that verification, the
placement new evaluation later on will verify it after it is dereferenced.
2020-10-01 Jakub Jelinek <jakub@redhat.com>
PR c++/97195
* constexpr.c (cxx_eval_call_expression): Don't VERIFY_CONSTANT the
second argument.
* g++.dg/cpp2a/constexpr-new14.C: New test.
|
|
The following patch fixes
-FAIL: gcc.dg/pr94780.c (internal compiler error)
-FAIL: gcc.dg/pr94780.c (test for excess errors)
-FAIL: gcc.dg/pr94842.c (internal compiler error)
-FAIL: gcc.dg/pr94842.c (test for excess errors)
on s390x-linux. The fix is essentially the same as has been applied to many
other targets (i386, aarch64, arm, rs6000, alpha, riscv).
2020-10-01 Jakub Jelinek <jakub@redhat.com>
* config/s390/s390.c (s390_atomic_assign_expand_fenv): Use
TARGET_EXPR instead of MODIFY_EXPR for the first assignments to
fenv_var and old_fpc. Formatting fixes.
|
|
SRA tends to use VIEW_CONVERT_EXPR when replacing bool fields with
unsigned char fields. Those are not handled in vector bool pattern
detection causing vector true values to leak. The following fixes
this by turning those into b ? 1 : 0 as well.
2020-10-01 Richard Biener <rguenther@suse.de>
* tree-vect-patterns.c (vect_recog_bool_pattern): Also handle
VIEW_CONVERT_EXPR.
* g++.dg/vect/pr97255.cc: New testcase.
|
|
levels for x86-64
These micro-architecture levels are defined in the x86-64 psABI:
https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9
PTA_NO_TUNE is introduced so that the new processor alias table entries
do not affect the CPU tuning setting in ix86_tune.
The tests depend on the macros added in commit 92e652d8c21bd7e66cbb0f900
("i386: Define __LAHF_SAHF__ and __MOVBE__ macros, based on ISA flags").
gcc/:
PR target/97250
* config/i386/i386.h (PTA_NO_TUNE, PTA_X86_64_BASELINE)
(PTA_X86_64_V2, PTA_X86_64_V3, PTA_X86_64_V4): New.
* common/config/i386/i386-common.c (processor_alias_table):
Add "x86-64-v2", "x86-64-v3", "x86-64-v4".
* config/i386/i386-options.c (ix86_option_override_internal):
Handle new PTA_NO_TUNE processor table entries.
* doc/invoke.texi (x86 Options): Document new -march values.
gcc/testsuite/:
PR target/97250
* gcc.target/i386/x86-64-v2.c: New test.
* gcc.target/i386/x86-64-v3.c: New test.
* gcc.target/i386/x86-64-v3-haswell.c: New test.
* gcc.target/i386/x86-64-v3-skylake.c: New test.
* gcc.target/i386/x86-64-v4.c: New test.
|
|
Add support for the 32-bit RISC-V (RV32) ISA matching the 64-bit RISC-V
(RV64) port except for async preemption added as a stub only.
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/251179
|
|
Generate assembly with .localentry,1 functions using @notoc calls.
This patch makes libgcc.a asm look the same as power10 pcrel as far as
toc/notoc is concerned.
Otherwise calling between functions that advertise as using the TOC
and those that don't, will require linker call stubs in statically
linked code.
gcc/
* config/rs6000/ppc-asm.h: Support __PCREL__ code.
libgcc/
* config/rs6000/morestack.S,
* config/rs6000/tramp.S: Support __PCREL__ code.
libitm/
* config/powerpc/sjlj.S: Support __PCREL__ code.
|
|
We've had this hack in the libgcc config to build libgcc with
-mcmodel=small for powerpc64 for a long time. It wouldn't be a bad
thing if someone who knows the multilib machinery well could arrange
for -mcmodel=small to be passed just for ppc64 when building for
earlier than power10. But for now, make -mno-minimal-toc do nothing
when pcrel. Which will do the right thing for any project that has
copied libgcc's trick.
We want this if configuring using --with-cpu=power10 to build a
power10 pcrel libgcc. --mcmodel=small turns off pcrel.
gcc/
* config/rs6000/linux64.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Don't
set -mcmodel=small for -mno-minimal-toc when pcrel.
libgcc/
* config/rs6000/t-linux: Document purpose of -mno-minimal-toc.
|
|
This PR points out that we accept
template<typename T> struct tuple { tuple(T); }; // #1
template<typename T> explicit tuple(T t) -> tuple<T>; // #2
tuple t = { 1 };
despite the 'explicit' deduction guide in a copy-list-initialization
context. That's because in deduction_guides_for we first find the
user-defined deduction guide (#2), and then ctor_deduction_guides_for
creates artificial deduction guides: one from the tuple(T) constructor and
a copy guide. So we end up with these three guides:
(1) template<class T> tuple(T) -> tuple<T> [DECL_NONCONVERTING_P]
(2) template<class T> tuple(tuple<T>) -> tuple<T>
(3) template<class T> tuple(T) -> tuple<T>
Then, in do_class_deduction, we prune this set, and get rid of (1).
Then overload resolution selects (3) and we succeed.
But [over.match.list]p1 says "In copy-list-initialization, if an explicit
constructor is chosen, the initialization is ill-formed." It also goes
on to say that this differs from other situations where only converting
constructors are considered for copy-initialization. Therefore for
list-initialization we consider explicit constructors and complain if one
is chosen. E.g. convert_like_internal/ck_user can give an error.
So my logic runs that we should not prune the deduction_guides_for guides
in a copy-list-initialization context, and only complain if we actually
choose an explicit deduction guide. This matches clang++/EDG/msvc++.
gcc/cp/ChangeLog:
PR c++/90210
* pt.c (do_class_deduction): Don't prune explicit deduction guides
in copy-list-initialization. In copy-list-initialization, if an
explicit deduction guide was selected, give an error.
gcc/testsuite/ChangeLog:
PR c++/90210
* g++.dg/cpp1z/class-deduction73.C: New test.
|
|
|
|
(PR middle-end/97189).
Resolves:
PR middle-end/97189 - ICE on redeclaration of a function with VLA argument and attribute access
gcc/ChangeLog:
PR middle-end/97189
* attribs.c (attr_access::array_as_string): Avoid assuming a VLA
access specification string contains a closing bracket.
gcc/c-family/ChangeLog:
PR middle-end/97189
* c-attribs.c (append_access_attr): Use the function declaration
location for a warning about an attribute access argument.
gcc/testsuite/ChangeLog:
PR middle-end/97189
* gcc.dg/attr-access-2.c: Adjust caret location.
* gcc.dg/Wvla-parameter-6.c: New test.
* gcc.dg/Wvla-parameter-7.c: New test.
|
|
gcc/ChangeLog:
PR c/97206
* attribs.c (attr_access::array_as_string): Avoid modifying a shared
type in place and use build_type_attribute_qual_variant instead.
gcc/testsuite/ChangeLog:
PR c/97206
* gcc.dg/Warray-parameter-7.c: New test.
* gcc.dg/Warray-parameter-8.c: New test.
* gcc.dg/Wvla-parameter-5.c: New test.
|
|
* trans-decl.c (gfc_build_intrinsic_function_decls): Add traling dots
to spec strings so they match the number of parameters; do not use
R and W for non-pointer parameters. Drop pointless specifier on
caf_stop_numeric and caf_get_team.
|
|
2020-09-30 Jan Hubicka <hubicka@ucw.cz>
* trans-io.c (gfc_build_io_library_fndecls): Add trailing dots so
length of spec string matches number of arguments.
|
|
Add a testcase for PR target/96827 which was fixed by r11-3559:
commit 97b798d80baf945ea28236eef3fa69f36626b579
Author: Joel Hutton <joel.hutton@arm.com>
Date: Wed Sep 30 15:08:13 2020 +0100
[SLP][VECT] Add check to fix 96837
PR target/96827
* gcc.target/i386/pr96827.c: New test.
|
|
Since r204778 (g571880a0a4c512195aa7d41929ba6795190887b2), we favor
branches over IT blocks on Cortex-M. As a result, instead of
generating two nested IT blocks in thumb2-cond-cmp-[1234].c, we
generate either a single IT block, or use branches depending on
conditions tested by the program.
Since this was a deliberate change and the tests still pass as
expected on Cortex-A, this patch skips them when targetting
Cortex-M. The avoids the failures on Cortex M3, M4, and M33. This
patch makes the testcases unsupported on Cortex-M7 although they pass
in this case because this CPU has different branch costs.
I tried to relax the scan-assembler directives using eg. cmpne|subne
or cmpgt|ble but that seemed fragile.
2020-09-07 Christophe Lyon <christophe.lyon@linaro.org>
gcc/testsuite/
PR target/94595
* gcc.target/arm/thumb2-cond-cmp-1.c: Skip if arm_cortex_m.
* gcc.target/arm/thumb2-cond-cmp-2.c: Skip if arm_cortex_m.
* gcc.target/arm/thumb2-cond-cmp-3.c: Skip if arm_cortex_m.
* gcc.target/arm/thumb2-cond-cmp-4.c: Skip if arm_cortex_m.
|
|
This amends SLP reduction testcases that currently trigger
vect_attempt_slp_rearrange_stmts eliding load permutations to
verify this is actually happening.
2020-09-30 Richard Biener <rguenther@suse.de>
* gcc.dg/vect/pr37027.c: Amend.
* gcc.dg/vect/pr67790.c: Likewise.
* gcc.dg/vect/pr92324-4.c: Likewise.
* gcc.dg/vect/pr92558.c: Likewise.
* gcc.dg/vect/pr95495.c: Likewise.
* gcc.dg/vect/slp-reduc-1.c: Likewise.
* gcc.dg/vect/slp-reduc-2.c: Likewise.
* gcc.dg/vect/slp-reduc-3.c: Likewise.
* gcc.dg/vect/slp-reduc-4.c: Likewise.
* gcc.dg/vect/slp-reduc-5.c: Likewise.
* gcc.dg/vect/slp-reduc-7.c: Likewise.
* gcc.dg/vect/vect-reduc-in-order-4.c: Likewise.
|
|
This patch introduces support for Cortex-A78 [0] and Cortex-A78AE [1]
cpus.
[0]: https://www.arm.com/products/silicon-ip-cpu/cortex-a/cortex-a78
[1]: https://www.arm.com/products/silicon-ip-cpu/cortex-a/cortex-a78ae
OK for master branch ?
kind regards
Przemyslaw Wirkus
gcc/ChangeLog:
* config/arm/arm-cpus.in: Add Cortex-A78 and Cortex-A78AE cores.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm-tune.md: Regenerate.
* doc/invoke.texi: Update docs.
|
|
This patch introduces support for Cortex-A78 [0] and Cortex-A78AE [1]
cpus.
[0]: https://www.arm.com/products/silicon-ip-cpu/cortex-a/cortex-a78
[1]: https://www.arm.com/products/silicon-ip-cpu/cortex-a/cortex-a78ae
OK for master branch ?
kind regards
Przemyslaw Wirkus
gcc/ChangeLog:
* config/aarch64/aarch64-cores.def: Add Cortex-A78 and Cortex-A78AE cores.
* config/aarch64/aarch64-tune.md: Regenerate.
* doc/invoke.texi: Add -mtune=cortex-a78 and -mtune=cortex-a78ae.
|
|
__ARM_undef type (pr96795).
Hello,
This patch fixes (PR96795) MVE intrinsic polymorphic variants vaddq, vaddq_m, vaddq_x, vcmpeqq_m,
vcmpeqq, vcmpgeq_m, vcmpgeq, vcmpgtq_m, vcmpgtq, vcmpleq_m, vcmpleq, vcmpltq_m, vcmpltq,
vcmpneq_m, vcmpneq, vfmaq_m, vfmaq, vfmasq_m, vfmasq, vmaxnmavq, vmaxnmavq_p, vmaxnmvq,
vmaxnmvq_p, vminnmavq, vminnmavq_p, vminnmvq, vminnmvq_p, vmulq_m, vmulq, vmulq_x, vsetq_lane,
vsubq_m, vsubq and vsubq_x which are incorrectly generating __ARM_undef and mismatching the passed
floating point scalar arguments.
Bootstrapped on arm-none-linux-gnueabihf and regression tested on arm-none-eabi and found no regressions.
Ok for master? Ok for GCC-10 branch?
Regards,
Srinath.
gcc/ChangeLog:
2020-09-30 Srinath Parvathaneni <srinath.parvathaneni@arm.com>
PR target/96795
* config/arm/arm_mve.h (__ARM_mve_coerce2): Define.
(__arm_vaddq): Correct the scalar argument.
(__arm_vaddq_m): Likewise.
(__arm_vaddq_x): Likewise.
(__arm_vcmpeqq_m): Likewise.
(__arm_vcmpeqq): Likewise.
(__arm_vcmpgeq_m): Likewise.
(__arm_vcmpgeq): Likewise.
(__arm_vcmpgtq_m): Likewise.
(__arm_vcmpgtq): Likewise.
(__arm_vcmpleq_m): Likewise.
(__arm_vcmpleq): Likewise.
(__arm_vcmpltq_m): Likewise.
(__arm_vcmpltq): Likewise.
(__arm_vcmpneq_m): Likewise.
(__arm_vcmpneq): Likewise.
(__arm_vfmaq_m): Likewise.
(__arm_vfmaq): Likewise.
(__arm_vfmasq_m): Likewise.
(__arm_vfmasq): Likewise.
(__arm_vmaxnmavq): Likewise.
(__arm_vmaxnmavq_p): Likewise.
(__arm_vmaxnmvq): Likewise.
(__arm_vmaxnmvq_p): Likewise.
(__arm_vminnmavq): Likewise.
(__arm_vminnmavq_p): Likewise.
(__arm_vminnmvq): Likewise.
(__arm_vminnmvq_p): Likewise.
(__arm_vmulq_m): Likewise.
(__arm_vmulq): Likewise.
(__arm_vmulq_x): Likewise.
(__arm_vsetq_lane): Likewise.
(__arm_vsubq_m): Likewise.
(__arm_vsubq): Likewise.
(__arm_vsubq_x): Likewise.
gcc/testsuite/ChangeLog:
PR target/96795
* gcc.target/arm/mve/intrinsics/mve_fp_vaddq_n.c: New Test.
* gcc.target/arm/mve/intrinsics/mve_vaddq_n.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_m_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_m_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_x_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_x_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpeqq_m_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpeqq_m_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpeqq_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpeqq_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpgeq_m_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpgeq_m_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpgeq_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpgeq_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpgtq_m_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpgtq_m_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpgtq_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpgtq_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpleq_m_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpleq_m_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpleq_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpleq_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpltq_m_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpltq_m_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpltq_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpltq_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpneq_m_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpneq_m_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpneq_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmpneq_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmaq_m_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmaq_m_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmaq_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmaq_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmasq_m_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmasq_m_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmasq_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vfmasq_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmavq_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmavq_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmavq_p_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmavq_p_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmvq_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmvq_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmvq_p_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmvq_p_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmavq_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmavq_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmavq_p_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmavq_p_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmvq_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmvq_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmvq_p_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmvq_p_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_m_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_m_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_x_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_x_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_m_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_m_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_n_f32-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_x_n_f16-1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_x_n_f32-1.c: Likewise.
|
|
The following patch adds a simple check to prevent slp stmts from
vector constructors being rearranged. vect_attempt_slp_rearrange_stmts
tries to rearrange to avoid a load permutation.
This fixes PR target/96837
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96827
gcc/ChangeLog:
2020-09-29 Joel Hutton <joel.hutton@arm.com>
PR target/96837
* tree-vect-slp.c (vect_analyze_slp): Do not call
vect_attempt_slp_rearrange_stmts for vector constructors.
gcc/testsuite/ChangeLog:
2020-09-29 Joel Hutton <joel.hutton@arm.com>
PR target/96837
* gcc.dg/vect/bb-slp-49.c: New test.
|
|
This is a small refactoring which introduces SLP_TREE_REF_COUNT and replaces
the uses of refcnt with it. This for consistency between the other properties.
A similar patch was pre-approved last year but since there are more use now I am
sending it for review anyway.
gcc/ChangeLog:
* tree-vectorizer.h (SLP_TREE_REF_COUNT): New.
* tree-vect-slp.c (_slp_tree::_slp_tree, _slp_tree::~_slp_tree,
vect_free_slp_tree, vect_build_slp_tree, vect_print_slp_tree,
slp_copy_subtree, vect_attempt_slp_rearrange_stmts): Use it.
|
|
Now hiddenness is managed by name-lookup, we no longer need DECL_HIDDEN_FRIEND_P.
This removes it. Mainly by deleting its bookkeeping, but there are a couple of uses
1) two name lookups look at it to see if they found a hidden thing.
In one we have the OVERLOAD, so can record OVL_HIDDEN_P. In the other
we're repeating a lookup that failed, but asking for hidden things --
so if that succeeds we know the thing was hidden. (FWIW CWG recently
discussed whether template specializations and instantiations should
see such hidden templates anyway, there is compiler divergence.)
2) We had a confusing setting of KOENIG_P when building a
non-dependent call. We don't repeat that lookup at instantiation time
anyway.
gcc/cp/
* cp-tree.h (struct lang_decl_fn): Remove hidden_friend_p.
(DECL_HIDDEN_FRIEND_P): Delete.
* call.c (add_function_candidate): Drop assert about anticipated
decl.
(build_new_op_1): Drop koenig lookup flagging for hidden friend.
* decl.c (duplicate_decls): Drop HIDDEN_FRIEND_P updating.
* name-lookup.c (do_pushdecl): Likewise.
(set_decl_namespace): Discover hiddenness from OVL_HIDDEN_P.
* pt.c (check_explicit_specialization): Record found_hidden
explicitly.
|
|
gcc/fortran/ChangeLog:
PR fortran/97242
* expr.c (gfc_is_not_contiguous): Fix check.
(gfc_check_pointer_assign): Use it.
gcc/testsuite/ChangeLog:
PR fortran/97242
* gfortran.dg/contiguous_11.f90: New test.
* gfortran.dg/contiguous_4.f90: Update.
* gfortran.dg/contiguous_7.f90: Update.
|
|
gcc/ChangeLog:
* omp-offload.c (omp_discover_implicit_declare_target): Also
handled nested functions.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/declare-target-3.f90: New test.
|
|
2020-30-09 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/97045
* trans-array.c (gfc_conv_array_ref): Make sure that the class
decl is passed to build_array_ref in the case of unlimited
polymorphic entities.
* trans-expr.c (gfc_conv_derived_to_class): Ensure that array
refs do not preceed the _len component. Free the _len expr.
* trans-stmt.c (trans_associate_var): Reset 'need_len_assign'
for polymorphic scalars.
* trans.c (gfc_build_array_ref): When the vptr size is used for
span, multiply by the _len field of unlimited polymorphic
entities, when non-zero.
gcc/testsuite/
PR fortran/97045
* gfortran.dg/select_type_50.f90 : New test.
|
|
GCC has a target hook TARGET_LIBC_HAS_FUNCTION, which tells the compiler
which functions it can expect to be present in libc.
The default target hook does not include the sincos functions.
The nvptx port of newlib does include sincos and sincosf, but not sincosl.
The target hook TARGET_LIBC_HAS_FUNCTION does not distinguish between sincos,
sincosf and sincosl, so if we enable it for the sincos functions, then for
test.c:
...
long double x, a, b;
int main (void) {
x = 0.5;
a = sinl (x);
b = cosl (x);
printf ("a: %f\n", (double)a);
printf ("b: %f\n", (double)b);
return 0;
}
...
we introduce a regression:
...
$ gcc test.c -lm -O2
unresolved symbol sincosl
collect2: error: ld returned 1 exit status
...
Add a type argument to target hook TARGET_LIBC_HAS_FUNCTION_TYPE, and use it
in nvptx_libc_has_function_type to enable sincos and sincosf, but not sincosl.
Build and reg-tested on x86_64-linux.
Build and tested on nvptx.
gcc/ChangeLog:
2020-09-28 Tobias Burnus <tobias@codesourcery.com>
Tom de Vries <tdevries@suse.de>
* builtins.c (expand_builtin_cexpi, fold_builtin_sincos): Update
targetm.libc_has_function call.
* builtins.def (DEF_C94_BUILTIN, DEF_C99_BUILTIN, DEF_C11_BUILTIN):
(DEF_C2X_BUILTIN, DEF_C99_COMPL_BUILTIN, DEF_C99_C90RES_BUILTIN):
Same.
* config/darwin-protos.h (darwin_libc_has_function): Update prototype.
* config/darwin.c (darwin_libc_has_function): Add arg.
* config/linux-protos.h (linux_libc_has_function): Update prototype.
* config/linux.c (linux_libc_has_function): Add arg.
* config/i386/i386.c (ix86_libc_has_function): Update
targetm.libc_has_function call.
* config/nvptx/nvptx.c (nvptx_libc_has_function): New function.
(TARGET_LIBC_HAS_FUNCTION): Redefine to nvptx_libc_has_function.
* convert.c (convert_to_integer_1): Update targetm.libc_has_function
call.
* match.pd: Same.
* target.def (libc_has_function): Add arg.
* doc/tm.texi: Regenerate.
* targhooks.c (default_libc_has_function, gnu_libc_has_function)
(no_c99_libc_has_function): Add arg.
* targhooks.h (default_libc_has_function, no_c99_libc_has_function)
(gnu_libc_has_function): Update prototype.
* tree-ssa-math-opts.c (pass_cse_sincos::execute): Update
targetm.libc_has_function call.
gcc/fortran/ChangeLog:
2020-09-30 Tom de Vries <tdevries@suse.de>
* f95-lang.c (gfc_init_builtin_functions): Update
targetm.libc_has_function call.
|
|
Since MOVDIRI and MOVDIR64B write to memory, similar to UNSPEC_MOVNT,
use SET operation in MOVDIRI and MOVDIR64B patterns with UNSPEC instead
of UNSPECV.
gcc/
PR target/97184
* config/i386/i386.md (UNSPECV_MOVDIRI): Renamed to ...
(UNSPEC_MOVDIRI): This.
(UNSPECV_MOVDIR64B): Renamed to ...
(UNSPEC_MOVDIR64B): This.
(movdiri<mode>): Use SET operation.
(@movdir64b_<mode>): Likewise.
gcc/testsuite/
PR target/97184
* gcc.target/i386/movdir64b.c: New test.
* gcc.target/i386/movdiri32.c: Likewise.
* gcc.target/i386/movdiri64.c: Likewise.
* lib/target-supports.exp (check_effective_target_movdir): New.
|
|
Before commit 7e437162001 "[testsuite] Require non_strict_align in
pr94600-{1,3}.c", some tests were failing for nvptx, because volatile stores
were expected, but memcpy's were found instead.
This was traced back to this bit in compute_record_mode:
...
/* If structure's known alignment is less than what the scalar
mode would need, and it matters, then stick with BLKmode. */
if (mode != BLKmode
&& STRICT_ALIGNMENT
&& ! (TYPE_ALIGN (type) >= BIGGEST_ALIGNMENT
|| TYPE_ALIGN (type) >= GET_MODE_ALIGNMENT (mode)))
{
/* If this is the only reason this type is BLKmode, then
don't force containing types to be BLKmode. */
TYPE_NO_FORCE_BLK (type) = 1;
mode = BLKmode;
}
...
which got triggered for nvptx, but not for x86_64.
The commit disabled the tests for non_strict_align effective target, but
that had the effect for the arm target that those tests were disabled, even
though they were passing before.
Further investigation in compute_record_mode shows that the if-condition
evaluates to false for arm because, because TYPE_ALIGN (type) == 32, while
it's 8 for nvptx. This again can be explained by the
PCC_BITFIELD_TYPE_MATTERS setting, which is 1 for arm, but 0 for nvptx.
Re-enable the test for arm by using effective target
(non_strict_align || pcc_bitfield_type_matters).
Tested on arm-eabi and nvptx.
gcc/testsuite/ChangeLog:
2020-09-30 Tom de Vries <tdevries@suse.de>
* gcc.dg/pr94600-1.c: Use effective target
(non_strict_align || pcc_bitfield_type_matters).
* gcc.dg/pr94600-3.c: Same.
|
|
gcc/
* config/i386/i386-c.c (ix86_target_macros_internal): Define
__LAHF_SAHF__ and __MOVBE__ based on ISA flags.
|
|
These tests were missing dg-requires-effective-targets to ensure they
are UNSUPPORTED if the assembler doesn't have AMX support.
2020-09-30 Jakub Jelinek <jakub@redhat.com>
* gcc.target/i386/amxint8-dpbssd-2.c: Require effective targets
amx_tile and amx_int8.
* gcc.target/i386/amxint8-dpbsud-2.c: Likewise.
* gcc.target/i386/amxint8-dpbusd-2.c: Likewise.
* gcc.target/i386/amxint8-dpbuud-2.c: Likewise.
* gcc.target/i386/amxbf16-dpbf16ps-2.c: Require effective targets
amx_tile and amx_bf16.
* gcc.target/i386/amxtile-2.c: Require effective target amx_tile.
|