Age | Commit message (Collapse) | Author | Files | Lines |
|
Hi,
as we discussed with Honza on the mailin glist last week, making
cached call context structure distinct from the normal one may make it
clearer that the cached data need to be explicitely deallocated.
This patch does that division. It is not mandatory for the overall
main goals of the patch set and can be dropped if deemed superfluous.
gcc/ChangeLog:
2020-09-02 Martin Jambor <mjambor@suse.cz>
* ipa-fnsummary.h (ipa_cached_call_context): New forward declaration
and class.
(class ipa_call_context): Make friend ipa_cached_call_context. Moved
methods duplicate_from and release to it too.
* ipa-fnsummary.c (ipa_call_context::duplicate_from): Moved to class
ipa_cached_call_context.
(ipa_call_context::release): Likewise, removed the parameter.
* ipa-inline-analysis.c (node_context_cache_entry): Change the type of
ctx to ipa_cached_call_context.
(do_estimate_edge_time): Remove parameter from the call to
ipa_cached_call_context::release.
|
|
Hi,
this large patch is mostly mechanical change which aims to replace
uses of separate vectors about known scalar values (usually called
known_vals or known_csts), known aggregate values (known_aggs), known
virtual call contexts (known_contexts) and known value
ranges (known_value_ranges) with uses of either new type
ipa_call_arg_values or ipa_auto_call_arg_values, both of which simply
contain these vectors inside them.
The need for two distinct comes from the fact that when the vectors
are constructed from jump functions or lattices, we really should use
auto_vecs with embedded storage allocated on stack. On the other hand,
the bundle in ipa_call_context can be allocated on heap when in cache,
one time for each call_graph node.
ipa_call_context is constructible from ipa_auto_call_arg_values but
then its vectors must not be resized, otherwise the vectors will stop
pointing to the stack ones. Unfortunately, I don't think the
structure embedded in ipa_call_context can be made constant because we
need to manipulate and deallocate it when in cache.
gcc/ChangeLog:
2020-09-01 Martin Jambor <mjambor@suse.cz>
* ipa-prop.h (ipa_auto_call_arg_values): New type.
(class ipa_call_arg_values): Likewise.
(ipa_get_indirect_edge_target): Replaced vector arguments with
ipa_call_arg_values in declaration. Added an overload for
ipa_auto_call_arg_values.
* ipa-fnsummary.h (ipa_call_context): Removed members m_known_vals,
m_known_contexts, m_known_aggs, duplicate_from, release and equal_to,
new members m_avals, store_to_cache and equivalent_to_p. Adjusted
construcotr arguments.
(estimate_ipcp_clone_size_and_time): Replaced vector arguments
with ipa_auto_call_arg_values in declaration.
(evaluate_properties_for_edge): Likewise.
* ipa-cp.c (ipa_get_indirect_edge_target): Adjusted to work on
ipa_call_arg_values rather than on separate vectors. Added an
overload for ipa_auto_call_arg_values.
(devirtualization_time_bonus): Adjusted to work on
ipa_auto_call_arg_values rather than on separate vectors.
(gather_context_independent_values): Adjusted to work on
ipa_auto_call_arg_values rather than on separate vectors.
(perform_estimation_of_a_value): Likewise.
(estimate_local_effects): Likewise.
(modify_known_vectors_with_val): Adjusted both variants to work on
ipa_auto_call_arg_values and rename them to
copy_known_vectors_add_val.
(decide_about_value): Adjusted to work on ipa_call_arg_values rather
than on separate vectors.
(decide_whether_version_node): Likewise.
* ipa-fnsummary.c (evaluate_conditions_for_known_args): Likewise.
(evaluate_properties_for_edge): Likewise.
(ipa_fn_summary_t::duplicate): Likewise.
(estimate_edge_devirt_benefit): Adjusted to work on
ipa_call_arg_values rather than on separate vectors.
(estimate_edge_size_and_time): Likewise.
(estimate_calls_size_and_time_1): Likewise.
(summarize_calls_size_and_time): Adjusted calls to
estimate_edge_size_and_time.
(estimate_calls_size_and_time): Adjusted to work on
ipa_call_arg_values rather than on separate vectors.
(ipa_call_context::ipa_call_context): Construct from a pointer to
ipa_auto_call_arg_values instead of inividual vectors.
(ipa_call_context::duplicate_from): Adjusted to access vectors within
m_avals.
(ipa_call_context::release): Likewise.
(ipa_call_context::equal_to): Likewise.
(ipa_call_context::estimate_size_and_time): Adjusted to work on
ipa_call_arg_values rather than on separate vectors.
(estimate_ipcp_clone_size_and_time): Adjusted to work with
ipa_auto_call_arg_values rather than on separate vectors.
(ipa_merge_fn_summary_after_inlining): Likewise. Adjusted call to
estimate_edge_size_and_time.
(ipa_update_overall_fn_summary): Adjusted call to
estimate_edge_size_and_time.
* ipa-inline-analysis.c (do_estimate_edge_time): Adjusted to work with
ipa_auto_call_arg_values rather than with separate vectors.
(do_estimate_edge_size): Likewise.
(do_estimate_edge_hints): Likewise.
* ipa-prop.c (ipa_auto_call_arg_values::~ipa_auto_call_arg_values):
New destructor.
|
|
I noticed that the following changes from this paper were not yet
implemented.
libstdc++-v3/ChangeLog:
* include/bits/stl_iterator.h (reverse_iterator::iter_move):
Define for C++20 as per P0896.
(reverse_iterator::iter_swap): Likewise.
(move_iterator::operator*): Apply P0896 changes for C++20.
(move_iterator::operator[]): Likewise.
* testsuite/24_iterators/reverse_iterator/cust.cc: New test.
|
|
This patch fixes an issue with vmin* and vmax* intrinsics which accept
a scalar argument. Previously when the scalar was of different width
to the vector elements this would generate __ARM_undef. This change
allows the scalar argument to be implicitly converted to the correct
width. Also tidied up the relevant unit tests, some of which would
have passed even if only one of two or three intrinsic calls had
compiled correctly.
Bootstrapped and tested on arm-none-eabi, gcc and CMSIS_DSP
testsuites are clean. OK for trunk?
Thanks,
Joe
gcc/ChangeLog:
2020-08-10 Joe Ramsay <joe.ramsay@arm.com>
* config/arm/arm_mve.h (__arm_vmaxnmavq): Remove coercion of scalar
argument.
(__arm_vmaxnmvq): Likewise.
(__arm_vminnmavq): Likewise.
(__arm_vminnmvq): Likewise.
(__arm_vmaxnmavq_p): Likewise.
(__arm_vmaxnmvq_p): Likewise (and delete duplicate definition).
(__arm_vminnmavq_p): Likewise.
(__arm_vminnmvq_p): Likewise.
(__arm_vmaxavq): Likewise.
(__arm_vmaxavq_p): Likewise.
(__arm_vmaxvq): Likewise.
(__arm_vmaxvq_p): Likewise.
(__arm_vminavq): Likewise.
(__arm_vminavq_p): Likewise.
(__arm_vminvq): Likewise.
(__arm_vminvq_p): Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/arm/mve/intrinsics/vmaxavq_p_s16.c: Add test for mismatched
width of scalar argument.
* gcc.target/arm/mve/intrinsics/vmaxavq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxavq_p_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxavq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxavq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxavq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmavq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmavq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmavq_p_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmavq_p_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmvq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmvq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmvq_p_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmvq_p_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxvq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxvq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxvq_p_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxvq_p_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxvq_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxvq_p_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxvq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxvq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxvq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxvq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxvq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxvq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminavq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminavq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminavq_p_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminavq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminavq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminavq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmavq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmavq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmavq_p_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmavq_p_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmvq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmvq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmvq_p_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmvq_p_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminvq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminvq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminvq_p_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminvq_p_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminvq_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminvq_p_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminvq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminvq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminvq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminvq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminvq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminvq_u8.c: Likewise.
|
|
This patch adds a Neoverse V1-specific tuning struct that currently is
just a deduplication of the N1 struct it was using before and specifying
the SVE width.
This will allow us to tweak Neoverse V1 things in the future as needed.
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/
* config/aarch64/aarch64.c (neoversev1_tunings): Define.
* config/aarch64/aarch64-cores.def (zeus): Use it.
(neoverse-v1): Likewise.
|
|
gcc/ChangeLog:
2020-10-02 Jan Hubicka <hubicka@ucw.cz>
* attr-fnspec.h: Update documentation.
(attr_fnsec::return_desc_size): Set to 2
(attr_fnsec::arg_desc_size): Set to 2
* builtin-attrs.def (STR1): Update fnspec.
* internal-fn.def (UBSAN_NULL): Update fnspec.
(UBSAN_VPTR): Update fnspec.
(UBSAN_PTR): Update fnspec.
(ASAN_CHECK): Update fnspec.
(GOACC_DIM_SIZE): Remove fnspec.
(GOACC_DIM_POS): Remove fnspec.
* tree-ssa-alias.c (attr_fnspec::verify): Update verification.
gcc/fortran/ChangeLog:
2020-10-02 Jan Hubicka <hubicka@ucw.cz>
* trans-decl.c (gfc_build_library_function_decl_with_spec): Verify
fnspec.
(gfc_build_intrinsic_function_decls): Update fnspecs.
(gfc_build_builtin_function_decls): Update fnspecs.
* trans-io.c (gfc_build_io_library_fndecls): Update fnspecs.
* trans-types.c (create_fn_spec): Update fnspecs.
|
|
I had reason to wander into cp_make_fname, and noticed it's the only
caller of cp_fname_init. Folding it in makes the code simpler.
gcc/cp/
* cp-tree.h (cp_fname_init): Delete declaration.
* decl.c (cp_fname_init): Merge into only caller ...
(cp_make_fname): ... here & refactor.
|
|
* attr-fnspec.h: New file.
* calls.c (decl_return_flags): Use attr_fnspec.
* gimple.c (gimple_call_arg_flags): Use attr_fnspec.
(gimple_call_return_flags): Use attr_fnspec.
* tree-into-ssa.c (pass_build_ssa::execute): Use attr_fnspec.
* tree-ssa-alias.c (attr_fnspec::verify): New member fuction.
|
|
* tree-ssa-alias.c (ao_ref_init_from_ptr_and_range): Break out from ...
(ao_ref_init_from_ptr_and_size): ... here.
|
|
2020-10-02 Jan Hubicka <hubicka@ucw.cz>
* data-streamer-in.c (streamer_read_poly_int64): New function.
* data-streamer-out.c (streamer_write_poly_int64): New function.
* data-streamer.h (streamer_write_poly_int64): Declare.
(streamer_read_poly_int64): Declare.
|
|
In r11-2922, Przemek fixed a post-RA instruction match failure
caused by the SVE FP subtraction patterns.. This patch applies
the same fix to the other patterns.
To recap, the issue is around the handling of predication.
We want to do two things:
- Optimise cases in which a predicate is known to be all-true.
- Differentiate cases in which the predicate on an _x ACLE function has
to be kept as-is from cases in which we can make more lanes active.
The former is true by default, the latter is true for certain
combinations of flags in the -ffast-math group.
This is handled by a boolean flag in the unspecs to say whether the
predicate is “strict” or “relaxed”. When combining multiple strict
operations, the predicates used in the operations generally need to
match. When combining multiple relaxed operations, we can ignore the
predicates on nested operations and just use the predicate on the
“outermost” operation.
Originally I'd tried to reduce combinatorial explosion by using
aarch64_sve_pred_dominates_p. This required matching predicates
for strict operations but allowed more combinations for relaxed
operations.
The problem (as I should have remembered) is that C conditions on
insn patterns can't reliably enforce matching operands. If the
same register is used in two different input operands, the RA is
allowed to use different hard registers for those input operands
(and sometimes it has to). So operands that match before RA
might not match afterwards. The only sure way to force a match
is via match_dup.
This patch splits the cases into two. I cry bitter tears at having
to do this, but I think it's the only backportable fix. There might
be some way of using define_subst to generate the cond_* patterns from
the pred_* patterns, with some alternatives strategically disabled in
each case, but that's future work and might not be an improvement.
Since so many patterns now do this, I moved the comments from the
subtraction pattern to a new banner comment at the head of the file.
gcc/
* config/aarch64/aarch64-protos.h (aarch64_sve_pred_dominates_p):
Delete.
* config/aarch64/aarch64.c (aarch64_sve_pred_dominates_p): Likewise.
* config/aarch64/aarch64-sve.md: Add banner comment describing
how merging predicated FP operations are represented.
(*cond_<SVE_COND_FP_UNARY:optab><mode>_2): Split into...
(*cond_<SVE_COND_FP_UNARY:optab><mode>_2_relaxed): ...this and...
(*cond_<SVE_COND_FP_UNARY:optab><mode>_2_strict): ...this.
(*cond_<SVE_COND_FP_UNARY:optab><mode>_any): Split into...
(*cond_<SVE_COND_FP_UNARY:optab><mode>_any_relaxed): ...this and...
(*cond_<SVE_COND_FP_UNARY:optab><mode>_any_strict): ...this.
(*cond_<SVE_COND_FP_BINARY_INT:optab><mode>_2): Split into...
(*cond_<SVE_COND_FP_BINARY_INT:optab><mode>_2_relaxed): ...this and...
(*cond_<SVE_COND_FP_BINARY_INT:optab><mode>_2_strict): ...this.
(*cond_<SVE_COND_FP_BINARY_INT:optab><mode>_any): Split into...
(*cond_<SVE_COND_FP_BINARY_INT:optab><mode>_any_relaxed): ...this
and...
(*cond_<SVE_COND_FP_BINARY_INT:optab><mode>_any_strict): ...this.
(*cond_<SVE_COND_FP_BINARY:optab><mode>_2): Split into...
(*cond_<SVE_COND_FP_BINARY:optab><mode>_2_relaxed): ...this and...
(*cond_<SVE_COND_FP_BINARY:optab><mode>_2_strict): ...this.
(*cond_<SVE_COND_FP_BINARY_I1:optab><mode>_2_const): Split into...
(*cond_<SVE_COND_FP_BINARY_I1:optab><mode>_2_const_relaxed): ...this
and...
(*cond_<SVE_COND_FP_BINARY_I1:optab><mode>_2_const_strict): ...this.
(*cond_<SVE_COND_FP_BINARY:optab><mode>_3): Split into...
(*cond_<SVE_COND_FP_BINARY:optab><mode>_3_relaxed): ...this and...
(*cond_<SVE_COND_FP_BINARY:optab><mode>_3_strict): ...this.
(*cond_<SVE_COND_FP_BINARY:optab><mode>_any): Split into...
(*cond_<SVE_COND_FP_BINARY:optab><mode>_any_relaxed): ...this and...
(*cond_<SVE_COND_FP_BINARY:optab><mode>_any_strict): ...this.
(*cond_<SVE_COND_FP_BINARY_I1:optab><mode>_any_const): Split into...
(*cond_<SVE_COND_FP_BINARY_I1:optab><mode>_any_const_relaxed): ...this
and...
(*cond_<SVE_COND_FP_BINARY_I1:optab><mode>_any_const_strict): ...this.
(*cond_add<mode>_2_const): Split into...
(*cond_add<mode>_2_const_relaxed): ...this and...
(*cond_add<mode>_2_const_strict): ...this.
(*cond_add<mode>_any_const): Split into...
(*cond_add<mode>_any_const_relaxed): ...this and...
(*cond_add<mode>_any_const_strict): ...this.
(*cond_<SVE_COND_FCADD:optab><mode>_2): Split into...
(*cond_<SVE_COND_FCADD:optab><mode>_2_relaxed): ...this and...
(*cond_<SVE_COND_FCADD:optab><mode>_2_strict): ...this.
(*cond_<SVE_COND_FCADD:optab><mode>_any): Split into...
(*cond_<SVE_COND_FCADD:optab><mode>_any_relaxed): ...this and...
(*cond_<SVE_COND_FCADD:optab><mode>_any_strict): ...this.
(*cond_sub<mode>_3_const): Split into...
(*cond_sub<mode>_3_const_relaxed): ...this and...
(*cond_sub<mode>_3_const_strict): ...this.
(*aarch64_pred_abd<mode>): Split into...
(*aarch64_pred_abd<mode>_relaxed): ...this and...
(*aarch64_pred_abd<mode>_strict): ...this.
(*aarch64_cond_abd<mode>_2): Split into...
(*aarch64_cond_abd<mode>_2_relaxed): ...this and...
(*aarch64_cond_abd<mode>_2_strict): ...this.
(*aarch64_cond_abd<mode>_3): Split into...
(*aarch64_cond_abd<mode>_3_relaxed): ...this and...
(*aarch64_cond_abd<mode>_3_strict): ...this.
(*aarch64_cond_abd<mode>_any): Split into...
(*aarch64_cond_abd<mode>_any_relaxed): ...this and...
(*aarch64_cond_abd<mode>_any_strict): ...this.
(*cond_<SVE_COND_FP_TERNARY:optab><mode>_2): Split into...
(*cond_<SVE_COND_FP_TERNARY:optab><mode>_2_relaxed): ...this and...
(*cond_<SVE_COND_FP_TERNARY:optab><mode>_2_strict): ...this.
(*cond_<SVE_COND_FP_TERNARY:optab><mode>_4): Split into...
(*cond_<SVE_COND_FP_TERNARY:optab><mode>_4_relaxed): ...this and...
(*cond_<SVE_COND_FP_TERNARY:optab><mode>_4_strict): ...this.
(*cond_<SVE_COND_FP_TERNARY:optab><mode>_any): Split into...
(*cond_<SVE_COND_FP_TERNARY:optab><mode>_any_relaxed): ...this and...
(*cond_<SVE_COND_FP_TERNARY:optab><mode>_any_strict): ...this.
(*cond_<SVE_COND_FCMLA:optab><mode>_4): Split into...
(*cond_<SVE_COND_FCMLA:optab><mode>_4_relaxed): ...this and...
(*cond_<SVE_COND_FCMLA:optab><mode>_4_strict): ...this.
(*cond_<SVE_COND_FCMLA:optab><mode>_any): Split into...
(*cond_<SVE_COND_FCMLA:optab><mode>_any_relaxed): ...this and...
(*cond_<SVE_COND_FCMLA:optab><mode>_any_strict): ...this.
(*aarch64_pred_fac<cmp_op><mode>): Split into...
(*aarch64_pred_fac<cmp_op><mode>_relaxed): ...this and...
(*aarch64_pred_fac<cmp_op><mode>_strict): ...this.
(*cond_<optab>_nontrunc<SVE_FULL_F:mode><SVE_FULL_HSDI:mode>): Split
into...
(*cond_<optab>_nontrunc<SVE_FULL_F:mode><SVE_FULL_HSDI:mode>_relaxed):
...this and...
(*cond_<optab>_nontrunc<SVE_FULL_F:mode><SVE_FULL_HSDI:mode>_strict):
...this.
(*cond_<optab>_nonextend<SVE_FULL_HSDI:mode><SVE_FULL_F:mode>): Split
into...
(*cond_<optab>_nonextend<SVE_FULL_HSDI:mode><SVE_FULL_F:mode>_relaxed):
...this and...
(*cond_<optab>_nonextend<SVE_FULL_HSDI:mode><SVE_FULL_F:mode>_strict):
...this.
* config/aarch64/aarch64-sve2.md
(*cond_<SVE2_COND_FP_UNARY_LONG:optab><mode>): Split into...
(*cond_<SVE2_COND_FP_UNARY_LONG:optab><mode>_relaxed): ...this and...
(*cond_<SVE2_COND_FP_UNARY_LONG:optab><mode>_strict): ...this.
(*cond_<SVE2_COND_FP_UNARY_NARROWB:optab><mode>_any): Split into...
(*cond_<SVE2_COND_FP_UNARY_NARROWB:optab><mode>_any_relaxed): ...this
and...
(*cond_<SVE2_COND_FP_UNARY_NARROWB:optab><mode>_any_strict): ...this.
(*cond_<SVE2_COND_INT_UNARY_FP:optab><mode>): Split into...
(*cond_<SVE2_COND_INT_UNARY_FP:optab><mode>_relaxed): ...this and...
(*cond_<SVE2_COND_INT_UNARY_FP:optab><mode>_strict): ...this.
|
|
As Christophe pointed out, my r11-3522 patch didn't in fact fix
all of the armv8_2-fp16-arith-2.c failures introduced by allowing
FP16 vectorisation without -funsafe-math-optimizations. I must have
only tested the final patch on my usual arm-linux-gnueabihf bootstrap,
which it turns out treats the test as unsupported.
The focus of the original patch was to use mode macros for
patterns that are shared between Advanced SIMD, iwMMXt and MVE.
This patch uses the mode macros for general neon.md patterns too.
gcc/
* config/arm/neon.md (*sub<VDQ:mode>3_neon): Use the new mode macros
for the insn condition.
(sub<VH:mode>3, *mul<VDQW:mode>3_neon): Likewise.
(mul<VDQW:mode>3add<VDQW:mode>_neon): Likewise.
(mul<VH:mode>3add<VH:mode>_neon): Likewise.
(mul<VDQW:mode>3neg<VDQW:mode>add<VDQW:mode>_neon): Likewise.
(fma<VCVTF:mode>4, fma<VH:mode>4, *fmsub<VCVTF:mode>4): Likewise.
(quad_halves_<code>v4sf, reduc_plus_scal_<VD:mode>): Likewise.
(reduc_plus_scal_<VQ:mode>, reduc_smin_scal_<VD:mode>): Likewise.
(reduc_smin_scal_<VQ:mode>, reduc_smax_scal_<VD:mode>): Likewise.
(reduc_smax_scal_<VQ:mode>, mul<VH:mode>3): Likewise.
(neon_vabd<VF:mode>_2, neon_vabd<VF:mode>_3): Likewise.
(fma<VH:mode>4_intrinsic): Delete.
(neon_vadd<VCVTF:mode>): Use the new mode macros to decide which
form of instruction to generate.
(neon_vmla<VDQW:mode>, neon_vmls<VDQW:mode>): Likewise.
(neon_vsub<VCVTF:mode>): Likewise.
(neon_vfma<VH:mode>): Generate the main fma<mode>4 form instead
of using fma<mode>4_intrinsic.
gcc/testsuite/
* gcc.target/arm/armv8_2-fp16-arith-2.c (float16_t): Use _Float16_t
rather than __fp16.
(float16x4_t, float16x4_t): Likewise.
(fp16_abs): Use __builtin_fabsf16.
|
|
This fixes test failures on ilp32 introduced in
r11-3032-gd4febc75e8dfab23bd3132d5747eded918f85107.
The assembler checks in extend-syntax.c simply needed adjusting for
32-bit pointers.
It appears the subsp.c test has never passed on ILP32 due to a missed
optimisation there. Since this isn't a code quality regression, disable
that check on ILP32.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/extend-syntax.c: Fix assembler checks for
ilp32, disable check-function-bodies on ilp32.
* gcc.target/aarch64/subsp.c: Only check second scan-assembler
on lp64 since the code on ilp32 is missing the optimization
needed for this test to pass.
|
|
gcc/ChangeLog:
PR gcov-profile/97193
* coverage.c (coverage_init): GCDA note files should not be
mangled and should end in output directory.
|
|
libgomp/ChangeLog:
* Makefile.in: Regenerate with automake 1.15.1.
* aclocal.m4: Likewise.
* configure: Likewise.
* testsuite/Makefile.in: Likewise.
|
|
We were failing to set the flag on a delete call in a new expression, in a
deleting destructor, and in a coroutine. Fixed by setting it in the
function that builds the call.
2020-10-02 Jason Merril <jason@redhat.com>
gcc/cp/ChangeLog:
* call.c (build_operator_new_call): Set CALL_FROM_NEW_OR_DELETE_P.
(build_op_delete_call): Likewise.
* init.c (build_new_1, build_vec_delete_1, build_delete): Not here.
(build_delete):
gcc/ChangeLog:
* gimple.h (gimple_call_operator_delete_p): Rename from
gimple_call_replaceable_operator_delete_p.
* gimple.c (gimple_call_operator_delete_p): Likewise.
* tree.h (DECL_IS_REPLACEABLE_OPERATOR_DELETE_P): Remove.
* tree-ssa-dce.c (mark_all_reaching_defs_necessary_1): Adjust.
(propagate_necessity): Likewise.
(eliminate_unnecessary_stmts): Likewise.
* tree-ssa-structalias.c (find_func_aliases_for_call): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/pr94314.C: new/delete no longer omitted.
|
|
This fixes points-to analysis and DCE to only consider new/delete
operator calls from new or delete expressions and not direct calls.
2020-10-01 Richard Biener <rguenther@suse.de>
* gimple.h (GF_CALL_FROM_NEW_OR_DELETE): New call flag.
(gimple_call_set_from_new_or_delete): New.
(gimple_call_from_new_or_delete): Likewise.
* gimple.c (gimple_build_call_from_tree): Set
GF_CALL_FROM_NEW_OR_DELETE appropriately.
* ipa-icf-gimple.c (func_checker::compare_gimple_call):
Compare gimple_call_from_new_or_delete.
* tree-ssa-dce.c (mark_all_reaching_defs_necessary_1): Make
sure to only consider new/delete calls from new or delete
expressions.
(propagate_necessity): Likewise.
(eliminate_unnecessary_stmts): Likewise.
* tree-ssa-structalias.c (find_func_aliases_for_call):
Likewise.
* g++.dg/tree-ssa/pta-delete-1.C: New testcase.
|
|
As discussed with richi, we should be able to use TREE_PROTECTED for this
flag, since CALL_FROM_THUNK_P will never be set on a call to an operator new
or delete.
2020-10-01 Jason Merril <jason@redhat.com>
gcc/cp/ChangeLog:
* lambda.c (call_from_lambda_thunk_p): New.
* cp-gimplify.c (cp_genericize_r): Use it.
* pt.c (tsubst_copy_and_build): Use it.
* typeck.c (check_return_expr): Use it.
* cp-tree.h: Declare it.
(CALL_FROM_NEW_OR_DELETE_P): Move to gcc/tree.h.
gcc/ChangeLog:
* tree.h (CALL_FROM_NEW_OR_DELETE_P): Move from cp-tree.h.
* tree-core.h: Document new usage of protected_flag.
|
|
This should have been included in the irange_allocator patch, as
a method to see if the current object can hold a passed range
without truncation.
gcc/ChangeLog:
* value-range.h (irange::fits_p): New.
|
|
|
|
Fixes golang/go#41737
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/258977
|
|
during RTL pass: fwprop1
gcc.dg/pr82596.c: In function 'test_cststring':
gcc.dg/pr82596.c:27:1: internal compiler error: in decompose, at rtl.h:2282
-m32 gcc/testsuite/gcc.dg/pr82596.c fails along with other tests after
applying rtx_cost patches, which exposed a backend bug.
legitimize_address when presented with the following address
(plus (reg) (const_int 0x7ffffffff))
attempts to rewrite it as a high/low sum. The low part is 0xffff, or
-1, making the high part 0x80000000. But this is no longer canonical
for SImode.
* config/rs6000/rs6000.c (rs6000_legitimize_address): Use
gen_int_mode for high part of address constant.
|
|
Commit c6be439b37 wrongly left a block of code inside an "else" block,
which changed the default for power10 TARGET_NO_FP_IN_TOC
accidentally. We don't want FP constants in the TOC when
-mcmodel=medium can address them just as efficiently outside the TOC.
* config/rs6000/rs6000.c (rs6000_linux64_override_options):
Formatting. Correct setting of TARGET_NO_FP_IN_TOC and
TARGET_NO_SUM_IN_TOC.
|
|
* config/rs6000/freebsd64.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Use
rs6000_linux64_override_options.
* config/rs6000/linux64.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Break
out to..
* config/rs6000/rs6000.c (rs6000_linux64_override_options): ..this,
new function. Tweak non-biarch test and clearing of
profile_kernel to work with freebsd64.h.
|
|
There are only a couple of asserts remaining using this macro, and
nothing using TYPE_HIDDEN_P. Killed thusly.
gcc/cp/
* cp-tree.h (DECL_ANTICIPATED): Adjust comment.
(DECL_HIDDEN_P, TYPE_HIDDEN_P): Delete.
* tree.c (ovl_insert): Delete DECL_HIDDEN_P assert.
(ovl_skip_hidden): Likewise.
|
|
Since a889e06ac68 the following fails.
In file included from ../../gcc/tree-ssa-propagate.h:25:0,
from ../../gcc/config/rs6000/rs6000.c:78:
../../gcc/value-query.h:90:31: error: ‘irange’ has not been declared
virtual bool range_of_expr (irange &r, tree name, gimple * = NULL) = 0;
^~~~~~
../../gcc/value-query.h:91:31: error: ‘irange’ has not been declared
virtual bool range_on_edge (irange &r, edge, tree name);
^~~~~~
../../gcc/value-query.h:92:31: error: ‘irange’ has not been declared
virtual bool range_of_stmt (irange &r, gimple *, tree name = NULL);
^~~~~~
In file included from ../../gcc/tree-ssa-propagate.h:25:0,
from ../../gcc/config/rs6000/rs6000-call.c:67:
../../gcc/value-query.h:90:31: error: ‘irange’ has not been declared
virtual bool range_of_expr (irange &r, tree name, gimple * = NULL) = 0;
^~~~~~
../../gcc/value-query.h:91:31: error: ‘irange’ has not been declared
virtual bool range_on_edge (irange &r, edge, tree name);
^~~~~~
../../gcc/value-query.h:92:31: error: ‘irange’ has not been declared
virtual bool range_of_stmt (irange &r, gimple *, tree name = NULL);
gcc/ChangeLog:
* config/rs6000/rs6000-call.c: Include value-range.h.
* config/rs6000/rs6000.c: Likewise.
|
|
When running:
...
$ gcc.sh src/gcc/testsuite/gcc.target/nvptx/abi-complex-arg.c -S -dP
...
we have in abi-complex-arg.s:
...
//(insn 3 5 4 2
// (set
// (reg:QI 23)
// (truncate:QI (reg:SI 22))) "abi-complex-arg.c":38:1 29 {truncsiqi2}
// (nil))
cvt.u32.u32 %r23, %r22; // 3 [c=4] truncsiqi2/0
...
The cvt.u32.u32 can be written shorter and clearer as mov.u32.
Fix this in define_insn "truncsi<QHIM>2".
Tested on nvptx.
gcc/ChangeLog:
2020-10-01 Tom de Vries <tdevries@suse.de>
PR target/80845
* config/nvptx/nvptx.md (define_insn "truncsi<QHIM>2"): Emit mov.u32
instead of cvt.u32.u32.
|
|
This patch does several things at once:
(1) Add vector compare patterns (vec_cmp and vec_cmpu).
(2) Add vector selects between floating-point modes when the
values being compared are integers (affects vcond and vcondu).
(3) Add vector selects between integer modes when the values being
compared are floating-point (affects vcond).
(4) Add standalone vector select patterns (vcond_mask).
(5) Tweak the handling of compound comparisons with zeros.
Unfortunately it proved too difficult (for me) to separate this
out into a series of smaller patches, since everything is so
inter-related. Defining only some of the new patterns does
not leave things in a happy state.
The handling of comparisons is mostly taken from the vcond patterns.
This means that it remains non-compliant with IEEE: “quiet” comparisons
use signalling instructions. But that shouldn't matter for floats,
since we require -funsafe-math-optimizations to vectorize for them
anyway.
It remains the case that comparisons and selects aren't implemented
at all for HF vectors. Implementing those feels like separate work.
gcc/
PR target/96528
PR target/97288
* config/arm/arm-protos.h (arm_expand_vector_compare): Declare.
(arm_expand_vcond): Likewise.
* config/arm/arm.c (arm_expand_vector_compare): New function.
(arm_expand_vcond): Likewise.
* config/arm/neon.md (vec_cmp<VDQW:mode><v_cmp_result>): New pattern.
(vec_cmpu<VDQW:mode><VDQW:mode>): Likewise.
(vcond<VDQW:mode><VDQW:mode>): Require operand 5 to be a register
or zero. Use arm_expand_vcond.
(vcond<V_cvtto><V32:mode>): New pattern.
(vcondu<VDQIW:mode><VDQIW:mode>): Generalize to...
(vcondu<VDQW:mode><v_cmp_result): ...this. Require operand 5
to be a register or zero. Use arm_expand_vcond.
(vcond_mask_<VDQW:mode><v_cmp_result>): New pattern.
(neon_vc<cmp_op><mode>, neon_vc<cmp_op><mode>_insn): Add "@" marker.
(neon_vbsl<mode>): Likewise.
(neon_vc<cmp_op>u<mode>): Reexpress as...
(@neon_vc<code><mode>): ...this.
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_vect_cond_mixed): Add
arm neon targets.
* gcc.target/arm/neon-compare-1.c: New test.
* gcc.target/arm/neon-compare-2.c: Likewise.
* gcc.target/arm/neon-compare-3.c: Likewise.
* gcc.target/arm/neon-compare-4.c: Likewise.
* gcc.target/arm/neon-compare-5.c: Likewise.
* gcc.target/arm/neon-vcond-gt.c: Expect comparisons with zero.
* gcc.target/arm/neon-vcond-ltgt.c: Likewise.
* gcc.target/arm/neon-vcond-unordered.c: Likewise.
|
|
gcc/testsuite/
* gcc.target/aarch64/movtf_1.c: Restrict the asm matching to lp64.
* gcc.target/aarch64/movti_1.c: Likewise.
|
|
gcc/testsuite/
PR target/96375
* gcc.target/arm/lob1.c: Fix missing flag.
* gcc.target/arm/lob2.c: Likewise.
* gcc.target/arm/lob3.c: Likewise.
* gcc.target/arm/lob4.c: Likewise.
* gcc.target/arm/lob5.c: Likewise.
* gcc.target/arm/lob6.c: Likewise.
* lib/target-supports.exp
(check_effective_target_arm_v8_1_lob_ok): Return 1 only for
cortex-m targets, add '-mthumb' flag.
|
|
* config/i386/t-rtems: Change from mtune to march when building
multilibs. The mtune argument tunes or optimizes for a specific
CPU model but does not ensure the generated code is appropriate
for the CPU model. Prior to this patch, i386 compatible code
was always generated but tuned for later models.
|
|
gcc/ChangeLog:
* builtins.c (compute_objsize): Replace vr_values with range_query.
(get_range): Same.
(gimple_call_alloc_size): Same.
* builtins.h (class vr_values): Remove.
(gimple_call_alloc_size): Replace vr_values with range_query.
* gimple-ssa-sprintf.c (get_int_range): Same.
(struct directive): Pass gimple context to fmtfunc callback.
(directive::set_width): Replace inline with out-of-line version.
(directive::set_precision): Same.
(format_none): New gimple argument.
(format_percent): New gimple argument.
(format_integer): New gimple argument.
(format_floating): New gimple argument.
(get_string_length): Use range_query API.
(format_character): New gimple argument.
(format_string): New gimple argument.
(format_plain): New gimple argument.
(format_directive): New gimple argument.
(parse_directive): Replace vr_values with range_query.
(compute_format_length): Same.
(handle_printf_call): Same. Adjust for range_query API.
* tree-ssa-strlen.c (get_range): Same.
(compare_nonzero_chars): Same.
(get_addr_stridx) Replace vr_values with range_query.
(get_stridx): Same.
(dump_strlen_info): Same.
(get_range_strlen_dynamic): Adjust for range_query API.
(set_strlen_range): Same
(maybe_warn_overflow): Replace vr_values with range_query.
(handle_builtin_strcpy): Same.
(maybe_diag_stxncpy_trunc): Add FIXME comment.
(handle_builtin_memcpy): Replace vr_values with range_query.
(handle_builtin_memset): Same.
(get_len_or_size): Same.
(strxcmp_eqz_result): Same.
(handle_builtin_string_cmp): Same.
(count_nonzero_bytes_addr): Same, plus adjust for range_query API.
(count_nonzero_bytes): Replace vr_values with range_query.
(handle_store): Same.
(strlen_check_and_optimize_call): Same.
(handle_integral_assign): Same.
(check_and_optimize_stmt): Same.
* tree-ssa-strlen.h (class vr_values): Remove.
(get_range): Replace vr_values with range_query.
(get_range_strlen_dynamic): Same.
(handle_printf_call): Same.
|
|
gcc/ChangeLog:
* gimple-loop-versioning.cc (lv_dom_walker::before_dom_children):
Pass m_range_analyzer instead of get_vr_values.
(loop_versioning::name_prop::get_value): Rename to...
(loop_versioning::name_prop::value_of_expr): ...this.
* gimple-ssa-evrp-analyze.c (evrp_range_analyzer::evrp_range_analyzer):
Adjust for evrp_range_analyzer
inheriting from vr_values.
(evrp_range_analyzer::try_find_new_range): Same.
(evrp_range_analyzer::record_ranges_from_incoming_edge): Same.
(evrp_range_analyzer::record_ranges_from_phis): Same.
(evrp_range_analyzer::record_ranges_from_stmt): Same.
(evrp_range_analyzer::push_value_range): Same.
(evrp_range_analyzer::pop_value_range): Same.
* gimple-ssa-evrp-analyze.h (class evrp_range_analyzer): Inherit from
vr_values. Adjust accordingly.
* gimple-ssa-evrp.c: Adjust for evrp_range_analyzer inheriting from
vr_values.
(evrp_folder::value_of_evrp): Rename from get_value.
* tree-ssa-ccp.c (class ccp_folder): Rename get_value to
value_of_expr.
(ccp_folder::get_value): Rename to...
(ccp_folder::value_of_expr): ...this.
* tree-ssa-copy.c (class copy_folder): Rename get_value to
value_of_expr.
(copy_folder::get_value): Rename to...
(copy_folder::value_of_expr): ...this.
* tree-ssa-dom.c (dom_opt_dom_walker::after_dom_children): Adjust
for evrp_range_analyzer inheriting from vr_values.
(dom_opt_dom_walker::optimize_stmt): Same.
* tree-ssa-propagate.c (substitute_and_fold_engine::replace_uses_in):
Call value_of_* instead of get_value.
(substitute_and_fold_engine::replace_phi_args_in): Same.
(substitute_and_fold_engine::propagate_into_phi_args): Same.
(substitute_and_fold_dom_walker::before_dom_children): Same.
* tree-ssa-propagate.h: Include value-query.h.
(class substitute_and_fold_engine): Inherit from value_query.
* tree-ssa-strlen.c (strlen_dom_walker::before_dom_children):
Adjust for evrp_range_analyzer inheriting from vr_values.
* tree-ssa-threadedge.c (record_temporary_equivalences_from_phis):
Same.
* tree-vrp.c (class vrp_folder): Same.
(vrp_folder::get_value): Rename to value_of_expr.
* vr-values.c (vr_values::get_lattice_entry): Adjust for
vr_values inheriting from range_query.
(vr_values::range_of_expr): New.
(vr_values::value_of_expr): New.
(vr_values::value_on_edge): New.
(vr_values::value_of_stmt): New.
(simplify_using_ranges::op_with_boolean_value_range_p): Call
get_value_range through query.
(check_for_binary_op_overflow): Rename store to query.
(vr_values::vr_values): Remove vrp_value_range_pool.
(vr_values::~vr_values): Same.
(simplify_using_ranges::get_vr_for_comparison): Call get_value_range
through query.
(simplify_using_ranges::compare_names): Same.
(simplify_using_ranges::vrp_evaluate_conditional): Same.
(simplify_using_ranges::vrp_visit_cond_stmt): Same.
(simplify_using_ranges::simplify_abs_using_ranges): Same.
(simplify_using_ranges::simplify_cond_using_ranges_1): Same.
(simplify_cond_using_ranges_2): Same.
(simplify_using_ranges::simplify_switch_using_ranges): Same.
(simplify_using_ranges::two_valued_val_range_p): Same.
(simplify_using_ranges::simplify_using_ranges): Rename store to query.
(simplify_using_ranges::simplify): Assert that we have a query.
* vr-values.h (class range_query): Remove.
(class simplify_using_ranges): Remove inheritance of range_query.
(class vr_values): Add virtuals for range_of_expr, value_of_expr,
value_on_edge, value_of_stmt, and get_value_range.
Call range_query allocator instead of using vrp_value_range_pool.
Remove vrp_value_range_pool.
(simplify_using_ranges::get_value_range): Remove.
|
|
This avoids using VMAT_CONTIGUOUS with single-element interleaving
when using V1mode vectors. Instead keep VMAT_ELEMENTWISE but
continue to avoid load-lanes and gathers.
2020-10-01 Richard Biener <rguenther@suse.de>
PR tree-optimization/97236
* tree-vect-stmts.c (get_group_load_store_type): Keep
VMAT_ELEMENTWISE for single-element vectors.
* gcc.dg/vect/pr97236.c: New testcase.
|
|
I discovered pushdecl_top_level was not setting the decl's context,
and we ended up with namespace-scope decls with NULL context. That
broke modules. Then I discovered a couple of places where we set the
context to a FUNCTION_DECL, which is also wrong. AFAICT the literals
in question belong in global scope, as they're comdatable entities.
But create_temporary would use current_scope for the context before we
pushed it into namespace scope.
This patch asserts the context is NULL and then sets it to the frobbed
global_namespace.
gcc/cp/
* name-lookup.c (pushdecl_top_level): Assert incoming context is
null, add global_namespace context.
(pushdecl_top_level_and_finish): Likewise.
* pt.c (get_template_parm_object): Clear decl context before
pushing.
* semantics.c (finish_compound_literal): Likewise.
|
|
PR ipa/97243
* gcc.c-torture/compile/pr97243.c: New test.
|
|
gcc/ChangeLog:
* ipa-modref.c (compute_parm_map): Be ready for callee_pi to be NULL.
|
|
PR ipa/97244
* gcc.dg/ipa/remref-2a.c: Add -fno-ipa-modref
|
|
PR ipa/97244
* ipa-fnsummary.c (pass_free_fnsummary::execute): Free
also indirect inlining datastructure.
* ipa-modref.c (pass_ipa_modref::execute): Do not free them here.
* ipa-prop.c (ipa_free_all_node_params): Do not crash when info does
not exist.
(ipa_unregister_cgraph_hooks): Likewise.
|
|
* internal-fn.c (DEF_INTERNAL_FN): Fix handling of fnspec
|
|
gcc/ChangeLog:
* Makefile.in: Add value-query.o.
* value-query.cc: New file.
* value-query.h: New file.
|
|
It turns out I'd already found lookup_and_check_tag's control flow
confusing, and had refactored it on the modules branch. For instance,
it continually checks 'if (decl &&$ condition)' before finally getting
to 'else if (!decl)'. why not just check !decl first and be done?
Well, it is done thusly.
gcc/cp/
* decl.c (lookup_and_check_tag): Refactor.
|
|
libstdc++-v3/ChangeLog:
* config/cpu/arm/cxxabi_tweaks.h (_GLIBCXX_GUARD_TEST_AND_ACQUIRE):
Do not try to dereference return value of __atomic_load_n.
|
|
This moves the recent entry for Neoverse N2 down and adds a comment in
order to preserve the existing order/structure in arm-cpus.in.
gcc/ChangeLog:
* config/arm/arm-cpus.in: Fix ordering, move Neoverse N2 down.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm-tune.md: Regenerate.
|
|
When compiling test-case pr94600-1.c for nvptx, this gimple mem move:
...
MEM[(volatile struct t0 *)655404B] ={v} a0[0];
...
is expanded into a memcpy, but when compiling pr94600-2.c instead, this similar
gimple mem move:
...
MEM[(volatile struct t0 *)655404B] ={v} a00;
...
is expanded into a 32-bit load/store pair.
In both cases, emit_block_move is called.
In the latter case, can_move_by_pieces (4 /* byte-size */, 32 /* bit-align */)
is called, which returns true (because by_pieces_ninsns returns 1, which is
smaller than the MOVE_RATIO of 4).
In the former case, can_move_by_pieces (4 /* byte-size */, 8 /* bit-align */)
is called, which returns false (because by_pieces_ninsns returns 4, which is
not smaller than the MOVE_RATIO of 4).
So the difference in code generation is explained by the alignment. The
difference in alignment comes from the move sources: a0[0] vs. a00. Both
have the same type with 8-bit alignment, but a00 is on stack, which based on
the base stack align and stack variable placement happens to result in a
32-bit alignment.
Enable test-cases pr94600-{1,3}.c for nvptx by forcing the currently 8-byte
aligned variables to have a 32-bit alignment for STRICT_ALIGNMENT targets.
Tested on nvptx.
gcc/testsuite/ChangeLog:
2020-10-01 Tom de Vries <tdevries@suse.de>
* gcc.dg/pr94600-1.c: Force 32-bit alignment for a0 for !non_strict_align
targets. Remove target clauses from scan tests.
* gcc.dg/pr94600-3.c: Same.
|
|
> > The following testcase is miscompiled (in particular the a and i
> > initialization). The problem is that build_special_member_call due to
> > the immediate constructors (but not evaluated in constant expression mode)
> > doesn't create a CALL_EXPR, but returns a TARGET_EXPR with CONSTRUCTOR
> > as the initializer for it,
>
> That seems like the bug; at the end of build_over_call, after you
>
> > call = cxx_constant_value (call, obj_arg);
>
> You need to build an INIT_EXPR if obj_arg isn't a dummy.
That works. obj_arg is NULL if it is a dummy from the earlier code.
2020-10-01 Jakub Jelinek <jakub@redhat.com>
PR c++/96994
* call.c (build_over_call): If obj_arg is non-NULL, return INIT_EXPR
setting obj_arg to call.
* g++.dg/cpp2a/consteval18.C: New test.
|
|
[PR97195]
As mentioned in the PR, we only support due to a bug in constant expressions
std::construct_at on non-automatic variables, because we VERIFY_CONSTANT the
second argument of placement new, which fails verification if it is an
address of an automatic variable.
The following patch fixes it by not performing that verification, the
placement new evaluation later on will verify it after it is dereferenced.
2020-10-01 Jakub Jelinek <jakub@redhat.com>
PR c++/97195
* constexpr.c (cxx_eval_call_expression): Don't VERIFY_CONSTANT the
second argument.
* g++.dg/cpp2a/constexpr-new14.C: New test.
|
|
The following patch fixes
-FAIL: gcc.dg/pr94780.c (internal compiler error)
-FAIL: gcc.dg/pr94780.c (test for excess errors)
-FAIL: gcc.dg/pr94842.c (internal compiler error)
-FAIL: gcc.dg/pr94842.c (test for excess errors)
on s390x-linux. The fix is essentially the same as has been applied to many
other targets (i386, aarch64, arm, rs6000, alpha, riscv).
2020-10-01 Jakub Jelinek <jakub@redhat.com>
* config/s390/s390.c (s390_atomic_assign_expand_fenv): Use
TARGET_EXPR instead of MODIFY_EXPR for the first assignments to
fenv_var and old_fpc. Formatting fixes.
|
|
SRA tends to use VIEW_CONVERT_EXPR when replacing bool fields with
unsigned char fields. Those are not handled in vector bool pattern
detection causing vector true values to leak. The following fixes
this by turning those into b ? 1 : 0 as well.
2020-10-01 Richard Biener <rguenther@suse.de>
* tree-vect-patterns.c (vect_recog_bool_pattern): Also handle
VIEW_CONVERT_EXPR.
* g++.dg/vect/pr97255.cc: New testcase.
|
|
levels for x86-64
These micro-architecture levels are defined in the x86-64 psABI:
https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/77566eb03bc6a326811cb7e9
PTA_NO_TUNE is introduced so that the new processor alias table entries
do not affect the CPU tuning setting in ix86_tune.
The tests depend on the macros added in commit 92e652d8c21bd7e66cbb0f900
("i386: Define __LAHF_SAHF__ and __MOVBE__ macros, based on ISA flags").
gcc/:
PR target/97250
* config/i386/i386.h (PTA_NO_TUNE, PTA_X86_64_BASELINE)
(PTA_X86_64_V2, PTA_X86_64_V3, PTA_X86_64_V4): New.
* common/config/i386/i386-common.c (processor_alias_table):
Add "x86-64-v2", "x86-64-v3", "x86-64-v4".
* config/i386/i386-options.c (ix86_option_override_internal):
Handle new PTA_NO_TUNE processor table entries.
* doc/invoke.texi (x86 Options): Document new -march values.
gcc/testsuite/:
PR target/97250
* gcc.target/i386/x86-64-v2.c: New test.
* gcc.target/i386/x86-64-v3.c: New test.
* gcc.target/i386/x86-64-v3-haswell.c: New test.
* gcc.target/i386/x86-64-v3-skylake.c: New test.
* gcc.target/i386/x86-64-v4.c: New test.
|