Age | Commit message (Collapse) | Author | Files | Lines |
|
and types
The functions in subpackage Storage_Model_Support (apart from the
Has_*_Aspect functions) are revised to have assertions that will fail
when passed a parameter that doesn't specify the appropriate aspect
(either aspect Storage_Model_Type or Designated_Storage_Model), instead
of returning Empty for bad arguments. Also, various of the functions now
allow either a type with aspect Storage_Model_Type or an object of such
a type.
gcc/ada/
* sem_util.ads (Storage_Model_Support): Revise comments on most
operations within this nested package to reflect that they can
now be passed either a type that has aspect Storage_Model_Type
or an object of such a type. Change the names of the relevant
formals to SM_Obj_Or_Type. Also, add more precise semantic
descriptions in some cases, and declare the subprograms in a
more logical order.
* sem_util.adb (Storage_Model_Support.Storage_Model_Object): Add
an assertion that the type must specify aspect
Designated_Storage_Model, rather than returning Empty when it
doesn't specify that aspect.
(Storage_Model_Support.Storage_Model_Type): Add an assertion
that formal must be an object whose type specifies aspect
Storage_Model_Type, rather than returning Empty for when it
doesn't have such a type (and test Has_Storage_Model_Type_Aspect
rather than Find_Value_Of_Aspect).
(Storage_Model_Support.Get_Storage_Model_Type_Entity): Allow
both objects and types, and add an assertion that the type (or
the type of the object) has a value for aspect
Storage_Model_Type.
|
|
gcc/ada/
* checks.adb (Apply_Arithmetic_Overflow_Minimized_Eliminated):
Fix condition to return.
|
|
gcc/ada/
* inline.adb (Can_Be_Inlined_In_GNATprove_Mode): Update comment.
|
|
Fix the escaping of the loop variable from the loop scope in both forms
of iterated element associations (i.e. "for J in ..." and "for J of
..."). Create a dedicated scope around the analyses of both loops. Also
create a copy of the Loop_Parameter_Specification instead of analyzing
(and modifying) the original Tree as it will be reanalyzed later.
gcc/ada/
* sem_aggr.adb (Resolve_Iterated_Association): Create scope
around N_Iterated_Element_Association handling. Analyze a copy
of the Loop_Parameter_Specification. Call Analyze instead
Analyze_* to be more homogeneous.
(Sem_Ch5): Remove now unused package.
|
|
The front-end drops the declaration of a temporary on the floor because
Insert_Actions fails to climb up out of an N_Iterated_Component_Association
when the temporary is created during the analysis of its Discrete_Choices.
gcc/ada/
* exp_util.adb (Insert_Actions) <N_Iterated_Component_Association>:
Climb up out of the node if the actions come from Discrete_Choices.
|
|
Fix a regression in the support for Ada 2022's treatment of calls to
abstract subprograms in pre/post-conditions (thanks to Javier Miranda
for producing this patch).
gcc/ada/
* sem_disp.adb (Check_Dispatching_Context): When checking to see
whether an expression occurs in a class-wide pre/post-condition,
also check for the possibility that it occurs in a class-wide
preconditions subprogram that was introduced as part of
expansion. Without this fix, some legal calls occuring in
class-wide preconditions may be incorrectly flagged as violating
the "a call to an abstract subprogram must be dispatching" rule.
|
|
The key is that the protected type is a (limited) private type, which
fools a test in Cleanup_Scopes.
gcc/ada/
* inline.adb (Cleanup_Scopes): Test the underlying type.
|
|
The semantic analysis of predicates involves a fair amount of tree
copying because of both semantic and implementation considerations, and
there is a difficulty with quantified expressions since they declare a
new entity that cannot be shared between the various copies of the tree.
This change implements a specific processing for it in New_Copy_Tree
that subsumes a couple of fixes made earlier for variants of the issue.
gcc/ada/
* sem_util.ads (Is_Entity_Of_Quantified_Expression): Declare.
* sem_util.adb (Is_Entity_Of_Quantified_Expression): New
predicate.
(New_Copy_Tree): Deal with all entities of quantified
expressions.
* sem_ch13.adb (Build_Predicate_Functions): Get rid of
superfluous tree copying and remove obsolete code.
* sem_ch6.adb (Fully_Conformant_Expressions): Deal with all
entities of quantified expressions.
|
|
Finalization of a record object is required to finalize any components
that have an access discriminant constrained by a per-object expression
before other components. This includes the case of a type extension;
"early finalization" components of the parent type are required to be
finalized before non-early-finalization extension components. This is
implemented in the extension type's finalization procedure by placing
the call to the parent type's finalization procedure between the
finalization of the "early finalization" extension components and the
finalization of the other extension components. Previously that call was
executed after finalizing all of the extension conponents.
gcc/ada/
* exp_ch7.adb (Build_Finalize_Statements): Add Last_POC_Call
variable to keep track of the last "early finalization" call
generated for type extension's finalization procedure. If
non-empty, then this will indicate the point at which to insert
the call to the parent type's finalization procedure. Modify
nested function Process_Component_List_For_Finalize to set this
variable (and avoid setting it during a recursive call). If
Last_POC_Call is empty, then insert the parent finalization call
before, rather than after, the finalization code for the
extension components.
|
|
This moves the implementation of AI12-0101 + AI05-0123 from the expander
to the semantic analyzer and completes the implementation of AI12-0413,
which are both binding interpretations in Ada 2012, fixing a few bugs in
the process and removing a fair amount of duplicated code throughout.
gcc/ada/
* einfo-utils.adb (Remove_Entity): Fix couple of oversights.
* exp_ch3.adb (Is_User_Defined_Equality): Delete.
(User_Defined_Eq): Call Get_User_Defined_Equality.
(Make_Eq_Body): Likewise.
(Predefined_Primitive_Eq_Body): Call Is_User_Defined_Equality.
* exp_ch4.adb (Build_Eq_Call): Call Get_User_Defined_Equality.
(Is_Equality): Delete.
(User_Defined_Primitive_Equality_Op): Likewise.
(Find_Aliased_Equality): Call Is_User_Defined_Equality.
(Expand_N_Op_Eq): Call Underlying_Type unconditionally.
Do not implement AI12-0101 + AI05-0123 here.
(Expand_Set_Membership): Call Resolve_Membership_Equality.
* exp_ch6.adb (Expand_Call_Helper): Remove obsolete code.
* sem_aux.ads (Is_Record_Or_Limited_Type): Delete.
* sem_aux.adb (Is_Record_Or_Limited_Type): Likewise.
* sem_ch4.ads (Nondispatching_Call_To_Abstract_Operation): Declare.
* sem_ch4.adb (Analyze_Call): Call Call_Abstract_Operation.
(Analyze_Membership_Op): Call Resolve_Membership_Equality.
(Nondispatching_Call_To_Abstract_Operation): New procedure.
(Remove_Abstract_Operations): Call it.
* sem_ch6.adb (Check_Untagged_Equality): Remove obsolete error and
call Is_User_Defined_Equality.
* sem_ch7.adb (Inspect_Untagged_Record_Completion): New procedure
implementing AI12-0101 + AI05-0123.
(Analyze_Package_Specification): Call it.
(Declare_Inherited_Private_Subprograms): Minor tweak.
(Uninstall_Declarations): Likewise.
* sem_disp.adb (Check_Direct_Call): Adjust to new implementation
of Is_User_Defined_Equality.
* sem_res.ads (Resolve_Membership_Equality): Declare.
* sem_res.adb (Resolve): Replace direct error handling with call to
Nondispatching_Call_To_Abstract_Operation
(Resolve_Call): Likewise.
(Resolve_Equality_Op): Likewise. mplement AI12-0413.
(Resolve_Membership_Equality): New procedure.
(Resolve_Membership_Op): Call Get_User_Defined_Equality.
* sem_util.ads (Get_User_Defined_Eq): Rename into...
(Get_User_Defined_Equality): ...this.
* sem_util.adb (Get_User_Defined_Eq): Rename into...
(Get_User_Defined_Equality): ...this. Call Is_User_Defined_Equality.
(Is_User_Defined_Equality): Also check the profile but remove tests
on Comes_From_Source and Parent.
* sinfo.ads (Generic_Parent_Type): Adjust field description.
* uintp.ads (Ubool): Invoke user-defined equality in predicate.
|
|
Cleanup related to handling of user-defined equality in GNATprove.
gcc/ada/
* exp_ch3.adb (User_Defined_Eq): Replace duplicated code with a
call to Get_User_Defined_Eq.
|
|
When checking components of a record type for their own user-defined
equality function it is enough to find just one such a component.
Cleanup related to handling of user-defined equality in GNATprove.
gcc/ada/
* exp_ch3.adb (Build_Untagged_Equality): Exit early when the
outcome of a loop is already known.
|
|
This is an incremental change towards supporting shared libraries
for VxWorks on aarch64.
The aarch64-vx7r2 compiler supports compilation with -fpic/PIC. This
change adds aarch64 to the list of CPUs for which GNATLIB_SHARED maps to
gnatlib-shared-dual for vxworks7r2, so "make gnatlib-shared" actually
builds a shared lib.
While other adjustments will be needed to get the runtime tests to pass,
this one is a necessary step and doesn't impair the rest.
gcc/ada/
* Makefile.rtl: Add aarch64 to the list of CPUs for which
GNATLIB_SHARED maps to gnatlib-shared-dual for vxworks7r2.
|
|
This aligns Analyze_Negation and Analyze_Unary_Op with the other similar
procedures in Sem_Ch4. No functional changes.
gcc/ada/
* sem_ch4.adb (Analyze_Negation): Minor tweak.
(Analyze_Unary_Op): Likewise.
|
|
The problem is that Install_Limited_With_Clause does not fully implement
AI05-0129, in the case where a regular with clause is processed before a
limited_with clause of the same package: the visible "shadow" entity is
that of the incomplete type, instead of that of the full type per the AI.
This requires adjusting Remove_Limited_With_Unit to match the change in
Install_Limited_With_Clause and also Build_Incomplete_Type_Declaration,
which is responsible for synthesizing incomplete types out of full type
declarations for self-referential types.
A small tweak is also needed in Analyze_Subprogram_Body_Helper to align
it with an equivalent processing for CW types in Find_Type_Name. And the
patch also changes the Incomplete_View field in full type declarations
to point to the entity of the view instead of its declaration.
gcc/ada/
* exp_ch3.adb (Build_Assignment): Adjust to the new definition of
Incomplete_View field.
* sem_ch10.ads (Decorate_Type): Declare.
* sem_ch10.adb (Decorate_Type): Move to library level.
(Install_Limited_With_Clause): In the already analyzed case, also
deal with incomplete type declarations present in the sources and
simplify the replacement code.
(Build_Shadow_Entity): Deal with swapped views in package body.
(Restore_Chain_For_Shadow): Deal with incomplete type declarations
present in the sources.
* sem_ch3.adb (Analyze_Full_Type_Declaration): Adjust to the new
definition of Incomplete_View field.
(Build_Incomplete_Type_Declaration): Small consistency tweak.
Set the incomplete type as the Incomplete_View of the full type.
If the scope is a package with a limited view, build a shadow
entity for the incomplete type.
* sem_ch6.adb (Analyze_Subprogram_Body_Helper): When replacing
the limited view of a CW type as designated type of an anonymous
access return type, get to the CW type of the incomplete view of
the tagged type, if any.
(Collect_Primitive_Operations): Adjust to the new definition of
Incomplete_View field.
* sinfo.ads (Incomplete_View): Denote the entity itself instead
of its declaration.
* sem_util.adb: Remove call to Defining_Entity.
|
|
Volatile refinement properties (e.g. Async_Writers), which refine the
Volatile aspect in SPARK, are inherited by subtypes from their base
types. In particular, this patch fixes handling of those properties for
subtypes of private types.
gcc/ada/
* sem_util.adb (Type_Or_Variable_Has_Enabled_Property): Given a
subtype recurse into its base type.
|
|
Routine Type_Or_Variable_Has_Enabled_Property handles either types or
objects; replace negation with an explicit positive condition.
Cleanup related to handling of volatile refinement aspects in SPARK;
behaviour is unaffected.
gcc/ada/
* sem_util.adb (Type_Or_Variable_Has_Enabled_Property): Clarify.
|
|
Routines Is_Enabled and Is_Enabled_Pragma are identical (except for
comments); remove this duplication.
Cleanup related to handling of volatile refinement aspects in SPARK;
behaviour is unaffected.
gcc/ada/
* sem_util.adb (Is_Enabled): Remove; use Is_Enabled_Pragma
instead.
|
|
gcc/ada/ChangeLog:
* locales.c (iso_639_1_to_639_3): Use ARRAY_SIZE.
(language_name_to_639_3): Likewise.
(country_name_to_3166): Likewise.
gcc/analyzer/ChangeLog:
* engine.cc (exploded_node::get_dot_fillcolor): Use ARRAY_SIZE.
* function-set.cc (test_stdio_example): Likewise.
* sm-file.cc (get_file_using_fns): Likewise.
* sm-malloc.cc (malloc_state_machine::unaffected_by_call_p): Likewise.
* sm-signal.cc (get_async_signal_unsafe_fns): Likewise.
gcc/ChangeLog:
* attribs.cc (diag_attr_exclusions): Use ARRAY_SIZE.
(decls_mismatched_attributes): Likewise.
* builtins.cc (c_strlen): Likewise.
* cfg.cc (DEF_BASIC_BLOCK_FLAG): Likewise.
* common/config/aarch64/aarch64-common.cc (aarch64_option_init_struct): Likewise.
* config/aarch64/aarch64-builtins.cc (aarch64_lookup_simd_builtin_type): Likewise.
(aarch64_init_simd_builtin_types): Likewise.
(aarch64_init_builtin_rsqrt): Likewise.
* config/aarch64/aarch64.cc (is_madd_op): Likewise.
* config/arm/arm-builtins.cc (arm_lookup_simd_builtin_type): Likewise.
(arm_init_simd_builtin_types): Likewise.
* config/avr/gen-avr-mmcu-texi.cc (mcus[ARRAY_SIZE): Likewise.
(c_prefix): Likewise.
(main): Likewise.
* config/c6x/c6x.cc (N_SAVE_ORDER): Likewise.
* config/darwin-c.cc (darwin_register_frameworks): Likewise.
* config/gcn/mkoffload.cc (process_obj): Likewise.
* config/i386/i386-builtins.cc (get_builtin_code_for_version): Likewise.
(fold_builtin_cpu): Likewise.
* config/m32c/m32c.cc (PUSHM_N): Likewise.
* config/nvptx/mkoffload.cc (process): Likewise.
* config/rs6000/driver-rs6000.cc (host_detect_local_cpu): Likewise.
* config/s390/s390.cc (NR_C_MODES): Likewise.
* config/tilepro/gen-mul-tables.cc (find_sequences): Likewise.
(create_insn_code_compression_table): Likewise.
* config/vms/vms.cc (NBR_CRTL_NAMES): Likewise.
* diagnostic-format-json.cc (json_from_expanded_location): Likewise.
* dwarf2out.cc (ARRAY_SIZE): Likewise.
* genhooks.cc (emit_documentation): Likewise.
(emit_init_macros): Likewise.
* gimple-ssa-sprintf.cc (format_floating): Likewise.
* gimple-ssa-warn-access.cc (memmodel_name): Likewise.
* godump.cc (keyword_hash_init): Likewise.
* hash-table.cc (hash_table_higher_prime_index): Likewise.
* input.cc (for_each_line_table_case): Likewise.
* ipa-free-lang-data.cc (free_lang_data): Likewise.
* ipa-inline.cc (sanitize_attrs_match_for_inline_p): Likewise.
* optc-save-gen.awk: Likewise.
* spellcheck.cc (test_metric_conditions): Likewise.
* tree-vect-slp-patterns.cc (sizeof): Likewise.
(ARRAY_SIZE): Likewise.
* tree.cc (build_common_tree_nodes): Likewise.
gcc/c-family/ChangeLog:
* c-common.cc (ARRAY_SIZE): Use ARRAY_SIZE.
(c_common_nodes_and_builtins): Likewise.
* c-format.cc (check_tokens): Likewise.
(check_plain): Likewise.
* c-pragma.cc (c_pp_lookup_pragma): Likewise.
(init_pragma): Likewise.
* known-headers.cc (get_string_macro_hint): Likewise.
(get_stdlib_header_for_name): Likewise.
* c-attribs.cc: Likewise.
gcc/c/ChangeLog:
* c-decl.cc (match_builtin_function_types): Use ARRAY_SIZE.
gcc/cp/ChangeLog:
* module.cc (depset::entity_kind_name): Use ARRAY_SIZE.
* name-lookup.cc (get_std_name_hint): Likewise.
* parser.cc (cp_parser_new): Likewise.
gcc/fortran/ChangeLog:
* frontend-passes.cc (gfc_code_walker): Use ARRAY_SIZE.
* openmp.cc (gfc_match_omp_context_selector_specification): Likewise.
* trans-intrinsic.cc (conv_intrinsic_ieee_builtin): Likewise.
* trans-types.cc (gfc_get_array_descr_info): Likewise.
gcc/jit/ChangeLog:
* jit-builtins.cc (find_builtin_by_name): Use ARRAY_SIZE.
(get_string_for_type_id): Likewise.
* jit-recording.cc (recording::context::context): Likewise.
gcc/lto/ChangeLog:
* lto-common.cc (lto_resolution_read): Use ARRAY_SIZE.
* lto-lang.cc (lto_init): Likewise.
|
|
This patch adds support for list items in the has_device_addr clause which type
is given by C++ template parameters.
gcc/cp/ChangeLog:
* pt.cc (tsubst_omp_clauses): Added OMP_CLAUSE_HAS_DEVICE_ADDR.
* semantics.cc (finish_omp_clauses): Added template decl processing.
libgomp/ChangeLog:
* testsuite/libgomp.c++/target-has-device-addr-7.C: New test.
* testsuite/libgomp.c++/target-has-device-addr-8.C: New test.
* testsuite/libgomp.c++/target-has-device-addr-9.C: New test.
|
|
Fixes:
opts-global.cc:75:15: runtime error: store to address 0x00000bc9be70 with insufficient space for an object of type 'char'
which happens when mask == 0, len == 0 and we allocate zero elements.
Eventually, result[0] is called which triggers the UBSAN.
gcc/ChangeLog:
* opts-global.cc (write_langs): Allocate at least one byte.
|
|
The following adds MIN/MAX folding from fold_cond_expr_with_comparison
to the part GIMPLE of match.pd, leaving the GENERIC part in
fold-const.cc since that's constrainted on frontend specific things
I did not want to carry to match.pd.
The effect becomes appearant when we no longer can rely on GENERIC
folding of COND_EXPRs in gcc.dg/tree-ssa/pr92834.c and
gcc.dg/tree-ssa/pr94786.c.
2022-05-13 Richard Biener <rguenther@suse.de>
* match.pd (A cmp B ? A : B -> min/max): New patterns
carried over from fold_cond_expr_with_comparison.
|
|
When d->perm[i] == d->perm[i-1] + 1 and d->perm[i] == nelt, it's not
continuous. It should fail if there's more than 2 continuous areas.
gcc/ChangeLog:
PR target/105587
* config/i386/i386-expand.cc
(expand_vec_perm_pslldq_psrldq_por): Fail when (d->perm[i] ==
d->perm[i-1] + 1) && d->perm[i] == nelt && start != -1.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr105587.c: New test.
|
|
|
|
const_int_operand and other const*_operand predicates do not need
constraints when the constraint is inherited from the range of
constant integer predicate. Remove the constraint in case all
alternatives use the same inherited constraint.
2022-05-15 Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog:
* config/i386/i386.md: Remove constraints when used with
const_int_operand, const0_operand, const_1_operand, constm1_operand,
const8_operand, const128_operand, const248_operand, const123_operand,
const2367_operand, const1248_operand, const359_operand,
const_4_or_8_to_11_operand, const48_operand, const_0_to_1_operand,
const_0_to_3_operand, const_0_to_4_operand, const_0_to_5_operand,
const_0_to_7_operand, const_0_to_15_operand, const_0_to_31_operand,
const_0_to_63_operand, const_0_to_127_operand, const_0_to_255_operand,
const_0_to_255_mul_8_operand, const_1_to_31_operand,
const_1_to_63_operand, const_2_to_3_operand, const_4_to_5_operand,
const_4_to_7_operand, const_6_to_7_operand, const_8_to_9_operand,
const_8_to_11_operand, const_8_to_15_operand, const_10_to_11_operand,
const_12_to_13_operand, const_12_to_15_operand, const_14_to_15_operand,
const_16_to_19_operand, const_16_to_31_operand, const_20_to_23_operand,
const_24_to_27_operand and const_28_to_31_operand.
* config/i386/mmx.md: Ditto.
* config/i386/sse.md: Ditto.
* config/i386/subst.md: Ditto.
* config/i386/sync.md: Ditto.
|
|
It has come up several times that Clang considers hidden friends of a class
to be sufficiently memberly to be covered by a friend declaration naming the
class. This is somewhat unclear in the standard: [class.friend] says
"Declaring a class to be a friend implies that private and protected members
of the class granting friendship can be named in the base-specifiers and
member declarations of the befriended class."
A hidden friend is a syntactic member-declaration, but is it a "member
declaration"? CWG was ambivalent, and referred the question to EWG as a
design choice. But recently Patrick mentioned that the current G++ choice
not to treat it as a "member declaration" was making his library work
significantly more cumbersome, so let's go ahead and vote the other way.
This means that the testcases for 100502 and 58993 are now accepted.
DR1699
PR c++/100502
PR c++/58993
gcc/cp/ChangeLog:
* friend.cc (is_friend): Hidden friends count as members.
* search.cc (friend_accessible_p): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/template/access37.C: Now OK.
* g++.dg/template/friend69.C: Now OK.
* g++.dg/lookup/friend23.C: New test.
|
|
While I was backporting the patch for PR102300, it occurred to me that it
would be cleaner to look through the injected-class-name earlier in the
function. I don't think this changes any test results.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_template_name): Look through
injected-class-name.
|
|
My patch for 105191 made us use build_value_init more frequently from
build_vec_init_expr, but build_value_init doesn't like to be called to
initialize a class in a template. That's caused trouble in the past, and
seems like a strange restriction, so let's fix it.
PR c++/105589
PR c++/105191
PR c++/92385
gcc/cp/ChangeLog:
* init.cc (build_value_init): Handle class in template.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/initlist-array16.C: New test.
|
|
|
|
|
|
This was fixed by r258755:
PR c++/81311 - wrong C++17 overload resolution.
PR c++/81952
gcc/testsuite/ChangeLog:
* g++.dg/overload/conv-op4.C: New test.
|
|
The exporter relies on sorting interface parse methods. It would sort
them as it encountered interface types. However, when an interface
type is an element of a struct or array type, the exporter might
encounter that interface type before sorting the parse methods. If it
then encountered an identical interface type again, it could get
confused about whether the two types are identical or not.
Fix the problem by always sorting the parse methods in the
finalize_methods pass.
Also firm up the export type sorting to make sure we never have this
kind of confusion again. Doing this revealed that we need to be more
careful about sorting in order to handle aliases correctly.
Also fix the interface type hash computation to use the right hash
value when looking at parse methods rather than all methods.
The test case for this is https://go.dev/cl/405759.
Fixes golang/go#52841
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/405556
|
|
This patch improves support for vector equality and inequality of
V1TImode vectors, and V2DImode vectors with sse2 but not sse4.
Consider the three functions below:
typedef unsigned int uv4si __attribute__ ((__vector_size__ (16)));
typedef unsigned long long uv2di __attribute__ ((__vector_size__ (16)));
typedef unsigned __int128 uv1ti __attribute__ ((__vector_size__ (16)));
uv4si eq_v4si(uv4si x, uv4si y) { return x == y; }
uv2di eq_v2di(uv2di x, uv2di y) { return x == y; }
uv1ti eq_v1ti(uv1ti x, uv1ti y) { return x == y; }
These all perform vector comparisons of 128bit SSE2 registers, generating
the result as a vector, where ~0 (all 1 bits) represents true and a zero
represents false. eq_v4si is trivially implemented by x86_64's pcmpeqd
instruction. This patch improves the other two cases:
For v2di, gcc -O2 currently generates:
movq %xmm0, %rdx
movq %xmm1, %rax
movdqa %xmm0, %xmm2
cmpq %rax, %rdx
movhlps %xmm2, %xmm3
movhlps %xmm1, %xmm4
sete %al
movq %xmm3, %rdx
movzbl %al, %eax
negq %rax
movq %rax, %xmm0
movq %xmm4, %rax
cmpq %rax, %rdx
sete %al
movzbl %al, %eax
negq %rax
movq %rax, %xmm5
punpcklqdq %xmm5, %xmm0
ret
but with this patch we now generate:
pcmpeqd %xmm0, %xmm1
pshufd $177, %xmm1, %xmm0
pand %xmm1, %xmm0
ret
where the results of a V4SI comparison are shuffled and bit-wise ANDed
to produce the desired result. There's no change in the code generated
for "-O2 -msse4" where the compiler generates a single "pcmpeqq" insn.
For V1TI mode, the results are equally dramatic, where the current -O2
output looks like:
movaps %xmm0, -40(%rsp)
movq -40(%rsp), %rax
movq -32(%rsp), %rdx
movaps %xmm1, -24(%rsp)
movq -24(%rsp), %rcx
movq -16(%rsp), %rsi
xorq %rcx, %rax
xorq %rsi, %rdx
orq %rdx, %rax
sete %al
xorl %edx, %edx
movzbl %al, %eax
negq %rax
adcq $0, %rdx
movq %rax, %xmm2
negq %rdx
movq %rdx, -40(%rsp)
movhps -40(%rsp), %xmm2
movdqa %xmm2, %xmm0
ret
with this patch we now generate:
pcmpeqd %xmm0, %xmm1
pshufd $177, %xmm1, %xmm0
pand %xmm1, %xmm0
pshufd $78, %xmm0, %xmm1
pand %xmm1, %xmm0
ret
performing a V2DI comparison, followed by a shuffle and pand, and with
-O2 -msse4 take advantages of SSE4.1's pcmpeqq:
pcmpeqq %xmm0, %xmm1
pshufd $78, %xmm1, %xmm0
pand %xmm1, %xmm0
ret
2022-05-13 Roger Sayle <roger@nextmovesoftware.com>
Uroš Bizjak <ubizjak@gmail.com>
gcc/ChangeLog
* config/i386/sse.md (vec_cmpeqv2div2di): Enable for TARGET_SSE2.
For !TARGET_SSE4_1, expand as a V4SI vector comparison, followed
by a pshufd and pand.
(vec_cmpeqv1tiv1ti): New define_expand implementing V1TImode
vector equality as a V2DImode vector comparison (see above),
followed by a pshufd and pand.
gcc/testsuite/ChangeLog
* gcc.target/i386/sse2-v1ti-veq.c: New test case.
* gcc.target/i386/sse2-v1ti-vne.c: New test case.
|
|
A few tests need not be restricted to 'lp64', so remove the restriction.
A few of those need a simple change to the DejaGnu directives to suppress
'-mcmodel' flags for '-m32'.
2022-05-13 Paul A. Clarke <pc@us.ibm.com>
gcc/testsuite
* g++.target/powerpc/pr65240-1.C: Adjust DejaGnu directives.
* g++.target/powerpc/pr65240-2.C: Likewise.
* g++.target/powerpc/pr65240-3.C: Likewise.
* g++.target/powerpc/pr65240-4.C: Likewise.
* g++.target/powerpc/pr65242.C: Likewise.
* g++.target/powerpc/pr67211.C: Likewise.
* g++.target/powerpc/pr69667.C: Likewise.
* g++.target/powerpc/pr71294.C: Likewise.
|
|
Also adjust DejaGnu directives, as specifically requiring "powerpc*-*-*" is no
longer required.
2021-05-13 Paul A. Clarke <pc@us.ibm.com>
gcc/testsuite
* g++.dg/pr65240.h: Move to g++.target/powerpc.
* g++.dg/pr93974.C: Likewise.
* g++.dg/pr65240-1.C: Move to g++.target/powerpc, adjust dg directives.
* g++.dg/pr65240-2.C: Likewise.
* g++.dg/pr65240-3.C: Likewise.
* g++.dg/pr65240-4.C: Likewise.
* g++.dg/pr65242.C: Likewise.
* g++.dg/pr67211.C: Likewise.
* g++.dg/pr69667.C: Likewise.
* g++.dg/pr71294.C: Likewise.
* g++.dg/pr84264.C: Likewise.
* g++.dg/pr84279.C: Likewise.
* g++.dg/pr85657.C: Likewise.
|
|
This patch implements the missed optimization enhancement PR 83907,
by handling memset with a constant byte value in tree-ssa's strlen
optimization pass. Effectively, this treats memset(dst,'x',3) as
it would memcpy(dst,"xxx",3).
This patch also includes a tweak to handle_store to address another
missed optimization observed in the related test case pr83907-2.c.
The consecutive byte stores to memory get coalesced into a vector
write of a vector const, but unfortunately tree-ssa-strlen's
handle_store didn't previously handle the (unusual) case where the
stored "string" starts with a zero byte but also contains non-zero
bytes.
2022-05-13 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR tree-optimization/83907
* tree-ssa-strlen.cc (handle_builtin_memset): Record a strinfo
for memset with an constant char value.
(handle_store): Improved handling of stores with a first byte
of zero, but not storing_all_zeros_p.
gcc/testsuite/ChangeLog
PR tree-optimization/83907
* gcc.dg/tree-ssa/pr83907-1.c: New test case.
* gcc.dg/tree-ssa/pr83907-2.c: New test case.
|
|
The Zbb support has introduced ctz and clz to the backend, but some
transformations in GCC need to know what the value of c[lt]z at zero
is. This affects how the optab is generated and may suppress use of
CLZ/CTZ in tree passes.
Among other things, this is needed for the transformation of
table-based ctz-implementations, such as in deepsjeng, to work
(see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90838).
Prior to this change, the test case from PR90838 would compile to
on RISC-V targets with Zbb:
myctz:
lui a4,%hi(.LC0)
ld a4,%lo(.LC0)(a4)
neg a5,a0
and a5,a5,a0
mul a5,a5,a4
lui a4,%hi(.LANCHOR0)
addi a4,a4,%lo(.LANCHOR0)
srli a5,a5,58
sh2add a5,a5,a4
lw a0,0(a5)
ret
After this change, we get:
myctz:
ctz a0,a0
andi a0,a0,63
ret
Testing this with deepsjeng_r (from SPEC 2017) against QEMU, this
shows a clear reduction in dynamic instruction count:
- before 1961888067076
- after 1907928279874 (2.75% reduction)
This also merges the various target-specific test-cases (for x86-64,
aarch64 and riscv) within gcc.dg/pr90838.c.
This extends the macros (i.e., effective-target keywords) used in
testing (lib/target-supports.exp) to reliably distinguish between RV32
and RV64 via __riscv_xlen (i.e., the integer register bitwidth) :
testing for ILP32 could be misleading (as ILP32 is a valid memory
model for 64bit systems).
gcc/ChangeLog:
* config/riscv/riscv.h (CLZ_DEFINED_VALUE_AT_ZERO): Implement.
(CTZ_DEFINED_VALUE_AT_ZERO): Same.
* doc/sourcebuild.texi: add documentation for RISC-V specific
test target keywords
gcc/testsuite/ChangeLog:
* gcc.dg/pr90838.c: Add additional flags (dg-additional-options)
when compiling for riscv64 and subsume gcc.target/aarch64/pr90838.c
and gcc.target/i386/pr95863-2.c.
* gcc.target/aarch64/pr90838.c: Removed.
* gcc.target/i386/pr95863-2.c: Removed.
* lib/target-supports.exp: Recognize RV32 or RV64 via XLEN
Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Signed-off-by: Manolis Tsamis <manolis.tsamis@vrull.eu>
Co-authored-by: Manolis Tsamis <manolis.tsamis@vrull.eu>
|
|
When folding, the LHS has not been set, so we should be checking the type of
op1. We should also make sure op1 is not undefined.
PR tree-optimization/105597
gcc/
* range-op.cc (operator_minus::lhs_op1_relation): Use op1 instead
of the lhs and make sure it is not undefined.
gcc/testsuite/
* gcc.dg/pr105597.c: New.
|
|
For a non-descriptor array, map(A(n:m)) was mapped as
map(tofrom:A[n-1] [len: ...]) map(alloc:A [pointer assign, bias: ...])
with this patch, it is changed to
map(tofrom:A[n-1] [len: ...]) map(firstprivate:A [pointer assign, bias: ...])
The latter avoids an alloc - and also avoids the race condition with
nowait in the enclosed testcase. (Note: predantically, the testcase is
invalid since OpenMP 5.1, violating the map clause restriction at [354:10-13].
gcc/fortran/ChangeLog:
* trans-openmp.cc (gfc_trans_omp_clauses): When mapping nondescriptor
array sections, use GOMP_MAP_FIRSTPRIVATE_POINTER instead of
GOMP_MAP_POINTER for the pointer attachment.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/target-nowait-array-section.f90: New test.
|
|
2022-05-13 Sebastian Pop <spop@amazon.com>
gcc/
PR target/105162
* config/aarch64/aarch64-protos.h (atomic_ool_names): Increase dimension
of str array.
* config/aarch64/aarch64.cc (aarch64_atomic_ool_func): Call
memmodel_from_int and handle MEMMODEL_SYNC_*.
(DEF0): Add __aarch64_*_sync functions.
gcc/testsuite/
PR target/105162
* gcc.target/aarch64/sync-comp-swap-ool.c: New.
* gcc.target/aarch64/sync-op-acquire-ool.c: New.
* gcc.target/aarch64/sync-op-full-ool.c: New.
* gcc.target/aarch64/target_attr_20.c: Update check.
* gcc.target/aarch64/target_attr_21.c: Same.
libgcc/
PR target/105162
* config/aarch64/lse.S: Define BARRIER and handle memory MODEL 5.
* config/aarch64/t-lse: Add a 5th memory model for _sync functions.
|
|
Similar to 37e65643d3e ("testsuite/101269: fix testcase when used with
-m32"), RISC-V needs to be told not to put symbols in the
sdata/srodata/sbss sections.
gcc/testsuite/ChangeLog
* gcc.dg/debug/btf/btf-datasec-1.c: Don't use small data on RISC-V.
|
|
Similar to pr593993, RISC-V needs to limit symbols send in sdata.
gcc/testsuite/ChangeLog:
* g++.dg/opt/const7.C: Don't use small data on RISC-V.
|
|
Re-using some common things like EQ_EXPR and other relationals made
certain things easier, but complicated debugging and added extra overhead
when accessing lookup tables. With forthcoming additional relation types,
it makes more sense to simple have a distinct relation kind.
* gimple-range-fold.cc (fold_using_range::range_of_phi): Use new VREL_*
enumerated values.
* gimple-range-path.cc (maybe_register_phi_relation): Ditto.
* range-op.cc (*::lhs_op1_relation): Return relation_kind, and use
new VREL enumerated values.
(*::lhs_op2_relation): Ditto.
(*::op1_op2_relation): Ditto.
(*::fold_range): Use new VREL enumerated values.
(minus_op1_op2_relation_effect): Ditto.
(range_relational_tests): Ditto.
* range-op.h (fold_range, op1_range, op2_range): Use VREL_VARYING.
(lhs_op1_relation, lhs_op2_relation, op1_op2_relation): Return
relation_kind.
(*_op1_op2_relation): Return relation_kind.
(relop_early_resolve): Use VREL_UNDEFINED.
* value-query.cc (range_query::query_relation): Use VREL_VARYING.
* value-relation.cc (VREL_LAST): Change enumerated value.
(vrel_range_assert): Delete.
(print_relation): Remove range assert.
(rr_negate_table): Adjust table to use new enumerated values..
(relation_negate): Remove range assert.
(rr_swap_table): Adjust.
(relation_swap): Remove range assert.
(rr_intersect_table): Adjust.
(relation_intersect): Remove range assert.
(rr_union_table): Adjust.
(relation_union): Remove range assert.
(rr_transitive_table): Adjust.
(relation_transitive): Remove range assert.
(equiv_oracle::query_relation): Use new VREL enumerated values.
(equiv_oracle::register_relation): Ditto.
(relation_oracle::register_stmt): Ditto.
(dom_oracle::set_one_relation): Ditto.
(dom_oracle::register_transitives): Ditto.
(dom_oracle::query_relation): Ditto.
(path_oracle::register_relation): Ditto.
(path_oracle::query_relation): Ditto.
* value-relation.h (enum relation_kind_t): New relation_kind.
(*_op1_op2_relation): Adjust prototypes.
|
|
Union_ returns a boolean indicating if the operation changes the range.
Also optimize the common single-pair UNION single-pair case.
* gimple-range-edge.cc (calc_switch_ranges): Check union return value.
* value-range.cc (irange::legacy_verbose_union_): Add return value.
(irange::irange_single_pair_union): New.
(irange::irange_union): Add return value.
* value-range.h (class irange): Adjust prototypes.
|
|
Return true if the intersection of ranges changed the original value.
Speed up the case when there is no change by calling an efficient
contains routine.
* value-range.cc (irange::legacy_verbose_intersect): Add return value.
(irange::irange_contains_p): New.
(irange::irange_intersect): Add return value.
* value-range.h (class irange): Adjust prototypes.
|
|
The "is_current" status is returned by parameter, but was being returned by the
function as well instead of true if NAME had a global range, and FALSE
if it did not.
* gimple-range-cache.cc (ranger_cache::get_global_range): Return the
had_global value instead.
|
|
We use the relation between op1 and op2 to help fold a statement, but
it was not provided to the lhs_op1_relation and lhs_op2_relation routines
to determine if is also creates a relation between the LHS and either operand.
gcc/
PR tree-optimization/104547
* gimple-range-fold.cc (fold_using_range::range_of_range_op): Add
the op1/op2 relation to the relation call.
* range-op.cc (*::lhs_op1_relation): Add param.
(*::lhs_op2_relation): Ditto.
(operator_minus::lhs_op1_relation): New.
(range_relational_tests): Add relation param.
* range-op.h (lhs_op1_relation, lhs_op2_relation): Adjust prototype.
gcc/testsuite/
* g++.dg/pr104547.C: New.
|
|
Internal-linkage entity mangling is entirely implementation defined --
there's no ABI issue. Let's not mangle in any module attachment to
them, it makes the symbols unnecessarily longer.
gcc/cp/
* mangle.cc (maybe_write_module): Check external linkage.
gcc/testsuite/
* g++.dg/modules/mod-sym-4.C: New.
|
|
VRP currently searches the ssa_name list for globals to exported after it
finishes running. Recent changes have VRP calling a side-effect routine for
each stmt during the walk. This change simply exports globals as they are
calculated the final time during the walk.
* gimple-range.cc (gimple_ranger::register_side_effects): First check
if the DEF should be exported as a global.
* tree-vrp.cc (rvrp_folder::pre_fold_bb): Process PHI side effects,
which will export globals.
(execute_ranger_vrp): Remove call to export_global_ranges.
|
|
When we reset the path oracle, we should clear the killing defs vector.
* value-relation.cc (path_oracle::reset_path): Clear killing_defs.
|