Age | Commit message (Collapse) | Author | Files | Lines |
|
In a define_insn, you could use either an explicit parallel for
the insns or genrecog/genemit will add one for you.
The problem when genemit is processing the pattern for clobbers
(to create the function add_clobbers), genemit hadn't add the implicit
parallel yet but at the same time forgot to ignore that there
could be an explicit parallel there.
This means in some cases (like in the sh backend), add_clobbers
and recog had a different idea if there was clobbers on the insn.
This fixes the problem by looking through the explicit parallel
for the instruction in genemit.
Bootstrapped and tested on x86_64-linux-gnu.
PR middle-end/116058
gcc/ChangeLog:
* genemit.cc (struct clobber_pat): Change pattern to be rtvec.
Add code field.
(gen_insn): Look through an explicit parallel if there was one.
Update store to new clobber_pat.
(output_add_clobbers): Update call to gen_exp for the changed
clobber_pat.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
gcc/ChangeLog:
* config/riscv/sync-rvwmo.md: Add conditional length attributes.
* config/riscv/sync-ztso.md: Ditto.
* config/riscv/sync.md: Fix incorrect insn length attributes and
reformat existing conditional checks.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
|
|
This patch ensures this testcase is ran for armv8.1-m.main+mve as this is
testing that doloops with function calls that aren't intrinsics get rejected
as potential doloop targets during ivopts. For other targets this loop gets
rejected for different reasons.
gcc/testsuite/ChangeLog:
* gcc.target/arm/mve/ivopts-3.c: Add require target and options.
|
|
According to the Neoverse V2 Software Optimization Guide (section 4.14), the
instruction pairs CMP+CSEL and CMP+CSET can be fused, which had not been
implemented so far. This patch implements and tests the two fusion pairs.
The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
There was also no non-noise impact on SPEC CPU2017 benchmark.
OK for mainline?
Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/
* config/aarch64/aarch64.cc (aarch_macro_fusion_pair_p): Implement
fusion logic.
* config/aarch64/aarch64-fusion-pairs.def (cmp+csel): New entry.
(cmp+cset): Likewise.
* config/aarch64/tuning_models/neoversev2.h: Enable logic in
field fusible_ops.
gcc/testsuite/
* gcc.target/aarch64/fuse_cmp_csel.c: New test.
* gcc.target/aarch64/fuse_cmp_cset.c: Likewise.
|
|
The testcase contains the constant:
arr2 = svreinterpret_u8(svdup_u32(0x0a0d5c3f));
which was initially hoisted by hand, but which gimple optimisers later
propagated to each use (as expected). The constant was then expanded
as a load-and-duplicate from the constant pool. Normally that load
should then be hoisted back out of the loop, but may_trap_or_fault_p
stopped that from happening in this case.
The code responsible was:
if (/* MEM_NOTRAP_P only relates to the actual position of the memory
reference; moving it out of context such as when moving code
when optimizing, might cause its address to become invalid. */
code_changed
|| !MEM_NOTRAP_P (x))
{
poly_int64 size = MEM_SIZE_KNOWN_P (x) ? MEM_SIZE (x) : -1;
return rtx_addr_can_trap_p_1 (XEXP (x, 0), 0, size,
GET_MODE (x), code_changed);
}
where code_changed is true. (Arguably it doesn't need to be true in
this case, if we inserted invariants on the preheader edge, but it
would still need to be true for conditionally executed loads.)
Normally this wouldn't be a problem, since rtx_addr_can_trap_p_1
would recognise that the address refers to the constant pool.
However, the SVE load-and-replicate instructions have a limited
offset range, so it isn't possible for them to have a LO_SUM address.
All we have is a plain pseudo base register.
MEM_READONLY_P is defined as:
/* 1 if RTX is a mem that is statically allocated in read-only memory. */
#define MEM_READONLY_P(RTX) \
(RTL_FLAG_CHECK1 ("MEM_READONLY_P", (RTX), MEM)->unchanging)
and so I think it should be safe to move memory references if both
MEM_READONLY_P and MEM_NOTRAP_P are true.
The testcase isn't a minimal reproducer, but I think it's good
to have a realistic full routine in the testsuite.
gcc/
PR rtl-optimization/116145
* rtlanal.cc (may_trap_p_1): Trust MEM_NOTRAP_P even for code
movement if MEM_READONLY_P is also true.
gcc/testsuite/
PR rtl-optimization/116145
* gcc.target/aarch64/sve/acle/general/pr116145.c: New test.
|
|
This DR clarifies that "int main() = delete;" is ill-formed.
PR c++/116169
gcc/cp/ChangeLog:
* decl.cc (cp_finish_decl): Disallow deleting ::main.
gcc/testsuite/ChangeLog:
* g++.dg/DRs/dr882.C: New test.
|
|
This provides and uses a CTOR to initialize the object used in
tree walks to track local variable uses. This makes the idiom
used consistent.
gcc/cp/ChangeLog:
* coroutines.cc (struct local_vars_frame_data): Add a
CTOR.
(morph_fn_to_coro): Use CTOR for local_vars_frame_data
instead of brace init.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
|
|
We maintain state on the progress of await analysis in an object that
is passed to the various tree walks used. Some of the state had become
stale (i.e. unused members). Remove those and provide a CTOR so that
updates are localised.
Remove the file scope hash_map used to collect the final state for the
actor function and make that part of the suspend point state.
gcc/cp/ChangeLog:
* coroutines.cc (struct susp_frame_data): Remove unused members,
provide a CTOR.
(morph_fn_to_coro): Use susp_frame_data CTOR, and make the suspend
state hash map local to the morph function.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
|
|
The current code fails to check for void expression types because it does
not looup the type. Fixed thus.
gcc/cp/ChangeLog:
* coroutines.cc (replace_continue): Look up expression type.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
|
|
The problem here is that when forwprop does a copy prop, into a statement,
we mark the uses of that statement as possibly need to be removed. But it just
happened that statement was a debug statement, there will be a difference when
compiling with debuging info turned on vs off; this is not expected.
So the fix is not to add the old use to dce list to process if it was a debug
statement.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.
PR tree-optimization/116156
gcc/ChangeLog:
* tree-ssa-forwprop.cc (pass_forwprop::execute): Don't add
uses if the statement was a debug statement.
gcc/testsuite/ChangeLog:
* c-c++-common/torture/pr116156-1.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
To get correct aliasing behavior requires that structures and unions
that contain a byte array, i.e. an array of non-atomic character
type (N3254), are marked with TYPE_TYPELESS_STORAGE. This change
affects also earlier language modes.
gcc/c/
* c-decl.cc (grokdeclarator, finish_struct): Set and
propagate TYPE_TYPELESS_STORAGE.
gcc/testsuite/
* gcc.dg/c2y-byte-alias-1.c: New test.
* gcc.dg/c2y-byte-alias-2.c: New test.
* gcc.dg/c2y-byte-alias-3.c: New test.
|
|
gcc/ChangeLog:
* config/i386/constraints.md: Fixed the comment/naming for je/jM/jO.
* config/i386/predicates.md (apx_ndd_memory_operand): Renamed and
fixed the comment.
(apx_evex_memory_operand): New name.
(apx_ndd_add_memory_operand): Ditto.
(apx_evex_add_memory_operand): Ditto.
|
|
SPARK_Mode aspect was not properly propagated to the body of
a standalone child subprogram from the generated spec for that subprogram,
leading GNATprove to not analyze this body. Now fixed.
gcc/ada/
* aspects.adb (Find_Aspect): Take into account the case of a node
of kind N_Defining_Program_Unit_Name.
* sem_ch10.adb (Analyze_Compilation_Unit): Copy the SPARK aspect
from the spec to the body. Delay semantic analysis after that
point to ensure that SPARK_Mode is properly analyzed.
|
|
Fix a number of problems in handling of actions generated for a
2-dimensional array aggregate where the outer aggregate has iterated
component association and the inner aggregate involves run-time checks.
gcc/ada/
* exp_aggr.adb (Add_Loop_Actions): Actions are now attached to
iterated component association just like they are attached to
ordinary component association.
(Build_Array_Aggr_Code): If resolution of the array aggregate
generated some actions, e.g. for run-time checks, then we must
keep them; same for the Other_Clause.
* sem_aggr.adb (Resolve_Iterated_Component_Association): Unset
references to iterator variable in loop actions (which might come
from run-time check), just these references are unset in the
expression itself.
|
|
Code cleanup; semantics is unaffected.
gcc/ada/
* exp_util.adb (Insert_Actions): Remove null ELSE branch.
|
|
Code cleanup; behavior is unaffected.
gcc/ada/
* exp_aggr.adb (Add_Loop_Actions): Change manipulation of list
to avoid unnecessary calls to Parent and Loop_Actions.
|
|
Code cleanup; semantics is unaffected.
gcc/ada/
* exp_util.adb (Insert_Actions): Move negation in front of
complex conjunctions.
|
|
Code cleanup; semantics is unaffected.
gcc/ada/
* exp_aggr.adb (Gen_Assign): Fix layout.
* sem_aggr.adb (Empty_Range): Reuse Choice_List.
|
|
The compiler rejects various cases of container aggregates with
iterated_element_associations that include a loop_parameter_subtype_indication
or that include the "reverse" keyword. The fixes are in the parser, for
naccepting the syntax for these cases, as well as for properly accounting
for reverse iterators in the analyzer and expander.
gcc/ada/
* exp_aggr.adb
(Expand_Container_Aggregate.Expand_Iterated_Component): Set the
Reverse_Present flag when creating the loop's iteration_scheme.
* gen_il-gen-gen_nodes.adb: Add flag Reverse_Present to
N_Iterated_Component_Association nodes.
* par-ch3.adb (P_Constraint_Op): Remove testing for and ignoring
of Tok_In following a constraint. It's allowed for "in" to follow
a constraint of loop_parameter_subtype_indication of an
iterator_specification, so it shouldn't be ignored.
* par-ch4.adb (P_Iterated_Component_Association): Account for
"reverse" following the "in" in an iterated_component_association,
and set the Reverse_Present flag on the
N_Iterated_Component_Association node. Add handling for a ":"
following the identifier in an iterator_specification of an
iterated_element_association, sharing the code with the "of" case
(which backs up to the identifier at the beginning of the
iterator_specification). Fix incorrect trailing comment following
the call to Scan.
(Build_Iterated_Element_Association): Set the Reverse_Present flag
on an N_Loop_Parameter_Specification node of an
N_Iterated_Element_Association.
* par-ch5.adb (P_Iterator_Specification): Remove error-recovery
and error code that reports "subtype indication is only legal on
an element iterator", as that error can no longer be emitted (and
was formerly only reported on one fixedbugs test).
* sem_aggr.adb
(Resolve_Container_Aggregate.Resolve_Iterated_Association): When
creating an N_Iterator_Specification for an
N_Iterated_Component_Association, set the Reverse_Present flag of
the N_Iterated_Specification from the flag on the latter.
* sinfo.ads: Add comments for the Reverse_Present flag, which is
now allowed on nodes of kind N_Iterated_Component_Association.
|
|
Unlike the aspect, the pragma needs to be propagated explicitly from a
generic subprogram to its instances.
gcc/ada/
* sem_ch12.adb (Analyze_Subprogram_Instantiation): Propagate the
No_Raise flag like the No_Return flag.
|
|
This patch makes a minor modification to Expand_Container_Aggregate
in order to silence a GNAT SAS false positive.
gcc/ada/
* exp_aggr.adb (Expand_Container_Aggregate): Remove variables.
(To_Int): New function.
(Add_Range_Size): Use newly introduced function.
|
|
Add complete functional contracts to all subprograms in
Ada.Strings.Unbounded, except Count, following the specification from
Ada RM A.4.5. These contracts are similar to the contracts found in
Ada.Strings.Fixed and Ada.Strings.Bounded.
A difference is that type Unbounded_String is controlled, thus we avoid
performing copies of a parameter Source with Source'Old, and instead
apply 'Old attribute on the enclosing call, such as Length(Source)'Old.
As Unbounded_String is controlled, the implementation is not in SPARK.
Instead, we have separately proved a slightly different implementation
for which Unbounded_String is not controlled, against the same
specification. This ensures that the specification is consistent.
To minimize differences between this test from the SPARK testsuite and
the actual implementation (the one in a-strunb.adb), and to avoid
overflows in the actual implementation, some code is slightly rewritten.
Delete and Insert are modified to return the correct result in all
cases allowed by the standard.
The same contracts are added to the version in a-strunb__shared.ads and
similar implementation patches are applied to the body
a-strunb__shared.adb. In particular, tests are added to avoid overflows
on strings for which the last index is Natural'Last, and the computations
that involve Sum to guarantee that an exception is raised in case of
overflow are rewritten to guarantee correct detection and no intermediate
overflows (and such tests are applied consistently between the procedure
and the function when both exist).
gcc/ada/
* libgnat/a-strunb.adb (Sum, Saturated_Sum, Saturated_Mul): Adapt
function signatures to more precise types that allow proof.
(function "&"): Conditionally assign a slice to avoid possible
overflow which only occurs when the assignment is a noop (because
the slice is empty in that case).
(Append): Same.
(function "*"): Retype K to avoid a possible overflow. Add early
return on null length for proof.
(Delete): Fix implementation to return the correct result in all
cases allowed by the Ada standard.
(Insert): Same. Also avoid possible overflows.
(Length): Rewrite as expression function for proof.
(Overwrite): Avoid possible overflows.
(Slice): Same.
(To_String): Rewrite as expression function for proof.
* libgnat/a-strunb.ads: Extend Assertion_Policy to new contracts
used. Add complete functional contracts to all subprograms of the
public API except Count.
* libgnat/a-strunb__shared.adb (Sum): Adapt function signature to
more precise types that allow proof.
(function "&"): Conditionally assign a slice to avoid possible
overflow.
(function "*"): Retype K to avoid a possible overflow.
(Delete): Fix implementation to return the correct result in all
cases allowed by the Ada standard.
(Insert): Avoid possible overflows.
(Overwrite): Avoid possible overflows.
(Replace_Slice): Same.
(Slice): Same.
(To_String): Rewrite as expression function for proof.
* libgnat/a-strunb__shared.ads: Extend Assertion_Policy to new
contracts used. Add complete functional contracts to all
subprograms of the public API except Count. Mark public part of
spec as in SPARK.
|
|
This patch is motivated by a GNAT SAS report.
gcc/ada/
* scng.adb (Slit): Initialize object in uncommon path.
|
|
gcc/ada/
* exp_ch4.adb (Generate_Temporary): Remove unused procedure.
|
|
Change Is_Finalizer from synthesized attribute into flag. Remove duplicate
Is_Finalizer_Proc. Add new Try_Inline_Always for backend usage.
gcc/ada/
* einfo-utils.ads (Is_Finalizer): Delete.
* einfo-utils.adb (Is_Finalizer): Delete.
* einfo.ads: Adjust comment.
* gen_il-fields.ads, gen_il-gen-gen_entities.adb: Add Is_Finalizer
flag.
* exp_ch3.adb (Build_Init_Procedure): Set it.
* exp_ch7.adb (Create_Finalizer): Likewise.
* exp_util.adb (Try_Inline_Always): New function.
* exp_util.ads (Try_Inline_Always): New function.
* sem_elab.adb (Is_Finalizer_Proc): Replace with Is_Finalizer.
|
|
Unix timestamp jumps one second back when a leap second
is applied and doesn't count cumulative leap seconds.
This was not taken into account in conversions between
Unix time and Ada time. Now fixed.
gcc/ada/
* libgnat/a-calend.adb: Modify unix time handling.
|
|
gcc/ada/
* doc/gnat_rm/implementation_defined_pragmas.rst: Add examples.
* gnat_rm.texi: Regenerate.
* gnat_ugn.texi: Regenerate.
|
|
This patch enhances support for this language feature by rejecting
more ambiguous function calls. In terms of name resolution, the
analysis of interpolated expressions is now treated as an expression
of any type, as required by the documentation. Additionally, support
for nested interpolated strings has been removed.
gcc/ada/
* gen_il-fields.ads (Is_Interpolated_String_Literal): New field.
* gen_il-gen-gen_nodes.adb (Is_Interpolated_String_Literal): The
new field is a flag handled by the parser (syntax flag).
* par-ch2.adb (P_Interpolated_String_Literal): Decorate the new
flag.
* sem_ch2.adb (Analyze_Interpolated_String_Literal): Improve code
detecting and reporting ambiguous function calls.
* sem_res.adb (Resolve_Interpolated_String_Literal): Restrict
resolution imposed by the context type to string literals that
have the new flag.
* sinfo.ads (Is_Interpolated_String_Literal): New field defined in
string literals. Fix documentation of the syntax rule of
interpolated string literal.
|
|
An assignment statement whose LHS is of a reference type is never legal. If
no other legality rule is violated, then it is ambiguous. In some cases this
ambiguity was not correctly detected.
gcc/ada/
* sem_ch5.adb (Analyze_Assignment): Delete code that was
incorrectly implementing a preference rule.
|
|
This adds a variant of the System.Finalization_Primitives unit that supports
only controlled types with relaxed finalization, and adds the description of
its implementation to Exp_Ch7.
gcc/ada/
* exp_ch7.adb (Relaxed Finalization): New paragraph in head
comment.
* sem_ch13.adb (Validate_Finalizable_Aspect): Give an error
message if strict finalization is required but not supported by
the runtime.
|
|
gcc/ada/
* sem_util.adb (Set_Referenced_Modified): Set referenced as LHS
for the prefixes of array slices.
|
|
The current instance of a type or subtype (see RM 8.6) is an object or
value, not a type or subtype. So a name denoting such a current instance is
illegal in any context that requires a name denoting a type or subtype.
In some cases this error was not detected.
gcc/ada/
* sem_ch8.adb (Find_Type): If Is_Current_Instance returns True for
N (and Comes_From_Source (N) is also True) then flag an error.
Call Is_Current_Instance (twice) instead of duplicating (twice)
N_Access_Definition-related code in Is_Current_Instance.
* sem_util.adb (Is_Current_Instance): Implement
access-type-related clauses of the RM 8.6 current instance rule.
For pragmas Predicate and Predicate_Failure, distinguish between
the first and subsequent pragma arguments.
|
|
In some cases, a legal type conversion in a generic package is correctly
accepted but the corresponding type conversion in an instance of the generic
is incorrectly rejected.
gcc/ada/
* sem_res.adb (Valid_Conversion): Test In_Instance instead of
In_Instance_Body.
|
|
The new aspect is automatically set on the Adjust and Finalize primitives of
finalizable types, unless Relaxed_Finalization is explicitly set to False,
but it can also be specified directly on subprograms. It is also available
in earlier versions of the language by means of the associated pragma.
gcc/ada/
* aspects.ads (Aspect_Id): Add Aspect_No_Raise identifier.
(Implementation_Defined_Aspect): Add True for Aspect_No_Raise.
(Is_Representation_Aspect): Add False for Aspect_No_Raise.
(Aspect_Names): Add Name_No_Raise for Aspect_No_Raise.
(Aspect_Delay): Add Always_Delay for Aspect_No_Raise.
* checks.ads (Raise_Checks_Suppressed): New function.
(Apply_Raise_Check): New procedure.
* checks.adb (Apply_Raise_Check): New procedure.
(Raise_Checks_Suppressed): New function.
* doc/gnat_rm/gnat_language_extensions.rst (Generalized
Finalization): Update.
* doc/gnat_rm/implementation_defined_aspects.rst (No_Raise): New.
* doc/gnat_rm/implementation_defined_characteristics.rst (Check
names): Document Raise_Check and alphabetize others.
* doc/gnat_rm/implementation_defined_pragmas.rst (No_Raise): New.
* einfo.ads (No_Raise): New flag defined in subprograms and
generic subprograms.
* exp_ch6.adb (Expand_N_Subprogram_Body): Call Apply_Raise_Check
at the end of the processing.
* exp_ch11.adb (Get_RT_Exception_Name): Add alternative for
PE_Raise_Check_Failed to case statement.
* gen_il-fields.ads (Opt_Field_Enum): Add No_Raise identifier.
* gen_il-gen-gen_entities.adb (Subprogram_Kind): Add No_Raise as
semantical flag.
(Generic_Subprogram_Kind): Likewise.
* par-prag.adb (Prag): Add alternative for Pragma_No_Raise to case
statement.
* sem_ch13.adb (Validate_Finalizable_Aspect): Set No_Raise on the
Adjust and Finalize primitives if Relaxed_Finalization is set.
* sem_prag.adb (Analyze_Pragma): Add alternative for
Pragma_No_Raise to case statement.
(Sig_Flag): Add 0 for Pragma_No_Raise.
* snames.ads-tmpl (Remaining pragma names): Add Name_No_Raise.
(Names of recognized checks): Add Name_Raise_Check.
(Pragma_Id): Add Pragma_No_Raise identifier.
* types.ads (Raise_Check): New named number.
(All_Checks): Adjust.
(RT_Exception_Code): Add PE_Raise_Check_Failed identifier.
(Rkind): Add PE_Reason for PE_Raise_Check_Failed and alphabetize.
* types.h (RT_Exception_Code): Add PE_Raise_Check_Failed as 38.
(LAST_REASON_CODE): Adjust.
* libgnat/a-except.adb (Rcheck_PE_Raise_Check): New procedure with
pragmas Export, No_Return and Machine_Attributes.
(Rmsg_38): New string constant.
* gnat_rm.texi: Regenerate.
|
|
The pseudo random number generators used in GNAT are not
suitable for applications that require cryptographic
security. While this was mentioned in some places others
did not have a corresponding note, leading to these
generators being used in a non-suitable context.
gcc/ada/
* doc/gnat_rm/standard_library_routines.rst: Add note to section
of Ada.Numerics.Discrete_Random and Ada.Numerics.Float_Random.
* doc/gnat_rm/the_gnat_library.rst: Add note to section about
GNAT.Random_Numbers.
* libgnat/a-nudira.ads: Add note about cryptographic properties.
* gnat_rm.texi: Regenerate.
* gnat_ugn.texi: Regenerate.
|
|
gcc/ada/
* doc/gnat_rm/gnat_language_extensions.rst: Fix layout of section.
* gnat_rm.texi: Regenerate.
* gnat_ugn.texi: Regenerate.
|
|
This happens when the expression is a reference to a formal parameter of
the function, or a conditional expression with such a reference as one of
its dependent expressions, because the RM 6.5(8/5) subclause prescribes a
tag reassignment in this case, which requires freezing the tagged type in
the GNAT freezing model, although the language says there is no freezing.
In other words, it's another occurrence of the discrepancy between this
model tailored to Ada 95 and the freezing rules introduced in Ada 2012,
that is papered over by Should_Freeze_Type and the associated processing.
gcc/ada/
* exp_util.ads (Is_Conversion_Or_Reference_To_Formal): New
function declaration.
* exp_util.adb (Is_Conversion_Or_Reference_To_Formal): New
function body.
* exp_ch6.adb (Expand_Simple_Function_Return): Call the predicate
Is_Conversion_Or_Reference_To_Formal in order to decide whether a
tag check or reassignment is needed.
* freeze.adb (Should_Freeze_Type): Move declaration and body to
the appropriate places. Also return True for tagged results
subject to the expansion done in Expand_Simple_Function_Return
that is guarded by the predicate
Is_Conversion_Or_Reference_To_Formal.
|
|
This patch fixes an assertion failure in some cases in the code to
warn about possible misuse of range attributes in loop. The root of
the problem is that this code failed to consider the case where the
outer loop is a while loop.
Also fix a typo in a nearby comment.
gcc/ada/
* sem_ch5.adb (Analyze_Loop_Statement): Fix loop pattern detection
code. Fix typo.
|
|
The je constraint should be used for APX NDD ADD with register source
operand. The jM is for APX NDD patterns with immediate operand.
gcc/ChangeLog:
* config/i386/i386.md (nf_mem_constraint): Fixed the constraint
for the define_subst_attr.
(nf_mem_constraint): Added new define_subst_attr.
(*add<mode>_1<nf_name>): Fixed the constraint.
|
|
gcc/ChangeLog:
* config/loongarch/genopts/gen-evolution.awk: Do not use
"length()" to compute the size of an array.
|
|
This patch improves the Advanced SIMD popcount expansion by using SVE if
available.
For example, GCC currently generates the following code sequence for V2DI:
cnt v31.16b, v31.16b
uaddlp v31.8h, v31.16b
uaddlp v31.4s, v31.8h
uaddlp v31.2d, v31.4s
However, by using SVE, we can generate the following sequence instead:
ptrue p7.b, all
cnt z31.d, p7/m, z31.d
Similar improvements can be made for V4HI, V8HI, V2SI and V4SI too.
The scalar popcount expansion can also be improved similarly by using SVE and
those changes will be included in a separate patch.
PR target/113860
gcc/ChangeLog:
* config/aarch64/aarch64-simd.md (popcount<mode>2): Add TARGET_SVE
support.
* config/aarch64/aarch64-sve.md (@aarch64_pred_<optab><mode>): Use new
iterator SVE_VDQ_I.
* config/aarch64/iterators.md (SVE_VDQ_I): New mode iterator.
(VPRED): Add V8QI, V16QI, V4HI, V8HI and V2SI.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/popcnt-sve.c: New test.
Signed-off-by: Pengxuan Zheng <quic_pzheng@quicinc.com>
|
|
As Andrew pointed out in PR116148, fam-in-union-alone-in-struct-2.c
was designed for little-endian, the recent commit r15-2403 made it
be tested with running on BE and PR116148 got exposed.
This patch is to adjust the expected data for members in with_fam_2_v
and with_fam_3_v by considering endianness, also update with_fam_3_v.b[1]
from 0x5f6f7f7f to 0x5f6f7f8f to avoid two "7f"s.
PR testsuite/116148
gcc/testsuite/ChangeLog:
* c-c++-common/fam-in-union-alone-in-struct-2.c: Define macros
WITH_FAM_2_V_B[03] and WITH_FAM_3_V_A[07] as endianness, update the
checking with these macros and initialize with_fam_3_v.b[1] with
0x5f6f7f8f instead of 0x5f6f7f7f.
|
|
|
|
In PR116149 we choose a wrong vector length which causes wrong values in
a reduction. The problem happens in avlprop where we choose the
number of units in the instruction's mode as vector length. For the
non-scalar variants the respective operand has the correct non-widened
mode. For the scalar variants, however, the same operand has a scalar
mode which obviously only has one unit. This makes us choose VL = 1
leaving three elements undisturbed (so potentially -1). Those end up
in the reduction causing the wrong result.
This patch adjusts the mode_idx just for the scalar variants of the
affected instruction patterns.
gcc/ChangeLog:
PR target/116149
* config/riscv/vector.md: Fix mode_idx attribute of scalar
widen add/sub variants.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr116149.c: New test.
|
|
A static analyzer found a pasto in gfc_get_array_descr_info.
The code does
t = base_decl;
if (!integer_zerop (dtype_off))
t = fold_build_pointer_plus (t, dtype_off);
dtype = TYPE_MAIN_VARIANT (get_dtype_type_node ());
field = gfc_advance_chain (TYPE_FIELDS (dtype), GFC_DTYPE_RANK);
rank_off = byte_position (field);
if (!integer_zerop (dtype_off))
t = fold_build_pointer_plus (t, rank_off);
i.e. uses the same !integer_zerop check between both, while it should
be checking rank_off in the latter case.
This actually doesn't change anything on the generated code, because
both the dtype_off and rank_off aren't zero,
typedef struct dtype_type
{
size_t elem_len;
int version;
signed char rank;
signed char type;
signed short attribute;
}
dtype_type;
struct {
type *base_addr;
size_t offset;
dtype_type dtype;
index_type span;
descriptor_dimension dim[];
};
dtype_off is 16 on 64-bit arches and 8 on 32-bit ones and rank_off is
12 on 64-bit arches and 8 on 32-bit arches, so this patch is just to
pacify those static analyzers or be prepared if the ABI changes in the
future. Because in the current ABI both of those are actually non-zero,
doing
if (!integer_zerop (something)) t = fold_build_pointer_plus (t, something);
actually isn't an optimization, it will consume more compile time. If
the ABI changes and we forget to readd it, nothing bad happens,
fold_build_pointer_plus handles 0 addends fine, just takes some compile
time to handle that.
I've kept this if (!integer_zerop (data_off)) guard earlier because
data_off is 0 in the current ABI, so it is an optimization there.
2024-08-01 Jakub Jelinek <jakub@redhat.com>
* trans-types.cc (gfc_get_array_descr_info): Don't test if
!integer_zerop (dtype_off), use fold_build_pointer_plus
unconditionally.
|
|
Also add a testcase for -mabi=lp64d where 'd' is required.
gcc/ChangeLog:
PR target/116111
* config/riscv/riscv.cc (riscv_option_override): Add error.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/arch-41.c: New test.
* gcc.target/riscv/pr116111.c: New test.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
|
|
[PR116113]
The following testcase ICEs, because for structured binding error recovery
DECL_DECOMP_BASE is kept NULL and the newly added code to pick up saved
value from the base assumes that on structured binding bases the
TARGET_EXPR will be always there (that is the case if there are no errors).
The following patch fixes it by testing DECL_DECOMP_BASE before
dereferencing it, another option would be not to do that if
error_operand_p (cond).
2024-08-01 Jakub Jelinek <jakub@redhat.com>
PR c++/116113
* semantics.cc (maybe_convert_cond): Check DECL_DECOMP_BASE
is non-NULL before dereferencing it.
(finish_switch_cond): Likewise.
* g++.dg/cpp26/decomp11.C: New test.
|
|
This adds a cost model and core definition for Cortex-X925.
gcc/ChangeLog:
* config/aarch64/aarch64-cores.def (cortex-x925): New.
* config/aarch64/aarch64-tune.md: Regenerate.
* config/aarch64/tuning_models/cortexx925.h: New file.
* config/aarch64/aarch64.cc: Use it.
* doc/invoke.texi: Document it.
|
|
This updates the cost for Neoverse N2 to reflect the updated
Software Optimization Guide.
gcc/ChangeLog:
* config/aarch64/tuning_models/neoversen2.h: Update costs.
|
|
this updates the costs for gener-armv9-a based on the updated costs for
Neoverse V2 and Neoverse N2.
gcc/ChangeLog:
* config/aarch64/tuning_models/generic_armv9_a.h: Update costs.
|