Age | Commit message (Collapse) | Author | Files | Lines |
|
We have been requiring the get_return_on_allocation_fail() call to have the
same type as the ramp. This is not intended by the standard, so relax that
to allow anything convertible to the ramp return.
PR c++/109682
gcc/cp/ChangeLog:
* coroutines.cc
(cp_coroutine_transform::build_ramp_function): Allow for cases where
get_return_on_allocation_fail has a type convertible to the ramp
return type.
gcc/testsuite/ChangeLog:
* g++.dg/coroutines/pr109682.C: New test.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
|
|
[PR100476].
Require that the value returned by get_return_object is convertible to
the ramp return. This means that the only time we allow a void
get_return_object, is when the ramp is also a void function.
We diagnose this early to allow us to exit the ramp build if the return
values are incompatible.
PR c++/100476
gcc/cp/ChangeLog:
* coroutines.cc
(cp_coroutine_transform::build_ramp_function): Remove special
handling of void get_return_object expressions.
gcc/testsuite/ChangeLog:
* g++.dg/coroutines/coro-bad-gro-01-void-gro-non-class-coro.C:
Adjust expected diagnostic.
* g++.dg/coroutines/pr102489.C: Avoid void get_return_object.
* g++.dg/coroutines/pr103868.C: Likewise.
* g++.dg/coroutines/pr94879-folly-1.C: Likewise.
* g++.dg/coroutines/pr94883-folly-2.C: Likewise.
* g++.dg/coroutines/pr96749-2.C: Likewise.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
|
|
The responsibility for destroying part of the frame content (promise,
arg copies and the frame itself) transitions from the ramp to the
body of the coroutine once we reach the await_resume () for the
initial suspend.
We added the variable that flags the transition, but failed to act on
it. This corrects that so that the ramp only tries to run DTORs for
objects when an exception occurs before the initial suspend await
resume has started.
PR c++/113773
gcc/cp/ChangeLog:
* coroutines.cc
(cp_coroutine_transform::build_ramp_function): Only cleanup the
frame state on exceptions that occur before the initial await
resume has begun.
gcc/testsuite/ChangeLog:
* g++.dg/coroutines/torture/pr113773.C: New test.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
|
|
This splits out the building of the allocation and deallocation expressions
and runs them early in the ramp build, so that we can exit if they are not
usable, before we start building the ramp body.
Likewise move checks for other required resources to the begining of the
ramp builder.
This is preparation for work needed to update the allocation/destruction
in cases where we have excess alignment of the promise or other saved frame
state.
gcc/cp/ChangeLog:
* call.cc (build_op_delete_call_1): Renamed and added a param
to allow the caller to prioritize two argument usual deleters.
(build_op_delete_call): New.
(build_coroutine_op_delete_call): New.
* coroutines.cc (coro_get_frame_dtor): Rename...
(build_coroutine_frame_delete_expr):... to this; simplify to
use build_op_delete_call for all cases.
(build_actor_fn): Use revised frame delete function.
(build_coroutine_frame_alloc_expr): New.
(cp_coroutine_transform::complete_ramp_function): Rename...
(cp_coroutine_transform::build_ramp_function): ... to this.
Reorder code to carry out checks for prerequisites before the
codegen. Split out the allocation/delete code.
(cp_coroutine_transform::apply_transforms): Use revised name.
* coroutines.h: Rename function.
* cp-tree.h (build_coroutine_op_delete_call): New.
gcc/testsuite/ChangeLog:
* g++.dg/coroutines/coro-bad-alloc-01-bad-op-del.C: Use revised
diagnostics.
* g++.dg/coroutines/coro-bad-gro-00-class-gro-scalar-return.C:
Likewise.
* g++.dg/coroutines/coro-bad-gro-01-void-gro-non-class-coro.C:
Likewise.
* g++.dg/coroutines/coro-bad-grooaf-00-static.C: Likewise.
* g++.dg/coroutines/ramp-return-b.C: Likewise.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
|
|
This change is preparation for fixes to the ramp and codegen to follow.
The primary motivation is that we have thee activities; analysis, ramp
synthesis and outlined coroutine body synthesis. These are currently
carried out in sequence in the 'morph_fn_to_coro' code, which means that
we are nesting the synthesis of the outlined coroutine body inside the
finish_function call for the original function (which becomes the ramp).
The revised code splits the three interests so that the analysis can be
used independently by the ramp and body synthesis. This avoids some issues
seen with global state that start/finish function use and allows us to
use more of the high-level APIs in fixing bugs.
The resultant implementation is more self-contained, and has less impact
on finish_function.
gcc/cp/ChangeLog:
* coroutines.cc (struct suspend_point_info, struct param_info,
struct local_var_info, struct susp_frame_data,
struct local_vars_frame_data): Move to coroutines.h.
(build_actor_fn): Use start/finish function APIs.
(build_destroy_fn): Likewise.
(coro_build_actor_or_destroy_function): No longer mark the
actor / destroyer as DECL_COROUTINE_P.
(coro_rewrite_function_body): Use class members.
(cp_coroutine_transform::wrap_original_function_body): Likewise.
(build_ramp_function): Replace by...
(cp_coroutine_transform::complete_ramp_function): ...this.
(cp_coroutine_transform::cp_coroutine_transform): New.
(cp_coroutine_transform::~cp_coroutine_transform): New
(morph_fn_to_coro): Replace by...
(cp_coroutine_transform::apply_transforms): ...this.
(cp_coroutine_transform::finish_transforms): New.
* cp-tree.h (morph_fn_to_coro): Remove.
* decl.cc (emit_coro_helper): Remove.
(finish_function): Revise handling of coroutine transforms.
* coroutines.h: New file.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
Co-authored-by: Arsen Arsenović <arsen@aarsen.me>
|
|
This is primarily preparation to partition the functionality of the
coroutine transform into analysis, ramp generation and then (later)
synthesis of the coroutine body. The patch does fix one latent
issue in the ordering of DTORs for frame parameter copies (to ensure
that they are processed in reverse order to the copy creation).
gcc/cp/ChangeLog:
* coroutines.cc (build_actor_fn): Arrange to apply any
required parameter copy DTORs in reverse order to their
creation.
(coro_rewrite_function_body): Handle revised param uses.
(morph_fn_to_coro): Split the ramp function completion
into a separate function.
(build_ramp_function): New.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
|
|
When we build an await expression, we might need to materialise the awaiter
if it is a prvalue. This re-implements this using core APIs instead of local
code.
gcc/cp/ChangeLog:
* coroutines.cc (build_co_await): Simplify checks for the cases that
we need to materialise an awaiter.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
|
|
The case in PR113746 used to ICE until commit r15-123-gf04dc89a991ddc.
This patch simply adds the case to the testsuite.
PR c++/113746
gcc/testsuite/ChangeLog:
* g++.dg/parse/crash76.C: New test.
|
|
set -fschedule-insns.
gcc/testsuite/
* gcc.dg/torture/pr115929-2.c: Add dg-require-effective-target scheduling.
* gcc.dg/torture/pr116343.c: Same.
|
|
|
|
repeating_sequence_p operates directly on the encoded pattern and does
not derive elements using the .elt() accessor. Passing in the length of
the unencoded vector can cause an out-of-bounds read of the encoded
pattern.
gcc/ChangeLog:
* config/riscv/riscv-v.cc (rvv_builder::can_duplicate_repeating_sequence_p):
Use encoded_nelts when calling repeating_sequence_p.
(rvv_builder::is_repeating_sequence): Ditto.
(rvv_builder::repeating_sequence_use_merge_profitable_p): Ditto.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
|
|
PR116405]
Now that more operations are allowed for noce_convert_multiple_sets,
it is possible that the same register appears multiple times as target
in a basic block. After noce_convert_multiple_sets_1 is called we
potentially also emit register moves from temporaries back to the
original targets. In some cases where the target registers overlap
with the block's condition, these register moves may overwrite
intermediate variables because they're emitted after the if-converted
code. To address this issue we now iterate backwards and keep track
of seen registers when emitting these final register moves.
PR rtl-optimization/116372
PR rtl-optimization/116405
gcc/ChangeLog:
* ifcvt.cc (noce_convert_multiple_sets): Iterate backwards and track
target registers.
gcc/testsuite/ChangeLog:
* gcc.dg/pr116372.c: New test.
* gcc.dg/pr116405.c: New test.
|
|
Similar to not allowing jump instructions in the generated code, we
also shouldn't allow call instructions in noce_convert_multiple_sets.
In the case of PR116358 a libcall was generated from force_operand.
PR middle-end/116358
gcc/ChangeLog:
* ifcvt.cc (noce_convert_multiple_sets): Disallow call insns.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/pr116358.c: New test.
|
|
Our power8 swap optimization pass has some special handling for optimizing
swaps of TImode variables. The test case reported in bugzilla uses a call
to __atomic_compare_exchange, which introduces a variable of PTImode and
that does not get the same treatment as TImode leading to wrong code
generation. The simple fix is to treat PTImode identically to TImode.
2024-08-23 Peter Bergner <bergner@linux.ibm.com>
gcc/
PR target/116415
* config/rs6000/rs6000.h (TI_OR_PTI_MODE): New define.
* config/rs6000/rs6000-p8swap.cc (rs6000_analyze_swaps): Use it to
handle PTImode identically to TImode.
gcc/testsuite/
PR target/116415
* gcc.target/powerpc/pr116415.c: New test.
|
|
Complex lowering generally replaces existing complex defs with
COMPLEX_EXPRs but those might be dead when it can always refer to
components from the lattice. This in turn can pessimize followup
transforms like forwprop and reassoc, the following makes sure to
get rid of dead COMPLEX_EXPRs generated by using
simple_dce_from_worklist.
PR tree-optimization/116463
* tree-complex.cc: Include tree-ssa-dce.h.
(dce_worklist): New global.
(update_complex_assignment): Add SSA def to the DCE worklist.
(tree_lower_complex): Perform DCE.
|
|
This reverts commit 4cb07a38233aadb4b389a6e5236c95f52241b6e0.
|
|
This patch would like to support the form 4 of the unsigned integer
.SAT_TRUNC. Aka below example:
Form 4:
#define DEF_SAT_U_TRUC_FMT_4(NT, WT) \
NT __attribute__((noinline)) \
sat_u_truc_##WT##_to_##NT##_fmt_4 (WT x) \
{ \
bool not_overflow = x <= (WT)(NT)(-1); \
return ((NT)x) | (NT)((NT)not_overflow - 1); \
}
DEF_SAT_U_TRUC_FMT_4(uint32_t, uint64_t)
Before this patch:
4 │ __attribute__((noinline))
5 │ uint8_t sat_u_truc_uint32_t_to_uint8_t_fmt_4 (uint32_t x)
6 │ {
7 │ _Bool not_overflow;
8 │ unsigned char _1;
9 │ unsigned char _2;
10 │ unsigned char _3;
11 │ uint8_t _6;
12 │
13 │ ;; basic block 2, loop depth 0
14 │ ;; pred: ENTRY
15 │ not_overflow_5 = x_4(D) <= 255;
16 │ _1 = (unsigned char) x_4(D);
17 │ _2 = (unsigned char) not_overflow_5;
18 │ _3 = _2 + 255;
19 │ _6 = _1 | _3;
20 │ return _6;
21 │ ;; succ: EXIT
22 │
23 │ }
After this patch:
4 │ __attribute__((noinline))
5 │ uint8_t sat_u_truc_uint32_t_to_uint8_t_fmt_4 (uint32_t x)
6 │ {
7 │ uint8_t _6;
8 │
9 │ ;; basic block 2, loop depth 0
10 │ ;; pred: ENTRY
11 │ _6 = .SAT_TRUNC (x_4(D)); [tail call]
12 │ return _6;
13 │ ;; succ: EXIT
14 │
15 │ }
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.
gcc/ChangeLog:
* match.pd: Add form 4 for unsigned .SAT_TRUNC matching.
Signed-off-by: Pan Li <pan2.li@intel.com>
|
|
In get_best_extraction_insn we use smallest_int_mode_for_size with
struct_bits as size argument. PR115495 has struct_bits = 256 and we
don't have a mode for that. This patch makes smallest_mode_for_size
and smallest_int_mode_for_size return opt modes so we can just skip
over the loop when there is no mode.
PR middle-end/115495
gcc/ChangeLog:
* cfgexpand.cc (expand_debug_expr): Require mode.
* combine.cc (make_extraction): Ditto.
* config/aarch64/aarch64.cc (aarch64_expand_cpymem): Ditto.
(aarch64_expand_setmem): Ditto.
* config/arc/arc.cc (arc_expand_cpymem): Ditto.
* config/arm/arm.cc (arm_expand_divmod_libfunc): Ditto.
* config/i386/i386.cc (ix86_get_mask_mode): Ditto.
* config/rs6000/predicates.md: Ditto.
* config/rs6000/rs6000.cc (vspltis_constant): Ditto.
* config/s390/s390.cc (s390_expand_insv): Ditto.
* config/sparc/sparc.cc (assign_int_registers): Ditto.
* coverage.cc (get_gcov_type): Ditto.
(get_gcov_unsigned_t): Ditto.
* dse.cc (find_shift_sequence): Ditto.
* expmed.cc (store_integral_bit_field): Ditto.
* expr.cc (convert_mode_scalar): Ditto.
(op_by_pieces_d::smallest_fixed_size_mode_for_size): Ditto.
(emit_block_move_via_oriented_loop): Ditto.
(copy_blkmode_to_reg): Ditto.
(store_field): Ditto.
* internal-fn.cc (expand_arith_overflow): Ditto.
* machmode.h (HAVE_MACHINE_MODES): Ditto.
(smallest_mode_for_size): Use opt_machine_mode.
(smallest_int_mode_for_size): Use opt_scalar_int_mode.
* optabs-query.cc (get_best_extraction_insn): Require mode.
* optabs.cc (expand_twoval_binop_libfunc): Ditto.
* stor-layout.cc (smallest_mode_for_size): Return
opt_machine_mode.
(layout_type): Require mode.
(initialize_sizetypes): Ditto.
* tree-ssa-loop-manip.cc (canonicalize_loop_ivs): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr115495.c: New test.
gcc/ada/ChangeLog:
* gcc-interface/utils2.cc (fast_modulo_reduction): Require mode.
(nonbinary_modular_operation): Ditto.
|
|
Standard abs synthesis during expand is max (a, -a). This
expansion has the advantage of avoiding masking and is thus potentially
faster than the a < 0 ? -a : a synthesis.
gcc/ChangeLog:
* config/riscv/autovec.md (abs<mode>2): Expand via max (a, -a).
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/unop/abs-rv32gcv.c: Adjust test
expectation.
* gcc.target/riscv/rvv/autovec/unop/abs-rv64gcv.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/abs-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/cond/cond_unary-8.c: Ditto.
|
|
Apparently due to slightly different optimization levels
not always both subroutines have multiple subranges,
but having at least one such, and no lexical blocks
is sufficient to prove that the fix worked. Q.E.D.
So reduce the test expectations to only at least one
inlined subroutine with multiple subranges.
gcc/testsuite/ChangeLog:
PR other/116462
* gcc.dg/debug/dwarf2/inline7.c: Reduce test expectations.
|
|
This comes from a loophole in gnat_get_array_descr_info for record types
containing a template, which represent an aliased array, when this array
type is bit-packed and implemented as a modular integer.
gcc/ada/
* gcc-interface/misc.cc (gnat_get_array_descr_info): Test the
BIT_PACKED_ARRAY_TYPE_P flag only once on the final debug type. In
the case of records containing a template, replay the entire
processing for the array type contained therein.
|
|
The compiler does not report the correct error in occurrences
of interpolated strings, when the sources are compiled without
language extensions allowed.
gcc/ada/
* scng.adb (Scan): Call Error_Msg_GNAT_Extension() to report an
error, when the sources are compiled without Core_Extensions_
Allowed, and the scanner detects the beginning of an interpolated
string.
|
|
PECOFF symbols don't have a size attached to them. The symbol size that
System.Object_Reader.Read_Symbol guesses to make up for the lack of
information can be wrong when the symbol table doesn't match the
algorithm's expectations; in particular that's the case when function
symbols aren't sorted by address.
To avoid incorrect tracebacks caused by wrong symbol size guesses, don't
use the symbol size for PECOFF files when producing a traceback and
instead pick the symbol with the highest address lower than the target
address.
gcc/ada/
* libgnat/s-dwalin.adb (Symbolic_Address): Ignore symbol size in
address-to-symbol translation for PECOFF files.
|
|
The compiler crashes when processing an object declaration
of a custom string type initialized with an interpolated
string.
gcc/ada/
* exp_attr.adb (Expand_N_Attribute_Reference: [Put_Image]): Add
support for custom string types.
* exp_ch2.adb (Expand_N_Interpolated_String_Literal): Add a type
conversion to the result object declaration of custom string
types.
* exp_put_image.adb (Build_String_Put_Image_Call): Handle custom
string types.
|
|
Implicit_Dereference is a type-specific aspect and therefore cannot be
legally specified as part of a subtype declaration.
gcc/ada/
* sem_ch13.adb (Analyze_Aspect_Implicit_Dereference): Generate
error if an aspect specification specifies the
Implicit_Dereference aspect of a non-first subtype.
|
|
If the Overflow_Mode in effect is Eliminated, then evaluating an arithmetic
op such as addition or subtraction should not fail an overflow check.
Fix a bug which resulted in such an overflow check failure.
gcc/ada/
* checks.adb (Is_Signed_Integer_Arithmetic_Op): Return True in the
case of relational operator whose operands are of a signed integer
type.
|
|
Records without a limited keyword now emit a warning if
they contain a member that has an inherently limited type.
gcc/ada/
* libgnat/a-coinho__shared.ads: add limited keyword.
* libgnat/g-awk.adb: add limited keyword.
* libgnat/g-comlin.ads: add limited keyword.
* libgnat/s-excmac__arm.ads: add limited keyword.
* libgnat/s-excmac__gcc.ads: add limited keyword.
* libgnat/s-soflin.ads: add limited keyword.
|
|
Record types that do not have a limited keyword but have a
member with a limited type are also considered to be limited types.
This can be confusing to understand for newer Ada users. It is
better to emit a warning in this scenario and suggest that the
type should be marked with a limited keyword. This diagnostic will
be acticated when the -gnatw_l switch is used.
gcc/ada/
* sem_ch3.adb: Add method Check_Inherited_Limted_Record for
emitting the warning for an inherited limited type.
* warnsw.adb: Add processing for the -gnatw_l switch that
triggeres the inheritly limited type warning.
* warnsw.ads: same as above.
* doc/gnat_ugn/building_executable_programs_with_gnat.rst: Add
entry for -gnatw_l switch.
* gnat_ugn.texi: Regenerate.
|
|
gcc/ada/
* sem_ch6.adb (Check_Private_Overriding): Improve code detecting
error on private function with controlling result. Fixes the
regression of ACATS bde0003.
|
|
Style cleanup; semantics is unaffected. Offending occurrences found with
grep "^ *:=" and fixed manually.
gcc/ada/
* checks.ads, cstand.adb, exp_aggr.adb, exp_ch4.adb, exp_ch5.adb,
exp_dbug.adb, exp_util.adb, gnatlink.adb, lib-util.adb,
libgnat/a-except.adb, libgnat/a-exexpr.adb, libgnat/a-ngcoar.adb,
libgnat/s-rannum.adb, libgnat/s-trasym__dwarf.adb, osint.adb,
rtsfind.adb, sem_case.adb, sem_ch12.adb, sem_ch13.adb,
sem_ch3.adb, sem_ch6.adb, sem_eval.adb, sem_prag.adb,
sem_util.adb: Fix style.
|
|
Move detection of always valid expressions from routine
Ensure_Valid (which inserts validity checks) to Expr_Known_Valid
(which decides their validity). In particular, this patch removes
duplicated detection of boolean operators, which were recognized
in both these routines.
Code cleanup; behavior is unaffected.
gcc/ada/
* checks.adb (Ensure_Valid): Remove detection of boolean and
short-circuit operators.
(Expr_Known_Valid): Detect short-circuit operators; detection of
boolean operators was already done in this routine.
|
|
Replace low-level iteration over formal and actual parameters with a
call to high-level Find_Actual routine. Code cleanup; behavior is
unaffected.
gcc/ada/
* checks.adb (Ensure_Valid): Use Find_Actual.
|
|
When iterating over actual and formal parameters, we should use
First_Actual/Next_Actual and not simply First/Next, because the
order of actual parameters might be different than the order of
formal parameters obtained with First_Formal/Next_Formal.
This patch fixes a glitch in validity checks for actual parameters
and applies the same fix to other misuses of First/Next as well.
gcc/ada/
* checks.adb (Ensure_Valid): Use First_Actual/Next_Actual.
* exp_ch6.adb (Is_Direct_Deep_Call): Likewise.
* exp_util.adb (Type_Of_Formal): Likewise.
* sem_util.adb (Is_Container_Element): Likewise; cleanup
membership test by using a subtype.
|
|
gcc/ada/
* sem_ch13.adb (Analyze_One_Aspect): Temporarily remove reporting
an error when the new aspect is set to True and the extensions are
not enabled.
|
|
The compiler does not report an error when 'access is applied to
a non-aliased class-wide interface type object.
gcc/ada/
* exp_util.ads (Is_Expanded_Class_Wide_Interface_Object_Decl): New
subprogram.
* exp_util.adb (Is_Expanded_Class_Wide_Interface_Object_Decl):
ditto.
* sem_util.adb (Is_Aliased_View): Handle expanded class-wide type
object declaration.
* checks.adb (Is_Aliased_Unconstrained_Component): Protect the
frontend against calling Is_Aliased_View with Empty. Found working
on this issue.
|
|
This patch adds support for a new GNAT aspect/pragma that modifies
the semantics of dispatching primitives. When a tagged type has
this aspect/pragma, only subprograms that have the first parameter
of this type will be considered dispatching primitives; this new
pragma/aspect is inherited by all descendant types.
gcc/ada/
* aspects.ads (Aspect_First_Controlling_Parameter): New aspect.
Defined as implementation defined aspect that has a static boolean
value and it is converted to pragma when the value is True.
* einfo.ads (Has_First_Controlling_Parameter): New attribute.
* exp_ch9.adb (Build_Corresponding_Record): Propagate the aspect
to the corresponding record type.
(Expand_N_Protected_Type_Declaration): Analyze the inherited
aspect to add the pragma.
(Expand_N_Task_Type_Declaration): ditto.
* freeze.adb (Warn_If_Implicitly_Inherited_Aspects): New
subprogram.
(Has_First_Ctrl_Param_Aspect): New subprogram.
(Freeze_Record_Type): Call Warn_If_Implicitly_Inherited_Aspects.
(Freeze_Subprogram): Check illegal subprograms of tagged types and
interface types that have this new aspect.
* gen_il-fields.ads (Has_First_Controlling_Parameter): New entity
field.
* gen_il-gen-gen_entities.adb (Has_First_Controlling_Parameter):
The new field is a semantic flag.
* gen_il-internals.adb (Image): Add
Has_First_Controlling_Parameter.
* par-prag.adb (Prag): No action for
Pragma_First_Controlling_Parameter since processing is handled
entirely in Sem_Prag.
* sem_ch12.adb (Validate_Private_Type_Instance): When the generic
formal has this new aspect, check that the actual type also has
this aspect.
* sem_ch13.adb (Analyze_One_Aspect): Check that the aspect is
applied to a tagged type or a concurrent type.
* sem_ch3.adb (Analyze_Full_Type_Declaration): Derived tagged
types inherit this new aspect, and also from their implemented
interface types.
(Process_Full_View): Propagate the aspect to the full view.
* sem_ch6.adb (Is_A_Primitive): New subprogram; used to factor
code and also clarify detection of primitives.
* sem_ch9.adb (Check_Interfaces): Propagate this new aspect to the
type implementing interface types.
* sem_disp.adb (Check_Controlling_Formals): Handle tagged type
that has the aspect and has subprograms overriding primitives of
tagged types that lack this aspect.
(Check_Dispatching_Operation): Warn on dispatching primitives
disallowed by this new aspect.
(Has_Predefined_Dispatching_Operation_Name): New subprogram.
(Find_Dispatching_Type): Handle dispatching functions of tagged
types that have the new aspect.
(Find_Primitive_Covering_Interface): For primitives of tagged
types that have the aspect and override a primitive of a parent
type that does not have the aspect, we must temporarily unset
attribute First_Controlling_ Parameter to properly check
conformance.
* sem_prag.ads (Aspect_Specifying_Pragma): Add new pragma.
* sem_prag.adb (Pragma_First_Controlling_Parameter): Handle new
pragma.
* snames.ads-tmpl (Name_First_Controlling_Parameter): New name.
* warnsw.ads (Warn_On_Non_Dispatching_Primitives): New warning.
* warnsw.adb (Warn_On_Non_Dispatching_Primitives): New warning;
not set by default when GNAT_Mode warnings are enabled, nor when
all warnings are enabled (-gnatwa).
|
|
gcc/fortran:
* invoke.texi (Code Gen Options): Add a missing word.
|
|
The generic GPL link redirects to GPL v3.0 right now, but may redirect
to a different version at one point. Specifically link to the version we
are using
gcc:
* doc/gm2.texi (License): Specifically link to GPL v3.0
|
|
This patch removes an unnecessary view_convert in trans_associate to
prevent hard to find runtime errors in the future. The view_convert was
erroneously introduced not understanding why ranks of the arrays to
assign are different. The ranks are fixed by PR86468 now and the
view_convert is obsolete.
gcc/fortran/ChangeLog:
PR fortran/86468
* trans-stmt.cc (trans_associate_var): Remove superfluous
view_convert.
|
|
The testcase cc.dg/vect/vect-mod-var.c has an division by 0
which is undefined. On some targets (aarch64), the scalar and
the vectorized version, the result of division by 0 is the same.
While on other targets (x86), we get a SIGFAULT. On other targets (powerpc),
the results are different.
The fix is to make sure the testcase does not test division by 0 (or really mod by 0).
Pushed as obvious after testing on x86_64-linux-gnu to make sure the testcase passes
now.
PR testsuite/116461
gcc/testsuite/ChangeLog:
* gcc.dg/vect/vect-mod-var.c: Change the initialization loop so that
`b[i]` is never 0. Use 1 in those places.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
|
|
[PR116464]
This is an obvious fix to the gcc.dg/torture/pr116420.c testcase which simplier
changes from plain `char` to `signed char` so it works on targets where plain char defaults
to unsigned.
Pushed as obvious after a quick test for aarch64-linux-gnu to make sure the testcase
passes now.
PR testsuite/116464
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr116420.c:
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
|
|
The DF framework provides us a way to run dataflow problems on sub-graphs.
Naturally a bitmap of interesting blocks is passed into those routines. At a
confluence point, the DF framework will not mark a block for re-processing if
it's not in that set of interesting blocks.
When ext-dce sets up that set of interesting blocks it's using the wrong
counter. ie, it's using n_basic_blocks rather than last_basic_block. If there
are holes in the block indices, some number of blocks won't get marked as
interesting.
In this case the block needing reprocessing has an index higher than
n_basic_blocks. It never gets reprocessed and the newly found live chunks
don't propagate further up the CFG -- ultimately resulting in a pseudo
appearing to have only the low 8 bits live, when in fact the low 32 bits are
actually live.
Fixed in the obvious way, by using last_basic_block instead.
Bootstrapped and regression tested on x86_64. Pushing to the trunk.
PR rtl-optimization/116420
gcc/
* ext-dce.cc (ext_dce_init): Fix loop iteration when setting up the
interesting block for DF to analyze.
gcc/testsuite
* gcc.dg/torture/pr116420.c: New test.
|
|
The patch streams out VOIDmode for aggregate types with offloading enabled,
and recomputes appropriate TYPE_MODE and DECL_MODE while streaming-in on accel
side. The rationale for this change is to avoid streaming out host-specific
modes that may be used for aggregate types, which may not be representable on
the accelerator. For eg, AArch64 uses OImode for ARRAY_TYPE whose size is 256-bits,
and nvptx doesn't have OImode, and thus ends up emitting an error from
lto_input_mode_table.
gcc/ChangeLog:
* lto-streamer-in.cc: (lto_read_tree_1): Set DECL_MODE (expr) to
TREE_TYPE (TYPE_MODE (expr)) if TREE_TYPE (expr) is aggregate type and
offloading is enabled.
* stor-layout.cc (layout_type): Move computation of mode for
ARRAY_TYPE from ...
(compute_array_mode): ... to here.
* stor-layout.h (compute_array_mode): Declare.
* tree-streamer-in.cc: Include stor-layout.h.
(unpack_ts_common_value_fields): Call compute_array_mode if offloading
is enabled.
* tree-streamer-out.cc (pack_ts_fixed_cst_value_fields): Stream out
VOIDmode if decl has aggregate type and offloading is enabled.
(pack_ts_type_common_value_fields): Stream out VOIDmode for aggregate
type if offloading is enabled.
Signed-off-by: Prathamesh Kulkarni <prathameshk@nvidia.com>
|
|
The stack-clash code is generating wrong cfi directives in
riscv_v_adjust_scalable_frame because REG_CFA_DEF_CFA has a different
encoding than REG_FRAME_RELATED_EXPR, this patch fixes the offset sign
in prologue and starts using REG_CFA_DEF_CFA in the epilogue.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_v_adjust_scalable_frame): Add
epilogue code for stack-clash and fix prologue cfi note.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/stack-check-cfa-3.c: Fix the expected output.
|
|
The problem here was a missing save_expr around arg0 since
it is used twice, once in REALPART_EXPR and once in IMAGPART_EXPR.
Thia adds the save_expr and reformats the code slightly so it is a
little easier to understand. It excludes the case when arg0 is
a COMPLEX_EXPR since in that case we'll end up with the distinct
real and imaginary parts. This is important to retain early
optimization in some testcases.
Bootstapped and tested on x86_64-linux-gnu with no regressions.
PR middle-end/116454
gcc/ChangeLog:
* fold-const.cc (fold_binary_loc): Fix `a * +-1i`
by wrapping arg0 with save_expr when it is not COMPLEX_EXPR.
gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr116454-1.c: New test.
* gcc.dg/torture/pr116454-2.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Co-Authored-By: Richard Biener <rguenther@suse.de>
|
|
aarch64-autovec-preference=N
The param aarch64-autovec-preference=N is a useful tool for testing
auto-vectorisation in GCC as it allows the user to force a particular
strategy. So far, N could be a numerical value between 0 and 4.
This patch replaces the numerical values by more user-friendly
names to distinguish the options.
The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
Ok for mainline?
Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/
PR target/116365
* config/aarch64/aarch64-opts.h
(enum aarch64_autovec_preference_enum): New enum.
* config/aarch64/aarch64.cc (aarch64_cmp_autovec_modes):
Change numerical to enum values.
(aarch64_autovectorize_vector_modes): Change numerical to enum
values.
(aarch64_vector_costs::record_potential_advsimd_unrolling):
Change numerical to enum values.
* config/aarch64/aarch64.opt: Change param type to enum.
* doc/invoke.texi: Update documentation.
gcc/testsuite/
PR target/116365
* gcc.target/aarch64/autovec_param_asimd-only.c: New test.
* gcc.target/aarch64/autovec_param_default.c: Likewise.
* gcc.target/aarch64/autovec_param_prefer-asimd.c: Likewise.
* gcc.target/aarch64/autovec_param_prefer-sve.c: Likewise.
* gcc.target/aarch64/autovec_param_sve-only.c: Likewise.
* gcc.target/aarch64/neoverse_v1_2.c: Update parameter value.
* gcc.target/aarch64/neoverse_v1_3.c: Likewise.
* gcc.target/aarch64/sve/cond_asrd_1.c: Likewise.
* gcc.target/aarch64/sve/cond_cnot_4.c: Likewise.
* gcc.target/aarch64/sve/cond_unary_5.c: Likewise.
* gcc.target/aarch64/sve/cond_uxt_5.c: Likewise.
* gcc.target/aarch64/sve/cond_xorsign_2.c: Likewise.
* gcc.target/aarch64/sve/pr98268-1.c: Likewise.
* gcc.target/aarch64/sve/pr98268-2.c: Likewise.
|
|
This affects only the RISC-V targets, where the compiler options
-gvariable-location-views and consequently also -ginline-points
are disabled by default, which is unexpected and disables some
useful features of the generated debug info.
Due to a bug in the gas assembler the .loc statement
is not usable to generate location view debug info.
That is detected by configure:
configure:31500: checking assembler for dwarf2 debug_view support
configure:31509: .../riscv-unknown-elf/bin/as -o conftest.o conftest.s >&5
conftest.s: Assembler messages:
conftest.s:5: Error: .uleb128 only supports constant or subtract expressions
conftest.s:6: Error: .uleb128 only supports constant or subtract expressions
configure:31512: $? = 1
configure: failed program was
.file 1 "conftest.s"
.loc 1 3 0 view .LVU1
nop
.data
.uleb128 .LVU1
.uleb128 .LVU1
configure:31523: result: no
This results in dwarf2out_as_locview_support being set to false,
and that creates a sequence of events, with the end result that
most inlined functions either have no DW_AT_entry_pc, or one
with a wrong entry pc value.
But the location views can also be generated without using any
.loc statements, therefore we should enable the option
-gvariable-location-views by default, regardless of the status
of -gas-locview-support.
Note however, that the combination of the following compiler options
-g -O2 -gvariable-location-views -gno-as-loc-support
turned out to create invalid assembler intermediate files,
with lots of assembler errors like:
Error: leb128 operand is an undefined symbol: .LVU3
This affects all targets, except RISC-V of course ;-)
and is fixed by the changes in dwarf2out.cc
Finally the .debug_loclists created without assembler locview support
did contain location view pairs like v0000000ffffffff v000000000000000
which is the value from FORCE_RESET_NEXT_VIEW, but that is most likely
not as expected either, so change that as well.
gcc/ChangeLog:
* dwarf2out.cc (dwarf2out_maybe_output_loclist_view_pair,
output_loc_list): Correct handling of -gno-as-loc-support,
use ZERO_VIEW_P to output view number as zero value.
* toplev.cc (process_options): Do not automatically disable
-gvariable-location-views when -gno-as-loc-support or
-gno-as-locview-support is used, instead do automatically
disable -gas-locview-support if -gno-as-loc-support is used.
gcc/testsuite/ChangeLog:
* gcc.dg/debug/dwarf2/inline2.c: Add checks for inline entry_pc.
* gcc.dg/debug/dwarf2/inline6.c: Add -gno-as-loc-support and check
the resulting location views.
|
|
While this already works correctly for the case when an inlined
subroutine contains only one subrange, a redundant DW_TAG_lexical_block
is still emitted when the subroutine has multiple blocks.
Fixes: ac02e5b75451 ("re PR debug/37801 (DWARF output for inlined functions
doesn't always use DW_TAG_inlined_subroutine)")
gcc/ChangeLog:
PR debug/87440
* dwarf2out.cc (gen_inlined_subroutine_die): Handle the case
of multiple subranges correctly.
gcc/testsuite/ChangeLog:
* gcc.dg/debug/dwarf2/inline7.c: New test.
|
|
This patch adds a new vectorization pattern that detects the modulo
operation where the second operand is a variable.
It replaces the statement by division, multiplication, and subtraction.
The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
Ok for mainline?
Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/
PR tree-optimization/101390
* tree-vect-patterns.cc (vect_recog_mod_var_pattern): Add new pattern.
gcc/testsuite/
PR tree-optimization/101390
* gcc.dg/vect/vect-mod-var.c: New test.
* gcc.target/aarch64/sve/mod_1.c: Likewise.
* lib/target-supports.exp: New selector expression.
|