Age | Commit message (Collapse) | Author | Files | Lines |
|
This fixes the output of -gnatRj for an extension of a tagged type which has
a variant part and also deals with the case where the parent type is private
with unknown discriminants.
gcc/ada/
* repinfo.ads (JSON output format): Document special case of
Present member of a Variant object.
* repinfo.adb (List_Structural_Record_Layout): Change the type of
Ext_Level parameter to Integer. Restrict the first recursion with
increasing levels to the fixed part and implement a second
recursion with decreasing levels for the variant part. Deal with
an extension of a type with unknown discriminants.
|
|
The varying case currently falls through to the 1/true case.
gcc/
* gimple-range-gori.cc (gori_compute::logical_combine): Add missing
return statement in the varying case.
gcc/testsuite/
* gnat.dg/opt102.adb:New test.
* gnat.dg/opt102_pkg.adb, gnat.dg/opt102_pkg.ads: New helper.
|
|
error
This patch fixes the runtime bug above. The full runtime message is:
findChildAndParent has caused internal runtime error, RTentity is either
corrupt or the module storage has not been initialized yet. The bug is
due to a non nul terminated string determining the module initialization order.
This results in modules being uninitialized and the above crash. The bug
manifests itself on 32 bit systems - but obviously is latent on all
targets and the fix should be applied to both gcc-14 and gcc-13.
gcc/m2/ChangeLog:
PR modula2/111510
* gm2-compiler/M2GenGCC.mod (IsExportedGcc): Minor spacing changes.
(BuildTrashTreeFromInterface): Minor spacing changes.
* gm2-compiler/M2Options.mod (GetRuntimeModuleOverride): Call
string to generate a nul terminated C style string.
* gm2-compiler/M2Quads.mod (BuildStringAdrParam): New procedure.
(BuildM2InitFunction): Replace inline parameter generation with
calls to BuildStringAdrParam.
(cherry picked from commit 53daf67fd55e005e37cb3ab33ac0783a71761de9)
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
|
|
This patch adds the ability to resize ranges as needed, defaulting to no
resizing. int_range_max now defaults to 3 sub-ranges (instead of 255)
and grows to 255 when the range being calculated does not fit.
PR tree-optimization/110315
* value-range-storage.h (vrange_allocator::alloc_irange): Adjust
new params.
* value-range.cc (irange::operator=): Resize range.
(irange::irange_union): Same.
(irange::irange_intersect): Same.
(irange::invert): Same.
* value-range.h (irange::maybe_resize): New.
(~int_range): New.
(int_range_max): Default to 3 sub-ranges and resize as needed.
(int_range::int_range): Adjust for resizing.
(int_range::operator=): Same.
|
|
This recent regression occurs when the nominal subtype of the constant is a
discriminated record type with default discriminants.
gcc/ada/
PR ada/110488
* sem_ch3.adb (Analyze_Object_Declaration): Do not build a default
subtype for a deferred constant in the definite case too.
|
|
|
|
|
|
We should guard both the diagnostic and backward compatibilty fallback
code with tf_error, so that in a SFINAE context we don't issue any
diagnostics and correctly treat ill-formed C++23 multidimensional
subscript operator expressions as such.
PR c++/111493
gcc/cp/ChangeLog:
* decl2.cc (grok_array_decl): Guard diagnostic and backward
compatibility fallback code paths with tf_error.
gcc/testsuite/ChangeLog:
* g++.dg/cpp23/subscript15.C: New test.
(cherry picked from commit 1fea14def849dd38b098b0e2d54e64801f9c1f43)
|
|
In order to compare the constraints of a ttp with that of its argument,
we rewrite the ttp's constraints in terms of the argument template's
template parameters. The substitution to achieve this currently uses a
single level of template arguments, but that never does the right thing
because a ttp's template parameters always have level >= 2. This patch
fixes this by including the outer template arguments in the substitution,
which ought to match the depth of the ttp.
The second testcase demonstrates it's better to substitute the concrete
outer template arguments instead of generic ones since a ttp's constraints
could depend on outer parameters.
PR c++/111485
gcc/cp/ChangeLog:
* pt.cc (is_compatible_template_arg): New parameter 'args'.
Add the outer template arguments 'args' to 'new_args'.
(convert_template_argument): Pass 'args' to
is_compatible_template_arg.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/concepts-ttp5.C: New test.
* g++.dg/cpp2a/concepts-ttp6.C: New test.
(cherry picked from commit 6f902a42b0afe3f3145bcb864695fc290b5acc3e)
|
|
This corrects resolving decltype of a (class) NTTP object as per
[dcl.type.decltype]/1.2 and [temp.param]/6 in the type-dependent case.
Note that in the non-dependent case we resolve the decltype ahead of
time, in which case finish_decltype_type drops the const VIEW_CONVERT_EXPR
wrapper around the TEMPLATE_PARM_INDEX, and the latter has the desired
non-const type.
In the type-dependent case, at instantiation time tsubst drops the
VIEW_CONVERT_EXPR since the substituted NTTP is the already-const object
created by get_template_parm_object. So in this case finish_decltype_type
sees the const object, which this patch now adds special handling for.
PR c++/99631
gcc/cp/ChangeLog:
* semantics.cc (finish_decltype_type): For an NTTP object,
return its type modulo cv-quals.
gcc/testsuite/ChangeLog:
* g++.dg/cpp2a/nontype-class60.C: New test.
(cherry picked from commit ddd064e3571c4a9e6258c75eba65585a07367712)
|
|
2023-09-24 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/92586
* trans-expr.cc (gfc_trans_arrayfunc_assign): Supply a missing
dereference for the call to gfc_deallocate_alloc_comp_no_caf.
gcc/testsuite/
PR fortran/92586
* gfortran.dg/pr92586.f90 : New test
|
|
2023-09-24 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/68155
* decl.cc (fix_initializer_charlen): New function broken out of
add_init_expr_to_sym.
(add_init_expr_to_sym, build_struct): Call the new function.
gcc/testsuite/
PR fortran/68155
* gfortran.dg/pr68155.f90: New test.
|
|
|
|
|
|
1. Move class NTTP object pretty printing to a more general spot in
the pretty printer, so that we always print its value instead of
its (mangled) name even when it appears outside of a template
argument list.
2. Print the type of an class NTTP object alongside its CONSTRUCTOR
value, like dump_expr would have done.
3. Don't print const VIEW_CONVERT_EXPR wrappers for class NTTPs.
PR c++/111471
gcc/cp/ChangeLog:
* cxx-pretty-print.cc (cxx_pretty_printer::expression)
<case VAR_DECL>: Handle class NTTP objects by printing
their type and value.
<case VIEW_CONVERT_EXPR>: Strip const VIEW_CONVERT_EXPR
wrappers for class NTTPs.
(pp_cxx_template_argument_list): Don't handle class NTTP
objects here.
gcc/testsuite/ChangeLog:
* g++.dg/concepts/diagnostic19.C: New test.
(cherry picked from commit 75c4b0cde4835b45350da0a5cd82f1d1a0a7a2f1)
|
|
|
|
|
|
Call GOMP_alloc/free for 'omp allocate' allocated variables. This is
for C only as C++ and Fortran show a sorry already in the FE. Note that
this only applies to stack variables as the C FE shows a sorry for
static variables.
gcc/ChangeLog:
* gimplify.cc (gimplify_bind_expr): Call GOMP_alloc/free for
'omp allocate' variables; move stack cleanup after other
cleanup.
(omp_notice_variable): Process original decl when decl
of the value-expression for a 'omp allocate' variable is passed.
* omp-low.cc (scan_omp_1_op): Handle 'omp allocate' variables
libgomp/ChangeLog:
* libgomp.texi (OpenMP 5.1 Impl.): Mark 'omp allocate' as
implemented for C only.
* testsuite/libgomp.c/allocate-4.c: New test.
* testsuite/libgomp.c/allocate-5.c: New test.
* testsuite/libgomp.c/allocate-6.c: New test.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/allocate-11.c: Remove C-only dg-message
for 'sorry, unimplemented'.
* c-c++-common/gomp/allocate-12.c: Likewise.
* c-c++-common/gomp/allocate-15.c: Likewise.
* c-c++-common/gomp/allocate-9.c: Likewise.
* c-c++-common/gomp/allocate-10.c: New test.
* c-c++-common/gomp/allocate-17.c: New test.
(cherry picked from commit 1a554a2c9f33fdb3c170f1c37274037ece050114)
|
|
aarch64_operands_ok_for_ldpstp contained the code:
/* One of the memory accesses must be a mempair operand.
If it is not the first one, they need to be swapped by the
peephole. */
if (!aarch64_mem_pair_operand (mem_1, GET_MODE (mem_1))
&& !aarch64_mem_pair_operand (mem_2, GET_MODE (mem_2)))
return false;
But the requirement isn't just that one of the accesses must be a
valid mempair operand. It's that the lower access must be, since
that's the access that will be used for the instruction operand.
gcc/
PR target/111411
* config/aarch64/aarch64.cc (aarch64_operands_ok_for_ldpstp): Require
the lower memory access to a mem-pair operand.
gcc/testsuite/
PR target/111411
* gcc.dg/rtl/aarch64/pr111411.c: New test.
(cherry picked from commit 2d38f45bcca62ca0c7afef4b579f82c5c2a01610)
|
|
While working on another patch, I hit a problem with the aarch64
expansion of untyped_call. The expander emits the usual:
(set (mem ...) (reg resN))
instructions to store the result registers to memory, but it didn't
say in RTL where those resN results came from. This eventually led
to a failure of gcc.dg/torture/stackalign/builtin-return-2.c,
via regrename.
This patch turns the untyped call from a plain call to a call_value,
to represent that the call returns (or might return) a useful value.
The patch also uses a PARALLEL return rtx to represent all the possible
return registers.
gcc/
* config/aarch64/aarch64.md (untyped_call): Emit a call_value
rather than a call. List each possible destination register
in the call pattern.
(cherry picked from commit 629efe27744d13c3b83bbe8338b84c37c83dbe4f)
|
|
|
|
This ICE was caused by an invalid assumption that all BIND_EXPRs have
a non-null BIND_EXPR_BLOCK. In C++ these do exist and are used for
temporaries introduced in expressions that are not full-expressions.
Since they have no block to fix up, the traversal can just ignore
these tree nodes.
gcc/cp/ChangeLog
PR c++/111274
* parser.cc (fixup_blocks_walker): Check for null BIND_EXPR_BLOCK.
gcc/testsuite/ChangeLog
PR c++/111274
* g++.dg/gomp/pr111274.C: New test case.
(cherry picked from commit ab4bdad49716eb1c60e22e0e617d5eb56b0bac6f)
|
|
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_clause_allocate): Handle
error_mark_node.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/allocate-13.c: New test.
(cherry picked from commit 55243898f8f9560371f258fe0c6ca202ab7b2085)
|
|
Both, specifying no category and specifying 'all', implies
that the implicit-behavior applies to all categories.
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_clause_defaultmap): Parse
'all' as category.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_omp_clause_defaultmap): Parse
'all' as category.
gcc/fortran/ChangeLog:
* gfortran.h (enum gfc_omp_defaultmap_category):
Add OMP_DEFAULTMAP_CAT_ALL.
* openmp.cc (gfc_match_omp_clauses): Parse
'all' as category.
* trans-openmp.cc (gfc_trans_omp_clauses): Handle it.
gcc/ChangeLog:
* tree-core.h (enum omp_clause_defaultmap_kind): Add
OMP_CLAUSE_DEFAULTMAP_CATEGORY_ALL.
* gimplify.cc (gimplify_scan_omp_clauses): Handle it.
* tree-pretty-print.cc (dump_omp_clause): Likewise.
libgomp/ChangeLog:
* libgomp.texi (OpenMP 5.2 status): Add depobj with
destroy-var argument as 'N'. Mark defaultmap with
'all' category as 'Y'.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/defaultmap-1.f90: Update dg-error.
* c-c++-common/gomp/defaultmap-5.c: New test.
* c-c++-common/gomp/defaultmap-6.c: New test.
* gfortran.dg/gomp/defaultmap-10.f90: New test.
* gfortran.dg/gomp/defaultmap-9.f90: New test.
(cherry picked from commit 0698c9fddfc5a41dd7f233928b7a486cb044fea3)
|
|
|
|
Merge up to r13-7822-g10c7edcc65d4bf1d05a9f0791e77e7b953e3e796 (18th Sep 2023)
|
|
vsetvl pass has been refactored in gcc14, and the optimization
is more reasonable than releases/gcc-13. This problem does not
exist in gcc14.
Phase 6 of gcc13 is an optimization patch. Due to lack of consideration,
there will be some hidden bugs, so we decided to remove phase 6.
Although the generated code will be redundant, the program is correct.
PR target/111412
gcc/ChangeLog:
* config/riscv/riscv-vsetvl.cc (vector_infos_manager::release): Remove.
(pass_vsetvl::refine_vsetvls): Ditto.
(pass_vsetvl::cleanup_vsetvls): Ditto.
(pass_vsetvl::propagate_avl): Ditto.
(pass_vsetvl::lazy_vsetvl): Ditto.
* config/riscv/riscv-vsetvl.h: Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/vsetvl/avl_single-79.c: Adjust case.
* gcc.target/riscv/rvv/vsetvl/avl_single-80.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/avl_single-86.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/avl_single-87.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/avl_single-88.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/avl_single-89.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/avl_single-90.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-25.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-26.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-14.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-15.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvl-1.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvl-5.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvl-6.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvl-7.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvl-8.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvlmax-2.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vsetvlmax-4.c: Ditto.
* gcc.target/riscv/rvv/base/pr111412.c: New test.
|
|
|
|
|
|
|
|
|
|
|
|
This patch generates a singular or plural message relating to the
number of enums missing. Use -Wcase-enum when building of the
modula-2 libraries and m2/stage2/cc1gm2.
gcc/m2/ChangeLog:
* Make-lang.in (GM2_FLAGS): Add -Wcase-enum.
(GM2_ISO_FLAGS): Add -Wcase-enum.
* gm2-compiler/M2CaseList.mod (EnumerateErrors): Issue
singular or plural start text prior to the enum list.
Remove unused parameter tokenno.
(EmitMissingRangeErrors): New procedure.
(MissingCaseBounds): Call EmitMissingRangeErrors.
(MissingCaseStatementBounds): Call EmitMissingRangeErrors.
* gm2-libs-iso/TextIO.mod: Fix spacing.
libgm2/ChangeLog:
* libm2cor/Makefile.am (libm2cor_la_M2FLAGS): Add
-Wcase-enum.
* libm2cor/Makefile.in: Regenerate.
* libm2iso/Makefile.am (libm2iso_la_M2FLAGS): Add
-Wcase-enum.
* libm2iso/Makefile.in: Regenerate.
* libm2log/Makefile.am (libm2log_la_M2FLAGS): Add
-Wcase-enum.
* libm2log/Makefile.in: Regenerate.
* libm2pim/Makefile.am (libm2pim_la_M2FLAGS): Add
-Wcase-enum.
* libm2pim/Makefile.in: Regenerate.
(cherry picked from commit 3af2af15798cb6243e2643f98f62c9270b1ca5d2)
Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
|
|
|
|
AArch64 normally puts the saved registers near the bottom of the frame,
immediately above any dynamic allocations. But this means that a
stack-smash attack on those dynamic allocations could overwrite the
saved registers without needing to reach as far as the stack smash
canary.
The same thing could also happen for variable-sized arguments that are
passed by value, since those are allocated before a call and popped on
return.
This patch avoids that by putting the locals (and thus the canary) below
the saved registers when stack smash protection is active.
The patch fixes CVE-2023-4039.
gcc/
* config/aarch64/aarch64.cc (aarch64_save_regs_above_locals_p):
New function.
(aarch64_layout_frame): Use it to decide whether locals should
go above or below the saved registers.
(aarch64_expand_prologue): Update stack layout comment.
Emit a stack tie after the final adjustment.
gcc/testsuite/
* gcc.target/aarch64/stack-protector-8.c: New test.
* gcc.target/aarch64/stack-protector-9.c: Likewise.
|
|
After previous patches, it's no longer necessary to store
saved_regs_size and below_hard_fp_saved_regs_size in the frame info.
All measurements instead use the top or bottom of the frame as
reference points.
gcc/
* config/aarch64/aarch64.h (aarch64_frame::saved_regs_size)
(aarch64_frame::below_hard_fp_saved_regs_size): Delete.
* config/aarch64/aarch64.cc (aarch64_layout_frame): Update accordingly.
|
|
The stack frame is currently divided into three areas:
A: the area above the hard frame pointer
B: the SVE saves below the hard frame pointer
C: the outgoing arguments
If the stack frame is allocated in one chunk, the allocation needs a
probe if the frame size is >= guard_size - 1KiB. In addition, if the
function is not a leaf function, it must probe an address no more than
1KiB above the outgoing SP. We ensured the second condition by
(1) using single-chunk allocations for non-leaf functions only if
the link register save slot is within 512 bytes of the bottom
of the frame; and
(2) using the link register save as a probe (meaning, for instance,
that it can't be individually shrink wrapped)
If instead the stack is allocated in multiple chunks, then:
* an allocation involving only the outgoing arguments (C above) requires
a probe if the allocation size is > 1KiB
* any other allocation requires a probe if the allocation size
is >= guard_size - 1KiB
* second and subsequent allocations require the previous allocation
to probe at the bottom of the allocated area, regardless of the size
of that previous allocation
The final point means that, unlike for single allocations,
it can be necessary to have both a non-SVE register probe and
an SVE register probe. For example:
* allocate A, probe using a non-SVE register save
* allocate B, probe using an SVE register save
* allocate C
The non-SVE register used in this case was again the link register.
It was previously used even if the link register save slot was some
bytes above the bottom of the non-SVE register saves, but an earlier
patch avoided that by putting the link register save slot first.
As a belt-and-braces fix, this patch explicitly records which
probe registers we're using and allows the non-SVE probe to be
whichever register comes first (as for SVE).
The patch also avoids unnecessary probes in sve/pcs/stack_clash_3.c.
gcc/
* config/aarch64/aarch64.h (aarch64_frame::sve_save_and_probe)
(aarch64_frame::hard_fp_save_and_probe): New fields.
* config/aarch64/aarch64.cc (aarch64_layout_frame): Initialize them.
Rather than asserting that a leaf function saves LR, instead assert
that a leaf function saves something.
(aarch64_get_separate_components): Prevent the chosen probe
registers from being individually shrink-wrapped.
(aarch64_allocate_and_probe_stack_space): Remove workaround for
probe registers that aren't at the bottom of the previous allocation.
gcc/testsuite/
* gcc.target/aarch64/sve/pcs/stack_clash_3.c: Avoid redundant probes.
|
|
Previous patches ensured that the final frame allocation only needs
a probe when the size is strictly greater than 1KiB. It's therefore
safe to use the normal 1024 probe offset in all cases.
The main motivation for doing this is to simplify the code and
remove the number of special cases.
gcc/
* config/aarch64/aarch64.cc (aarch64_allocate_and_probe_stack_space):
Always probe the residual allocation at offset 1024, asserting
that that is in range.
gcc/testsuite/
* gcc.target/aarch64/stack-check-prologue-17.c: Expect the probe
to be at offset 1024 rather than offset 0.
* gcc.target/aarch64/stack-check-prologue-18.c: Likewise.
* gcc.target/aarch64/stack-check-prologue-19.c: Likewise.
|
|
-fstack-clash-protection uses the save of LR as a probe for the next
allocation. The next allocation could be:
* another part of the static frame, e.g. when allocating SVE save slots
or outgoing arguments
* an alloca in the same function
* an allocation made by a callee function
However, when -fomit-frame-pointer is used, the LR save slot is placed
above the other GPR save slots. It could therefore be up to 80 bytes
above the base of the GPR save area (which is also the hard fp address).
aarch64_allocate_and_probe_stack_space took this into account when
deciding how much subsequent space could be allocated without needing
a probe. However, it interacted badly with:
/* If doing a small final adjustment, we always probe at offset 0.
This is done to avoid issues when LR is not at position 0 or when
the final adjustment is smaller than the probing offset. */
else if (final_adjustment_p && rounded_size == 0)
residual_probe_offset = 0;
which forces any allocation that is smaller than the guard page size
to be probed at offset 0 rather than the usual offset 1024. It was
therefore possible to construct cases in which we had:
* a probe using LR at SP + 80 bytes (or some other value >= 16)
* an allocation of the guard page size - 16 bytes
* a probe at SP + 0
which allocates guard page size + 64 consecutive unprobed bytes.
This patch requires the LR probe to be in the first 16 bytes of the
save area when stack clash protection is active. Doing it
unconditionally would cause code-quality regressions.
Putting LR before other registers prevents push/pop allocation
when shadow call stacks are enabled, since LR is restored
separately from the other callee-saved registers.
The new comment doesn't say that the probe register is required
to be LR, since a later patch removes that restriction.
gcc/
* config/aarch64/aarch64.cc (aarch64_layout_frame): Ensure that
the LR save slot is in the first 16 bytes of the register save area.
Only form STP/LDP push/pop candidates if both registers are valid.
(aarch64_allocate_and_probe_stack_space): Remove workaround for
when LR was not in the first 16 bytes.
gcc/testsuite/
* gcc.target/aarch64/stack-check-prologue-18.c: New test.
* gcc.target/aarch64/stack-check-prologue-19.c: Likewise.
* gcc.target/aarch64/stack-check-prologue-20.c: Likewise.
|
|
The AArch64 ABI says that, when stack clash protection is used,
there can be a maximum of 1KiB of unprobed space at sp on entry
to a function. Therefore, we need to probe when allocating
>= guard_size - 1KiB of data (>= rather than >). This is what
GCC does.
If an allocation is exactly guard_size bytes, it is enough to allocate
those bytes and probe once at offset 1024. It isn't possible to use a
single probe at any other offset: higher would conmplicate later code,
by leaving more unprobed space than usual, while lower would risk
leaving an entire page unprobed. For simplicity, the code probes all
allocations at offset 1024.
Some register saves also act as probes. If we need to allocate
more space below the last such register save probe, we need to
probe the allocation if it is > 1KiB. Again, this allocation is
then sometimes (but not always) probed at offset 1024. This sort of
allocation is currently only used for outgoing arguments, which are
rarely this big.
However, the code also probed if this final outgoing-arguments
allocation was == 1KiB, rather than just > 1KiB. This isn't
necessary, since the register save then probes at offset 1024
as required. Continuing to probe allocations of exactly 1KiB
would complicate later patches.
gcc/
* config/aarch64/aarch64.cc (aarch64_allocate_and_probe_stack_space):
Don't probe final allocations that are exactly 1KiB in size (after
unprobed space above the final allocation has been deducted).
gcc/testsuite/
* gcc.target/aarch64/stack-check-prologue-17.c: New test.
|
|
This patch just changes a calculation of initial_adjust
to one that makes it slightly more obvious that the total
adjustment is frame.frame_size.
gcc/
* config/aarch64/aarch64.cc (aarch64_layout_frame): Tweak
calculation of initial_adjust for frames in which all saves
are SVE saves.
|
|
After previous patches, it no longer really makes sense to allocate
the top of the frame in terms of varargs_and_saved_regs_size and
saved_regs_and_above.
gcc/
* config/aarch64/aarch64.cc (aarch64_layout_frame): Simplify
the allocation of the top of the frame.
|
|
reg_offset was measured from the bottom of the saved register area.
This made perfect sense with the original layout, since the bottom
of the saved register area was also the hard frame pointer address.
It became slightly less obvious with SVE, since we save SVE
registers below the hard frame pointer, but it still made sense.
However, if we want to allow different frame layouts, it's more
convenient and obvious to measure reg_offset from the bottom of
the frame. After previous patches, it's also a slight simplification
in its own right.
gcc/
* config/aarch64/aarch64.h (aarch64_frame): Add comment above
reg_offset.
* config/aarch64/aarch64.cc (aarch64_layout_frame): Walk offsets
from the bottom of the frame, rather than the bottom of the saved
register area. Measure reg_offset from the bottom of the frame
rather than the bottom of the saved register area.
(aarch64_save_callee_saves): Update accordingly.
(aarch64_restore_callee_saves): Likewise.
(aarch64_get_separate_components): Likewise.
(aarch64_process_components): Likewise.
|
|
This patch fixes another case in which a value was described with
an “upside-down” view.
gcc/
* config/aarch64/aarch64.h (aarch64_frame::frame_size): Tweak comment.
|
|
Similarly to the previous locals_offset patch, hard_fp_offset
was described as:
/* Offset from the base of the frame (incomming SP) to the
hard_frame_pointer. This value is always a multiple of
STACK_BOUNDARY. */
poly_int64 hard_fp_offset;
which again took an “upside-down” view: higher offsets meant lower
addresses. This patch renames the field to bytes_above_hard_fp instead.
gcc/
* config/aarch64/aarch64.h (aarch64_frame::hard_fp_offset): Rename
to...
(aarch64_frame::bytes_above_hard_fp): ...this.
* config/aarch64/aarch64.cc (aarch64_layout_frame)
(aarch64_expand_prologue): Update accordingly.
(aarch64_initial_elimination_offset): Likewise.
|
|
locals_offset was described as:
/* Offset from the base of the frame (incomming SP) to the
top of the locals area. This value is always a multiple of
STACK_BOUNDARY. */
This is implicitly an “upside down” view of the frame: the incoming
SP is at offset 0, and anything N bytes below the incoming SP is at
offset N (rather than -N).
However, reg_offset instead uses a “right way up” view; that is,
it views offsets in address terms. Something above X is at a
positive offset from X and something below X is at a negative
offset from X.
Also, even on FRAME_GROWS_DOWNWARD targets like AArch64,
target-independent code views offsets in address terms too:
locals are allocated at negative offsets to virtual_stack_vars.
It seems confusing to have *_offset fields of the same structure
using different polarities like this. This patch tries to avoid
that by renaming locals_offset to bytes_above_locals.
gcc/
* config/aarch64/aarch64.h (aarch64_frame::locals_offset): Rename to...
(aarch64_frame::bytes_above_locals): ...this.
* config/aarch64/aarch64.cc (aarch64_layout_frame)
(aarch64_initial_elimination_offset): Update accordingly.
|
|
After previous patches, it is no longer necessary to calculate
a chain_offset in cases where there is no chain record.
gcc/
* config/aarch64/aarch64.cc (aarch64_expand_prologue): Move the
calculation of chain_offset into the emit_frame_chain block.
|
|
aarch64_save_callee_saves and aarch64_restore_callee_saves took
a parameter called start_offset that gives the offset of the
bottom of the saved register area from the current stack pointer.
However, it's more convenient for later patches if we use the
bottom of the entire frame as the reference point, rather than
the bottom of the saved registers.
Doing that removes the need for the callee_offset field.
Other than that, this is not a win on its own. It only really
makes sense in combination with the follow-on patches.
gcc/
* config/aarch64/aarch64.h (aarch64_frame::callee_offset): Delete.
* config/aarch64/aarch64.cc (aarch64_layout_frame): Remove
callee_offset handling.
(aarch64_save_callee_saves): Replace the start_offset parameter
with a bytes_below_sp parameter.
(aarch64_restore_callee_saves): Likewise.
(aarch64_expand_prologue): Update accordingly.
(aarch64_expand_epilogue): Likewise.
|
|
Following on from the previous bytes_below_saved_regs patch, this one
records the number of bytes that are below the hard frame pointer.
This eventually replaces below_hard_fp_saved_regs_size.
If a frame pointer is not needed, the epilogue adds final_adjust
to the stack pointer before restoring registers:
aarch64_add_sp (tmp1_rtx, tmp0_rtx, final_adjust, true);
Therefore, if the epilogue needs to restore the stack pointer from
the hard frame pointer, the directly corresponding offset is:
-bytes_below_hard_fp + final_adjust
i.e. go from the hard frame pointer to the bottom of the frame,
then add the same amount as if we were using the stack pointer
from the outset.
gcc/
* config/aarch64/aarch64.h (aarch64_frame::bytes_below_hard_fp): New
field.
* config/aarch64/aarch64.cc (aarch64_layout_frame): Initialize it.
(aarch64_expand_epilogue): Use it instead of
below_hard_fp_saved_regs_size.
|
|
The frame layout code currently hard-codes the assumption that
the number of bytes below the saved registers is equal to the
size of the outgoing arguments. This patch abstracts that
value into a new field of aarch64_frame.
gcc/
* config/aarch64/aarch64.h (aarch64_frame::bytes_below_saved_regs): New
field.
* config/aarch64/aarch64.cc (aarch64_layout_frame): Initialize it,
and use it instead of crtl->outgoing_args_size.
(aarch64_get_separate_components): Use bytes_below_saved_regs instead
of outgoing_args_size.
(aarch64_process_components): Likewise.
|