riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2023-09-27	ada: Fix invalid JSON for extended variant record with -gnatRj	Eric Botcazou	2	-9/+52
	This fixes the output of -gnatRj for an extension of a tagged type which has a variant part and also deals with the case where the parent type is private with unknown discriminants. gcc/ada/ * repinfo.ads (JSON output format): Document special case of Present member of a Variant object. * repinfo.adb (List_Structural_Record_Layout): Change the type of Ext_Level parameter to Integer. Restrict the first recursion with increasing levels to the fixed part and implement a second recursion with decreasing levels for the variant part. Deal with an extension of a type with unknown discriminants.
2023-09-26	Add missing return in gori_compute::logical_combine	Eric Botcazou	4	-0/+33
	The varying case currently falls through to the 1/true case. gcc/ * gimple-range-gori.cc (gori_compute::logical_combine): Add missing return statement in the varying case. gcc/testsuite/ * gnat.dg/opt102.adb:New test. * gnat.dg/opt102_pkg.adb, gnat.dg/opt102_pkg.ads: New helper.
2023-09-26	PR modula2/111510 runtime ICE findChildAndParent has caused internal runtime ↵	Gaius Mulley	3	-19/+23
	error This patch fixes the runtime bug above. The full runtime message is: findChildAndParent has caused internal runtime error, RTentity is either corrupt or the module storage has not been initialized yet. The bug is due to a non nul terminated string determining the module initialization order. This results in modules being uninitialized and the above crash. The bug manifests itself on 32 bit systems - but obviously is latent on all targets and the fix should be applied to both gcc-14 and gcc-13. gcc/m2/ChangeLog: PR modula2/111510 * gm2-compiler/M2GenGCC.mod (IsExportedGcc): Minor spacing changes. (BuildTrashTreeFromInterface): Minor spacing changes. * gm2-compiler/M2Options.mod (GetRuntimeModuleOverride): Call string to generate a nul terminated C style string. * gm2-compiler/M2Quads.mod (BuildStringAdrParam): New procedure. (BuildM2InitFunction): Replace inline parameter generation with calls to BuildStringAdrParam. (cherry picked from commit 53daf67fd55e005e37cb3ab33ac0783a71761de9) Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2023-09-26	Reduce the initial size of int_range_max.	Andrew MacLeod	3	-30/+83
	This patch adds the ability to resize ranges as needed, defaulting to no resizing. int_range_max now defaults to 3 sub-ranges (instead of 255) and grows to 255 when the range being calculated does not fit. PR tree-optimization/110315 * value-range-storage.h (vrange_allocator::alloc_irange): Adjust new params. * value-range.cc (irange::operator=): Resize range. (irange::irange_union): Same. (irange::irange_intersect): Same. (irange::invert): Same. * value-range.h (irange::maybe_resize): New. (~int_range): New. (int_range_max): Default to 3 sub-ranges and resize as needed. (int_range::int_range): Adjust for resizing. (int_range::operator=): Same.
2023-09-26	ada: Fix deferred constant wrongly rejected	Eric Botcazou	1	-2/+4
	This recent regression occurs when the nominal subtype of the constant is a discriminated record type with default discriminants. gcc/ada/ PR ada/110488 * sem_ch3.adb (Analyze_Object_Declaration): Do not build a default subtype for a deferred constant in the definite case too.
2023-09-26	Daily bump.	GCC Administrator	1	-1/+1

2023-09-25	Daily bump.	GCC Administrator	4	-1/+78

2023-09-24	c++: missing SFINAE in grok_array_decl [PR111493]	Patrick Palka	2	-3/+37
	We should guard both the diagnostic and backward compatibilty fallback code with tf_error, so that in a SFINAE context we don't issue any diagnostics and correctly treat ill-formed C++23 multidimensional subscript operator expressions as such. PR c++/111493 gcc/cp/ChangeLog: * decl2.cc (grok_array_decl): Guard diagnostic and backward compatibility fallback code paths with tf_error. gcc/testsuite/ChangeLog: * g++.dg/cpp23/subscript15.C: New test. (cherry picked from commit 1fea14def849dd38b098b0e2d54e64801f9c1f43)
2023-09-24	c++: constraint rewriting during ttp coercion [PR111485]	Patrick Palka	3	-2/+44
	In order to compare the constraints of a ttp with that of its argument, we rewrite the ttp's constraints in terms of the argument template's template parameters. The substitution to achieve this currently uses a single level of template arguments, but that never does the right thing because a ttp's template parameters always have level >= 2. This patch fixes this by including the outer template arguments in the substitution, which ought to match the depth of the ttp. The second testcase demonstrates it's better to substitute the concrete outer template arguments instead of generic ones since a ttp's constraints could depend on outer parameters. PR c++/111485 gcc/cp/ChangeLog: * pt.cc (is_compatible_template_arg): New parameter 'args'. Add the outer template arguments 'args' to 'new_args'. (convert_template_argument): Pass 'args' to is_compatible_template_arg. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/concepts-ttp5.C: New test. * g++.dg/cpp2a/concepts-ttp6.C: New test. (cherry picked from commit 6f902a42b0afe3f3145bcb864695fc290b5acc3e)
2023-09-24	c++: constness of decltype of NTTP object [PR99631]	Patrick Palka	2	-0/+26
	This corrects resolving decltype of a (class) NTTP object as per [dcl.type.decltype]/1.2 and [temp.param]/6 in the type-dependent case. Note that in the non-dependent case we resolve the decltype ahead of time, in which case finish_decltype_type drops the const VIEW_CONVERT_EXPR wrapper around the TEMPLATE_PARM_INDEX, and the latter has the desired non-const type. In the type-dependent case, at instantiation time tsubst drops the VIEW_CONVERT_EXPR since the substituted NTTP is the already-const object created by get_template_parm_object. So in this case finish_decltype_type sees the const object, which this patch now adds special handling for. PR c++/99631 gcc/cp/ChangeLog: * semantics.cc (finish_decltype_type): For an NTTP object, return its type modulo cv-quals. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/nontype-class60.C: New test. (cherry picked from commit ddd064e3571c4a9e6258c75eba65585a07367712)
2023-09-24	Fortran: Supply a missing dereference [PR92586]	Paul Thomas	2	-1/+63
	2023-09-24 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/92586 * trans-expr.cc (gfc_trans_arrayfunc_assign): Supply a missing dereference for the call to gfc_deallocate_alloc_comp_no_caf. gcc/testsuite/ PR fortran/92586 * gfortran.dg/pr92586.f90 : New test
2023-09-24	Fortran: Pad mismatched charlens in component initializers [PR68155]	Paul Thomas	2	-34/+79
	2023-09-24 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/68155 * decl.cc (fix_initializer_charlen): New function broken out of add_init_expr_to_sym. (add_init_expr_to_sym, build_struct): Call the new function. gcc/testsuite/ PR fortran/68155 * gfortran.dg/pr68155.f90: New test.
2023-09-24	Daily bump.	GCC Administrator	1	-1/+1

2023-09-23	Daily bump.	GCC Administrator	3	-1/+23

2023-09-22	c++: improve class NTTP object pretty printing [PR111471]	Patrick Palka	2	-2/+37
	1. Move class NTTP object pretty printing to a more general spot in the pretty printer, so that we always print its value instead of its (mangled) name even when it appears outside of a template argument list. 2. Print the type of an class NTTP object alongside its CONSTRUCTOR value, like dump_expr would have done. 3. Don't print const VIEW_CONVERT_EXPR wrappers for class NTTPs. PR c++/111471 gcc/cp/ChangeLog: * cxx-pretty-print.cc (cxx_pretty_printer::expression) <case VAR_DECL>: Handle class NTTP objects by printing their type and value. <case VIEW_CONVERT_EXPR>: Strip const VIEW_CONVERT_EXPR wrappers for class NTTPs. (pp_cxx_template_argument_list): Don't handle class NTTP objects here. gcc/testsuite/ChangeLog: * g++.dg/concepts/diagnostic19.C: New test. (cherry picked from commit 75c4b0cde4835b45350da0a5cd82f1d1a0a7a2f1)
2023-09-22	Daily bump.	GCC Administrator	1	-1/+1

2023-09-21	Daily bump.	GCC Administrator	3	-1/+27

2023-09-20	OpenMP: Add ME support for 'omp allocate' stack variables	Tobias Burnus	11	-33/+242
	Call GOMP_alloc/free for 'omp allocate' allocated variables. This is for C only as C++ and Fortran show a sorry already in the FE. Note that this only applies to stack variables as the C FE shows a sorry for static variables. gcc/ChangeLog: * gimplify.cc (gimplify_bind_expr): Call GOMP_alloc/free for 'omp allocate' variables; move stack cleanup after other cleanup. (omp_notice_variable): Process original decl when decl of the value-expression for a 'omp allocate' variable is passed. * omp-low.cc (scan_omp_1_op): Handle 'omp allocate' variables libgomp/ChangeLog: * libgomp.texi (OpenMP 5.1 Impl.): Mark 'omp allocate' as implemented for C only. * testsuite/libgomp.c/allocate-4.c: New test. * testsuite/libgomp.c/allocate-5.c: New test. * testsuite/libgomp.c/allocate-6.c: New test. gcc/testsuite/ChangeLog: * c-c++-common/gomp/allocate-11.c: Remove C-only dg-message for 'sorry, unimplemented'. * c-c++-common/gomp/allocate-12.c: Likewise. * c-c++-common/gomp/allocate-15.c: Likewise. * c-c++-common/gomp/allocate-9.c: Likewise. * c-c++-common/gomp/allocate-10.c: New test. * c-c++-common/gomp/allocate-17.c: New test. (cherry picked from commit 1a554a2c9f33fdb3c170f1c37274037ece050114)
2023-09-20	aarch64: Fix loose ldpstp check [PR111411]	Richard Sandiford	2	-5/+60
	aarch64_operands_ok_for_ldpstp contained the code: /* One of the memory accesses must be a mempair operand. If it is not the first one, they need to be swapped by the peephole. / if (!aarch64_mem_pair_operand (mem_1, GET_MODE (mem_1)) && !aarch64_mem_pair_operand (mem_2, GET_MODE (mem_2))) return false; But the requirement isn't just that one of the accesses must be a valid mempair operand. It's that the lower access must be, since that's the access that will be used for the instruction operand. gcc/ PR target/111411 config/aarch64/aarch64.cc (aarch64_operands_ok_for_ldpstp): Require the lower memory access to a mem-pair operand. gcc/testsuite/ PR target/111411 * gcc.dg/rtl/aarch64/pr111411.c: New test. (cherry picked from commit 2d38f45bcca62ca0c7afef4b579f82c5c2a01610)
2023-09-20	aarch64: Fix return register handling in untyped_call	Richard Sandiford	1	-1/+19
	While working on another patch, I hit a problem with the aarch64 expansion of untyped_call. The expander emits the usual: (set (mem ...) (reg resN)) instructions to store the result registers to memory, but it didn't say in RTL where those resN results came from. This eventually led to a failure of gcc.dg/torture/stackalign/builtin-return-2.c, via regrename. This patch turns the untyped call from a plain call to a call_value, to represent that the call returns (or might return) a useful value. The patch also uses a PARALLEL return rtx to represent all the possible return registers. gcc/ * config/aarch64/aarch64.md (untyped_call): Emit a call_value rather than a call. List each possible destination register in the call pattern. (cherry picked from commit 629efe27744d13c3b83bbe8338b84c37c83dbe4f)
2023-09-20	Daily bump.	GCC Administrator	1	-1/+1

2023-09-19	OpenMP: Fix ICE in fixup_blocks_walker [PR111274]	Sandra Loosemore	2	-1/+19
	This ICE was caused by an invalid assumption that all BIND_EXPRs have a non-null BIND_EXPR_BLOCK. In C++ these do exist and are used for temporaries introduced in expressions that are not full-expressions. Since they have no block to fix up, the traversal can just ignore these tree nodes. gcc/cp/ChangeLog PR c++/111274 * parser.cc (fixup_blocks_walker): Check for null BIND_EXPR_BLOCK. gcc/testsuite/ChangeLog PR c++/111274 * g++.dg/gomp/pr111274.C: New test case. (cherry picked from commit ab4bdad49716eb1c60e22e0e617d5eb56b0bac6f)
2023-09-19	OpenMP: Avoid ICE in c_parser_omp_clause_allocate with invalid expr	Tobias Burnus	4	-1/+46
	gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_clause_allocate): Handle error_mark_node. gcc/testsuite/ChangeLog: * c-c++-common/gomp/allocate-13.c: New test. (cherry picked from commit 55243898f8f9560371f258fe0c6ca202ab7b2085)
2023-09-19	OpenMP: Handle 'all' as category in defaultmap	Tobias Burnus	18	-12/+378
	Both, specifying no category and specifying 'all', implies that the implicit-behavior applies to all categories. gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_clause_defaultmap): Parse 'all' as category. gcc/cp/ChangeLog: * parser.cc (cp_parser_omp_clause_defaultmap): Parse 'all' as category. gcc/fortran/ChangeLog: * gfortran.h (enum gfc_omp_defaultmap_category): Add OMP_DEFAULTMAP_CAT_ALL. * openmp.cc (gfc_match_omp_clauses): Parse 'all' as category. * trans-openmp.cc (gfc_trans_omp_clauses): Handle it. gcc/ChangeLog: * tree-core.h (enum omp_clause_defaultmap_kind): Add OMP_CLAUSE_DEFAULTMAP_CATEGORY_ALL. * gimplify.cc (gimplify_scan_omp_clauses): Handle it. * tree-pretty-print.cc (dump_omp_clause): Likewise. libgomp/ChangeLog: * libgomp.texi (OpenMP 5.2 status): Add depobj with destroy-var argument as 'N'. Mark defaultmap with 'all' category as 'Y'. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/defaultmap-1.f90: Update dg-error. * c-c++-common/gomp/defaultmap-5.c: New test. * c-c++-common/gomp/defaultmap-6.c: New test. * gfortran.dg/gomp/defaultmap-10.f90: New test. * gfortran.dg/gomp/defaultmap-9.f90: New test. (cherry picked from commit 0698c9fddfc5a41dd7f233928b7a486cb044fea3)
2023-09-19	Daily bump.	GCC Administrator	3	-1/+34

2023-09-18	Merge branch 'releases/gcc-13' into devel/omp/gcc-13	Tobias Burnus	65	-814/+1930
	Merge up to r13-7822-g10c7edcc65d4bf1d05a9f0791e77e7b953e3e796 (18th Sep 2023)
2023-09-18	RISC-V: Remove phase 6 of vsetvl pass in GCC13[PR111412]	xuli	21	-190/+80
	vsetvl pass has been refactored in gcc14, and the optimization is more reasonable than releases/gcc-13. This problem does not exist in gcc14. Phase 6 of gcc13 is an optimization patch. Due to lack of consideration, there will be some hidden bugs, so we decided to remove phase 6. Although the generated code will be redundant, the program is correct. PR target/111412 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (vector_infos_manager::release): Remove. (pass_vsetvl::refine_vsetvls): Ditto. (pass_vsetvl::cleanup_vsetvls): Ditto. (pass_vsetvl::propagate_avl): Ditto. (pass_vsetvl::lazy_vsetvl): Ditto. * config/riscv/riscv-vsetvl.h: Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/avl_single-79.c: Adjust case. * gcc.target/riscv/rvv/vsetvl/avl_single-80.c: Ditto. * gcc.target/riscv/rvv/vsetvl/avl_single-86.c: Ditto. * gcc.target/riscv/rvv/vsetvl/avl_single-87.c: Ditto. * gcc.target/riscv/rvv/vsetvl/avl_single-88.c: Ditto. * gcc.target/riscv/rvv/vsetvl/avl_single-89.c: Ditto. * gcc.target/riscv/rvv/vsetvl/avl_single-90.c: Ditto. * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-25.c: Ditto. * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-26.c: Ditto. * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-14.c: Ditto. * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-15.c: Ditto. * gcc.target/riscv/rvv/vsetvl/vsetvl-1.c: Ditto. * gcc.target/riscv/rvv/vsetvl/vsetvl-5.c: Ditto. * gcc.target/riscv/rvv/vsetvl/vsetvl-6.c: Ditto. * gcc.target/riscv/rvv/vsetvl/vsetvl-7.c: Ditto. * gcc.target/riscv/rvv/vsetvl/vsetvl-8.c: Ditto. * gcc.target/riscv/rvv/vsetvl/vsetvlmax-2.c: Ditto. * gcc.target/riscv/rvv/vsetvl/vsetvlmax-4.c: Ditto. * gcc.target/riscv/rvv/base/pr111412.c: New test.
2023-09-18	Daily bump.	GCC Administrator	1	-1/+1

2023-09-17	Daily bump.	GCC Administrator	1	-1/+1

2023-09-16	Daily bump.	GCC Administrator	1	-1/+1

2023-09-15	Daily bump.	GCC Administrator	1	-1/+1

2023-09-14	Daily bump.	GCC Administrator	2	-1/+16

2023-09-13	[PATCH] modula2: -Wcase-enum detect singular/plural and use switch during build	Gaius Mulley	3	-15/+42
	This patch generates a singular or plural message relating to the number of enums missing. Use -Wcase-enum when building of the modula-2 libraries and m2/stage2/cc1gm2. gcc/m2/ChangeLog: * Make-lang.in (GM2_FLAGS): Add -Wcase-enum. (GM2_ISO_FLAGS): Add -Wcase-enum. * gm2-compiler/M2CaseList.mod (EnumerateErrors): Issue singular or plural start text prior to the enum list. Remove unused parameter tokenno. (EmitMissingRangeErrors): New procedure. (MissingCaseBounds): Call EmitMissingRangeErrors. (MissingCaseStatementBounds): Call EmitMissingRangeErrors. * gm2-libs-iso/TextIO.mod: Fix spacing. libgm2/ChangeLog: * libm2cor/Makefile.am (libm2cor_la_M2FLAGS): Add -Wcase-enum. * libm2cor/Makefile.in: Regenerate. * libm2iso/Makefile.am (libm2iso_la_M2FLAGS): Add -Wcase-enum. * libm2iso/Makefile.in: Regenerate. * libm2log/Makefile.am (libm2log_la_M2FLAGS): Add -Wcase-enum. * libm2log/Makefile.in: Regenerate. * libm2pim/Makefile.am (libm2pim_la_M2FLAGS): Add -Wcase-enum. * libm2pim/Makefile.in: Regenerate. (cherry picked from commit 3af2af15798cb6243e2643f98f62c9270b1ca5d2) Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2023-09-13	Daily bump.	GCC Administrator	4	-1/+284

2023-09-12	aarch64: Make stack smash canary protect saved registers	Richard Sandiford	3	-6/+168
	AArch64 normally puts the saved registers near the bottom of the frame, immediately above any dynamic allocations. But this means that a stack-smash attack on those dynamic allocations could overwrite the saved registers without needing to reach as far as the stack smash canary. The same thing could also happen for variable-sized arguments that are passed by value, since those are allocated before a call and popped on return. This patch avoids that by putting the locals (and thus the canary) below the saved registers when stack smash protection is active. The patch fixes CVE-2023-4039. gcc/ * config/aarch64/aarch64.cc (aarch64_save_regs_above_locals_p): New function. (aarch64_layout_frame): Use it to decide whether locals should go above or below the saved registers. (aarch64_expand_prologue): Update stack layout comment. Emit a stack tie after the final adjustment. gcc/testsuite/ * gcc.target/aarch64/stack-protector-8.c: New test. * gcc.target/aarch64/stack-protector-9.c: Likewise.
2023-09-12	aarch64: Remove below_hard_fp_saved_regs_size	Richard Sandiford	2	-31/+21
	After previous patches, it's no longer necessary to store saved_regs_size and below_hard_fp_saved_regs_size in the frame info. All measurements instead use the top or bottom of the frame as reference points. gcc/ * config/aarch64/aarch64.h (aarch64_frame::saved_regs_size) (aarch64_frame::below_hard_fp_saved_regs_size): Delete. * config/aarch64/aarch64.cc (aarch64_layout_frame): Update accordingly.
2023-09-12	aarch64: Explicitly record probe registers in frame info	Richard Sandiford	3	-18/+64
	The stack frame is currently divided into three areas: A: the area above the hard frame pointer B: the SVE saves below the hard frame pointer C: the outgoing arguments If the stack frame is allocated in one chunk, the allocation needs a probe if the frame size is >= guard_size - 1KiB. In addition, if the function is not a leaf function, it must probe an address no more than 1KiB above the outgoing SP. We ensured the second condition by (1) using single-chunk allocations for non-leaf functions only if the link register save slot is within 512 bytes of the bottom of the frame; and (2) using the link register save as a probe (meaning, for instance, that it can't be individually shrink wrapped) If instead the stack is allocated in multiple chunks, then: * an allocation involving only the outgoing arguments (C above) requires a probe if the allocation size is > 1KiB * any other allocation requires a probe if the allocation size is >= guard_size - 1KiB * second and subsequent allocations require the previous allocation to probe at the bottom of the allocated area, regardless of the size of that previous allocation The final point means that, unlike for single allocations, it can be necessary to have both a non-SVE register probe and an SVE register probe. For example: * allocate A, probe using a non-SVE register save * allocate B, probe using an SVE register save * allocate C The non-SVE register used in this case was again the link register. It was previously used even if the link register save slot was some bytes above the bottom of the non-SVE register saves, but an earlier patch avoided that by putting the link register save slot first. As a belt-and-braces fix, this patch explicitly records which probe registers we're using and allows the non-SVE probe to be whichever register comes first (as for SVE). The patch also avoids unnecessary probes in sve/pcs/stack_clash_3.c. gcc/ * config/aarch64/aarch64.h (aarch64_frame::sve_save_and_probe) (aarch64_frame::hard_fp_save_and_probe): New fields. * config/aarch64/aarch64.cc (aarch64_layout_frame): Initialize them. Rather than asserting that a leaf function saves LR, instead assert that a leaf function saves something. (aarch64_get_separate_components): Prevent the chosen probe registers from being individually shrink-wrapped. (aarch64_allocate_and_probe_stack_space): Remove workaround for probe registers that aren't at the bottom of the previous allocation. gcc/testsuite/ * gcc.target/aarch64/sve/pcs/stack_clash_3.c: Avoid redundant probes.
2023-09-12	aarch64: Simplify probe of final frame allocation	Richard Sandiford	4	-13/+9
	Previous patches ensured that the final frame allocation only needs a probe when the size is strictly greater than 1KiB. It's therefore safe to use the normal 1024 probe offset in all cases. The main motivation for doing this is to simplify the code and remove the number of special cases. gcc/ * config/aarch64/aarch64.cc (aarch64_allocate_and_probe_stack_space): Always probe the residual allocation at offset 1024, asserting that that is in range. gcc/testsuite/ * gcc.target/aarch64/stack-check-prologue-17.c: Expect the probe to be at offset 1024 rather than offset 0. * gcc.target/aarch64/stack-check-prologue-18.c: Likewise. * gcc.target/aarch64/stack-check-prologue-19.c: Likewise.
2023-09-12	aarch64: Put LR save probe in first 16 bytes	Richard Sandiford	4	-42/+233
	-fstack-clash-protection uses the save of LR as a probe for the next allocation. The next allocation could be: * another part of the static frame, e.g. when allocating SVE save slots or outgoing arguments * an alloca in the same function * an allocation made by a callee function However, when -fomit-frame-pointer is used, the LR save slot is placed above the other GPR save slots. It could therefore be up to 80 bytes above the base of the GPR save area (which is also the hard fp address). aarch64_allocate_and_probe_stack_space took this into account when deciding how much subsequent space could be allocated without needing a probe. However, it interacted badly with: /* If doing a small final adjustment, we always probe at offset 0. This is done to avoid issues when LR is not at position 0 or when the final adjustment is smaller than the probing offset. / else if (final_adjustment_p && rounded_size == 0) residual_probe_offset = 0; which forces any allocation that is smaller than the guard page size to be probed at offset 0 rather than the usual offset 1024. It was therefore possible to construct cases in which we had: a probe using LR at SP + 80 bytes (or some other value >= 16) * an allocation of the guard page size - 16 bytes * a probe at SP + 0 which allocates guard page size + 64 consecutive unprobed bytes. This patch requires the LR probe to be in the first 16 bytes of the save area when stack clash protection is active. Doing it unconditionally would cause code-quality regressions. Putting LR before other registers prevents push/pop allocation when shadow call stacks are enabled, since LR is restored separately from the other callee-saved registers. The new comment doesn't say that the probe register is required to be LR, since a later patch removes that restriction. gcc/ * config/aarch64/aarch64.cc (aarch64_layout_frame): Ensure that the LR save slot is in the first 16 bytes of the register save area. Only form STP/LDP push/pop candidates if both registers are valid. (aarch64_allocate_and_probe_stack_space): Remove workaround for when LR was not in the first 16 bytes. gcc/testsuite/ * gcc.target/aarch64/stack-check-prologue-18.c: New test. * gcc.target/aarch64/stack-check-prologue-19.c: Likewise. * gcc.target/aarch64/stack-check-prologue-20.c: Likewise.
2023-09-12	aarch64: Tweak stack clash boundary condition	Richard Sandiford	2	-1/+58
	The AArch64 ABI says that, when stack clash protection is used, there can be a maximum of 1KiB of unprobed space at sp on entry to a function. Therefore, we need to probe when allocating >= guard_size - 1KiB of data (>= rather than >). This is what GCC does. If an allocation is exactly guard_size bytes, it is enough to allocate those bytes and probe once at offset 1024. It isn't possible to use a single probe at any other offset: higher would conmplicate later code, by leaving more unprobed space than usual, while lower would risk leaving an entire page unprobed. For simplicity, the code probes all allocations at offset 1024. Some register saves also act as probes. If we need to allocate more space below the last such register save probe, we need to probe the allocation if it is > 1KiB. Again, this allocation is then sometimes (but not always) probed at offset 1024. This sort of allocation is currently only used for outgoing arguments, which are rarely this big. However, the code also probed if this final outgoing-arguments allocation was == 1KiB, rather than just > 1KiB. This isn't necessary, since the register save then probes at offset 1024 as required. Continuing to probe allocations of exactly 1KiB would complicate later patches. gcc/ * config/aarch64/aarch64.cc (aarch64_allocate_and_probe_stack_space): Don't probe final allocations that are exactly 1KiB in size (after unprobed space above the final allocation has been deducted). gcc/testsuite/ * gcc.target/aarch64/stack-check-prologue-17.c: New test.
2023-09-12	aarch64: Minor initial adjustment tweak	Richard Sandiford	1	-3/+2
	This patch just changes a calculation of initial_adjust to one that makes it slightly more obvious that the total adjustment is frame.frame_size. gcc/ * config/aarch64/aarch64.cc (aarch64_layout_frame): Tweak calculation of initial_adjust for frames in which all saves are SVE saves.
2023-09-12	aarch64: Simplify top of frame allocation	Richard Sandiford	1	-15/+8
	After previous patches, it no longer really makes sense to allocate the top of the frame in terms of varargs_and_saved_regs_size and saved_regs_and_above. gcc/ * config/aarch64/aarch64.cc (aarch64_layout_frame): Simplify the allocation of the top of the frame.
2023-09-12	aarch64: Measure reg_offset from the bottom of the frame	Richard Sandiford	2	-29/+27
	reg_offset was measured from the bottom of the saved register area. This made perfect sense with the original layout, since the bottom of the saved register area was also the hard frame pointer address. It became slightly less obvious with SVE, since we save SVE registers below the hard frame pointer, but it still made sense. However, if we want to allow different frame layouts, it's more convenient and obvious to measure reg_offset from the bottom of the frame. After previous patches, it's also a slight simplification in its own right. gcc/ * config/aarch64/aarch64.h (aarch64_frame): Add comment above reg_offset. * config/aarch64/aarch64.cc (aarch64_layout_frame): Walk offsets from the bottom of the frame, rather than the bottom of the saved register area. Measure reg_offset from the bottom of the frame rather than the bottom of the saved register area. (aarch64_save_callee_saves): Update accordingly. (aarch64_restore_callee_saves): Likewise. (aarch64_get_separate_components): Likewise. (aarch64_process_components): Likewise.
2023-09-12	aarch64: Tweak frame_size comment	Richard Sandiford	1	-2/+2
	This patch fixes another case in which a value was described with an “upside-down” view. gcc/ * config/aarch64/aarch64.h (aarch64_frame::frame_size): Tweak comment.
2023-09-12	aarch64: Rename hard_fp_offset to bytes_above_hard_fp	Richard Sandiford	2	-16/+16
	Similarly to the previous locals_offset patch, hard_fp_offset was described as: /* Offset from the base of the frame (incomming SP) to the hard_frame_pointer. This value is always a multiple of STACK_BOUNDARY. / poly_int64 hard_fp_offset; which again took an “upside-down” view: higher offsets meant lower addresses. This patch renames the field to bytes_above_hard_fp instead. gcc/ config/aarch64/aarch64.h (aarch64_frame::hard_fp_offset): Rename to... (aarch64_frame::bytes_above_hard_fp): ...this. * config/aarch64/aarch64.cc (aarch64_layout_frame) (aarch64_expand_prologue): Update accordingly. (aarch64_initial_elimination_offset): Likewise.
2023-09-12	aarch64: Rename locals_offset to bytes_above_locals	Richard Sandiford	2	-6/+6
	locals_offset was described as: /* Offset from the base of the frame (incomming SP) to the top of the locals area. This value is always a multiple of STACK_BOUNDARY. / This is implicitly an “upside down” view of the frame: the incoming SP is at offset 0, and anything N bytes below the incoming SP is at offset N (rather than -N). However, reg_offset instead uses a “right way up” view; that is, it views offsets in address terms. Something above X is at a positive offset from X and something below X is at a negative offset from X. Also, even on FRAME_GROWS_DOWNWARD targets like AArch64, target-independent code views offsets in address terms too: locals are allocated at negative offsets to virtual_stack_vars. It seems confusing to have _offset fields of the same structure using different polarities like this. This patch tries to avoid that by renaming locals_offset to bytes_above_locals. gcc/ * config/aarch64/aarch64.h (aarch64_frame::locals_offset): Rename to... (aarch64_frame::bytes_above_locals): ...this. * config/aarch64/aarch64.cc (aarch64_layout_frame) (aarch64_initial_elimination_offset): Update accordingly.
2023-09-12	aarch64: Only calculate chain_offset if there is a chain	Richard Sandiford	1	-5/+5
	After previous patches, it is no longer necessary to calculate a chain_offset in cases where there is no chain record. gcc/ * config/aarch64/aarch64.cc (aarch64_expand_prologue): Move the calculation of chain_offset into the emit_frame_chain block.
2023-09-12	aarch64: Tweak aarch64_save/restore_callee_saves	Richard Sandiford	2	-32/+28
	aarch64_save_callee_saves and aarch64_restore_callee_saves took a parameter called start_offset that gives the offset of the bottom of the saved register area from the current stack pointer. However, it's more convenient for later patches if we use the bottom of the entire frame as the reference point, rather than the bottom of the saved registers. Doing that removes the need for the callee_offset field. Other than that, this is not a win on its own. It only really makes sense in combination with the follow-on patches. gcc/ * config/aarch64/aarch64.h (aarch64_frame::callee_offset): Delete. * config/aarch64/aarch64.cc (aarch64_layout_frame): Remove callee_offset handling. (aarch64_save_callee_saves): Replace the start_offset parameter with a bytes_below_sp parameter. (aarch64_restore_callee_saves): Likewise. (aarch64_expand_prologue): Update accordingly. (aarch64_expand_epilogue): Likewise.
2023-09-12	aarch64: Add bytes_below_hard_fp to frame info	Richard Sandiford	2	-3/+8
	Following on from the previous bytes_below_saved_regs patch, this one records the number of bytes that are below the hard frame pointer. This eventually replaces below_hard_fp_saved_regs_size. If a frame pointer is not needed, the epilogue adds final_adjust to the stack pointer before restoring registers: aarch64_add_sp (tmp1_rtx, tmp0_rtx, final_adjust, true); Therefore, if the epilogue needs to restore the stack pointer from the hard frame pointer, the directly corresponding offset is: -bytes_below_hard_fp + final_adjust i.e. go from the hard frame pointer to the bottom of the frame, then add the same amount as if we were using the stack pointer from the outset. gcc/ * config/aarch64/aarch64.h (aarch64_frame::bytes_below_hard_fp): New field. * config/aarch64/aarch64.cc (aarch64_layout_frame): Initialize it. (aarch64_expand_epilogue): Use it instead of below_hard_fp_saved_regs_size.
2023-09-12	aarch64: Add bytes_below_saved_regs to frame info	Richard Sandiford	2	-35/+41
	The frame layout code currently hard-codes the assumption that the number of bytes below the saved registers is equal to the size of the outgoing arguments. This patch abstracts that value into a new field of aarch64_frame. gcc/ * config/aarch64/aarch64.h (aarch64_frame::bytes_below_saved_regs): New field. * config/aarch64/aarch64.cc (aarch64_layout_frame): Initialize it, and use it instead of crtl->outgoing_args_size. (aarch64_get_separate_components): Use bytes_below_saved_regs instead of outgoing_args_size. (aarch64_process_components): Likewise.