aboutsummaryrefslogtreecommitdiff
path: root/gcc
AgeCommit message (Collapse)AuthorFilesLines
2024-04-11aarch64: Remove FMV features whose names may changeAndrew Carlotti1-12/+2
Some architecture features have been combined under a single command line flag, but have been assigned multiple FMV feature names with the command line flag name enabling only a subset of these features in the FMV specification. I've proposed reallocating names in the FMV specification to match the command line flags [1], but for GCC 14 we'll just remove them from the FMV feature list. [1] https://github.com/ARM-software/acle/pull/315 gcc/ChangeLog: * config/aarch64/aarch64-option-extensions.def: Remove "memtag", "memtag2", "ssbs", "ssbs2", "ls64", "ls64_v" and "ls64_accdata" FMV features.
2024-04-11aarch64: Remove unsupported FMV featuresAndrew Carlotti1-38/+0
It currently isn't possible to support function multiversioning features properly in GCC without also enabling the extension in the command line options (with the exception of features such as "rpres" that do not require assembler support). We therefore remove unsupported features from GCC's list of FMV features. Some of these features ("fcma", "jscvt", "frintts", "flagm2", "wfxt", "rcpc2", and perhaps "dpb" and "dpb2") will be added back in the future once support for the command line option has been added. The rest of the removed features I have proposed removing from the ACLE specification as well, since it doesn't seem worthwhile to include support for them; see the ACLE pull request for more detailed justification: https://github.com/ARM-software/acle/pull/315 gcc/ChangeLog: * config/aarch64/aarch64-option-extensions.def: Remove "flagm2", "sha1", "pmull", "dit", "dpb", "dpb2", "jscvt", "fcma", "rcpc2", "frintts", "dgh", "ebf16", "sve-bf16", "sve-ebf16", "sve-i8mm", "sve2-pmull128", "memtag3", "bti" and "wfxt" entries.
2024-04-11aarch64: Fix typo and make rdma/rdm alias for FMVAndrew Carlotti2-2/+7
gcc/ChangeLog: * config/aarch64/aarch64-option-extensions.def: Fix "rmd"->"rdm", and add FMV to "rdma". * config/aarch64/aarch64.cc (FEAT_RDMA): Define as FEAT_RDM.
2024-04-11aarch64: Fix FMV array iteration boundsAndrew Carlotti2-3/+43
There was an assumption in some places that the aarch64_fmv_feature_data array contained FEAT_MAX elements. While this assumption held up till now, it is safer and more flexible to use the array size directly. Also fix the lower bound in compare_feature_masks to use ">=0" instead of ">0", and add a test using the features at index 0 and 1. However, the test already passed, because the earlier popcount check makes it impossible to reach the loop if the masks differ in exactly one location. gcc/ChangeLog: * config/aarch64/aarch64.cc (compare_feature_masks): Use ARRAY_SIZE and >=0 for iteration bounds. (aarch64_mangle_decl_assembler_name): Use ARRAY_SIZE. gcc/testsuite/ChangeLog: * g++.target/aarch64/mv-1.C: New test.
2024-04-11aarch64: Reorder FMV feature prioritiesAndrew Carlotti3-13/+9
Some higher priority FMV features were dependent subsets of lower priority features. Fix this, using the new priorities specified in https://github.com/ARM-software/acle/pull/279. gcc/ChangeLog: * config/aarch64/aarch64-option-extensions.def: Reorder FMV entries. gcc/testsuite/ChangeLog: * gcc.target/aarch64/cpunative/native_cpu_21.c: Reorder features. * gcc.target/aarch64/cpunative/native_cpu_22.c: Ditto.
2024-04-11c++: build_extra_args recapturing local specs [PR114303]Patrick Palka3-1/+47
r13-6452-g341e6cd8d603a3 made build_extra_args walk evaluated contexts first so that we prefer processing a local specialization in an evaluated context even if its first use is in an unevaluated context. But this means we need to avoid walking a tree that already has extra args/specs saved because the list of saved specs appears to be an evaluated context which we'll now walk first. It seems then that we should be calculating the saved specs from scratch each time, rather than potentially walking the saved specs list from an earlier partial instantiation when calling build_extra_args a second time around. PR c++/114303 gcc/cp/ChangeLog: * constraint.cc (tsubst_requires_expr): Clear REQUIRES_EXPR_EXTRA_ARGS before calling build_extra_args. * pt.cc (tree_extra_args): Define. (extract_locals_r): Assert *_EXTRA_ARGS is empty. (tsubst_stmt) <case IF_STMT>: Clear IF_SCOPE on the new IF_STMT. Call build_extra_args on the new IF_STMT instead of t which might already have IF_STMT_EXTRA_ARGS. gcc/testsuite/ChangeLog: * g++.dg/cpp1z/constexpr-if-lambda6.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com>
2024-04-11modula2: add modula-2 language section to languages supported by GCCGaius Mulley1-0/+11
This patch introduces a small modula-2 language section to the Language Standards Supported by GCC node. gcc/ChangeLog: * doc/standards.texi (Language Standards Supported by GCC): Add Modula-2 language section. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2024-04-11asan, v3: Fix up handling of > 32 byte aligned variables with ↵Jakub Jelinek2-3/+73
-fsanitize=address -fstack-protector* [PR110027] On Tue, Mar 26, 2024 at 02:08:02PM +0800, liuhongt wrote: > > > So, try to add some other variable with larger size and smaller alignment > > > to the frame (and make sure it isn't optimized away). > > > > > > alignb above is the alignment of the first partition's var, if > > > align_frame_offset really needs to depend on the var alignment, it probably > > > should be the maximum alignment of all the vars with alignment > > > alignb * BITS_PER_UNIT <=3D MAX_SUPPORTED_STACK_ALIGNMENT > > > > > In asan_emit_stack_protection, when it allocated fake stack, it assume > bottom of stack is also aligned to alignb. And the place violated this > is the first var partition. which is 32 bytes offsets, it should be > BIGGEST_ALIGNMENT / BITS_PER_UNIT. > So I think we need to use MAX (BIGGEST_ALIGNMENT / > BITS_PER_UNIT, ASAN_RED_ZONE_SIZE) for the first var partition. Your first patch aligned offsets[0] to maximum of alignb and ASAN_RED_ZONE_SIZE. But as I wrote in the reply to that mail, alignb there is the alignment of just a single variable which is the first one to appear in the sorted list and is placed in the highest spot in the stack frame. That is not necessarily the largest alignment, the sorting ensures that it is a variable with the largest size in the frame (and only if several of them have equal size, largest alignment from the same sized ones). Your second patch used maximum of BIGGEST_ALIGNMENT / BITS_PER_UNIT and ASAN_RED_ZONE_SIZE. That doesn't change anything at all when using -mno-avx512f - offsets[0] is still just 32-byte aligned in that case relative to top of frame, just changes the -mavx512f case to be 64-byte aligned offsets[0] (aka offsets[0] is then either 0 or -64 instead of either 0 or -32). That will not help if any variable in the frame needs 128-byte, 256-byte, 512-byte ... 4096-byte alignment. If you want to fix the bug in the spot you've touched, you'd need to walk all the stack_vars[stack_vars_sorted[si2]] for si2 [si + 1, n - 1] and for those where the loop would do anything (i.e. stack_vars[i2].representative == i2 && TREE_CODE (decl2) == SSA_NAME ? SA.partition_to_pseudo[var_to_partition (SA.map, decl2)] == NULL_RTX : DECL_RTL (decl2) == pc_rtx and the pred applies (but that means also walking the earlier ones! because with -fstack-protector* the vars can be processed in several calls) and alignb2 * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT and compute maximum of those alignments. That maximum is already computed, data->asan_alignb = MAX (data->asan_alignb, alignb); computes that, but you get the final result only after you do all the expand_stack_vars calls. You'd need to compute it before. Though, that change would be still in the wrong place. The thing is, it would be a waste of the precious stack space when it isn't needed at all (e.g. when asan will not at compile time do the use after return checking, or if it won't do it at runtime, or even if it will do at runtime it will waste the space on the stack). The following patch fixes it solely for the __asan_stack_malloc_N allocations, doesn't enlarge unnecessarily further the actual stack frame. Because asan is only supported on FRAME_GROWS_DOWNWARD architectures (mips, rs6000 and xtensa are conditional FRAME_GROWS_DOWNWARD arches, which for -fsanitize=address or -fstack-protector* use FRAME_GROWS_DOWNWARD 1, otherwise 0, others supporting asan always just use 1), the assumption for the dynamic stack realignment is that the top of the stack frame (aka offset 0) is aligned to alignb passed to the function (which is the maximum of alignb of all the vars in the frame). As checked by the assertion in the patch, offsets[0] is 0 most of the time and so that assumption is correct, the only case when it is not 0 is if -fstack-protector* is on together with -fsanitize=address and cfgexpand.cc (create_stack_guard) created a stack guard. That is the only variable which is allocated in the stack frame right away, for all others with -fsanitize=address defer_stack_allocation (or -fstack-protector*) returns true and so they aren't allocated immediately but handled during the frame layout phases. So, the original frame_offset of 0 is changed because of the stack guard to -pointer_size_in_bytes and later at the if (data->asan_vec.is_empty ()) { align_frame_offset (ASAN_RED_ZONE_SIZE); prev_offset = frame_offset.to_constant (); } to -ASAN_RED_ZONE_SIZE. The asan_emit_stack_protection code wasn't taking this into account though, so essentially assumed in the __asan_stack_malloc_N allocated memory it needs to align it such that pointer corresponding to offsets[0] is alignb aligned. But that isn't correct if alignb > ASAN_RED_ZONE_SIZE, in that case it needs to ensure that pointer corresponding to frame offset 0 is alignb aligned. The following patch fixes that. Unlike the previous case where we knew that asan_frame_size + base_align_bias falls into the same bucket as asan_frame_size, this isn't in some cases true anymore, so the patch recomputes which bucket to use and if going to bucket 11 (because there is no __asan_stack_malloc_11 function in the library) disables the after return sanitization. 2024-04-11 Jakub Jelinek <jakub@redhat.com> PR middle-end/110027 * asan.cc (asan_emit_stack_protection): Assert offsets[0] is zero if there is no stack protect guard, otherwise -ASAN_RED_ZONE_SIZE. If alignb > ASAN_RED_ZONE_SIZE and there is stack pointer guard, take the ASAN_RED_ZONE_SIZE bytes allocated at the top of the stack into account when computing base_align_bias. Recompute use_after_return_class from asan_frame_size + base_align_bias and set to -1 if that would overflow to 11. * gcc.dg/asan/pr110027.c: New test.
2024-04-11tree-optimization/109596 - wrong debug stmt move by copyheaderRichard Biener1-1/+1
The following fixes an omission in r14-162-gcda246f8b421ba causing wrong-debug and a bunch of guality regressions. PR tree-optimization/109596 * tree-ssa-loop-ch.cc (ch_base::copy_headers): Propagate debug stmts to nonexit->dest rather than exit->dest.
2024-04-11middle-end/114681 - condition coverage and inliningRichard Biener2-1/+19
When inlining a gcond it can map to multiple stmts, esp. with non-call EH. The following makes sure to pick up the remapped condition when dealing with condition coverage. PR middle-end/114681 * tree-inline.cc (copy_bb): Key on the remapped stmt to identify gconds to have condition coverage data remapped. * gcc.misc-tests/gcov-pr114681.c: New testcase.
2024-04-11c++: Fix ANNOTATE_EXPR instantiation [PR114409]Jakub Jelinek2-13/+53
The following testcase ICEs starting with the r14-4229 PR111529 change which moved ANNOTATE_EXPR handling from tsubst_expr to tsubst_copy_and_build. ANNOTATE_EXPR is only allowed in the IL to wrap a loop condition, and the loop condition of while/for loops can be a COMPOUND_EXPR with DECL_EXPR in the first operand and the corresponding VAR_DECL in the second, as created by finish_cond else if (!empty_expr_stmt_p (cond)) expr = build2 (COMPOUND_EXPR, TREE_TYPE (expr), cond, expr); Since then Patrick reworked the instantiation, so that we have now tsubst_stmt and tsubst_expr and ANNOTATE_EXPR ended up in the latter, while only tsubst_stmt can handle DECL_EXPR. Now, the reason why the while/for loops with variable declaration in the condition works in templates without the pragmas (i.e. without ANNOTATE_EXPR) is that both the FOR_STMT and WHILE_STMT handling uses RECUR aka tsubst_stmt in handling of the *_COND operand: case FOR_STMT: stmt = begin_for_stmt (NULL_TREE, NULL_TREE); RECUR (FOR_INIT_STMT (t)); finish_init_stmt (stmt); tmp = RECUR (FOR_COND (t)); finish_for_cond (tmp, stmt, false, 0, false); and case WHILE_STMT: stmt = begin_while_stmt (); tmp = RECUR (WHILE_COND (t)); finish_while_stmt_cond (tmp, stmt, false, 0, false); Therefore, it will handle DECL_EXPR embedded in COMPOUND_EXPR of the {WHILE,FOR}_COND just fine. But if that COMPOUND_EXPR with DECL_EXPR is wrapped with one or more ANNOTATE_EXPRs, because ANNOTATE_EXPR is now done solely in tsubst_expr and uses RECUR there (i.e. tsubst_expr), it will ICE on DECL_EXPR in there. This could be fixed by keeping ANNOTATE_EXPR handling in tsubst_expr but using tsubst_stmt for the first operand, but this patch instead moves ANNOTATE_EXPR handling to tsubst_stmt (and uses tsubst_expr for the second/third operand). 2024-04-11 Jakub Jelinek <jakub@redhat.com> PR c++/114409 * pt.cc (tsubst_expr) <case ANNOTATE_EXPR>: Move to ... (tsubst_stmt) <case ANNOTATE_EXPR>: ... here. Use tsubst_expr instead of RECUR for the last 2 arguments. * g++.dg/ext/pr114409-2.C: New test.
2024-04-11RISC-V: Remove -Wno-psabi for test build option [NFC]Pan Li62-62/+62
Just notice there are some test case still have -Wno-psabi option, which is deprecated now. Remove them all for riscv test cases. The below test are passed for this patch. * The riscv rvv regression test. gcc/testsuite/ChangeLog: * g++.target/riscv/rvv/base/pr109244.C: Remove deprecated -Wno-psabi option. * g++.target/riscv/rvv/base/pr109535.C: Ditto. * gcc.target/riscv/rvv/autovec/fixed-vlmax-1.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/compress-1.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/compress-2.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/compress-3.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/compress-4.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/compress-5.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/compress-6.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-2.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-3.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-4.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-5.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/compress_run-6.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/consecutive-1.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/consecutive-2.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/consecutive_run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/consecutive_run-2.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge-1.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge-2.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge-3.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge-4.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge-5.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge-6.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge-7.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-2.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-3.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-4.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-5.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-6.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/merge_run-7.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm-1.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm-2.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm-3.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm-4.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm-5.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm-6.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm-7.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-1.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-2.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-3.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-4.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-5.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-6.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/perm_run-7.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-1u.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-2u.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-3u.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-4u.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-run.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_extract-runu.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-1.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-2.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-3.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-4.c: Ditto. * gcc.target/riscv/rvv/autovec/vls-vlmax/vec_set-run.c: Ditto. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-04-11RISC-V: Bugfix ICE for the vector return arg in mode switchPan Li4-2/+79
This patch would like to fix a ICE in mode sw for below example code. during RTL pass: mode_sw test.c: In function ‘vbool16_t j(vuint64m4_t)’: test.c:15:1: internal compiler error: in create_pre_exit, at mode-switching.cc:451 15 | } | ^ 0x3978f12 create_pre_exit __RISCV_BUILD__/../gcc/mode-switching.cc:451 0x3979e9e optimize_mode_switching __RISCV_BUILD__/../gcc/mode-switching.cc:849 0x397b9bc execute __RISCV_BUILD__/../gcc/mode-switching.cc:1324 extern size_t get_vl (); vbool16_t test (vuint64m4_t a) { unsigned long b; return __riscv_vmsne_vx_u64m4_b16 (a, b, get_vl ()); } The create_pre_exit would like to find a return value copy. If not, there will be a reason in assert but not available for above sample code when vector calling convension is enabled by default. This patch would like to override the TARGET_FUNCTION_VALUE_REGNO_P for vector register and then we will have hard_regno_nregs for copy_num, aka there is a return value copy. As a side-effect of allow vector in TARGET_FUNCTION_VALUE_REGNO_P, the TARGET_GET_RAW_RESULT_MODE will have vector mode and which is sizeless cannot be converted to fixed_size_mode. Thus override the hook TARGET_GET_RAW_RESULT_MODE and return VOIDmode when the regno is-not-a fixed_size_mode. The below tests are passed for this patch. * The fully riscv regression tests. * The reproducing test in bugzilla PR114639. PR target/114639 gcc/ChangeLog: * config/riscv/riscv.cc (riscv_function_value_regno_p): New func impl for hook TARGET_FUNCTION_VALUE_REGNO_P. (riscv_get_raw_result_mode): New func imple for hook TARGET_GET_RAW_RESULT_MODE. (TARGET_FUNCTION_VALUE_REGNO_P): Impl the hook. (TARGET_GET_RAW_RESULT_MODE): Ditto. * config/riscv/riscv.h (V_RETURN): New macro for vector return. (GP_RETURN_FIRST): New macro for the first GPR in return. (GP_RETURN_LAST): New macro for the last GPR in return. (FP_RETURN_FIRST): Diito but for FPR. (FP_RETURN_LAST): Ditto. (FUNCTION_VALUE_REGNO_P): Remove as deprecated and replace by TARGET_FUNCTION_VALUE_REGNO_P. gcc/testsuite/ChangeLog: * g++.target/riscv/rvv/base/pr114639-1.C: New test. * gcc.target/riscv/rvv/base/pr114639-1.c: New test. Signed-off-by: Pan Li <pan2.li@intel.com>
2024-04-10btf: do not skip members of data type with type id BTF_VOID_TYPEIDIndu Bhagat3-12/+8
The previous fix in gen_ctf_sou_type () exposes an issue in BTF generation, however: BTF emission was currently decrementing the vlen (indicating the number of members) to skip members of type CTF_K_UNKNOWN altogether, but still emitting the BTF for the corresponding member (in output_asm_btf_sou_fields ()). One can see malformed BTF by executing the newly added CTF testcase (gcc.dg/debug/ctf/ctf-bitfields-5.c) with -gbtf instead or even existing btf-struct-2.c without this patch. To fix the issue, it makes sense to rather _not_ skip members of data type of type id BTF_VOID_TYPEID. gcc/ChangeLog: * btfout.cc (btf_asm_type): Do not skip emitting members of unknown type. gcc/testsuite/ChangeLog: * gcc.dg/debug/btf/btf-bitfields-4.c: Update the vlen check. * gcc.dg/debug/btf/btf-struct-2.c: Check that member named 'f' with void data type is emitted.
2024-04-10ctf: fix PR debug/112878Indu Bhagat2-5/+27
PR debug/112878: ICE: in ctf_add_slice, at ctfc.cc:499 with _BitInt > 255 in a struct and -gctf1 The CTF generation in GCC does not have a mechanism to roll-back an already added type. In this testcase presented in the PR, we hit a representation limit in CTF slices (for a member of a struct) and ICE, after the type for struct (CTF_K_STRUCT) has already been added to the container. To exit gracefully instead, we now check for both the offset and size of the bitfield to be explicitly <= 255. If the check fails, we emit the member with type CTF_K_UNKNOWN. Note that, the value 255 stems from the existing binutils libctf checks which were motivated to guard against malformed inputs. Although it is not accurate to say that this is a CTF representation limit, mark the code with TBD_CTF_REPRESENTATION_LIMIT for now so that this can be taken care of with the next format version bump, when libctf's checks for the slice data can be lifted as well. gcc/ChangeLog: PR debug/112878 * dwarf2ctf.cc (gen_ctf_sou_type): Check for conditions before call to ctf_add_slice. Use CTF_K_UNKNOWN type if fail. gcc/testsuite/ChangeLog: PR debug/112878 * gcc.dg/debug/ctf/ctf-bitfields-5.c: New test.
2024-04-11Daily bump.GCC Administrator7-1/+260
2024-04-11Revert "testsuite/gcc.target/cris/pr93372-2.c: Handle xpass from combine ↵Hans-Peter Nilsson1-8/+7
improvement" This reverts commit 4c8b3600c4856f7915281ae3ff4d97271c83a540.
2024-04-10target: missing -Whardened with -fcf-protection=none [PR114606]Marek Polacek3-1/+17
-Whardened warns when -fhardened couldn't enable a hardening option because that option was disabled on the command line, e.g.: $ ./cc1plus -quiet g.C -fhardened -O2 -fstack-protector cc1plus: warning: '-fstack-protector-strong' is not enabled by '-fhardened' because it was specified on the command line [-Whardened] but it doesn't work as expected with -fcf-protection=none: $ ./cc1plus -quiet g.C -fhardened -O2 -fcf-protection=none because we're checking == CF_NONE which doesn't distinguish between nothing and -fcf-protection=none. I should have used opts_set, like below. PR target/114606 gcc/ChangeLog: * config/i386/i386-options.cc (ix86_option_override_internal): Use opts_set rather than checking == CF_NONE. gcc/testsuite/ChangeLog: * gcc.target/i386/fhardened-1.c: New test. * gcc.target/i386/fhardened-2.c: New test. Reviewed-by: Jakub Jelinek <jakub@redhat.com>
2024-04-10analyzer: fix ICE on negative values for size_t [PR114472]David Malcolm4-5/+38
I made several attempts to fix this properly, but for now apply a band-aid to at least prevent crashing on such cases. gcc/analyzer/ChangeLog: PR analyzer/114472 * access-diagram.cc (bit_size_expr::maybe_get_formatted_str): Reject attempts to print sizes that are too large. * region.cc (region_offset::calc_symbolic_bit_offset): Use a typeless svalue for the bit offset. * store.cc (bit_range::intersects_p): Replace assertion with test. (bit_range::exceeds_p): Likewise. (bit_range::falls_short_of_p): Likewise. gcc/testsuite/ChangeLog: * c-c++-common/analyzer/out-of-bounds-pr114472.c: New test. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-04-10analyzer: add SARIF property bag to -Wanalyzer-infinite-loopDavid Malcolm1-0/+22
gcc/analyzer/ChangeLog: * infinite-loop.cc: Include "diagnostic-format-sarif.h". (infinite_loop::to_json): New. (infinite_loop_diagnostic::maybe_add_sarif_properties): New. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-04-10analyzer: add SARIF property bag to -Wanalyzer-infinite-recursionDavid Malcolm1-0/+13
gcc/analyzer/ChangeLog: * infinite-recursion.cc: Include "diagnostic-format-sarif.h". (infinite_recursion_diagnostic::maybe_add_sarif_properties): New. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-04-10analyzer: add SARIF property bags to -Wanalyzer-overlapping-buffersDavid Malcolm3-3/+49
gcc/analyzer/ChangeLog: * call-details.cc: Include "diagnostic-format-sarif.h". (overlapping_buffers::overlapping_buffers): Add params for new fields. (overlapping_buffers::maybe_add_sarif_properties): New. (overlapping_buffers::m_byte_range_a): New field. (overlapping_buffers::byte_range_b): New field. (overlapping_buffers::m_num_bytes_read_sval): New field. (call_details::complain_about_overlap): Pass new params to overlapping_buffers ctor. * ranges.cc (symbolic_byte_offset::to_json): New. (symbolic_byte_range::to_json): New. * ranges.h (symbolic_byte_offset::to_json): New decl. (symbolic_byte_range::to_json): New decl. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-04-10analyzer: show size in SARIF property bag for -Wanalyzer-tainted-allocation-sizeDavid Malcolm1-1/+14
gcc/analyzer/ChangeLog: * sm-taint.cc (tainted_allocation_size::tainted_allocation_size): Add "size_in_bytes" param. (tainted_allocation_size::maybe_add_sarif_properties): New. (tainted_allocation_size::m_size_in_bytes): New field. (region_model::check_dynamic_size_for_taint): Pass size_in_bytes to tainted_allocation_size ctor. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-04-10analyzer: fixes to internal docsDavid Malcolm1-2/+8
gcc/ChangeLog: * doc/analyzer.texi: Various tweaks. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-04-10analyzer, testuite: comment fixesDavid Malcolm1-2/+2
gcc/testsuite/ChangeLog: * c-c++-common/analyzer/memset-1.c: Clarify some comments. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-04-10testsuite: add some missing -fanalyzer to plugin testsDavid Malcolm7-8/+15
gcc/testsuite/ChangeLog: * gcc.dg/plugin/copy_from_user-1.c: Add missing directives for an analyzer test. * gcc.dg/plugin/taint-CVE-2011-0521-1-fixed.c: Add missing -fanalyzer to options. * gcc.dg/plugin/taint-CVE-2011-0521-1.c: Likewise. * gcc.dg/plugin/taint-CVE-2011-0521-2-fixed.c: Likewise. (dvb_usercopy): Add default case to avoid complaints about NULL derefs. * gcc.dg/plugin/taint-CVE-2011-0521-2.c: Likewise. * gcc.dg/plugin/taint-CVE-2011-0521-3-fixed.c: Add missing -fanalyzer to options. * gcc.dg/plugin/taint-CVE-2011-0521-3.c: Likewise. Drop xfail. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-04-10Regenerate gcc.potJoseph Myers1-8044/+8369
* gcc.pot: Regenerate.
2024-04-10Fortran: fix argument checking of intrinsics C_SIZEOF, C_F_POINTER [PR106500]Harald Anlauf5-13/+96
The interpretation of the F2018 standard regarding valid arguments to the intrinsic C_SIZEOF(X) was clarified in an edit to 18-007r1: https://j3-fortran.org/doc/year/22/22-101r1.txt loosening restrictions and giving examples. The F2023 text has: ! F2023:18.2.3.8 C_SIZEOF (X) ! ! X shall be a data entity with interoperable type and type parameters, ! and shall not be an assumed-size array, an assumed-rank array that ! is associated with an assumed-size array, an unallocated allocatable ! variable, or a pointer that is not associated. where ! 3.41 data entity ! data object, result of the evaluation of an expression, or the ! result of the execution of a function reference Update the checking code for interoperable arguments accordingly, and extend to reject functions returning pointer as FPTR argument to C_F_POINTER. gcc/fortran/ChangeLog: PR fortran/106500 * check.cc (is_c_interoperable): Fix checks for C_SIZEOF. (gfc_check_c_f_pointer): Reject function returning a pointer as FPTR, and improve an error message. gcc/testsuite/ChangeLog: PR fortran/106500 * gfortran.dg/c_sizeof_6.f90: Remove wrong dg-error. * gfortran.dg/sizeof_2.f90: Adjust pattern. * gfortran.dg/c_f_pointer_tests_9.f90: New test. * gfortran.dg/c_sizeof_7.f90: New test.
2024-04-10tree-optimization/114672 - WIDEN_MULT_PLUS_EXPR type mismatchRichard Biener2-2/+17
The following makes sure to restrict WIDEN_MULT*_EXPR to a mode precision final compute type as the mode is used to find the optab and type checking chokes when seeing bit-precisions later which would likely also not properly expanded to RTL. PR tree-optimization/114672 * tree-ssa-math-opts.cc (convert_plusminus_to_widen): Only allow mode-precision results. * gcc.dg/torture/pr114672.c: New testcase.
2024-04-10aarch64: Add support for _BitIntAndre Vieira7-0/+1139
This patch adds support for C23's _BitInt for the AArch64 port when compiling for little endianness. Big Endianness requires further target-agnostic support and we therefor disable it for now. gcc/ChangeLog: * config/aarch64/aarch64.cc (TARGET_C_BITINT_TYPE_INFO): Declare MACRO. (aarch64_bitint_type_info): New function. (aarch64_return_in_memory_1): Return large _BitInt's in memory. (aarch64_function_arg_alignment): Adapt to correctly return the ABI mandated alignment of _BitInt(N) where N > 128 as the alignment of TImode. (aarch64_composite_type_p): Return true for _BitInt(N), where N > 128. libgcc/ChangeLog: * config/aarch64/t-softfp (softfp_extras): Add floatbitinthf, floatbitintbf, floatbitinttf and fixtfbitint. * config/aarch64/libgcc-softfp.ver (GCC_14.0.0): Add __floatbitinthf, __floatbitintbf, __floatbitinttf and __fixtfbitint. gcc/testsuite/ChangeLog: * gcc.target/aarch64/bitint-alignments.c: New test. * gcc.target/aarch64/bitint-args.c: New test. * gcc.target/aarch64/bitint-sizes.c: New test. * gcc.target/aarch64/bitfield-bitint-abi.h: New header. * gcc.target/aarch64/bitfield-bitint-abi-align16.c: New test. * gcc.target/aarch64/bitfield-bitint-abi-align8.c: New test.
2024-04-10aarch64: Do not give ABI change diagnostics for _BitInt(N)Andre Vieira1-9/+52
This patch makes sure we do not give ABI change diagnostics for the ABI breaks of GCC 9, 13 and 14 for any type involving _BitInt(N), since that type did not exist before this GCC version. gcc/ChangeLog: * config/aarch64/aarch64.cc (bitint_or_aggr_of_bitint_p): New function. (aarch64_layout_arg): Don't emit diagnostics for types involving _BitInt(N).
2024-04-10c++: Implement C++26 P2809R3 - Trivial infinite loops are not Undefined BehaviorJakub Jelinek8-6/+527
The following patch attempts to implement P2809R3, which has been voted in as a DR. The middle-end has its behavior documented: '-ffinite-loops' Assume that a loop with an exit will eventually take the exit and not loop indefinitely. This allows the compiler to remove loops that otherwise have no side-effects, not considering eventual endless looping as such. This option is enabled by default at '-O2' for C++ with -std=c++11 or higher. So, the following patch attempts to detect trivial infinite loops by detecting trivially empty loops, if their condition is not INTEGER_CST (that case is handled by the middle-end right already) trying to constant evaluate with mce=true their condition and if it evaluates to true (and -ffinite-loops and not processing_template_decl) wraps the condition into an ANNOTATE_EXPR which tells the middle-end that the loop shouldn't be loop->finite_p despite -ffinite-loops). Furthermore, the patch adds -Wtautological-compare warnings for loop conditions containing std::is_constant_evaluated(), either if those always evaluate to true, or always evaluate to false, or will evaluate to true just when checking if it is trivial infinite loop (and if in non-constexpr function also say that it will evaluate to false otherwise). The user is doing something weird in all those cases. 2024-04-10 Jakub Jelinek <jakub@redhat.com> PR c++/114462 gcc/ * tree-core.h (enum annot_expr_kind): Add annot_expr_maybe_infinite_kind enumerator. * gimplify.cc (gimple_boolify): Handle annot_expr_maybe_infinite_kind. * tree-cfg.cc (replace_loop_annotate_in_block): Likewise. (replace_loop_annotate): Likewise. Move loop->finite_p initialization before the replace_loop_annotate_in_block calls. * tree-pretty-print.cc (dump_generic_node): Handle annot_expr_maybe_infinite_kind. gcc/cp/ * semantics.cc: Implement C++26 P2809R3 - Trivial infinite loops are not Undefined Behavior. (maybe_warn_for_constant_evaluated): Add trivial_infinite argument and emit special diagnostics for that case. (finish_if_stmt_cond): Adjust caller. (finish_loop_cond): New function. (finish_while_stmt): Use it. (finish_do_stmt): Likewise. (finish_for_stmt): Likewise. gcc/testsuite/ * g++.dg/cpp26/trivial-infinite-loop1.C: New test. * g++.dg/cpp26/trivial-infinite-loop2.C: New test. * g++.dg/cpp26/trivial-infinite-loop3.C: New test.
2024-04-10testsuite: Adjust pr113359-2_*.c with unsigned long long [PR114662]Kewen Lin2-8/+8
pr113359-2_*.c define a struct having unsigned long type members ay and az which have 4 bytes size at -m32, while the related constants CL1 and CL2 used for equality check are always 8 bytes, it makes compiler consider the below 69 if (a.ay != CL1) 70 __builtin_abort (); always to abort and optimize away the following call to getb, which leads to the expected wpa dumping on "Semantic equality" missing. This patch is to modify the types with unsigned long long accordingly. PR testsuite/114662 gcc/testsuite/ChangeLog: * gcc.dg/lto/pr113359-2_0.c: Use unsigned long long instead of unsigned long. * gcc.dg/lto/pr113359-2_1.c: Likewise.
2024-04-10Revert "combine: Don't combine if I2 does not change"Richard Biener1-11/+0
This reverts commit 839bc42772ba7af66af3bd16efed4a69511312ae.
2024-04-09rs6000: Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR [PR101865]Peter Bergner5-28/+7
This is a cleanup patch in preparation to fixing the real bug in PR101865. TARGET_DIRECT_MOVE is redundant with TARGET_P8_VECTOR, so alias it to that. Also replace all usages of OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR and delete the now dead mask. 2024-04-09 Peter Bergner <bergner@linux.ibm.com> gcc/ PR target/101865 * config/rs6000/rs6000.h (TARGET_DIRECT_MOVE): Define. * config/rs6000/rs6000.cc (rs6000_option_override_internal): Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR. Delete redundant OPTION_MASK_DIRECT_MOVE usage. Delete TARGET_DIRECT_MOVE dead code. (rs6000_opt_masks): Neuter the "direct-move" option. * config/rs6000/rs6000-c.cc (rs6000_target_modify_macros): Replace OPTION_MASK_DIRECT_MOVE with OPTION_MASK_P8_VECTOR. Delete useless comment. * config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Delete OPTION_MASK_DIRECT_MOVE. (OTHER_VSX_VECTOR_MASKS): Likewise. (POWERPC_MASKS): Likewise. * config/rs6000/rs6000.opt (mdirect-move): Remove Mask and Var.
2024-04-10c++: Keep DECL_SAVED_TREE of cdtor instantiations in modules [PR104040]Nathaniel Shead3-2/+28
A template instantiation still needs to have its DECL_SAVED_TREE so that its definition is emitted into the CMI. This way it can be emitted in the object file of any importers that use it, in case it doesn't end up getting emitted in this TU. This is true even for maybe-in-charge functions, because we don't currently stream the clones directly but instead regenerate them from this function. PR c++/104040 gcc/cp/ChangeLog: * semantics.cc (expand_or_defer_fn_1): Keep DECL_SAVED_TREE for all vague linkage cdtors with modules. gcc/testsuite/ChangeLog: * g++.dg/modules/pr104040_a.C: New test. * g++.dg/modules/pr104040_b.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
2024-04-10[APX] Prohibit SHA/KEYLOCKER usage of EGPR when APX enabledHongyu Wang1-13/+13
The latest APX spec announced removal of SHA/KEYLOCKER evex promotion, which means the SHA/KEYLOCKER insn does not support EGPR when APX enabled. Update the corresponding constraints to their EGPR-disabled counterparts. gcc/ChangeLog: * config/i386/sse.md (sha1msg1): Use "ja" instead of "Bm" for memory constraint. (sha1msg2): Likewise. (sha1nexte): Likewise. (sha1rnds4): Likewise. (sha256msg1): Likewise. (sha256msg2): Likewise. (sha256rnds2): Likewise. (aes<aesklvariant>u8): Use "jm" instead of "m" for memory constraint. (*aes<aeswideklvariant>u8): Likewise. (*encodekey128u32): Use "jr" instead of "r" for register constraints. (*encodekey256u32): Likewise.
2024-04-10c++: Track declarations imported from partitions [PR99377]Nathaniel Shead5-0/+53
The testcase in comment 15 of the linked PR is caused because the following assumption in depset::hash::make_dependency doesn't hold: if (DECL_LANG_SPECIFIC (not_tmpl) && DECL_MODULE_IMPORT_P (not_tmpl)) { /* Store the module number and index in cluster/section, so we don't have to look them up again. */ unsigned index = import_entity_index (decl); module_state *from = import_entity_module (index); /* Remap will be zero for imports from partitions, which we want to treat as-if declared in this TU. */ if (from->remap) { dep->cluster = index - from->entity_lwm; dep->section = from->remap; dep->set_flag_bit<DB_IMPORTED_BIT> (); } } This is because at least for template specialisations, we first see the declaration in the header unit imported from the partition, and then the instantiation provided by the partition itself. This means that the 'import_entity_index' lookup doesn't report that the specialisation was declared in the partition and thus should be considered as-if it was part of the TU, and get emitted into the CMI. We always need to emit definitions from module partitions into the primary module interface's CMI, as unlike with other kinds of transitive imports the built CMIs for module partitions are not visible to importers. To fix this, this patch allows, as a special case for installing an entity from a partition, to overwrite the entity_map entry with the (later) index into the partition so that this assumption holds again. We only do this for the first time we override with a partition, so that entities are at least still reported as originating from the first imported partition that declares them (rather than the last); existing tests check for this and this seems to be a friendlier approach to go for, albeit slightly more expensive. PR c++/99377 gcc/cp/ChangeLog: * module.cc (trees_in::install_entity): Overwrite entity map index if installing from a partition. gcc/testsuite/ChangeLog: * g++.dg/modules/pr99377-3_a.H: New test. * g++.dg/modules/pr99377-3_b.C: New test. * g++.dg/modules/pr99377-3_c.C: New test. * g++.dg/modules/pr99377-3_d.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
2024-04-10Daily bump.GCC Administrator10-1/+351
2024-04-09btf: improve btf-datasec-3.c test [PR114642]David Faust1-8/+8
This test failed on powerpc --target_board=unix'{-m32}' because two variables were not placed in sections where the test silently (and incorrectly) assumed they would be. The important thing for the test is only that BTF_KIND_DATASEC entries are NOT generated for the extern variable declarations without an explicit section attribute. Make the test more robust by placing the non-extern variables in explicit sections, and invert the checks to more accurately verify what we care about in this test. gcc/testsuite/ PR testsuite/114642 * gcc.dg/debug/btf/btf-datasec-3.c: Make test more robust on different architectures.
2024-04-09s390x: Optimize vector permute with constant indexesJuergen Christ3-1/+94
Loop vectorizer can generate vector permutes with constant indexes where all indexes are equal. Optimize this case to use vector replicate instead of vector permute. gcc/ChangeLog: * config/s390/s390.cc (expand_perm_as_replicate): Implement. (vectorize_vec_perm_const_1): Call new function. * config/s390/vx-builtins.md (vec_splat<mode>): Change to... (@vec_splat<mode>): ...this. gcc/testsuite/ChangeLog: * gcc.target/s390/vector/vec-expand-replicate.c: New test. Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
2024-04-09btf: emit symbol refs in DATASEC entries only for BPF [PR114608]David Faust4-9/+13
The behavior introduced in fa60ac54964 btf: Emit labels in DATASEC bts_offset entries. is only fully correct when compiling for the BPF target with BPF CO-RE enabled. In other cases, depending on optimizations, it can result in an incorrect symbol reference in the entry bts_offset field for a symbol which may not be emitted at all, causing link-time undefined symbol reference errors like in PR114608. The offending bts_offset field of BTF_KIND_DATASEC entries is in reality only currently useful to consumers of BTF information for BPF programs anyway. Correct the regression by only emitting symbol references in these entries when compiling for the BPF target. For other targets, the behavior returns to that prior to fa60ac54964. The underlying cause is related to PR 113566 "btf: incorrect BTF_KIND_DATASEC entries for variables which are optimized out." A complete fix for 113566 is more involved and unsuitable for stage 4, but will be addressed in the near future. gcc/ PR debug/114608 * btfout.cc (btf_asm_datasec_entry): Only emit a symbol reference when generating BTF for BPF CO-RE target. gcc/testsuite/ PR debug/114608 * gcc.dg/debug/btf/btf-datasec-1.c: Check bts_offset symbol references only for BPF target. * gcc.dg/debug/btf/btf-datasec-2.c: Likewise. * gcc.dg/debug/btf/btf-pr106773.c: Likewise.
2024-04-09aarch64: Fix ACLE SME streaming mode error in neon-sve-bridgeRichard Ball4-42/+75
When using LTO, handling the pragma for sme before the pragma for the neon-sve-bridge caused the following error on svset_neonq, in the neon-sve-bridge.c test. error: ACLE function '0' can only be called when SME streaming mode is enabled. This has been resolved by changing the pragma handlers to accept two modes. One where they add functions normally and a second in which registered_functions is filled with a placeholder value. By using this, the ordering of the functions can be maintained. gcc/ChangeLog: * config/aarch64/aarch64-c.cc (aarch64_pragma_aarch64): Add functions_nulls parameter to pragma_handlers. * config/aarch64/aarch64-protos.h: Likewise. * config/aarch64/aarch64-sve-builtins.h (enum handle_pragma_index): Add enum to count number of pragmas to be handled. * config/aarch64/aarch64-sve-builtins.cc (GTY): Add global variable for initial indexes and change overload_names to an array. (function_builder::function_builder): Add pragma handler information. (function_builder::add_function): Add code for overwriting previous registered_functions entries. (add_unique_function): Use an array to register overload_names for both pragma handler modes. (add_overloaded_function): Likewise. (init_builtins): Add functions_nulls parameter to pragma_handlers. (handle_arm_sve_h): Initialize pragma handler information. (handle_arm_neon_sve_bridge_h): Likewise. (handle_arm_sme_h): Likewise.
2024-04-09Fortran: Fix ICE in trans-stmt.cc(gfc_trans_call) [PR114535]Paul Thomas3-9/+60
2024-04-09 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/114535 * resolve.cc (resolve_symbol): Remove last chunk that checked for finalization of unreferenced symbols. gcc/testsuite/ PR fortran/114535 * gfortran.dg/pr114535d.f90: New test. * gfortran.dg/pr114535iv.f90: Additional source.
2024-04-09Fortran: Fix ICE in gfc_trans_pointer_assignment [PR113956]Paul Thomas2-6/+24
2024-04-09 Paul Thomas <pault@gcc.gnu.org> gcc/fortran PR fortran/113956 * trans-expr.cc (gfc_trans_pointer_assignment): Remove assert causing the ICE since it was unnecesary. gcc/testsuite/ PR fortran/113956 * gfortran.dg/pr113956.f90: New test.
2024-04-09lto/114655 - -flto=4 at link time doesn't override -flto=auto at compile timeRichard Biener1-5/+8
The following adjusts -flto option processing in lto-wrapper to have link-time -flto override any compile time setting. PR lto/114655 * lto-wrapper.cc (merge_flto_options): Add force argument. (merge_and_complain): Do not force here. (run_gcc): But here to make the link-time -flto option override any compile-time one.
2024-04-09RTEMS: Fix powerpc configurationSebastian Huber1-0/+4
gcc/ChangeLog: * config/rs6000/rtems.h (OS_MISSING_POWERPC64): Define.
2024-04-09Guard function->cond_uids access [PR114601]Jørgen Kvalsvik2-2/+18
PR114601 shows that it is possible to reach the condition_uid lookup without having also created the fn->cond_uids, through compiler-generated conditionals. Consider all lookups on non-existing maps misses, which they are from the perspective of the source code, to avoid the NULL access. PR gcov-profile/114601 gcc/ChangeLog: * tree-profile.cc (condition_uid): Guard fn->cond_uids access. gcc/testsuite/ChangeLog: * gcc.misc-tests/gcov-pr114601.c: New test.
2024-04-09i386: Fix aes/vaes patterns [PR114576]Jakub Jelinek3-40/+121
On Wed, Apr 19, 2023 at 02:40:59AM +0000, Jiang, Haochen via Gcc-patches wrote: > > > (define_insn "aesenc" > > > - [(set (match_operand:V2DI 0 "register_operand" "=x,x") > > > - (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x") > > > - (match_operand:V2DI 2 "vector_operand" "xBm,xm")] > > > + [(set (match_operand:V2DI 0 "register_operand" "=x,x,v") > > > + (unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0,x,v") > > > + (match_operand:V2DI 2 "vector_operand" > > > + "xBm,xm,vm")] > > > UNSPEC_AESENC))] > > > - "TARGET_AES" > > > + "TARGET_AES || (TARGET_VAES && TARGET_AVX512VL)" > > > "@ > > > aesenc\t{%2, %0|%0, %2} > > > + vaesenc\t{%2, %1, %0|%0, %1, %2} > > > vaesenc\t{%2, %1, %0|%0, %1, %2}" > > > - [(set_attr "isa" "noavx,avx") > > > + [(set_attr "isa" "noavx,aes,avx512vl") > > Shouldn't it be vaes_avx512vl and then remove " || (TARGET_VAES && > > TARGET_AVX512VL)" from condition. > > Since VAES should not imply AES, we need that "|| (TARGET_VAES && > TARGET_AVX512VL)" > > And there is no need to add vaes_avx512vl since the last alternative will only > be hit when there is no aes. When there is no aes, the pattern will need vaes > and avx512vl both or we could not use this pattern. avx512vl here is just like > a placeholder. As the following testcase shows, the above change was incorrect. Using aes isa for the second alternative is obviously wrong, aes is enabled whenever -maes is, regardless of -mavx or -mno-avx, so the above change means that for -maes -mno-avx RA can choose, either it matches the first alternative with the dup operand, or it matches the second one (but that is of course wrong because vaesenc VEX encoded insn needs AES & AVX CPUID). The big question is if "Since VAES should not imply AES" is the case or not. Looking around at what LLVM does on godbolt, seems since clang 6 which added -mvaes support -mvaes there implies -maes, but GCC treats those two independent. Now, if we'd take the LLVM path of making -mvaes imply -maes and -mno-aes imply -mno-vaes, then we should probably just revert the above patch and tweak common/config/i386/ to do the implications (+ add the testcase from this patch). If we keep the current behavior, where AES and VAES are completely independent extensions, then we need to do more changes as the following patch attempts to do. We should use the aesenc etc. insns for noavx as before, we know at that point that TARGET_AES must be true because (TARGET_VAES && TARGET_AVX512VL) won't be true when !TARGET_AVX - TARGET_AVX512VL implies TARGET_AVX. For the second alternative, i.e. the AVX AES VEX or VAES AVX512F EVEX case without using %xmm16+/EGPR regs, the patch uses avx isa, but we need to emit {evex} prefix in the assembly if AES ISA is not enabled. For the last alternative, we need to use a new vaes_avx512vl isa attribute, because the %xmm16+/EGPR support is there only if both VAES and AVX512VL is enabled, not just AVX and AES. Still, I wonder if -mvaes shouldn't imply at least -mavx512f and -mno-avx512f shouldn't imply -mno-vaes, because otherwise can't see how it could use 512-bit registers (this part not done in the patch). 2024-04-09 Jakub Jelinek <jakub@redhat.com> PR target/114576 * config/i386/i386.md (isa): Remove aes, add vaes_avx512vl. (enabled): Remove aes isa check, add vaes_avx512vl. * config/i386/sse.md (aesenc, aesenclast, aesdec, aesdeclast): Use jm instead of m for second alternative and emit {evex} prefix for it if !TARGET_AES. Use noavx,avx,vaes_avx512vl isa attribute. (vaesdec_<mode>, vaesdeclast_<mode>, vaesenc_<mode>, vaesenclast_<mode>): Add second alternative with x instead of v and jm instead of m. * gcc.target/i386/aes-pr114576.c: New test.
2024-04-09modula2: remove description of fdebug-trace-quad, fdebug-trace-api and add ↵Gaius Mulley1-8/+5
fm2-debug-trace= This documentation fix removes the descriptions of -fdebug-trace-quad and -fdebug-trace-api. It adds a description of -fm2-debug-trace= together with the trace alternatives: line,token,quad,all. gcc/ChangeLog: * doc/gm2.texi (Compiler options): Remove -fdebug-trace-quad. Remove -fdebug-trace-api. Add -fm2-debug-trace=. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>