riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2024-07-15	RISC-V: Allow adding enabled extension via target arch attributes	Christoph Müllner	5	-8/+47
	The set of enabled extensions can be extended via target arch function attributes by listing each extension with a '+' prefix and a comma as list separator. E.g.: __attribute__((target("arch=+zba,+zbb"))) void foo(); The programmer intends to ensure that one or more extensions are enabled when building the code. This is independent of the arch string that is passed at build time via the -march= option. Therefore, it is reasonable to allow enabling extensions via target arch attributes, which have already been enabled via the -march= string. The subset list code already supports such duplication for implied extensions. This patch adds an interface so the subset list parser can be switched into a mode where duplication is allowed. This commit fixes the following regressed test cases: * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-39.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-42.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-43.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-44.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-45.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-46.c gcc/ChangeLog: * common/config/riscv/riscv-common.cc (riscv_subset_list::add): Allow adding enabled extension if m_allow_adding_dup is set. * config/riscv/riscv-subset.h: Add m_allow_adding_dup and setter. * config/riscv/riscv-target-attr.cc (riscv_target_attr_parser::parse_arch): Allow adding enabled extensions. gcc/testsuite/ChangeLog: * gcc.target/riscv/pr115554.c: Change expected fail to expected pass. * gcc.target/riscv/target-attr-16.c: New test. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
2024-07-15	RISC-V: Rewrite target attribute handling	Christoph Müllner	23	-203/+371
	The target-arch attribute handling in RISC-V is only a few months old, but already saw a rewrite (9941f0295a14), which addressed an important issue. This rewrite introduced a hash table in the backend, which is used to keep track of target-arch attributes of all functions. The index of this hash table is the pointer to the function declaration object (fndecl). However, objects like these don't have the lifetime that is assumed here, which resulted in observing two fndecl objects with the same address for different objects (triggering the assertion in riscv_func_target_put() -- see also PR115562). This patch removes the hash table approach in favor of storing target specific options using the DECL_FUNCTION_SPECIFIC_TARGET() macro, which is also used by other backends and is specifically designed for this purpose (https://gcc.gnu.org/onlinedocs/gccint/Function-Properties.html). To have an accessible field in the target options, we need to adjust riscv.opt and introduce the field riscv_arch_string (for the already existing option '-march='). Using this macro allows to remove much code from riscv-common.cc, which controls access to the objects 'func_target_table' and 'current_subset_list'. One thing to mention is, that we had two subset lists: current_subset_list and cmdline_subset_list, with the latter being introduced recently for target attribute handling. This patch reduces them back to one (cmdline_subset_list) which contains the list of extensions that have been enabled by the command line arguments. Note that the patch keeps the existing behavior of rejecting duplications of extensions when added via the '+' operator in a function target attribute. E.g. "-march=rv64gc_zbb" and "arch=+zbb" will trigger an error (see pr115554.c). However, at the same time this patch breaks the acceptance of adding implied extensions, which causes the following six regressions (with the error "extension 'EXT' appear more than one time"): * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-39.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-42.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-43.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-44.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-45.c * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-46.c New tests were added to document the behavior and to ensure it won't regress. This patch did not show any regressions for rv32/rv64 and fixes the ICEs from PR115554 and PR115562. PR target/115554 PR target/115562 gcc/ChangeLog: * common/config/riscv/riscv-common.cc (struct riscv_func_target_info): Remove. (struct riscv_func_target_hasher): Likewise. (riscv_func_decl_hash): Likewise. (riscv_func_target_hasher::hash): Likewise. (riscv_func_target_hasher::equal): Likewise. (riscv_current_subset_list): Likewise. (riscv_cmdline_subset_list): Remove obsolete space. (riscv_func_target_table_lazy_init): Remove. (riscv_func_target_get): Likewise. (riscv_func_target_put): Likewise. (riscv_func_target_remove_and_destory): Likewise. (riscv_arch_str): Generate from cmdline_subset_list. (riscv_set_arch_by_subset_list): Don't set current_subset_list. (riscv_parse_arch_string): Remove current_subset_list. * config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Get subset list via riscv_cmdline_subset_list(). * config/riscv/riscv-subset.h (riscv_current_subset_list): Remove prototype. (riscv_func_target_get): Likewise. (riscv_func_target_put): Likewise. (riscv_func_target_remove_and_destory): Likewise. * config/riscv/riscv-target-attr.cc (riscv_target_attr_parser::parse_arch): Build base arch string from existing target options, if any. (riscv_target_attr_parser::update_settings): Store new arch string in target options. (riscv_process_one_target_attr): Whitespace fix. (riscv_process_target_attr): Drop opts argument. (riscv_option_valid_attribute_p): Properly save, change and restore target options. * config/riscv/riscv.cc (get_arch_str): New function. (riscv_declare_function_name): Get arch string for option-arch directive from function's target options. * config/riscv/riscv.opt: Add riscv_arch_string variable to march option. gcc/testsuite/ChangeLog: * gcc.target/riscv/target-attr-01.c: Add test for option-arch directive. * gcc.target/riscv/target-attr-02.c: Likewise. * gcc.target/riscv/target-attr-03.c: Likewise. * gcc.target/riscv/target-attr-04.c: Likewise. * gcc.target/riscv/target-attr-05.c: Fix formatting. * gcc.target/riscv/target-attr-06.c: Likewise. * gcc.target/riscv/target-attr-07.c: Likewise. * gcc.target/riscv/pr115554.c: New test. * gcc.target/riscv/pr115562.c: New test. * gcc.target/riscv/target-attr-08.c: New test. * gcc.target/riscv/target-attr-09.c: New test. * gcc.target/riscv/target-attr-10.c: New test. * gcc.target/riscv/target-attr-11.c: New test. * gcc.target/riscv/target-attr-12.c: New test. * gcc.target/riscv/target-attr-13.c: New test. * gcc.target/riscv/target-attr-14.c: New test. * gcc.target/riscv/target-attr-15.c: New test. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
2024-07-15	RISC-V: Attribute parser: Use alloca() instead of new + std::unique_ptr	Christoph Müllner	1	-6/+3
	Allocating an object on the heap with new, wrapping it in a std::unique_ptr and finally getting the buffer via buf.get() is a correct way to allocate a buffer that is automatically freed on return. However, a simple invocation of alloca() does the same with less overhead. gcc/ChangeLog: * config/riscv/riscv-target-attr.cc (riscv_target_attr_parser::parse_arch): Replace new + std::unique_ptr by alloca(). (riscv_process_one_target_attr): Likewise. (riscv_process_target_attr): Likewise. Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
2024-07-15	[i386] adjust flag_omit_frame_pointer in a single function [PR113719]	Alexandre Oliva	1	-6/+6
	The first two patches for PR113719 have each regressed gcc.dg/ipa/iinline-attr.c on a different target. The reason for this instability is that there are competing flag_omit_frame_pointer overriders on x86: - ix86_recompute_optlev_based_flags computes and sets a -f[no-]omit-frame-pointer default depending on USE_IX86_FRAME_POINTER and, in 32-bit mode, optimize_size - ix86_option_override_internal enables flag_omit_frame_pointer for -momit-leaf-frame-pointer to take effect ix86_option_override[_internal] calls ix86_recompute_optlev_based_flags before setting flag_omit_frame_pointer. It is called during global process_options. But ix86_recompute_optlev_based_flags is also called by parse_optimize_options, during attribute processing, and at that point, ix86_option_override is not called, so the final overrider for global options is not applied to the optimize attributes. If they differ, the testcase fails. In order to fix this, we need to process all overriders of this option whenever we process any of them. Since this setting is affected by optimization options, it makes sense to compute it in parse_optimize_options, rather than in process_options. for gcc/ChangeLog PR target/113719 * config/i386/i386-options.cc (ix86_option_override_internal): Move flag_omit_frame_pointer final overrider... (ix86_recompute_optlev_based_flags): ... here.
2024-07-15	RISC-V: Fix testcase for vector .SAT_SUB in zip benchmark	Edwin Lu	1	-0/+1
	The following testcase was not properly testing anything due to an uninitialized variable. As a result, the loop was not iterating through the testing data, but instead on undefined values which could cause an unexpected abort. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vx.h: initialize variable Signed-off-by: Edwin Lu <ewlu@rivosinc.com>
2024-07-15	AVR: avr-md - Simplify GET_MODE and GET_MODE_BITSIZE.	Georg-Johann Lay	2	-22/+22
	gcc/ * config/avr/avr.md: Simplify mode usage. (GET_MODE_SIZE (<MODE>mode)): Use <SIZE> instead. (GET_MODE_BITSIZE (<MODE>mode) - 1): Use <MSB> instead. (GET_MODE_MASK (QImode)): Use 0xff instead. * config/avr/avr-fixed.md: Same.
2024-07-15	varasm: Add support for emitting binary data with the new gas .base64 directive	Jakub Jelinek	5	-3/+162
	Nick has implemented a new .base64 directive in gas (to be shipped in the upcoming binutils 2.43; big thanks for that). See https://sourceware.org/bugzilla/show_bug.cgi?id=31964 The following patch adjusts default_elf_asm_output_ascii (i.e. ASM_OUTPUT_ASCII elfos.h implementation) to use it if it detects binary data and gas supports it. Without this patch, we emit stuff like: .string "\177ELF\002\001\001\003" .string "" .string "" .string "" .string "" .string "" .string "" .string "" .string "\002" .string ">" ... .string "\324\001\236 0FS\202\002E\n0@\203\004\005&\202\021\337)\021\203C\020A\300\220I\004\t\b\206(\234\0132l\004b\300\bK\006\220$0\303\020P$\233\211\002D\f" etc., with this patch more compact .base64 "f0VMRgIBAQMAAAAAAAAAAAIAPgABAAAAABf3AAAAAABAAAAAAAAAAACneB0AAAAAAAAAAEAAOAAOAEAALAArAAYAAAAEAAAAQAAAAAAAAABAAEAAAAAAAEAAQAAAAAAAEAMAAAAAAAAQAwAAAAAAAAgAAAAAAAAAAwAAAAQAAABQAwAAAAAAAFADQAAAAAAAUANAAAAAAAAcAAAAAAAAABwAAAAAAAAAAQAAAAAAAAABAAAABAAAAAAAAAAAAAAAAABAAAAAAAAAAEAAAAAAADBwOQAAAAAAMHA5AAAAAAAAEAAAAAAAAAEAAAAFAAAAAIA5AAAAAAAAgHkAAAAA" .base64 "AACAeQAAAAAAxSSgAgAAAADFJKACAAAAAAAQAAAAAAAAAQAAAAQAAAAAsNkCAAAAAACwGQMAAAAAALAZAwAAAADMtc0AAAAAAMy1zQAAAAAAABAAAAAAAAABAAAABgAAAGhmpwMAAAAAaHbnAwAAAABoducDAAAAAOAMAQAAAAAA4MEeAAAAAAAAEAAAAAAAAAIAAAAGAAAAkH2nAwAAAACQjecDAAAAAJCN5wMAAAAAQAIAAAAAAABAAgAAAAAAAAgAAAAAAAAABAAAAAQAAABwAwAAAAAAAHADQAAAAAAAcANAAAAAAABAAAAAAAAAAEAAAAAAAAAACAAAAAAA" .base64 "AAAEAAAABAAAALADAAAAAAAAsANAAAAAAACwA0AAAAAAACAAAAAAAAAAIAAAAAAAAAAEAAAAAAAAAAcAAAAEAAAAaGanAwAAAABoducDAAAAAGh25wMAAAAAAAAAAAAAAAAQAAAAAAAAAAgAAAAAAAAAU+V0ZAQAAABwAwAAAAAAAHADQAAAAAAAcANAAAAAAABAAAAAAAAAAEAAAAAAAAAACAAAAAAAAABQ5XRkBAAAAAw/WAMAAAAADD+YAwAAAAAMP5gDAAAAAPy7CgAAAAAA/LsKAAAAAAAEAAAAAAAAAFHldGQGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA" .base64 "AAAAAAAAAAAAAAAAAAAAAAAAABAAAAAAAAAAUuV0ZAQAAABoZqcDAAAAAGh25wMAAAAAaHbnAwAAAACYGQAAAAAAAJgZAAAAAAAAAQAAAAAAAAAvbGliNjQvbGQtbGludXgteDg2LTY0LnNvLjIAAAAAAAQAAAAwAAAABQAAAEdOVQACgADABAAAAAEAAAAAAAAAAQABwAQAAAAJAAAAAAAAAAIAAcAEAAAAAwAAAAAAAAAEAAAAEAAAAAEAAABHTlUAAAAAAAMAAAACAAAAAAAAAAOAAACsqAAAgS0AAOJWAAAjNwAAXjAAAAAAAAAAAAAAF1gAAHsxAABBBwAA" .base64 "G0kAALGmAACwoAAAAAAAAAAAAACQhAAAAAAAAOw1AACNYgAAAAAAAFQoAAAAAAAAx3UAALZAAAAAAAAAiIUAALGeAABBlAAAWEsAAPmRAACmOgAAAAAAADh3AAAAAAAAlCAAAAAAAABymgAAaosAAMIjAAAKMQAAMkIAADU0AAAAAAAA5ZwAAAAAAAAAAAAAAAAAAFIdAAAIGQAAAAAAAMFbAAAoTQAAGDcAAIRgAAA6HgAAlxwAAAAAAADOlgAAAAAAAEhPAAARiwAAMGgAAOVtAADMFgAAAAAAAAAAAACrjgAAYl4AACZVAAA/HgAAAAAAAAAAAABqPwAAAAAA" The patch attempts to juggle between readability and compactness, so if it detects some hunk of the initializer that would be shorter to be emitted as .string/.ascii directive, it does so, but if it previously used .base64 directive it switches mode only if there is a 16+ char ASCII-ish string. On my #embed testcase from yesterday unsigned char a[] = { #embed "cc1plus" }; without this patch it emits 2.4GB of assembly, while with this patch 649M. Compile times (trunk, so yes,rtl,extra checking) are: time ./xgcc -B ./ -S -std=c23 -O2 embed-11.c real 0m13.647s user 0m7.157s sys 0m2.597s time ./xgcc -B ./ -c -std=c23 -O2 embed-11.c real 0m28.649s user 0m26.653s sys 0m1.958s without the patch and time ./xgcc -B ./ -S -std=c23 -O2 embed-11.c real 0m4.283s user 0m2.288s sys 0m0.859s time ./xgcc -B ./ -c -std=c23 -O2 embed-11.c real 0m6.888s user 0m5.876s sys 0m1.002s with the patch, so that feels like significant improvement. The resulting embed-11.o is identical between the two ways of expressing the mostly binary data in the assembly. But note that there are portions like: .base64 "nAAAAAAAAAAvZRcAIgAOAFAzMwEAAAAABgAAAAAAAACEQBgAEgAOAFBHcAIAAAAA7AAAAAAAAAAAX19nbXB6X2dldF9zaQBtcGZyX3NldF9zaV8yZXhwAG1wZnJfY29zaABtcGZyX3RhbmgAbXBmcl9zZXRfbmFuAG1wZnJfc3ViAG1wZnJfdGFuAG1wZnJfc3RydG9mcgBfX2dtcHpfc3ViX3VpAF9fZ21wX2dldF9tZW1vcnlfZnVuY3Rpb25zAF9fZ21wel9zZXRfdWkAbXBmcl9wb3cAX19nbXB6X3N1YgBfX2dtcHpfZml0c19zbG9uZ19wAG1wZnJfYXRh" .base64 "bjIAX19nbXB6X2RpdmV4YWN0AG1wZnJfc2V0X2VtaW4AX19nbXB6X3NldABfX2dtcHpfbXVsAG1wZnJfY2xlYXIAbXBmcl9sb2cAbXBmcl9hdGFuaABfX2dtcHpfc3dhcABtcGZyX2FzaW5oAG1wZnJfYXNpbgBtcGZyX2NsZWFycwBfX2dtcHpfbXVsXzJleHAAX19nbXB6X2FkZG11bABtcGZyX3NpbmgAX19nbXB6X2FkZF91aQBfX2dtcHFfY2xlYXIAX19nbW9uX3N0YXJ0X18AbXBmcl9hY29zAG1wZnJfc2V0X2VtYXgAbXBmcl9jb3MAbXBmcl9zaW4A" .string "__gmpz_ui_pow_ui" .string "mpfr_get_str" .string "mpfr_acosh" .string "mpfr_sub_ui" .string "__gmpq_set_ui" .string "mpfr_set_inf" ... .string "GLIBC_2.14" .string "GLIBC_2.11" .base64 "AAABAAIAAQADAAMAAwADAAMAAwAEAAUABgADAAEAAQADAAMABwABAAEAAwADAAMAAwAIAAEAAwADAAEAAwABAAMAAwABAAMAAQADAAMAAwADAAMAAwADAAYAAwADAAEAAQAIAAMAAwADAAMAAwABAAMAAQADAAMAAQABAAEAAwAIAAEAAwADAAEAAwABAAMAAQADAAEABgADAAMAAQAHAAMAAwADAAMAAwABAAMAAQABAAMAAwADAAkAAQABAAEAAwAKAAEAAwADAAMAAQABAAMAAwALAAEAAwADAAEAAQADAAMAAwABAAMAAwABAAEAAwADAAMABwABAAMAAwAB" .base64 "AAEAAwADAAEAAwABAAMAAQADAAMAAwADAAEAAQABAAEAAwADAAMAAQABAAEAAQABAAEAAQADAAMAAwADAAMAAQABAAwAAwADAA0AAwADAAMAAwADAAEAAQADAAMAAQABAAMAAwADAAEAAwADAAEAAwAIAAMAAwADAAMABgABAA4ACwAGAAEAAQADAAEAAQADAAEAAwABAAMAAwABAAEAAwABAAMAAwABAAEAAwADAAMAAwABAAMAAQABAAEAAQABAAMADwABAAMAAQADAAMAAwABAAEAAQAIAAEADAADAAMAAQABAAMAAwADAAEAAQABAAEAAQADAAEAAwADAAEA" .base64 "AwABAAMAAQADAAMAAQABAAEAAwADAAMAAwADAAMAAQADAAMACAAQAA8AAQABAAEAAQABAAEAAQABAAEAAQABAAEAAQABAAEAAQABAAEAAQABAAEAAQABAAEAAQABAAEAAQABAAEAAQABAAEAAQABAAEAAQABAAEAAQABAAEAAQABAAEAAQABAAEAAQA=" so it isn't all just totally unreadable stuff. 2024-07-15 Jakub Jelinek <jakub@redhat.com> * configure.ac (HAVE_GAS_BASE64): New check. * config/elfos.h (BASE64_ASM_OP): Define if HAVE_GAS_BASE64 is defined. * varasm.cc (assemble_string): Bump maximum from 2000 to 16384 if BASE64_ASM_OP is defined. (default_elf_asm_output_limited_string): Emit opening '"' together with STRING_ASM_OP. (default_elf_asm_output_ascii): Use BASE64_ASM_OP if defined and beneficial. Remove UB when last_null is NULL. * configure: Regenerate. * config.in: Regenerate.
2024-07-15	Fix SSA_NAME leak due to def_stmt is removed before use_stmt.	liuhongt	2	-4/+24
	- _5 = __atomic_fetch_or_8 (&set_work_pending_p, 1, 0); - # DEBUG old => (long int) _5 + _6 = .ATOMIC_BIT_TEST_AND_SET (&set_work_pending_p, 0, 1, 0, __atomic_fetch_or_8); + # DEBUG old => NULL # DEBUG BEGIN_STMT - # DEBUG D#2 => _5 & 1 + # DEBUG D#2 => NULL ... - _10 = ~_5; - _8 = (_Bool) _10; - # DEBUG ret => _8 + _8 = _6 == 0; + # DEBUG ret => (_Bool) _10 confirmed. convert_atomic_bit_not does this, it checks for single_use and removes the def, failing to release the name (which would fix this up IIRC). Note the function removes stmts in "wrong" order (before uses of LHS are removed), so it requires larger surgery. And it leaks SSA names. gcc/ChangeLog: PR target/115872 * tree-ssa-ccp.cc (convert_atomic_bit_not): Remove use_stmt after use_nop_stmt is removed. (optimize_atomic_bit_test_and): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr115872.c: New test.
2024-07-15	[APX NF] Add a pass to convert legacy insn to NF insns	Hongyu Wang	5	-5/+259
	For APX ccmp, current infrastructure will always generate cstore for the ccmp flag user, like cmpe %rcx, %r8 ccmpnel %rax, %rbx seta %dil add %rcx, %r9 add %r9, %rdx testb %dil, %dil je .L2 For such case, the legacy add clobbers FLAGS_REG so there should have extra cstore to avoid the flag be reset before using it. If the instructions between flag producer and user are NF insns, the setcc/ test sequence is not required. Add a pass to convert legacy flag clobber insns to their NF counterpart. The convertion only happens when 1. APX_NF enabled. 2. For a BB, cstore was find, and there are insns between such cstore and next explicit set insn to FLAGS_REG (test or cmp). 3. All the insns found should have NF counterpart. The pass was added after rtl-ifcvt which eliminates some branch when profitable, which could cause some flag-clobbering insn put between cstore and jcc. gcc/ChangeLog: * config/i386/i386.md (has_nf): New define_attr, add to all nf related patterns. * config/i386/i386-features.cc (apx_nf_convert): New function to convert Non-NF insns to their NF counterparts. (class pass_apx_nf_convert): New pass class. (make_pass_apx_nf_convert): New. * config/i386/i386-passes.def: Add pass_apx_nf_convert after rtl_ifcvt. * config/i386/i386-protos.h (make_pass_apx_nf_convert): Declare. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-nf-2.c: New test.
2024-07-15	arm: Fix the expected output of the test pr111235.c [PR115894]	Surya Kumari Jangala	1	-1/+1
	With r15-1619-g3b9b8d6cfdf593, pr111235.c fails due to different registers used in ldrexd instruction. The key part of this test is that the compiler generates LDREXD. The registers used for that are pretty much irrelevant as they are not matched with any other operations within the test. This patch changes the test to test only for the mnemonic and not for any of the operands. 2024-07-15 Surya Kumari Jangala <jskumari@linux.ibm.com> gcc/testsuite: PR testsuite/115894 * gcc.target/arm/pr111235.c: Update expected output.
2024-07-15	RISC-V: Implement locality for __builtin_prefetch	Monk Chiang	4	-3/+69
	The patch add the Zihintntl instructions in the prefetch pattern. Zicbop has prefetch instructions. Zihintntl has NTL instructions. Insert NTL instructions before prefetch instruction, if target has Zihintntl extension. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_print_operand): Add 'L' letter to print zihintntl instructions string. * config/riscv/riscv.md (prefetch): Add zihintntl instructions. gcc/testsuite/ChangeLog: * gcc.target/riscv/prefetch-zicbop.c: New test. * gcc.target/riscv/prefetch-zihintntl.c: New test.
2024-07-14	aarch64: Fix the expected output of the test cpy_1.c [PR115892]	Surya Kumari Jangala	1	-0/+6
	The fix at r15-1619-g3b9b8d6cfdf593 results in a rearrangement of instructions generated for cpy_1.c. This patch fixes the expected output. 2024-07-12 Surya Kumari Jangala <jskumari@linux.ibm.com> gcc/testsuite: PR testsuite/115892 * gcc.target/aarch64/sve/acle/general/cpy_1.c: Update expected output.
2024-07-15	CRIS: Adjust gcc.dg/tree-ssa/loop-1.c	Hans-Peter Nilsson	1	-3/+2
	With r15-1619-g3b9b8d6cfdf593, there's a XPASS and a FAIL for this test-case for cris-elf. Looking at the generated code, _foo is indeed no longer saved in a register for CRIS. While that looks like a regression, coremark results are the same around this revision, so simply adjust the test-case: remove the target-specific exceptions for cris--. * gcc.dg/tree-ssa/loop-1.c: Remove target-specific test and xfail to adjust for recent changes in register allocation.
2024-07-15	RISC-V: Add md files for vector BFloat16	Feng Wang	5	-17/+407
	V3: Add Bfloat16 vector insn in generic-vector-ooo.md v2: Rebase Accroding to the BFloat16 spec, some vector iterators and new pattern are added in md files. Signed-off-by: Feng Wang <wangfeng@eswincomputing.com> gcc/ChangeLog: * config/riscv/generic-vector-ooo.md: Add def_insn_reservation for vector BFloat16. * config/riscv/riscv.md: Add new insn name for vector BFloat16. * config/riscv/vector-iterators.md: Add some iterators for vector BFloat16. * config/riscv/vector.md: Add some attribute for vector BFloat16. * config/riscv/vector-bfloat16.md: New file. Add insn pattern vector BFloat16.
2024-07-15	RISC-V: Add Zvfbfmin and Zvfbfwma intrinsic	Feng Wang	8	-17/+232
	v3: Modify warning message in riscv.cc v2: Rebase Accroding to the intrinsic doc, the 'Zvfbfmin' and 'Zvfbfwma' intrinsic functions are added by this patch. Signed-off-by: Feng Wang <wangfeng@eswincomputing.com> gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc (class vfncvtbf16_f): Add 'Zvfbfmin' intrinsic in bases. (class vfwcvtbf16_f): Ditto. (class vfwmaccbf16): Add 'Zvfbfwma' intrinsic in bases. (BASE): Add BASE macro for 'Zvfbfmin' and 'Zvfbfwma'. * config/riscv/riscv-vector-builtins-bases.h: Add declaration for 'Zvfbfmin' and 'Zvfbfwma'. * config/riscv/riscv-vector-builtins-functions.def (REQUIRED_EXTENSIONS): Add builtins def for 'Zvfbfmin' and 'Zvfbfwma'. (vfncvtbf16_f): Ditto. (vfncvtbf16_f_frm): Ditto. (vfwcvtbf16_f): Ditto. (vfwmaccbf16): Ditto. (vfwmaccbf16_frm): Ditto. * config/riscv/riscv-vector-builtins-shapes.cc (supports_vectype_p): Add vector intrinsic build judgment for BFloat16. (build_all): Ditto. (BASE_NAME_MAX_LEN): Adjust max length. * config/riscv/riscv-vector-builtins-types.def (DEF_RVV_F32_OPS): Add new operand type for BFloat16. (vfloat32mf2_t): Ditto. (vfloat32m1_t): Ditto. (vfloat32m2_t): Ditto. (vfloat32m4_t): Ditto. (vfloat32m8_t): Ditto. * config/riscv/riscv-vector-builtins.cc (DEF_RVV_F32_OPS): Ditto. (validate_instance_type_required_extensions): Add required_ext checking for 'Zvfbfmin' and 'Zvfbfwma'. * config/riscv/riscv-vector-builtins.h (enum required_ext): Add required_ext declaration for 'Zvfbfmin' and 'Zvfbfwma'. (reqired_ext_to_isa_name): Ditto. (required_extensions_specified): Ditto. (struct function_group_info): Add match case for 'Zvfbfmin' and 'Zvfbfwma'. * config/riscv/riscv.cc (riscv_validate_vector_type): Add required_ext checking for 'Zvfbfmin' and 'Zvfbfwma'.
2024-07-15	AVX512BF16: Do not allow permutation with vcvtne2ps2bf16 [PR115889]	Hongyu Wang	3	-48/+1
	According to the instruction spec of AVX512BF16, the convert from float to BF16 is not a simple truncation. It has special handling for denormal/nan, even for normal float it will add an extra bias according to the least significant bit for bf number. This means we cannot use the vcvtne2ps2bf16 for any bf16 vector shuffle. The optimization introduced in r15-1368 adds a specific split to convert HImode permutation with this instruction, so remove it and treat the BFmode permutation same as HFmode. gcc/ChangeLog: PR target/115889 * config/i386/predicates.md (vcvtne2ps2bf_parallel): Remove. * config/i386/sse.md (hi_cvt_bf): Remove. (HI_CVT_BF): Likewise. (vpermt2_sepcial_bf16_shuffle_<mode>):Likewise. gcc/testsuite/ChangeLog: PR target/115889 * gcc.target/i386/vpermt2-special-bf16-shufflue.c: Adjust output scan.
2024-07-15	RISC-V: Add vector type of BFloat16 format	Feng Wang	7	-3/+291
	v3: Rebase v2: Rebase The vector type of BFloat16 format is added in this patch, subsequent extensions to zvfbfmin and zvfwma need to be based on this patch. Signed-off-by: Feng Wang <wangfeng@eswincomputing.com> gcc/ChangeLog: * config/riscv/genrvv-type-indexer.cc (bfloat16_type): Generate bf16 vector_type and scalar_type in DEF_RVV_TYPE_INDEX. (bfloat16_wide_type): Ditto. (same_ratio_eew_bf16_type): Ditto. (main): Ditto. * config/riscv/riscv-modes.def (ADJUST_BYTESIZE): Add vector type for BFloat16. (RVV_WHOLE_MODES): Add vector type for BFloat16. (RVV_FRACT_MODE): Ditto. (RVV_NF4_MODES): Ditto. (RVV_NF8_MODES): Ditto. (RVV_NF2_MODES): Ditto. * config/riscv/riscv-vector-builtins-types.def (vbfloat16mf4_t): Add builtin vector type for BFloat16. (vbfloat16mf2_t): Add builtin vector type for BFloat16. (vbfloat16m1_t): Ditto. (vbfloat16m2_t): Ditto. (vbfloat16m4_t): Ditto. (vbfloat16m8_t): Ditto. (vbfloat16mf4x2_t): Ditto. (vbfloat16mf4x3_t): Ditto. (vbfloat16mf4x4_t): Ditto. (vbfloat16mf4x5_t): Ditto. (vbfloat16mf4x6_t): Ditto. (vbfloat16mf4x7_t): Ditto. (vbfloat16mf4x8_t): Ditto. (vbfloat16mf2x2_t): Ditto. (vbfloat16mf2x3_t): Ditto. (vbfloat16mf2x4_t): Ditto. (vbfloat16mf2x5_t): Ditto. (vbfloat16mf2x6_t): Ditto. (vbfloat16mf2x7_t): Ditto. (vbfloat16mf2x8_t): Ditto. (vbfloat16m1x2_t): Ditto. (vbfloat16m1x3_t): Ditto. (vbfloat16m1x4_t): Ditto. (vbfloat16m1x5_t): Ditto. (vbfloat16m1x6_t): Ditto. (vbfloat16m1x7_t): Ditto. (vbfloat16m1x8_t): Ditto. (vbfloat16m2x2_t): Ditto. (vbfloat16m2x3_t): Ditto. (vbfloat16m2x4_t): Ditto. (vbfloat16m4x2_t): Ditto. * config/riscv/riscv-vector-builtins.cc (check_required_extensions): Add required_ext checking for BFloat16. * config/riscv/riscv-vector-builtins.def (vbfloat16mf4_t): Add vector_type for BFloat16 in builtins.def. (vbfloat16mf4x2_t): Ditto. (vbfloat16mf4x3_t): Ditto. (vbfloat16mf4x4_t): Ditto. (vbfloat16mf4x5_t): Ditto. (vbfloat16mf4x6_t): Ditto. (vbfloat16mf4x7_t): Ditto. (vbfloat16mf4x8_t): Ditto. (vbfloat16mf2_t): Ditto. (vbfloat16mf2x2_t): Ditto. (vbfloat16mf2x3_t): Ditto. (vbfloat16mf2x4_t): Ditto. (vbfloat16mf2x5_t): Ditto. (vbfloat16mf2x6_t): Ditto. (vbfloat16mf2x7_t): Ditto. (vbfloat16mf2x8_t): Ditto. (vbfloat16m1_t): Ditto. (vbfloat16m1x2_t): Ditto. (vbfloat16m1x3_t): Ditto. (vbfloat16m1x4_t): Ditto. (vbfloat16m1x5_t): Ditto. (vbfloat16m1x6_t): Ditto. (vbfloat16m1x7_t): Ditto. (vbfloat16m1x8_t): Ditto. (vbfloat16m2_t): Ditto. (vbfloat16m2x2_t): Ditto. (vbfloat16m2x3_t): Ditto. (vbfloat16m2x4_t): Ditto. (vbfloat16m4_t): Ditto. (vbfloat16m4x2_t): Ditto. (vbfloat16m8_t): Ditto. (double_trunc_bfloat_scalar): Add scalar_type def for BFloat16. (double_trunc_bfloat_vector): Add vector_type def for BFloat16. * config/riscv/riscv-vector-builtins.h (RVV_REQUIRE_ELEN_BF_16): Add required defination of BFloat16 ext. * config/riscv/riscv-vector-switch.def (ENTRY): Add vector_type information for BFloat16. (TUPLE_ENTRY): Add tuple vector_type information for BFloat16.
2024-07-15	Daily bump.	GCC Administrator	5	-1/+45

2024-07-14	i386: Tweak i386-expand.cc to restore bootstrap on RHEL.	Roger Sayle	1	-13/+13
	This is a minor change to restore bootstrap on systems using gcc 4.8 as a host compiler. The fatal error is: In file included from gcc/gcc/coretypes.h:471:0, from gcc/gcc/config/i386/i386-expand.cc:23: gcc/gcc/config/i386/i386-expand.cc: In function 'void ix86_expand_fp_absneg_operator(rtx_code, machine_mode, rtx_def*)': ./insn-modes.h:315:75: error: temporary of non-literal type 'scalar_float_mode' in a constant expression #define HFmode (scalar_float_mode ((scalar_float_mode::from_int) E_HFmode)) ^ gcc/gcc/config/i386/i386-expand.cc:2179:8: note: in expansion of macro 'HFmode' case HFmode: ^ The solution is to use the E_?Fmode enumeration constants as case values in switch statements. 2024-07-14 Roger Sayle <roger@nextmovesoftware.com> config/i386/i386-expand.cc (ix86_expand_fp_absneg_operator): Use E_?Fmode enumeration constants in switch statement. (ix86_expand_copysign): Likewise. (ix86_expand_xorsign): Likewise.
2024-07-14	c, objc: Add -Wunterminated-string-initialization	Alejandro Colomar	5	-5/+33
	Warn about the following: char s[3] = "foo"; Initializing a char array with a string literal of the same length as the size of the array is usually a mistake. Rarely is the case where one wants to create a non-terminated character sequence from a string literal. In some cases, for writing faster code, one may want to use arrays instead of pointers, since that removes the need for storing an array of pointers apart from the strings themselves. char log_levels[] = { "info", "warning", "err" }; vs. char log_levels[][7] = { "info", "warning", "err" }; This forces the programmer to specify a size, which might change if a new entry is later added. Having no way to enforce null termination is very dangerous, however, so it is useful to have a warning for this, so that the compiler can make sure that the programmer didn't make any mistakes. This warning catches the bug above, so that the programmer will be able to fix it and write: char log_levels[][8] = { "info", "warning", "err" }; This warning already existed as part of -Wc++-compat, but this patch allows enabling it separately. It is also included in -Wextra, since it may not always be desired (when unterminated character sequences are wanted), but it's likely to be desired in most cases. Since Wc++-compat now includes this warning, the test has to be modified to expect the text of the new warning too, in <gcc.dg/Wcxx-compat-14.c>. Link: https://lists.gnu.org/archive/html/groff/2022-11/msg00059.html Link: https://lists.gnu.org/archive/html/groff/2022-11/msg00063.html Link: https://inbox.sourceware.org/gcc/36da94eb-1cac-5ae8-7fea-ec66160cf413@gmail.com/T/ PR c/115185 gcc/c-family/ChangeLog: c.opt: Add -Wunterminated-string-initialization. gcc/c/ChangeLog: * c-typeck.cc (digest_init): Separate warnings about character arrays being initialized as unterminated character sequences with string literals, from -Wc++-compat, into a new warning, -Wunterminated-string-initialization. gcc/ChangeLog: * doc/invoke.texi: Document the new -Wunterminated-string-initialization. gcc/testsuite/ChangeLog: * gcc.dg/Wcxx-compat-14.c: Adapt the test to match the new text of the warning, which doesn't say anything about C++ anymore. * gcc.dg/Wunterminated-string-initialization.c: New test. Acked-by: Doug McIlroy <douglas.mcilroy@dartmouth.edu> Acked-by: Mike Stump <mikestump@comcast.net> Reviewed-by: Sandra Loosemore <sloosemore@baylibre.com> Reviewed-by: Martin Uecker <uecker@tugraz.at> Signed-off-by: Alejandro Colomar <alx@kernel.org> Reviewed-by: Marek Polacek <polacek@redhat.com>
2024-07-14	CRIS: Fix up last comment.	Hans-Peter Nilsson	1	-4/+3
	* config/cris/cris.cc (cris_option_override_after_change): Fix up comment regarding disabling late_combine.
2024-07-14	CRIS: Disable late-combine by default, related PR115883	Hans-Peter Nilsson	1	-0/+37
	With late-combine, performance for coremark compiled for cris-elf regresses 2.6% by performance and by size 0.4%, measured at r15-2005-g13757e50ff0b, when compiled with "-O2 -march=v10". Earlier, at r15-1880-gce34fcc572a0, numbers were by performance 3.2% and by size 0.4%, even with the proposed patch to PR115883 (TL;DR: a presumed bug in LRA or combine exposed by late-combine). Without that patch, about the same performance results (at that revision). Similarly around the late-combine commit (r15-1579-g792f97b44ffc5e). I briefly looked at the performance regression for coremark at r15-2005-g13757e50ff0b (with/without this patch) as far as seeing that the stack-frame grew larger (maxing out on hard registers and needing one more slot) for at least two of the top three* functions that regressed the most in terms of cycles per call: matrix_mul_matrix_bitextract (in coremark, 17% slower) and __subdf3 (in libgcc, 6.7% slower). That makes sense when considering that late-combine "naturally" stretches register life-times. But, looking at late_combine::combine_into_uses and late_combine::optimizable_set, nothing stood out to me. I guess there's improvement opportunities in late_combine::check_register_pressure. () I opted not to look at _dtoa_r (in newlib) mostly because it's boring and always shows up when something in gcc goes sideways. (It maxes out on hard registers and is big. End of story.) Note that the change of default is done in the TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE worker, not in the TARGET_OPTION_OVERRIDE worker for reasons stated in the comment. config/cris/cris.cc (cris_option_override_after_change): New function. Disable late-combine by default. (cris_option_override): Call the new function.
2024-07-14	Daily bump.	GCC Administrator	10	-1/+307

2024-07-13	Document return value in write_cv_integer	Mark Harmstone	1	-1/+1
	gcc/ * dwarf2codeview.cc (write_lf_modifier): Expand upon comment.
2024-07-13	Make sure CodeView symbols are aligned	Mark Harmstone	1	-0/+2
	CodeView symbols have to be multiples of four bytes; add an alignment directive to write_data_symbol to ensure this. Note that these can be zeroes, so we can rely on GAS to do this for us; it's only types that need f3, f2, f1 values. gcc/ * dwarf2codeview.cc (write_data_symbol): Add alignment directive.
2024-07-13	Avoid magic numbers when writing CodeView padding	Mark Harmstone	1	-4/+7
	Adds names for the padding magic numbers to enum cv_leaf_type. gcc/ * dwarf2codeview.cc (enum cv_leaf_type): Add padding constants. (write_cv_padding): Use names for padding constants.
2024-07-13	Add CodeView enum cv_sym_type	Mark Harmstone	1	-5/+11
	Make everything more gdb-friendly by using an enum for symbol constants rather than #defines. gcc/ * dwarf2codeview.cc (S_LDATA32, S_GDATA32, S_COMPILE3): Undefine. (enum cv_sym_type): Define. (struct codeview_symbol): Use enum cv_sym_type. (write_codeview_symbols): Add default to switch.
2024-07-13	Add CodeView enum cv_leaf_type	Mark Harmstone	2	-25/+35
	Make everything more gdb-friendly by using an enum for type constants rather than #defines. gcc/ * dwarf2codeview.cc (enum cv_leaf_type): Define. (struct codeview_subtype): Use enum cv_leaf_type. (struct codeview_custom_type): Use enum cv_leaf_type. (write_lf_fieldlist): Add default to switch. (write_custom_types): Add default to switch. * dwarf2codeview.h (LF_MODIFIER, LF_POINTER): Undefine. (LF_PROCEDURE, LF_ARGLIST, LF_FIELDLIST, LF_BITFIELD): Likewise. (LF_INDEX, LF_ENUMERATE, LF_ARRAY, LF_CLASS): Likewise. (LF_STRUCTURE, LF_UNION, LF_ENUM, LF_MEMBER, LF_CHAR): Likewise. (LF_SHORT, LF_USHORT, LF_LONG, LF_ULONG, LF_QUADWORD): Likewise. (LF_UQUADWORD): Likewise.
2024-07-13	fortran: Correctly evaluate scalar MASK arguments of MINLOC/MAXLOC	Mikael Morin	2	-0/+34
	Add the preliminary code that the generated expression for MASK may depend on when generating the inline code to evaluate MINLOC or MAXLOC with a scalar MASK. The generated code was only keeping the generated expression but not the preliminary code, which was sufficient for simple cases such as data references or simple (scalar) function calls, but was bogus with more complicated ones. gcc/fortran/ChangeLog: * trans-intrinsic.cc (gfc_conv_intrinsic_minmaxloc): Add the preliminary code generated for MASK to the preliminary code of MINLOC/MAXLOC. gcc/testsuite/ChangeLog: * gfortran.dg/minmaxloc_17.f90: New test.
2024-07-13	Add gcc.gnu.org account names to MAINTAINERS	Richard Sandiford	2	-791/+969
	As discussed in the thread starting at: https://gcc.gnu.org/pipermail/gcc/2024-June/244199.html it would be useful to have the @gcc.gnu.org bugzilla account names in MAINTAINERS. This is because: (a) Not every non-@gcc.gnu.org email listed in MAINTAINERS is registered as a bugzilla user. (b) Only @gcc.gnu.org accounts tend to have full rights to modify tickets. (c) A maintainer's name and email address aren't always enough to guess the bugzilla account name. (d) The users list on bugzilla has many blank entries for "real name". However, including @gcc.gnu.org to the account name might encourage people to use it for ordinary email, rather than just for bugzilla. This patch goes for the compromise of using the unqualified account name, with some text near the top of the file to explain its usage. There isn't room in the area maintainer sections for a new column, so it seemed better to have the account name only in the Write After Approval section. It's then necessary to list all maintainers there, even if they have more specific roles as well. Also, there were some entries that didn't line up with the prevailing columns (they had one tab too many or one tab too few). It seemed easier to check for and report this, and other things, if the file used spaces rather than tabs. There was one instance of an email address without the trailing ">". The updates to check-MAINTAINERS.py includes a test for that. The account names in the file were taken from a trawl of the gcc-cvs archives, with a very small number of manual edits for ambiguities. There are a handful of names that I couldn't find; the new column has "-" for those. The names were then filtered against the bugzilla @gcc.gnu.org user list, with those not present again being blanked out with "-". ChangeLog: * MAINTAINERS: Replace tabs with spaces. Add a bugzilla account name column to the Write After Approval section. Line up the email column and fix an entry that was missing the trailing ">". contrib/ChangeLog: * check-MAINTAINERS.py (sort_by_surname): Replace with... (get_surname): ...this. (has_tab, is_empty): Delete. (check_group): Take a list of column positions as argument. Check that lines conform to these column numbers. Check that the final column is an email in angle brackets. Record surnames on the fly. (top level): Reject tabs. Use paragraph counts to identify which groups of lines should be checked. Report missing sections.
2024-07-13	diagnostics: add highlight-a vs highlight-b in colorization and pp_markup	David Malcolm	54	-196/+1580
	Since r6-4582-g8a64515099e645 (which added class rich_location), ranges of quoted source code have been colorized using the following rules: - the primary range used the same color of the kind of the diagnostic i.e. "error" vs "warning" etc (defaulting to bold red and bold magenta respectively) - secondary ranges alternate between "range1" and "range2" (defaulting to green and blue respectively) This works for cases with large numbers of highlighted ranges, but is suboptimal for common cases. The following patch adds a pair of color names: "highlight-a" and "highlight-b", and uses them whenever it makes sense to highlight and contrast two different things in the source code (e.g. a type mismatch). These are used by diagnostic-show-locus.cc for highlighting quoted source. In addition the patch adds colorization to fragments within the corresponding diagnostic messages themselves, using consistent colorization between the message and the quoted source code for the two different things being contrasted. For example, consider: demo.c: In function ‘test_bad_format_string_args’: ../../src/demo.c:25:18: warning: format ‘%i’ expects argument of type ‘int’, but argument 2 has type ‘const char ’ [-Wformat=] 25 \| printf("hello %i", msg); \| ~^ ~~~ \| \| \| \| int const char \| %s Previously, the types within the message in quotes would be in bold but not colorized, and the labelled ranges of quoted source code would use bold magenta for the "int" and non-bold green for the "const char ". With this patch: - the "%i" and "int" in the message and the "int" in the quoted source are all colored bold green - the "const char " in the message and in the quoted source are both colored bold blue so that the consistent use of contrasting color draws the reader's eyes to the relationships between the diagnostic message and the source. I've tried this with gnome-terminal with many themes, including a variety of light versus dark backgrounds, solarized versus non-solarized themes, etc, and it was readable in all. My initial version of the patch used the existing %r and %R facilities within pretty-print.cc for the messages, but this turned out to be very uncomfortable, leading to error-prone format strings such as: error_at (richloc, "invalid operands to binary %s (have %<%r%T%R%> and %<%r%T%R%>)", opname, "highlight-a", type0, "highlight-b", type1); To avoid requiring monstrosities such as the above, the patch adds a new "%e" format code to pretty-print.cc, which expects a pp_element , where pp_element is a new abstract base class (actually a pp_markup::element), along with various useful subclasses. This lets the above be written as: pp_markup::element_quoted_type element_0 (type0, highlight_colors::lhs); pp_markup::element_quoted_type element_1 (type1, highlight_colors::rhs); error_at (richloc, "invalid operands to binary %s (have %e and %e)", opname, &element_0, &element_1); which I feel is maintainable and clear to translators; the use of %e and pp_element captures the type-unsafe part of the variadic call, and the subclasses allow for type-safety (so e.g. an element_quoted_type expects a type and a highlighting color). This approach allows for some nice simplifications within c-format.cc. The patch also extends -Wformat to "teach" it about the new %e and pp_element . Doing so requires c-format.cc to be able to determine if a T is a pp_element * (i.e. if T is a subclass). To do so I added a new comp_types callback for comparing types, where the C++ frontend supplies a suitable implementation (and %e will always be wrong for C). I've manually tested this on many diagnostics with both C and C++ and it seems a subtle but significant improvement in readability. I've added a new option -fno-diagnostics-show-highlight-colors in case people prefer the old behavior. gcc/c-family/ChangeLog: * c-common.cc: Include "tree-pretty-print-markup.h". (binary_op_error): Use pp_markup::element_quoted_type and %e. (check_function_arguments): Add "comp_types" param and pass it to check_function_format. * c-common.h (check_function_arguments): Add "comp_types" param. (check_function_format): Likewise. * c-format.cc: Include "tree-pretty-print-markup.h". (local_pp_element_ptr_node): New. (PP_FORMAT_CHAR_TABLE): Add entry for %e. (struct format_check_context): Add "m_comp_types" field. (check_function_format): Add "comp_types" param and pass it to check_format_info. (check_format_info): Likewise, passing it to format_ctx's ctor. (check_format_arg): Extract m_comp_types from format_ctx and pass it to check_format_info_main. (check_format_info_main): Add "comp_types" param and pass it to arg_parser's ctor. (class argument_parser): Add "m_comp_types" field. (argument_parser::check_argument_type): Pass m_comp_types to check_format_types. (handle_subclass_of_pp_element_p): New. (check_format_types): Add "comp_types" param, and use it to call handle_subclass_of_pp_element_p. (class element_format_substring): New. (class element_expected_type_with_indirection): New. (format_type_warning): Use element_expected_type_with_indirection to unify the if (wanted_type_name) branches, reducing from four emit_warning calls to two. Simplify these further using %e. Doing so also gives suitable colorization of the text within the diagnostics. (init_dynamic_diag_info): Initialize local_pp_element_ptr_node. (selftest::test_type_mismatch_range_labels): Add nullptr for new param of gcc_rich_location label overload. * c-format.h (T_PP_ELEMENT_PTR): New. * c-type-mismatch.cc: Include "diagnostic-highlight-colors.h". (binary_op_rich_location::binary_op_rich_location): Use highlight_colors::lhs and highlight_colors::rhs for the ranges. * c-type-mismatch.h (class binary_op_rich_location): Add comment about highlight_colors. gcc/c/ChangeLog: * c-objc-common.cc: Include "tree-pretty-print-markup.h". (print_type): Add optional "highlight_color" param and use it to show highlight colors in "aka" text. (pp_markup::element_quoted_type::print_type): New. * c-typeck.cc: Include "tree-pretty-print-markup.h". (comp_parm_types): New. (build_function_call_vec): Pass it to check_function_arguments. (inform_for_arg): Use %e and highlight colors to contrast actual versus expected. (convert_for_assignment): Use highlight_colors::actual for the rhs_label. (build_binary_op): Use highlight_colors::lhs and highlight_colors::rhs for the ranges. gcc/ChangeLog: * common.opt (fdiagnostics-show-highlight-colors): New option. * common.opt.urls: Regenerate. * coretypes.h (pp_markup::element): New forward decl. (pp_element): New typedef. * diagnostic-color.cc (gcc_color_defaults): Add "highlight-a" and "highlight-b". * diagnostic-format-json.cc (diagnostic_output_format_init_json): Disable highlight colors. * diagnostic-format-sarif.cc (diagnostic_output_format_init_sarif): Likewise. * diagnostic-highlight-colors.h: New file. * diagnostic-path.cc (struct event_range): Pass nullptr for highlight color of m_rich_loc. * diagnostic-show-locus.cc (colorizer::set_range): Handle ranges with m_highlight_color. (colorizer::STATE_NAMED_COLOR): New. (colorizer::m_richloc): New field. (colorizer::colorizer): Add richloc param for initializing m_richloc. (colorizer::set_named_color): New. (colorizer::begin_state): Add case STATE_NAMED_COLOR. (layout::layout): Pass richloc to m_colorizer's ctor. (selftest::test_one_liner_labels): Pass nullptr for new param of gcc_rich_location ctor for labels. (selftest::test_one_liner_labels_utf8): Likewise. * diagnostic.h (diagnostic_context::set_show_highlight_colors): New. * doc/invoke.texi: Add option -fdiagnostics-show-highlight-colors and highlight-a and highlight-b color caps. * doc/ux.texi (Use color consistently when highlighting mismatches): New subsection. * gcc-rich-location.cc (gcc_rich_location::add_expr): Add "highlight_color" param. (gcc_rich_location::maybe_add_expr): Likewise. * gcc-rich-location.h (gcc_rich_location::gcc_rich_location): Split out into a pair of ctors, where if a range_label is supplied the caller must also supply a highlight color. (gcc_rich_location::add_expr): Add "highlight_color" param. (gcc_rich_location::maybe_add_expr): Likewise. * gcc.cc (driver_handle_option): Handle OPT_fdiagnostics_show_highlight_colors. * lto-wrapper.cc (merge_and_complain): Likewise. (append_compiler_options): Likewise. (append_diag_options): Likewise. (run_gcc): Likewise. * opts-common.cc (decode_cmdline_options_to_array): Add comment about -fno-diagnostics-show-highlight-colors. * opts-global.cc (init_options_once): Preserve pp_show_highlight_colors in case the global_dc's printer is recreated. * opts.cc (common_handle_option): Handle OPT_fdiagnostics_show_highlight_colors. (gen_command_line_string): Likewise. * pretty-print-markup.h: New file. * pretty-print.cc: Include "pretty-print-markup.h" and "diagnostic-highlight-colors.h". (pretty_printer::format): Handle %e. (pretty_printer::pretty_printer): Handle new field m_show_highlight_colors. (pp_string_n): New. (pp_markup::context::begin_quote): New. (pp_markup::context::end_quote): New. (pp_markup::context::begin_color): New. (pp_markup::context::end_color): New. (highlight_colors::expected): New. (highlight_colors::actual): New. (highlight_colors::lhs): New. (highlight_colors::rhs): New. (class selftest::test_element): New. (selftest::test_pp_format): Add tests of %e. (selftest::test_urlification): Likewise. * pretty-print.h (pp_markup::context): New forward decl. (class chunk_info): Add friend class pp_markup::context. (class pretty_printer): Add friend pp_show_highlight_colors. (pretty_printer::m_show_highlight_colors): New field. (pp_show_highlight_colors): New inline function. (pp_string_n): New decl. * substring-locations.cc: Include "diagnostic-highlight-colors.h". (format_string_diagnostic_t::highlight_color_format_string): New. (format_string_diagnostic_t::highlight_color_param): New. (format_string_diagnostic_t::emit_warning_n_va): Use highlight colors. * substring-locations.h (format_string_diagnostic_t::highlight_color_format_string): New. (format_string_diagnostic_t::highlight_color_param): New. * toplev.cc (general_init): Initialize global_dc's show_highlight_colors. * tree-pretty-print-markup.h: New file. gcc/cp/ChangeLog: * call.cc: Include "tree-pretty-print-markup.h". (implicit_conversion_error): Use highlight_colors::percent_h for the labelled range. (op_error_string): Split out into... (concat_op_error_string): ...this. (binop_error_string): New. (op_error): Use %e, binop_error_string, highlight_colors::lhs, and highlight_colors::rhs. (maybe_inform_about_fndecl_for_bogus_argument_init): Add "highlight_color" param; use it for the richloc. (convert_like_internal): Use highlight_colors::percent_h for the labelled_range, and highlight_colors::percent_i for the call to maybe_inform_about_fndecl_for_bogus_argument_init. (build_over_call): Pass cp_comp_parm_types for new "comp_types" param of check_function_arguments. (complain_about_bad_argument): Use highlight_colors::percent_h for the labelled_range, and highlight_colors::percent_i for the call to maybe_inform_about_fndecl_for_bogus_argument_init. * cp-tree.h (maybe_inform_about_fndecl_for_bogus_argument_init): Add optional highlight_color param. (cp_comp_parm_types): New decl. (highlight_colors::const percent_h): New decl. (highlight_colors::const percent_i): New decl. * error.cc: Include "tree-pretty-print-markup.h". (highlight_colors::const percent_h): New defn. (highlight_colors::const percent_i): New defn. (type_to_string): Add param "highlight_color" and use it. (print_nonequal_arg): Likewise. (print_template_differences): Add params "highlight_color_a" and "highlight_color_b". (type_to_string_with_compare): Add params "this_highlight_color" and "peer_highlight_color". (print_template_tree_comparison): Add params "highlight_color_a" and "highlight_color_b". (cxx_format_postprocessor::handle): Use highlight_colors::percent_h and highlight_colors::percent_i. (pp_markup::element_quoted_type::print_type): New. (range_label_for_type_mismatch::get_text): Pass nullptr for new params of type_to_string_with_compare. * typeck.cc (cp_comp_parm_types): New. (cp_build_function_call_vec): Pass it to check_function_arguments. (convert_for_assignment): Use highlight_colors::percent_h for the labelled_range. gcc/testsuite/ChangeLog: * g++.dg/diagnostic/bad-binary-ops-highlight-colors.C: New test. * g++.dg/diagnostic/bad-binary-ops-no-highlight-colors.C: New test. * g++.dg/plugin/plugin.exp (plugin_test_list): Add show-template-tree-color-no-highlight-colors.C to show_template_tree_color_plugin.c. * g++.dg/plugin/show-template-tree-color-labels.C: Update expected output to reflect use of highlight-a and highlight-b to contrast mismatches. * g++.dg/plugin/show-template-tree-color-no-elide-type.C: Likewise. * g++.dg/plugin/show-template-tree-color-no-highlight-colors.C: New test. * g++.dg/plugin/show-template-tree-color.C: Update expected output to reflect use of highlight-a and highlight-b to contrast mismatches. * g++.dg/warn/Wformat-gcc_diag-1.C: New test. * g++.dg/warn/Wformat-gcc_diag-2.C: New test. * g++.dg/warn/Wformat-gcc_diag-3.C: New test. * gcc.dg/bad-binary-ops-highlight-colors.c: New test. * gcc.dg/format/colors.c: New test. * gcc.dg/plugin/diagnostic_plugin_show_trees.c (show_tree): Pass nullptr for new param of gcc_rich_location::add_expr. libcpp/ChangeLog: * include/rich-location.h (location_range::m_highlight_color): New field. (rich_location::rich_location): Add optional label_highlight_color param. (rich_location::set_highlight_color): New decl. (rich_location::add_range): Add optional label_highlight_color param. (rich_location::set_range): Likewise. * line-map.cc (rich_location::rich_location): Add "label_highlight_color" param and pass it to add_range. (rich_location::set_highlight_color): New. (rich_location::add_range): Add "label_highlight_color" param. (rich_location::set_range): Add "highlight_color" param. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2024-07-13	tree-optimization/115868 - ICE with .MASK_CALL in simdclone	Richard Biener	1	-3/+8
	The following adjusts mask recording which didn't take into account that we can merge call arguments from two vectors like _50 = {vect_d_1.253_41, vect_d_1.254_43}; _51 = VIEW_CONVERT_EXPR<unsigned char>(mask__19.257_49); _52 = (unsigned int) _51; _53 = _Z3bazd.simdclone.7 (_50, _52); _54 = BIT_FIELD_REF <_53, 256, 0>; _55 = BIT_FIELD_REF <_53, 256, 256>; The testcase g++.dg/vect/pr68762-2.cc exercises this on x86_64 with partial vector usage enabled and AVX512 support. PR tree-optimization/115868 * tree-vect-stmts.cc (vectorizable_simd_clone_call): Correctly compute the number of mask copies required for vect_record_loop_mask.
2024-07-13	Daily bump.	GCC Administrator	10	-1/+396

2024-07-13	doc: Update GNU Modula 2 mailing list links	Gerald Pfeifer	1	-2/+2
	gcc: * doc/gm2.texi (Community): Update lists.nongnu.org and lists.gnu.org links.
2024-07-12	[PR rtl-optimization/115876] Fix one of two ubsan reported issues in new ↵	Jeff Law	1	-2/+2
	ext-dce.cc code David Binderman did a bootstrap build with ubsan enabled which triggered a few errors in the new ext-dce.cc code. This fixes the trivial case of shifting negative values. Bootstrapped and regression tested on x86. Pushing to the trunk. gcc/ PR rtl-optimization/115876 * ext-dce.cc (carry_backpropagate): Make mask and mmask unsigned.
2024-07-12	doc: remove @opindex for fconcepts-ts	Marek Polacek	1	-3/+1
	We're getting complaints from the CI system about this removed option. I suspect I should have removed the @opindex and @itemx for it. This patch does that. gcc/ChangeLog: * doc/invoke.texi: Remove @opindex and @itemx for -fconcepts-ts.
2024-07-12	Fix Xcode 16 build break with NULL != nullptr	Daniel Bertalan	6	-14/+14
	As of Xcode 16 beta 2 with the macOS 15 SDK, each re-inclusion of the stddef.h header causes the NULL macro in C++ to be re-defined to an integral constant (__null). This makes the workaround in d59a576b8 ("Redefine NULL to nullptr") ineffective, as other headers that are typically included after system.h (such as obstack.h) do include stddef.h too. This can be seen by running the sample below through `clang++ -E` #include <stddef.h> #define NULL nullptr #include <stddef.h> NULL The relevant libc++ change is here: https://github.com/llvm/llvm-project/commit/2950283dddab03c183c1be2d7de9d4999cc86131 Filed as FB14261859 to Apple and added a comment about it on LLVM PR 86843. This fixes the cases in --enable-languages=c,c++,objc,obj-c++,rust build where NULL being an integral constant instead of a null pointer literal (therefore no longer implicitly converting to a pointer when used as a template function's argument) caused issues. gcc/value-pointer-equiv.cc:65:43: error: no viable conversion from `pair<typename __unwrap_ref_decay<long>::type, typename __unwrap_ref_decay<long>::type>' to 'const pair<tree, tree>' 65 \| const std::pair <tree, tree> m_marker = std::make_pair (NULL, NULL); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~ As noted in the previous commit though, the proper solution would be to phase out the usages of NULL in GCC's C++ source code. gcc/analyzer/ChangeLog: * diagnostic-manager.cc (saved_diagnostic::saved_diagnostic): Change NULL to nullptr. (struct null_assignment_sm_context): Likewise. * infinite-loop.cc: Likewise. * infinite-recursion.cc: Likewise. * varargs.cc (va_list_state_machine::on_leak): Likewise. gcc/rust/ChangeLog: * metadata/rust-imports.cc (Import::try_package_in_directory): Change NULL to nullptr. gcc/ChangeLog: * value-pointer-equiv.cc: Change NULL to nullptr. Signed-off-by: Daniel Bertalan <dani@danielbertalan.dev>
2024-07-12	rtl-ssa: Fix prev_any_insn [PR115785]	Richard Sandiford	4	-41/+747
	Bit of a brown paper bag issue, but: due to the representation of the insn chain, insn_info::prev_any_insn would sometimes skip over instructions. This led to an invalid update in the PR when adding and removing instructions. I think one of the reasons I failed to spot this when checking the code is that m_prev_insn_or_last_debug_insn is misnamed: it's the previous instruction of the same type or the last debug instruction in a group. The patch therefore renames it to m_prev_sametype_or_last_debug_insn (with the term prev_sametype already being used in some accessors). The reason this didn't show up earlier is that (a) prev_any_insn is rarely used directly, (b) no instructions were lost from the def-use chains, and (c) only consecutive debug instructions were skipped when walking the insn chain. The chaining scheme makes prev_any_insn more complicated than next_any_insn, prev_nondebug_insn and next_nondebug_insn, but the object code produced is still relatively simple. gcc/ PR rtl-optimization/115785 * rtl-ssa/insns.h (insn_info::prev_insn_or_last_debug_insn) (insn_info::next_nondebug_or_debug_insn): Remove typedefs. (insn_info::m_prev_insn_or_last_debug_insn): Rename to... (insn_info::m_prev_sametype_or_last_debug_insn): ...this. * rtl-ssa/internals.inl (insn_info::insn_info): Update after above renaming. (insn_info::copy_prev_from): Likewise. (insn_info::set_prev_sametype_insn): Likewise. (insn_info::set_last_debug_insn): Likewise. (insn_info::clear_insn_links): Likewise. (insn_info::has_insn_links): Likewise. * rtl-ssa/member-fns.inl (insn_info::prev_nondebug_insn): Likewise. (insn_info::prev_any_insn): Fix moves from non-debug to debug insns. gcc/testsuite/ PR rtl-optimization/115785 * g++.dg/torture/pr115785.C: New test.
2024-07-12	modula2: bootstrap fix for string and vector headers.	FX Coudert	2	-3/+3
	This patch fixes the include of headers (<string> and <vector>) which are included after GCC's system.h has been included. It defines INCLUDE_STRING before including "system.h". This allows gcc to bootstrap with Apple clang 15. gcc/m2/ChangeLog: * gm2-gcc/m2linemap.cc (INCLUDE_STRING): Define before include of gcc-consolidation.h. * gm2spec.cc (INCLUDE_STRING): Define before include of system.h. (INCLUDE_VECTOR): Ditto. Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
2024-07-12	[RISC-V] Avoid unnecessary sign extension after memcmp	Jeff Law	2	-11/+18
	Similar to the str[n]cmp work, this adjusts the block compare expansion to do its work in X mode with an appropriate lowpart extraction of the results at the end of the sequence. This has gone through my tester on rv32 and rv64, but that's it. Waiting on pre-commit testing before moving forward. gcc/ * config/riscv/riscv-string.cc (emit_memcmp_scalar_load_and_compare): Set RESULT directly rather than using a temporary. (emit_memcmp_scalar_result_calculation): Similarly. (riscv_expand_block_compare_scalar): Use CONST0_RTX rather than generating new RTL. * config/riscv/riscv.md (cmpmemsi): Pass an X mode temporary to the expansion routines. If necessary extract low part of the word to store in final result location.
2024-07-12	c++/modules: Add testcase for fixed issue with usings [PR115798]	Nathaniel Shead	3	-0/+34
	This issue was fixed by r15-2003-gd6bf4b1c932211, but seems worth adding to the testsuite. PR c++/115798 gcc/testsuite/ChangeLog: * g++.dg/modules/using-26_a.C: New test. * g++.dg/modules/using-26_b.C: New test. * g++.dg/modules/using-26_c.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
2024-07-12	c++/modules: Handle redefinitions of using-decls	Nathaniel Shead	4	-12/+54
	This fixes an ICE exposed by supporting exported non-function using-decls. Sometimes when preparing to define a class, xref_tag will find a using-decl belonging to a different namespace, which triggers the checking_assert in modules handling. Ideally I feel that 'lookup_and_check_tag' should be told whether we're about to define the type and handle erroring on redefinitions itself to avoid this issue (and provide better diagnostics by acknowledging the using-declaration), but this is complicated with the current fragmentation of definition checking. So for this patch we just fixup the assertion and ensure that pushdecl properly errors on the conflicting declaration later. gcc/cp/ChangeLog: * decl.cc (xref_tag): Move assertion into condition. * name-lookup.cc (check_module_override): Check for conflicting types and using-decls. gcc/testsuite/ChangeLog: * g++.dg/modules/using-19_a.C: New test. * g++.dg/modules/using-19_b.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
2024-07-12	c++: Introduce USING_DECLs for non-function usings [PR114683]	Nathaniel Shead	17	-135/+308
	With modules, a non-function using-declaration is not completely interchangable with the declaration that it refers to; in particular, such a using-declaration may be exported without revealing the name of the entity it refers to. This patch fixes this by building USING_DECLs for all using-declarations that bind a non-function from a different scope. These new decls can than have purviewness and exportingness attached to them without affecting the decl that they refer to. We do this for all such usings, not just usings that may be revealed in a module; this way we can verify the change in representation against the (more comprehensive) non-modules testsuites, and in a future patch we can use the locations of these using-decls to enhance relevant diagnostics. Another possible approach would be to reuse OVERLOADs for this, as is already done within add_binding_entity for modules. I didn't do this because lots of code (as well as the names of the accessors) makes assumptions that OVERLOADs refer to function overload sets, and so splitting this up reduced semantic burden and made it easier to avoid unintentional changes. This did mean that we need to move out the definitions of ovl_iterator::{purview,exporting}_p, because the structures for module decls are declared later on in cp-tree.h. Building USING_DECLs changed a couple of code paths when adjusting bindings; in particular, pushdecl recognises global using-declarations as usings now, and so checks fall through to update_binding. To not regress g++.dg/lookup/linkage2.C the checks for 'extern' declarations no longer were sufficient (they don't handle 'extern "C"'); but duplicate_decls performed all the relevant checks anyway. Otherwise in general we strip using-decls from all lookup_* functions where necessary. Over time for diagnostics purposes it would probably be good to slowly revert this (especially e.g. lookup_elaborated_type causes some diagnostic quality regressions here) but this patch doesn't do so to minimise churn. This patch also tries not to build USING_DECLs when just redeclaring an existing declaration, and instead reveals that declaration in-place. This requires reworking some logic handling CONST_DECLs in module streaming, since a non-using CONST_DECL may now be exported indepenently of its containing enum. 'add_binding_entity' needs to explicitly write the names of unscoped enumerators so that lazy loading will trigger when the name is found by name lookup; it does this by pretending that the enum declarations are always usings so that it doesn't double-write definitions. By also checking if the enumerator was marked purview/exported we can use that to override a non-purview/non-exported TYPE_DECL and ensure it's made visible regardless. When reading we should get the exported flag on the enumeration constant, and so should properly create a binding for it. We don't need to do anything to handle importedness as that checking is skipped for EK_USINGs. Some other places assume that module information for a CONST_DECL inherits module information from its containing type. This includes: - get_originating_module_decl, for determining if the name was imported or has module attachment; I don't /think/ this change should affect that, so I'm leaving this untouched. - binding_cmp, for sorting by exportedness; since now an enumerator could be exported without the containing decl being exported, we need to handle this here too. PR c++/114683 gcc/cp/ChangeLog: * cp-tree.h (class ovl_iterator): Move definitions of purview_p and exporting_p to name-lookup.cc. * module.cc (depset::hash::add_binding_entity): Strip using-decls. Remove workarounds. Handle CONST_DECLs with different purview/exported from their enum. (enum ct_bind_flags): Remove unnecessary cbf_wrapped flag. (module_state::write_cluster): Likewise. (module_state::read_cluster): Build USING_DECL for non-function usings. (binding_cmp): Handle CONST_DECLs with different purview and/or exported from their enum. (set_instantiating_module): Support CONST_DECLs. * name-lookup.cc (get_fixed_binding_slot): Strip USING_DECLs. (name_lookup::process_binding): Strip USING_DECLs. (name_lookup::process_module_binding): Remove workaround. (update_binding): Strip USING_DECLs, remove incorrect check for non-extern variables. (ovl_iterator::purview_p): Support USING_DECLs. (ovl_iterator::exporting_p): Support USING_DECLs. (walk_module_binding): Handle stat hack type. (do_nonmember_using_decl): Strip USING_DECLs when comparing; build USING_DECLs for non-function usings in different scope rather than binding directly. (get_namespace_binding): Strip USING_DECLs. (lookup_name): Strip USING_DECLs. (lookup_elaborated_type): Strip USING_DECLs. * decl.cc (poplevel): Still support -Wunused for using-decls. (lookup_and_check_tag): Remove unnecessary strip_using_decl. * parser.cc (cp_parser_template_name): Likewise. (cp_parser_nonclass_name): Likewise. (cp_parser_class_name): Likewise. gcc/testsuite/ChangeLog: * g++.dg/lookup/using29.C: Update errors. * g++.dg/lookup/using53.C: Update errors, add XFAILs. * g++.dg/modules/using-22_b.C: Remove xfails. * g++.dg/warn/Wunused-var-18.C: Update error, add check. * g++.dg/lookup/using68.C: New test. * g++.dg/modules/using-24_a.C: New test. * g++.dg/modules/using-24_b.C: New test. * g++.dg/modules/using-25_a.C: New test. * g++.dg/modules/using-25_b.C: New test. * g++.dg/modules/using-enum-4_a.C: New test. * g++.dg/modules/using-enum-4_b.C: New test. * g++.dg/modules/using-enum-4_c.C: New test. Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
2024-07-12	s390: Fully exploit vgm, vgbm, vrepi	Stefan Schulze Frielinghaus	42	-178/+2577
	Currently instructions vgm and vrepi are utilized only for constant vectors where the element mode equals the element mode of the corresponding instruction. This patch lifts this restriction by making use of those instructions for constant vectors even if element modes do not coincide. For example, the constant vector (v2di){0x7ffffffe7ffffffe, 0x7ffffffe7ffffffe} can be loaded via vgmf %v0,1,30. Similar, the constant vector (v4si){0xaaaaaaaa, 0xaaaaaaaa, 0xaaaaaaaa, 0xaaaaaaaa} can be loaded via vrepiq %v0,-86. Analog, if the element mode of a constant vector is smaller than the element mode of a corresponding instruction, we still may make use of those instructions. For example, the constant vector (v4si){0x7fff, 0xfffe0000, 0x7fff, 0xfffe0000} can be loaded via vgmg %v0,17,46. Similar, the constant vector (v4si){-1, -16643, -1, -16643} can be loaded via vrepig %v0,-16643. Additionally this patch enables vgm, vgbm, vrepi for partial vectors, i.e., vectors of size less than 16 bytes. Basically this is done by treating a vector as a full vector resulting in replicating constants into the ignored bits whereas vgbm sets those to zero. Furthermore, there is no restriction to integer vectors anymore, i.e., supporting scalars of mode up to and including TI and TF and also floating-point vectors. Here are some numbers how often instructions are emitted for SPEC 2017: w/o patch w/ patch vgbm 140 365 vgm 17508 24452 vrepi 1360 2775 I expect most (maybe even all) to save us a load from the literal pool. gcc/ChangeLog: * config/s390/2964.md: Remove extended mnemonics for vgm. * config/s390/3906.md: Remove extended mnemonics for vgm. * config/s390/3931.md: Remove extended mnemonics for vgm. * config/s390/8561.md: Remove extended mnemonics for vgm. * config/s390/constraints.md (jKK): Remove constraint. (jzz): Add constraint. * config/s390/s390-protos.h (s390_contiguous_bitmask_vector_p): Add prototype. (s390_constant_via_vgm_p): Add prototype. (s390_constant_via_vrepi_p): Add prototype. * config/s390/s390.cc (s390_contiguous_bitmask_vector_p): New function. (s390_constant_via_vgm_vrepi_helper): New function. (s390_constant_via_vgm_p): New function. (s390_constant_via_vgbm_p): For the sake of symmetry rename s390_bytemask_vector_p into s390_constant_via_vgbm_p. (s390_bytemask_vector_p): Deal with non-integer and partial vectors. (s390_constant_via_vrepi_p): New function. (s390_legitimate_constant_p): Allow partial vectors. (legitimate_reload_constant_p): Fix indentation. (legitimate_reload_vector_constant_p): Restrict to constraints j00, jm1, jxx, jyy, jzz only, i.e., allow partial vectors. (s390_expand_vec_init): Also make use of vrepi if possible. (print_operand): Add q,p,r for vgm,vrepi,vgbm, respectively. Remove e,s,t for constant vectors. * config/s390/s390.md (movti): Add variants utilizing vgbm,vgm,vrepi. * config/s390/vector.md (mov<mode><tf_vr>): Adapt variants for vgbm,vgm,vrepi for the new scheme. (mov<mode>): Adapt variants for vgbm,vgm for the new scheme and add vrepi variant for modes V_8,V_16,V_32,V_64. gcc/testsuite/ChangeLog: * gcc.target/s390/vector/vec-copysign.c: Change to non-extended mnemonic. * gcc.target/s390/vector/vec-genmask-1.c: Change to non-extended mnemonic. * gcc.target/s390/vector/vec-init-1.c: Change to non-extended mnemonic. * gcc.target/s390/vector/vec-vrepi-1.c: Change to non-extended mnemonic. * gcc.target/s390/zvector/autovec-double-quiet-uneq.c: Change to non-extended mnemonic. * gcc.target/s390/zvector/autovec-float-quiet-uneq.c: Change to non-extended mnemonic. * gcc.target/s390/zvector/vec-genmask-1.c: Change to non-extended mnemonic. * gcc.target/s390/zvector/vec-splat-1.c: Change to non-extended mnemonic. * gcc.target/s390/zvector/vec-splat-2.c: Change to non-extended mnemonic. * gcc.target/s390/vector/vgbm-double-1.c: New test. * gcc.target/s390/vector/vgbm-float-1.c: New test. * gcc.target/s390/vector/vgbm-int128-1.c: New test. * gcc.target/s390/vector/vgbm-integer-1.c: New test. * gcc.target/s390/vector/vgbm-longdouble-1.c: New test. * gcc.target/s390/vector/vgm-df-1.c: New test. * gcc.target/s390/vector/vgm-di-1.c: New test. * gcc.target/s390/vector/vgm-hi-1.c: New test. * gcc.target/s390/vector/vgm-int128-1.c: New test. * gcc.target/s390/vector/vgm-longdouble-1.c: New test. * gcc.target/s390/vector/vgm-qi-1.c: New test. * gcc.target/s390/vector/vgm-sf-1.c: New test. * gcc.target/s390/vector/vgm-si-1.c: New test. * gcc.target/s390/vector/vgm-tf-1.c: New test. * gcc.target/s390/vector/vgm-ti-1.c: New test. * gcc.target/s390/vector/vrepi-df-1.c: New test. * gcc.target/s390/vector/vrepi-di-1.c: New test. * gcc.target/s390/vector/vrepi-hi-1.c: New test. * gcc.target/s390/vector/vrepi-int128-1.c: New test. * gcc.target/s390/vector/vrepi-qi-1.c: New test. * gcc.target/s390/vector/vrepi-sf-1.c: New test. * gcc.target/s390/vector/vrepi-si-1.c: New test. * gcc.target/s390/vector/vrepi-tf-1.c: New test. * gcc.target/s390/vector/vrepi-ti-1.c: New test.
2024-07-12	s390: Fix output template for movv1qi	Stefan Schulze Frielinghaus	1	-2/+2
	Although for instructions MVI and MVIY it does not make a difference whether the immediate is interpreted as signed or unsigned, GAS expects unsigned immediates for instruction format SI_URD. gcc/ChangeLog: * config/s390/vector.md (mov<mode>): Fix output template for movv1qi.
2024-07-12	i386: Some AVX512 ternlog expansion refinements.	Roger Sayle	1	-48/+78
	This patch replaces the calls to force_reg in ix86_expand_ternlog_binop and ix86_expand_ternlog with gen_reg_rtx and emit_move_insn. This patch also cleans up whitespace, consistently uses CONST_VECTOR_P instead of GET_CODE and tweaks checks for ix86_ternlog_leaf_p (for example where vpandn may take a memory operand). 2024-07-12 Roger Sayle <roger@nextmovesoftware.com> Hongtao Liu <hongtao.liu@intel.com> gcc/ChangeLog * config/i386/i386-expand.cc (ix86_broadcast_from_constant): Use CONST_VECTOR_P instead of comparison against GET_CODE. (ix86_gen_bcst_mem): Likewise. (ix86_ternlog_leaf_p): Likewise. (ix86_ternlog_operand_p): ix86_ternlog_leaf_p is always true for vector_all_ones_operand. (ix86_expand_ternlog_bin_op): Use CONST_VECTOR_P instead of equality comparison against GET_CODE. Replace call to force_reg with gen_reg_rtx and emit_move_insn (for VEC_DUPLICATE broadcast). Check for !register_operand instead of memory_operand. Support CONST_VECTORs by calling force_const_mem. (ix86_expand_ternlog): Fix indentation whitespace. Allow ix86_ternlog_leaf_p as ix86_expand_ternlog_andnot's second operand. Use CONST_VECTOR_P instead of equality against GET_CODE. Use gen_reg_rtx and emit_move_insn for ~a, ~b and ~c cases.
2024-07-12	s390: Align cjump_64 and icjump_64	Stefan Schulze Frielinghaus	1	-1/+2
	During machine reorg we optimize backward jumps and transform insns as e.g. (jump_insn 118 117 119 (set (pc) (if_then_else (ne (reg:CCRAW 33 %cc) (const_int 8 [0x8])) (label_ref 134) (pc))) "dec_math_1.f90":204:8 discrim 1 2161 {cjump_64} (expr_list:REG_DEAD (reg:CCRAW 33 %cc) (int_list:REG_BR_PROB 719407028 (nil))) -> 134) into (jump_insn 118 117 432 (set (pc) (if_then_else (ne (reg:CCRAW 33 %cc) (const_int 8 [0x8])) (pc) (label_ref 433))) "dec_math_1.f90":204:8 discrim 1 -1 (expr_list:REG_DEAD (reg:CCRAW 33 %cc) (int_list:REG_BR_PROB 719407028 (nil))) -> 433) The latter is not recognized anymore since icjump_64 only matches CC_REGNUM against zero. Fixed by aligning cjump_64 and icjump_64. gcc/ChangeLog: * config/s390/s390.md (*icjump_64): Allow raw CC comparisons, i.e., any constant integer between 0 and 15 for CC comparisons.
2024-07-12	aarch64: Avoid alloca in target attribute parsing	Richard Sandiford	1	-4/+8
	The handling of the target attribute used alloca to allocate a copy of unverified user input, which could exhaust the stack if the input is too long. This patch converts it to auto_vecs instead. I wondered about converting it to use std::string, which we already use elsewhere, but that would be more invasive and controversial. gcc/ * config/aarch64/aarch64.cc (aarch64_process_one_target_attr) (aarch64_process_target_attr): Avoid alloca.
2024-07-12	[libstdc++] [testsuite] require dfprt on some tests	Alexandre Oliva	9	-9/+9
	On a target that doesn't enable decimal float components in libgcc (because the libc doens't define all required FE_* macros), but whose compiler supports _Decimal* types, the effective target requirement dfp passes, but several tests won't link because the runtime support they depend on is missing. State their dfprt requirement. for libstdc++-v3/ChangeLog * testsuite/decimal/binary-arith.cc: Require dfprt. * testsuite/decimal/comparison.cc: Likewise. * testsuite/decimal/compound-assignment.cc: Likewise. * testsuite/decimal/compound-assignment-memfunc.cc: Likewise. * testsuite/decimal/make-decimal.cc: Likewise. * testsuite/decimal/pr54036-1.cc: Likewise. * testsuite/decimal/pr54036-2.cc: Likewise. * testsuite/decimal/pr54036-3.cc: Likewise. * testsuite/decimal/unary-arith.cc: Likewise.
2024-07-12	[alpha] adjust MEM alignment for block move [PR115459]	Alexandre Oliva	1	-0/+12
	Before issuing loads or stores for a block move, adjust the MEM alignments if analysis of the addresses enabled the inference of stricter alignment. This ensures that the MEMs are sufficiently aligned for the corresponding insns, which avoids trouble in case of e.g. substitutions into SUBREGs. for gcc/ChangeLog PR target/115459 * config/alpha/alpha.cc (alpha_expand_block_move): Adjust MEMs to match inferred alignment.