riscv-gnu-toolchain/gcc.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
2021-12-13	aarch64: Add command-line support for Armv8.8-a	Kyrylo Tkachov	1	-1/+3
	This final patch in the series is much simpler and adds command-line support for -march=armv8.8-a, making use of the +mops features added in the previous patches. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64-arches.def (armv8.8-a): Define. * config/aarch64/aarch64.h (AARCH64_FL_V8_8): Define. (AARCH64_FL_FOR_ARCH8_8): Define. * doc/invoke.texi: Document -march=armv8.8-a.
2021-12-13	aarch64: Add support for Armv8.8-a memory operations and memcpy expansion	Kyrylo Tkachov	1	-0/+3
	This patch adds the +mops architecture extension flag from the 2021 Arm Architecture extensions, Armv8.8-a. The +mops extensions introduce instructions to accelerate the memcpy, memset, memmove standard functions. The first patch here uses the instructions in the inline memcpy expansion. Further patches in the series will use similar instructions to inline memmove and memset. A new param, aarch64-mops-memcpy-size-threshold, is introduced to control the size threshold above which to emit the new sequence. Its default setting is 256 bytes, which is the same as the current threshold above which we'd emit a libcall. Bootstrapped and tested on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64-option-extensions.def (mops): Define. * config/aarch64/aarch64.c (aarch64_expand_cpymem_mops): Define. (aarch64_expand_cpymem): Define. * config/aarch64/aarch64.h (AARCH64_FL_MOPS): Define. (AARCH64_ISA_MOPS): Define. (TARGET_MOPS): Define. (MOVE_RATIO): Adjust for TARGET_MOPS. * config/aarch64/aarch64.md ("unspec"): Add UNSPEC_CPYMEM. (aarch64_cpymemdi): New pattern. (cpymemdi): Adjust for TARGET_MOPS. * config/aarch64/aarch64.opt (aarch64-mops-memcpy-size-threshol): New param. * doc/invoke.texi (AArch64 Options): Document +mops. gcc/testsuite/ChangeLog: * gcc.target/aarch64/mops_1.c: New test.
2021-12-13	docs: add missing @item for the first item	Martin Liska	1	-1/+1
	gcc/ChangeLog: * doc/extend.texi: Use @item for the first @itemx entry.
2021-12-09	Limit inlining functions called once	Jan Hubicka	1	-0/+8
	as dicussed in PR ipa/103454 there are several benchmarks that regresses for -finline-functions-called once. Runtmes: - tramp3d with -Ofast. 31% - exchange2 with -Ofast 11-21% - roms O2 9%-10% - tonto 2.5-3.5% with LTO Build times: - specfp2006 41% (mostly wrf that builds 71% faster) - specint2006 1.5-3% - specfp2017 64% (again mostly wrf) - specint2017 2.5-3.5% This patch adds two params to tweak the behaviour: 1) max-inline-functions-called-once-loop-depth limiting the loop depth (this is useful primarily for exchange where the inlined function is in loop depth 9) 2) max-inline-functions-called-once-insns We already have large-function-insns/growth parameters, but these are limiting also inlining small functions, so reducing them will regress very large functions that are hot. Because inlining functions called once is meant just as a cleanup pass I think it makes sense to have separate limit for it. gcc/ChangeLog: 2021-12-09 Jan Hubicka <hubicka@ucw.cz> * doc/invoke.texi (max-inline-functions-called-once-loop-depth, max-inline-functions-called-once-insns): New parameters. * ipa-inline.c (check_callers): Handle param_inline_functions_called_once_loop_depth and param_inline_functions_called_once_insns. (edge_badness): Fix linebreaks. * params.opt (param=max-inline-functions-called-once-loop-depth, param=max-inline-functions-called-once-insn): New params.
2021-12-09	Avoid expecting nonzero size for access none void* arguments [PR101751].	Martin Sebor	1	-2/+3
	Resolves: PR middle-end/101751 - attribute access none with void pointer expects nonzero size gcc/ChangeLog: PR middle-end/101751 * doc/extend.texi (attribute access): Adjust. * gimple-ssa-warn-access.cc (pass_waccess::maybe_check_access_sizes): Treat access mode none on a void* argument as expecting as few as zero bytes. gcc/testsuite/ChangeLog: PR middle-end/101751 * gcc.dg/Wstringop-overflow-86.c: New test.
2021-12-09	pch: Add support for relocation of the PCH data [PR71934]	Jakub Jelinek	1	-2/+2
	The following patch adds support for relocation of the PCH blob on PCH restore if we don't manage to get the preferred map slot for it. The GTY stuff knows where all the pointers are, after all it relocates it once during PCH save from the addresses where it was initially allocated to addresses in the preferred map slot. But, if we were to do it solely using GTY info upon PCH restore, we'd need another set of GTY functions, which I think would make it less maintainable and I think it would also be more costly at PCH restore time. Those functions would need to call something to add bias to pointers that haven't been marked yet and make sure not to add bias to any pointer twice. So, this patch instead builds a relocation table (sorted list of addresses in the blob which needs relocation) at PCH save time, stores it in a very compact form into the gch file and upon restore, adjusts pointers in GTY roots (that is right away in the root structures) and the addresses in the relocation table. The cost on stdc++.gch/O2g.gch (previously 85MB large) is about 3% file size growth, there are 2.5 million pointers that need relocation in the gch blob and the relocation table uses uleb128 for address deltas and needs ~1.01 bytes for one address that needs relocation, and about 20% compile time during PCH save (I think it is mainly because of the need to qsort those 2.5 million pointers). On PCH restore, if it doesn't need relocation (the usual case), it is just an extra fread of sizeof (size_t) data and fseek (in my tests real time on vanilla tree for #include <bits/stdc++.h> CU was ~0.175s and with the patch but no relocation ~0.173s), while if it needs relocation it took ~0.193s, i.e. 11.5% slower. Without PCH that #include <bits/stdc++.h> int i; testcase compiles with -O2 -g in ~1.199s, i.e. 6.2 times slower than PCH with relocation and 6.9 times than PCH without relocation. The discovery of the pointers in the blob that need relocation is done in the relocate_ptrs hook which does the pointer relocation during PCH save. Unfortunately, I had to make one change to the gengtype stuff due to the nested_ptr feature of GTY, which some libcpp headers and stringpool.c use. The relocate_ptrs hook had 2 arguments, pointer to the pointer and a cookie. When relocate_ptrs is done, in most cases it is called solely on the subfields of the current object, so e.g. if ((void )(x) == this_obj) op (&((x).u.fld[0].rt_rtx), cookie); so relocate_ptrs can assert that ptr_p is within the state->ptrs[state->ptrs_i]->obj .. state->ptrs[state->ptrs_i]->obj+state->ptrs[state->ptrs_i]->size-sizeof(void) range and compute from that the address in the blob which will need relocation (state->ptrs[state->ptrs_i]->new_addr is the new address given to it and ptr_p-state->ptrs[state->ptrs_i]->obj is the relative offset. Unfortunately, for nested_ptr gengtype emits something like: { union tree_node x0 = ((x).val.node.node) ? HT_IDENT_TO_GCC_IDENT (HT_NODE (((x).val.node.node))) : NULL; if ((void )(x) == this_obj) op (&(x0), cookie); (x).val.node.node = (x0) ? CPP_HASHNODE (GCC_IDENT_TO_HT_IDENT ((x0))) : NULL; } so relocate_ptrs is called with an address of some temporary variable and so doesn't know where the pointer will finally be. So, I've added another argument to relocate_ptrs (and to gt_pointer_operator). For the most common case I pass NULL as the new middle argument to that function, first one remains pointer to the pointer that needs adjustment and last the cookie. The NULL seems to be cheap to compute and short in the gt.[ch] files and stands for ptr_p is an address within the this_obj's range, remember its address. For the nested_ptr case, the new middle argument contains actual address of the pointer that might need to be relocated, so instead of the above op (&(x0), &((x).val.node.node), cookie); in there. And finally, e.g. for the reorder case I need a way to tell restore_ptrs to ignore a particular address for the relocation purposes and only treat it the old way. I've used for that the case when the first and second arguments are equal. In order to enable support for mapping PCH as fallback at different addresses than the preferred ones, a small change is needed to the host pch_use_address hooks. One change I've done to all of them is the change of the type of the first argument from void * to void &, such that the actual address can be told to the callers (or shall I instead use void ?), but another change that still needs to be done in them if they want the relocation is actually not fail if they couldn't get a preferred address, but instead modify what the first argument refers to. I've done that only for host-linux.c and Iain is testing similar change for host-darwin.c. Didn't change hpux, netbsd, openbsd, solaris, mingw32 or the fallbacks because I can't test those. Tested also with the: --- gcc/config/host-linux.c.jj 2021-12-06 22:22:42.007777367 +0100 +++ gcc/config/host-linux.c 2021-12-07 00:21:53.052674040 +0100 @@ -191,6 +191,8 @@ linux_gt_pch_use_address (void &base, s if (size == 0) return -1; +base = (char ) base + ((size + 8191) & (size_t) -4096); + / Try to map the file with MAP_PRIVATE. / addr = mmap (base, size, PROT_READ \| PROT_WRITE, MAP_PRIVATE, fd, offset); hack which forces all PCH restores to be relocated. An earlier version of the patch has been also regrest with base = (char ) base + 16384; in that spot, so both relocation to a non-overlapping spot and to an overlapping spot have been tested. 2021-12-09 Jakub Jelinek <jakub@redhat.com> PR pch/71934 * coretypes.h (gt_pointer_operator): Use 3 pointer arguments instead of two. * gengtype.c (struct walk_type_data): Add in_nested_ptr argument. (walk_type): Temporarily set d->in_nested_ptr around nested_ptr handling. (write_types_local_user_process_field): Pass a new middle pointer to gt_pointer_operator op calls, if d->in_nested_ptr pass there address of d->prev_val[2], otherwise NULL. (write_types_local_process_field): Likewise. * ggc-common.c (relocate_ptrs): Add real_ptr_p argument. If equal to ptr_p, do nothing, otherwise if NULL remember ptr_p's or if non-NULL real_ptr_p's corresponding new address in reloc_addrs_vec. (reloc_addrs_vec): New variable. (compare_ptr, read_uleb128, write_uleb128): New functions. (gt_pch_save): When iterating over objects through relocate_ptrs, save current i into state.ptrs_i. Sort reloc_addrs_vec and emit it as uleb128 of differences between pointer addresses into the PCH file. (gt_pch_restore): Allow restoring of PCH to a different address than the preferred one, in that case adjust global pointers by bias and also adjust by bias addresses read from the relocation table as uleb128 differences. Otherwise fseek over it. Perform gt_pch_restore_stringpool only after adjusting callbacks and for callback adjustments also take into account the bias. (default_gt_pch_use_address): Change type of first argument from void * to void &. (mmap_gt_pch_use_address): Likewise. ggc-tests.c (gt_pch_nx): Pass NULL as new middle argument to op. * hash-map.h (hash_map::pch_nx_helper): Likewise. (gt_pch_nx): Likewise. * hash-set.h (gt_pch_nx): Likewise. * hash-table.h (gt_pch_nx): Likewise. * hash-traits.h (ggc_remove::pch_nx): Likewise. * hosthooks-def.h (default_gt_pch_use_address): Change type of first argument from void * to void &. (mmap_gt_pch_use_address): Likewise. hosthooks.h (struct host_hooks): Change type of first argument of gt_pch_use_address hook from void * to void &. machmode.h (gt_pch_nx): Expect a callback with 3 pointers instead of two in the middle argument. * poly-int.h (gt_pch_nx): Likewise. * stringpool.c (gt_pch_nx): Pass NULL as new middle argument to op. * tree-cfg.c (gt_pch_nx): Likewise, except for LOCATION_BLOCK pass the same &(block) twice. * value-range.h (gt_pch_nx): Pass NULL as new middle argument to op. * vec.h (gt_pch_nx): Likewise. * wide-int.h (gt_pch_nx): Likewise. * config/host-darwin.c (darwin_gt_pch_use_address): Change type of first argument from void * to void &. config/host-darwin.h (darwin_gt_pch_use_address): Likewise. * config/host-hpux.c (hpux_gt_pch_use_address): Likewise. * config/host-linux.c (linux_gt_pch_use_address): Likewise. If it couldn't succeed to mmap at the preferred location, set base to the actual one. Update addr in the manual reading loop instead of base. * config/host-netbsd.c (netbsd_gt_pch_use_address): Change type of first argument from void * to void &. config/host-openbsd.c (openbsd_gt_pch_use_address): Likewise. * config/host-solaris.c (sol_gt_pch_use_address): Likewise. * config/i386/host-mingw32.c (mingw32_gt_pch_use_address): Likewise. * config/rs6000/rs6000-gen-builtins.c (write_init_file): Pass NULL as new middle argument to op in the generated code. * doc/gty.texi: Adjust samples for the addition of middle pointer to gt_pointer_operator callback. gcc/ada/ * gcc-interface/decl.c (gt_pch_nx): Pass NULL as new middle argument to op. gcc/c-family/ * c-pch.c (c_common_no_more_pch): Pass a temporary void * var with NULL value instead of NULL to host_hooks.gt_pch_use_address. gcc/c/ * c-decl.c (resort_field_decl_cmp): Pass the same pointer twice to resort_data.new_value. gcc/cp/ * module.cc (nop): Add another void * argument. * name-lookup.c (resort_member_name_cmp): Pass the same pointer twice to resort_data.new_value.
2021-12-07	MIPS: R6: load/store can process unaligned address	YunQiang Su	1	-0/+10
	MIPS release 6 requires the lw/ld/sw/sd can work with unaligned address, while it can be implemented by full hardware or trap&emulate. Since it doesn't have to be fully done by hardware, we add a pair of options -m(no-)unaligned-access. Kernels may need them. gcc/ChangeLog: * config/mips/mips.h (ISA_HAS_UNALIGNED_ACCESS, STRICT_ALIGNMENT): R6 can unaligned access. * config/mips/mips.md (movmisalign<mode>): Likewise. * config/mips/mips.opt: add -m(no-)unaligned-access * doc/invoke.texi: Likewise. gcc/testsuite/ChangeLog: * gcc.target/mips/mips.exp: add unaligned-access * gcc.target/mips/unaligned-2.c: New test. * gcc.target/mips/unaligned-3.c: New test.
2021-12-03	x86: Add -mmove-max=bits and -mstore-max=bits	H.J. Lu	1	-0/+13
	Add -mmove-max=bits and -mstore-max=bits to enable 256-bit/512-bit move and store, independent of -mprefer-vector-width=bits: 1. Add X86_TUNE_AVX512_MOVE_BY_PIECES and X86_TUNE_AVX512_STORE_BY_PIECES which are enabled for Intel Sapphire Rapids processor. 2. Add -mmove-max=bits to set the maximum number of bits can be moved from memory to memory efficiently. The default value is derived from X86_TUNE_AVX512_MOVE_BY_PIECES, X86_TUNE_AVX256_MOVE_BY_PIECES, and the preferred vector width. 3. Add -mstore-max=bits to set the maximum number of bits can be stored to memory efficiently. The default value is derived from X86_TUNE_AVX512_STORE_BY_PIECES, X86_TUNE_AVX256_STORE_BY_PIECES and the preferred vector width. gcc/ PR target/103269 * config/i386/i386-expand.c (ix86_expand_builtin): Pass PVW_NONE and PVW_NONE to ix86_target_string. * config/i386/i386-options.c (ix86_target_string): Add arguments for move_max and store_max. (ix86_target_string::add_vector_width): New lambda. (ix86_debug_options): Pass ix86_move_max and ix86_store_max to ix86_target_string. (ix86_function_specific_print): Pass ptr->x_ix86_move_max and ptr->x_ix86_store_max to ix86_target_string. (ix86_valid_target_attribute_tree): Handle x_ix86_move_max and x_ix86_store_max. (ix86_option_override_internal): Set the default x_ix86_move_max and x_ix86_store_max. * config/i386/i386-options.h (ix86_target_string): Add prefer_vector_width and prefer_vector_width. * config/i386/i386.h (TARGET_AVX256_MOVE_BY_PIECES): Removed. (TARGET_AVX256_STORE_BY_PIECES): Likewise. (MOVE_MAX): Use 64 if ix86_move_max or ix86_store_max == PVW_AVX512. Use 32 if ix86_move_max or ix86_store_max >= PVW_AVX256. (STORE_MAX_PIECES): Use 64 if ix86_store_max == PVW_AVX512. Use 32 if ix86_store_max >= PVW_AVX256. * config/i386/i386.opt: Add -mmove-max=bits and -mstore-max=bits. * config/i386/x86-tune.def (X86_TUNE_AVX512_MOVE_BY_PIECES): New. (X86_TUNE_AVX512_STORE_BY_PIECES): Likewise. * doc/invoke.texi: Document -mmove-max=bits and -mstore-max=bits. gcc/testsuite/ PR target/103269 * gcc.target/i386/pieces-memcpy-17.c: New test. * gcc.target/i386/pieces-memcpy-18.c: Likewise. * gcc.target/i386/pieces-memcpy-19.c: Likewise. * gcc.target/i386/pieces-memcpy-20.c: Likewise. * gcc.target/i386/pieces-memcpy-21.c: Likewise. * gcc.target/i386/pieces-memset-45.c: Likewise. * gcc.target/i386/pieces-memset-46.c: Likewise. * gcc.target/i386/pieces-memset-47.c: Likewise. * gcc.target/i386/pieces-memset-48.c: Likewise. * gcc.target/i386/pieces-memset-49.c: Likewise.
2021-12-03	Add TARGET_IFUNC_REF_LOCAL_OK	H.J. Lu	2	-0/+7
	1. On some targets, like PowerPC, reference to ifunc function resolver must be non-local so that compiler will properly emit PLT call. Add TARGET_IFUNC_REF_LOCAL_OK to allow binding indirect function resolver locally for targets which don't require special PLT call sequence. 2. Add ix86_call_use_plt_p to call local ifunc function resolvers via PLT. gcc/ PR target/51469 PR target/83782 * target.def (ifunc_ref_local_ok): Add a target hook. * varasm.c (default_binds_local_p_3): Force indirect function resolver non-local only if targetm.ifunc_ref_local_ok returns false. * config/i386/i386-expand.c (ix86_expand_call): Call ix86_call_use_plt_p to check if PLT should be used. * config/i386/i386-protos.h (ix86_call_use_plt_p): New. * config/i386/i386.c (output_pic_addr_const): Call ix86_call_use_plt_p to check if "@PLT" is needed. (ix86_call_use_plt_p): New. (TARGET_IFUNC_REF_LOCAL_OK): New. * doc/tm.texi.in: Add TARGET_IFUNC_REF_LOCAL_OK. * doc/tm.texi: Regenerated. gcc/testsuite/ PR target/51469 PR target/83782 * gcc.target/i386/pr83782-1.c: New test. * gcc.target/i386/pr83782-2.c: Likewise.
2021-12-03	pch: Add support for PCH for relocatable executables [PR71934]	Jakub Jelinek	1	-0/+9
	So, if we want to make PCH work for PIEs, I'd say we can: 1) add a new GTY option, say callback, which would act like skip for non-PCH and for PCH would make us skip it but remember for address bias translation 2) drop the skip for tree_translation_unit_decl::language 3) change get_unnamed_section to have const char * as last argument instead of const void , change unnamed_section::data also to const char and update everything related to that 4) maybe add a host hook whether it is ok to support binaries changing addresses (the only thing I'm worried is if some host that uses function descriptors allocates them dynamically instead of having them somewhere in the executable) 5) maybe add a gengtype warning if it sees in GTY tracked structure a function pointer without that new callback option Here is 1), 2), 3) implemented. Note, on stdc++.h.gch/O2g.gch there are just those 10 relocations without the second patch, with it a few more, but nothing huge. And for non-PIEs there isn't really any extra work on the load side except freading two scalar values and fseek. 2021-12-03 Jakub Jelinek <jakub@redhat.com> PR pch/71934 gcc/ * ggc.h (gt_pch_note_callback): Declare. * gengtype.h (enum typekind): Add TYPE_CALLBACK. (callback_type): Declare. * gengtype.c (dbgprint_count_type_at): Handle TYPE_CALLBACK. (callback_type): New variable. (process_gc_options): Add CALLBACK argument, handle callback option. (set_gc_used_type): Adjust process_gc_options caller, if callback, set type to &callback_type. (output_mangled_typename): Handle TYPE_CALLBACK. (walk_type): Likewise. Handle callback option. (write_types_process_field): Handle TYPE_CALLBACK. (write_types_local_user_process_field): Likewise. (write_types_local_process_field): Likewise. (write_root): Likewise. (dump_typekind): Likewise. (dump_type): Likewise. * gengtype-state.c (type_lineloc): Handle TYPE_CALLBACK. (state_writer::write_state_callback_type): New method. (state_writer::write_state_type): Handle TYPE_CALLBACK. (read_state_callback_type): New function. (read_state_type): Handle TYPE_CALLBACK. * ggc-common.c (callback_vec): New variable. (gt_pch_note_callback): New function. (gt_pch_save): Stream out gt_pch_save function address and relocation table. (gt_pch_restore): Stream in saved gt_pch_save function address and relocation table and apply relocations if needed. * doc/gty.texi (callback): Document new GTY option. * varasm.c (get_unnamed_section): Change callback argument's type and last argument's type from const void * to const char . (output_section_asm_op): Change argument's type from const void to const char , remove unnecessary cast. tree-core.h (struct tree_translation_unit_decl): Drop GTY((skip)) from language member. * output.h (unnamed_section_callback): Change argument type from const void * to const char . (struct unnamed_section): Use GTY((callback)) instead of GTY((skip)) for callback member. Change data member type from const void to const char . (struct noswitch_section): Use GTY((callback)) instead of GTY((skip)) for callback member. (get_unnamed_section): Change callback argument's type and last argument's type from const void to const char . (output_section_asm_op): Change argument's type from const void to const char . config/avr/avr.c (avr_output_progmem_section_asm_op): Likewise. Remove unneeded cast. * config/darwin.c (output_objc_section_asm_op): Change argument's type from const void * to const char . config/pa/pa.c (som_output_text_section_asm_op): Likewise. (som_output_comdat_data_section_asm_op): Likewise. * config/rs6000/rs6000.c (rs6000_elf_output_toc_section_asm_op): Likewise. (rs6000_xcoff_output_readonly_section_asm_op): Likewise. Instead of dereferencing directive hardcode variable names and decide based on whether directive is NULL or not. (rs6000_xcoff_output_readwrite_section_asm_op): Change argument's type from const void * to const char . (rs6000_xcoff_output_tls_section_asm_op): Likewise. Instead of dereferencing directive hardcode variable names and decide based on whether directive is NULL or not. (rs6000_xcoff_output_toc_section_asm_op): Change argument's type from const void to const char . (rs6000_xcoff_asm_init_sections): Adjust get_unnamed_section callers. gcc/c-family/ c-pch.c (struct c_pch_validity): Remove pch_init member. (pch_init): Don't initialize v.pch_init. (c_common_valid_pch): Don't warn and punt if .text addresses change. libcpp/ * include/line-map.h (class line_maps): Add GTY((callback)) to reallocator and round_alloc_size members.
2021-12-02	doc: Remove references to FreeBSD 1 and 2	Gerald Pfeifer	1	-4/+0
	FreeBSD 1 and FreeBSD 2, both still a.out, have been end of life for over two decades and GCC has not been supporting them for ages, too, so simply remove references. gcc: * doc/install.texi (--freebsd*): Remove references to FreeBSD 1 and FreeBSD 2.
2021-12-02	Implement -fprofile-prefix-map.	Martin Liska	1	-2/+12
	PR gcov-profile/96092 gcc/ChangeLog: * common.opt: New option. * coverage.c (coverage_begin_function): Emit filename with remap_profile_filename. * doc/invoke.texi: Document the new option. * file-prefix-map.c (add_profile_prefix_map): New. (remap_profile_filename): Likewise. * file-prefix-map.h (add_profile_prefix_map): Likewise. (remap_profile_filename): Likewise. * lto-opts.c (lto_write_options): Handle OPT_fprofile_prefix_map_. * opts-global.c (handle_common_deferred_options): Likewise. * opts.c (common_handle_option): Likewise. (gen_command_line_string): Likewise. * profile.c (output_location): Emit filename with remap_profile_filename.
2021-12-01	c++: constexpr, fold, weak redecl, fp/0 [PR103310]	Jason Merrill	1	-0/+14
	For PR61825, honza changed tree_single_nonzero_warnv_p to prevent a later declaration from marking a function as weak after we've determined that it wasn't weak before. But we shouldn't do that for speculative folding; we should only do it when we actually need a constant value. In C++, such a context is called "manifestly constant-evaluated". In fold, this seems to correspond to the folding_initializer flag, since in C this situation only occurs in static initializers. This change makes nonzero-1.c well-formed; I've added a nonzero-1a.c to verify that we delete the null check eventually if there is no weak redeclaration. The varasm.c change is so that if we do get the weak redeclaration error, we get it at the position of the weak declaration rather than the previous declaration. Using the FOLD_INIT paths also affects floating point arithmetic: notably, this makes floating point division by zero in a manifestly constant-evaluated context constant, as in a C static initializer. I've had some success convincing CWG that this is the right direction; C++ should follow C's floating point semantics more than we have been doing, and Joseph says that the C policy is that Annex F overrides other parts of the standard that say that some operations are undefined. But since we're in stage 3, I'm only making this change with the new flag -fconstexpr-fp-except. It may turn on by default in a future release. I think this distinction is only relevant for binary operations; arithmetic for the floating point case, comparison for possibly non-zero addresses. PR c++/103310 gcc/ChangeLog: * fold-const.c (maybe_nonzero_address): Use get_create or get depending on folding_initializer. (fold_binary_initializer_loc): New. * fold-const.h (fold_binary_initializer_loc): Declare. * varasm.c (mark_weak): Don't use the decl location. * doc/invoke.texi: Document -fconstexpr-fp-except. gcc/c-family/ChangeLog: * c.opt: Add -fconstexpr-fp-except. gcc/cp/ChangeLog: * constexpr.c (cxx_eval_binary_expression): Use fold_binary_initializer_loc if manifestly cxeval. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/constexpr-fp-except1.C: New test. * g++.dg/cpp1z/constexpr-if36.C: New test. * gcc.dg/tree-ssa/nonzero-1.c: Now well-formed. * gcc.dg/tree-ssa/nonzero-1a.c: New test.
2021-12-01	doc, d: Add note that D front end now requires GDC installed in order to ↵	Iain Buclaw	1	-0/+28
	bootstrap. gcc/ChangeLog: * doc/install.texi (Prerequisites): Add note that D front end now requires GDC installed in order to bootstrap. (Building): Add D compiler section, referencing prerequisites.
2021-11-30	vect: Support gather loads with SLP	Richard Sandiford	1	-0/+4
	This patch adds SLP support for IFN_GATHER_LOAD. Like the SLP support for IFN_MASK_LOAD, it works by treating only some of the arguments as child nodes. Unlike IFN_MASK_LOAD, it requires the other arguments (base, scale, and extension type) to be the same for all calls in the group. It does not require/expect the loads to be in a group (which probably wouldn't make sense for gathers). I was worried about the possible alias effect of moving gathers around to be part of the same SLP group. The patch therefore makes vect_analyze_data_ref_dependence treat gathers and scatters as a top-level concern, punting if the accesses aren't completely independent and if the user hasn't told us that a particular VF is safe. I think in practice we already punted in the same circumstances; the idea is just to make it more explicit. gcc/ PR tree-optimization/102467 * doc/sourcebuild.texi (vect_gather_load_ifn): Document. * tree-vect-data-refs.c (vect_analyze_data_ref_dependence): Commonize safelen handling. Punt for anything involving gathers and scatters unless safelen says otherwise. * tree-vect-slp.c (arg1_map): New variable. (vect_get_operand_map): Handle IFN_GATHER_LOAD. (vect_build_slp_tree_1): Likewise. (vect_build_slp_tree_2): Likewise. (compatible_calls_p): If vect_get_operand_map returns nonnull, check that any skipped arguments are equal. (vect_slp_analyze_node_operations_1): Tighten reduction check. * tree-vect-stmts.c (check_load_store_for_partial_vectors): Take an ncopies argument. (vect_get_gather_scatter_ops): Take slp_node and ncopies arguments. Handle SLP nodes. (vectorizable_store, vectorizable_load): Adjust accordingly. gcc/testsuite/ * lib/target-supports.exp (check_effective_target_vect_gather_load_ifn): New target test. * gcc.dg/vect/vect-gather-1.c: New test. * gcc.dg/vect/vect-gather-2.c: Likewise. * gcc.target/aarch64/sve/gather_load_11.c: Likewise.
2021-11-30	vect: Add support for fmax and fmin reductions	Richard Sandiford	1	-0/+8
	This patch adds support for reductions involving calls to fmax() and fmin(), without the -ffast-math flags that allow them to be converted to MAX_EXPR and MIN_EXPR. gcc/ * doc/md.texi (reduc_fmin_scal_@var{m}): Document. (reduc_fmax_scal_@var{m}): Likewise. * optabs.def (reduc_fmax_scal_optab): New optab. (reduc_fmin_scal_optab): Likewise * internal-fn.def (REDUC_FMAX, REDUC_FMIN): New functions. * tree-vect-loop.c (reduction_fn_for_scalar_code): Handle CASE_CFN_FMAX and CASE_CFN_FMIN. (neutral_op_for_reduction): Likewise. (needs_fold_left_reduction_p): Likewise. * config/aarch64/iterators.md (FMAXMINV): New iterator. (fmaxmin): Handle UNSPEC_FMAXNMV and UNSPEC_FMINNMV. * config/aarch64/aarch64-simd.md (reduc_<optab>_scal_<mode>): Fix unspec mode. (reduc_<fmaxmin>_scal_<mode>): New pattern. * config/aarch64/aarch64-sve.md (reduc_<fmaxmin>_scal_<mode>): Likewise. gcc/testsuite/ * gcc.dg/vect/vect-fmax-1.c: New test. * gcc.dg/vect/vect-fmax-2.c: Likewise. * gcc.dg/vect/vect-fmax-3.c: Likewise. * gcc.dg/vect/vect-fmin-1.c: New test. * gcc.dg/vect/vect-fmin-2.c: Likewise. * gcc.dg/vect/vect-fmin-3.c: Likewise. * gcc.target/aarch64/fmaxnm_1.c: Likewise. * gcc.target/aarch64/fmaxnm_2.c: Likewise. * gcc.target/aarch64/fminnm_1.c: Likewise. * gcc.target/aarch64/fminnm_2.c: Likewise. * gcc.target/aarch64/sve/fmaxnm_2.c: Likewise. * gcc.target/aarch64/sve/fmaxnm_3.c: Likewise. * gcc.target/aarch64/sve/fminnm_2.c: Likewise. * gcc.target/aarch64/sve/fminnm_3.c: Likewise.
2021-11-25	docs: Add missing @option keyword.	Martin Liska	1	-2/+2
	gcc/ChangeLog: * doc/invoke.texi: Use @option for -Wuninitialized.
2021-11-25	c++: Implement C++23 P2128R6 - Multidimensional subscript operator [PR102611]	Jakub Jelinek	1	-4/+12
	The following patch implements the C++23 Multidimensional subscript operator P2128R6 paper. As C++20 and older only allow a single expression in between []s (albeit for C++20 with a deprecation warning if it is a comma expression) and even in C++23 and for the coming years I think the vast majority of subscript expressions will still have a single expression and even in C++23 it is quite special, as e.g. the builtin operator requires exactly one assignment expression, the patch attempts to optimize for that case and if possible not to slow down that common case (or use more memory for it). So, already during parsing it differentiates between that (uses a single index_exp tree in that case) and the new cases (zero or two+ expressions in the list), for which it sets index_exp to NULL_TREE and uses a releasing_vec instead similarly to how e.g. finish_call_expr uses it. In call.c it introduces new functions build_op_subscript{,_1} which are something in between build_new_op{,_1} and build_op_call{,_1}. The former requires fixed number of arguments (and the patch still uses it for the common case of subscript with exactly one index expression), the latter handles variable number of arguments but is too CALL_EXPR specific and handles various cases that are unnecessary for the subscript. Right now the subscript for 0 or 2+ expressions doesn't need to deal with builtin candidates and so is quite simple. As discussed in the paper, for backwards compatibility, if for 2+ index expressions build_op_subscript fails (called with tf_none) and the expressions together form a valid comma expression (again checked with tf_none), it is used that C++20-ish way with a pedwarn about it, but if even that fails, build_op_subscript is called again with standard complain flags to diagnose it in the new way. And similarly for the builtin case. The -Wcomma-subscript warning used to be enabled by default unless -Wno-deprecated. Since the C/C++98..20 behavior is no longer deprecated, but ill-formed or changed meaning, it is now for C++23 enabled by default regardless of -Wno-deprecated and controls the pedwarn (but not the errors emitted if something wasn't valid before and isn't valid in C++23 either). 2021-11-25 Jakub Jelinek <jakub@redhat.com> PR c++/102611 gcc/ * doc/invoke.texi (-Wcomma-subscript): Document that for -std=c++20 the option isn't enabled by default with -Wno-deprecated but for -std=c++23 it is. gcc/c-family/ * c-opts.c (c_common_post_options): Enable -Wcomma-subscript by default for C++23 regardless of warn_deprecated. * c-cppbuiltin.c (c_cpp_builtins): Predefine __cpp_multidimensional_subscript=202110L for C++23. gcc/cp/ * cp-tree.h (build_op_subscript): Implement P2128R6 - Multidimensional subscript operator. Declare. (class releasing_vec): Add release method. (grok_array_decl): Remove bool argument, add vec<tree, va_gc> ** and tsubst_flags_t arguments. (build_min_non_dep_op_overload): Declare another overload. * parser.c (cp_parser_parenthesized_expression_list_elt): New function. (cp_parser_postfix_open_square_expression): Mention C++23 syntax in function comment. For C++23 parse zero or more than one initializer clauses in expression list, adjust grok_array_decl caller. (cp_parser_parenthesized_expression_list): Use cp_parser_parenthesized_expression_list_elt. (cp_parser_builtin_offsetof): Adjust grok_array_decl caller. * decl.c (grok_op_properties): For C++23 don't check number of arguments of operator[]. * decl2.c (grok_array_decl): Remove decltype_p argument, add index_exp_list and complain arguments. If index_exp is NULL, handle index_exp_list as the subscript expression list. tree.c (build_min_non_dep_op_overload): New overload. * call.c (add_operator_candidates, build_over_call): Adjust comments for removal of build_new_op_1. (build_op_subscript): New function. * pt.c (tsubst_copy_and_build_call_args): New function. (tsubst_copy_and_build) <case ARRAY_REF>: If second operand is magic CALL_EXPR with ovl_op_identifier (ARRAY_REF) as CALL_EXPR_FN, tsubst CALL_EXPR arguments including expanding pack expressions in it and call grok_array_decl instead of build_x_array_ref. <case CALL_EXPR>: Use tsubst_copy_and_build_call_args. * semantics.c (handle_omp_array_sections_1): Adjust grok_array_decl caller. gcc/testsuite/ * g++.dg/cpp2a/comma1.C: Expect different diagnostics for C++23. * g++.dg/cpp2a/comma3.C: Likewise. * g++.dg/cpp2a/comma4.C: Expect diagnostics for C++23. * g++.dg/cpp2a/comma5.C: Expect different diagnostics for C++23. * g++.dg/cpp23/feat-cxx2b.C: Test __cpp_multidimensional_subscript predefined macro. * g++.dg/cpp23/subscript1.C: New test. * g++.dg/cpp23/subscript2.C: New test. * g++.dg/cpp23/subscript3.C: New test. * g++.dg/cpp23/subscript4.C: New test. * g++.dg/cpp23/subscript5.C: New test. * g++.dg/cpp23/subscript6.C: New test.
2021-11-23	Implement -Winfinite-recursion [PR88232].	Martin Sebor	1	-0/+9
	Resolves: PR middle-end/88232 - Please implement -Winfinite-recursion gcc/ChangeLog: PR middle-end/88232 * Makefile.in (OBJS): Add gimple-warn-recursion.o. * common.opt: Add -Winfinite-recursion. * doc/invoke.texi (-Winfinite-recursion): Document. * passes.def (pass_warn_recursion): Schedule a new pass. * tree-pass.h (make_pass_warn_recursion): Declare. * gimple-warn-recursion.c: New file. gcc/c-family/ChangeLog: PR middle-end/88232 * c.opt: Add -Winfinite-recursion. gcc/testsuite/ChangeLog: PR middle-end/88232 * c-c++-common/attr-used-5.c: Suppress valid warning. * c-c++-common/attr-used-6.c: Same. * c-c++-common/attr-used-9.c: Same. * g++.dg/warn/Winfinite-recursion-2.C: New test. * g++.dg/warn/Winfinite-recursion-3.C: New test. * g++.dg/warn/Winfinite-recursion.C: New test. * gcc.dg/Winfinite-recursion-2.c: New test. * gcc.dg/Winfinite-recursion.c: New test.
2021-11-23	docs: Remove 2 more duplicite param descriptions.	Martin Liska	1	-8/+0
	gcc/ChangeLog: * doc/invoke.texi: Remove 2 more duplicite param descriptions.
2021-11-22	docs: remove duplicate param documentation	Martin Liska	1	-12/+0
	gcc/ChangeLog: * doc/invoke.texi: Remove duplicate documentation for 3 params.
2021-11-19	gcc, doc: Fix Darwin bootstrap: Amend an @option command to elide a space.	Iain Sandoe	1	-1/+1
	At least some version(s) of makeinfo (4.8) do not like @option {-xxxx} the brace has to follow the @option without any whitespace. makeinfo 4.8 is installed on Darwin systems and this breaks bootstrap. The amendment follows the style of the surrounding code. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> gcc/ChangeLog: * doc/invoke.texi: Remove whitespace after an @option.
2021-11-19	options: Make -Ofast switch off -fsemantic-interposition	Martin Jambor	1	-0/+1
	Using -fno-semantic-interposition has been reported by various people to bring about considerable speed up at the cost of strict compliance to the ELF symbol interposition rules See for example https://fedoraproject.org/wiki/Changes/PythonNoSemanticInterpositionSpeedup As such I believe it should be implied by our -Ofast optimization level, not only so that benchmarks that can benefit run faster, but also so that people looking at -Ofast documentation for options that could speed their programs find it. gcc/ChangeLog: 2021-11-12 Martin Jambor <mjambor@suse.cz> * opts.c (default_options_table): Switch off flag_semantic_interposition at Ofast. * doc/invoke.texi (Optimize Options): Document that Ofast switches off -fsemantic-interposition.
2021-11-19	Restore ancient -Waddress for weak symbols [PR33925].	Martin Sebor	1	-0/+2
	Resolves: PR c/33925 - gcc -Waddress lost some useful warnings PR c/102867 - -Waddress from macro expansion in readelf.c gcc/c-family/ChangeLog: PR c++/33925 PR c/102867 * c-common.c (decl_with_nonnull_addr_p): Call maybe_nonzero_address and improve handling tof defined symbols. gcc/c/ChangeLog: PR c++/33925 PR c/102867 * c-typeck.c (maybe_warn_for_null_address): Suppress warnings for code resulting from macro expansion. gcc/cp/ChangeLog: PR c++/33925 PR c/102867 * typeck.c (warn_for_null_address): Suppress warnings for code resulting from macro expansion. gcc/ChangeLog: PR c++/33925 PR c/102867 * doc/invoke.texi (-Waddress): Update. gcc/testsuite/ChangeLog: PR c++/33925 PR c/102867 * g++.dg/warn/Walways-true-2.C: Adjust to avoid a valid warning. * c-c++-common/Waddress-5.c: New test. * c-c++-common/Waddress-6.c: New test. * g++.dg/warn/Waddress-7.C: New test. * gcc.dg/Walways-true-2.c: Adjust to avoid a valid warning. * gcc.dg/weak/weak-3.c: Expect a warning.
2021-11-19	Do not abort compilation when dump file is /dev/*	Giuliano Belinassi	1	-1/+2
	The `configure` scripts generated with autoconf often tests compiler features by setting output to `/dev/null`, which then sets the dump folder as being /dev/* and the compilation halts with an error because GCC cannot create files in /dev/. This is a problem when configure is testing for compiler features because it cannot tell if the failure was due to unsupported features or any other problem, and disable it even if it is working. As an example, running configure overriding CFLAGS="-fdump-ipa-clones" will result in several compiler-features as being disabled because of gcc halting with an error creating files in /dev/. This commit fixes this issue by checking if the output file is /dev/null or /dev/zero. In this case we use the current working directory for dump output instead of the directory of the output file because we cannot write to /dev/. gcc/ChangeLog 2021-11-16 Giuliano Belinassi <gbelinassi@suse.de> * gcc.c (process_command): Skip dumpdir override if file is a not_actual_file_p. * doc/invoke.texi: Update -dumpdir documentation. gcc/testsuite/ChangeLog 2021-11-16 Giuliano Belinassi <gbelinassi@suse.de> * gcc.dg/devnull-dump.c: New. Signed-off-by: Giuliano Belinassi <gbelinassi@suse.de>
2021-11-18	c++: Implement -Wuninitialized for mem-initializers (redux) [PR19808]	Marek Polacek	1	-0/+12
	2021 update: Last year I posted a version of this patch: <https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559162.html> but it didn't make it in. The main objection seemed to be that the patch tried to do too much, and overlapped with the ME uninitialized warnings. Since the patch used walk_tree without any data flow info, it issued false positives for things like a(0 ? b : 42) and similar. I'll admit I've been dreading resurrecting this because of the lack of clarity about where we should warn about what. On the other hand, I think we really should do something about this. So I've simplified the original patch as much as it seemed reasonable. For instance, it doesn't even attempt to handle cases like "a((b = 42)), c(b)" -- for these I simply give up for the whole mem-initializer (but who writes code like that, anyway?). I also give up when a member is initialized with a function call, because we don't know what the call could do. See Wuninitialized-17.C, for which clang emits a false positive but we don't. I remember having a hard time dealing with initializer lists in my previous patch, so now I only handle simple a{b} cases, but no more. It turned out that this abridged version still warns about 90% cases where users would expect a warning. More complicated cases are left for the ME, which, for unused inline functions, will only warn with -fkeep-inline-functions, but so be it. (This is bug 21678.) This patch implements the long-desired -Wuninitialized warning for member initializer lists, so that the front end can detect bugs like struct A { int a; int b; A() : b(1), a(b) { } }; where the field 'b' is used uninitialized because the order of member initializers in the member initializer list is irrelevant; what matters is the order of declarations in the class definition. I've implemented this by keeping a hash set holding fields that are not initialized yet, so at first it will be {a, b}, and after initializing 'a' it will be {b} and so on. Then I use walk_tree to walk the initializer and if we see that an uninitialized object is used, we warn. Of course, when we use the address of the object, we may not warn: struct B { int &r; int p; int a; B() : r(a), p(&a), a(1) { } // ok }; Likewise, don't warn in unevaluated contexts such as sizeof. Classes without an explicit initializer may still be initialized by their default constructors; whether or not something is considered initialized is handled in perform_member_init, see member_initialized_p. PR c++/19808 PR c++/96121 gcc/cp/ChangeLog: init.c (perform_member_init): Remove a forward declaration. Walk the initializer using find_uninit_fields_r. New parameter to track uninitialized fields. If a member is initialized, remove it from the hash set. (perform_target_ctor): Return the initializer. (struct find_uninit_data): New class. (find_uninit_fields_r): New function. (find_uninit_fields): New function. (emit_mem_initializers): Keep and initialize a set holding fields that are not initialized. When handling delegating constructors, walk the constructor tree using find_uninit_fields_r. Also when initializing base clases. Pass uninitialized down to perform_member_init. gcc/ChangeLog: * doc/invoke.texi: Update documentation for -Wuninitialized. * tree.c (stabilize_reference): Set location. gcc/testsuite/ChangeLog: * g++.dg/warn/Wuninitialized-14.C: New test. * g++.dg/warn/Wuninitialized-15.C: New test. * g++.dg/warn/Wuninitialized-16.C: New test. * g++.dg/warn/Wuninitialized-17.C: New test. * g++.dg/warn/Wuninitialized-18.C: New test. * g++.dg/warn/Wuninitialized-19.C: New test. * g++.dg/warn/Wuninitialized-20.C: New test. * g++.dg/warn/Wuninitialized-21.C: New test. * g++.dg/warn/Wuninitialized-22.C: New test. * g++.dg/warn/Wuninitialized-23.C: New test. * g++.dg/warn/Wuninitialized-24.C: New test. * g++.dg/warn/Wuninitialized-25.C: New test. * g++.dg/warn/Wuninitialized-26.C: New test. * g++.dg/warn/Wuninitialized-27.C: New test. * g++.dg/warn/Wuninitialized-28.C: New test. * g++.dg/warn/Wuninitialized-29.C: New test. * g++.dg/warn/Wuninitialized-30.C: New test.
2021-11-18	x86: Add -mindirect-branch-cs-prefix	H.J. Lu	1	-1/+9
	Add -mindirect-branch-cs-prefix to add CS prefix to call and jmp to indirect thunk with branch target in r8-r15 registers so that the call and jmp instruction length is 6 bytes to allow them to be replaced with "lfence; call %r8-r15" or "lfence; jmp %r8-r15" at run-time. gcc/ PR target/102952 * config/i386/i386.c (ix86_output_jmp_thunk_or_indirect): Emit CS prefix for -mindirect-branch-cs-prefix. (ix86_output_indirect_branch_via_reg): Likewise. * config/i386/i386.opt: Add -mindirect-branch-cs-prefix. * doc/invoke.texi: Document -mindirect-branch-cs-prefix. gcc/testsuite/ PR target/102952 * gcc.target/i386/indirect-thunk-cs-prefix-1.c: New test. * gcc.target/i386/indirect-thunk-cs-prefix-2.c: Likewise.
2021-11-18	c-family: Add __builtin_assoc_barrier	Matthias Kretz	1	-0/+18
	New builtin to enable explicit use of PAREN_EXPR in C & C++ code. Signed-off-by: Matthias Kretz <m.kretz@gsi.de> gcc/testsuite/ChangeLog: * c-c++-common/builtin-assoc-barrier-1.c: New test. gcc/cp/ChangeLog: * constexpr.c (cxx_eval_constant_expression): Handle PAREN_EXPR via cxx_eval_constant_expression. * cp-objcp-common.c (names_builtin_p): Handle RID_BUILTIN_ASSOC_BARRIER. * cp-tree.h: Adjust TREE_LANG_FLAG documentation to include PAREN_EXPR in REF_PARENTHESIZED_P. (REF_PARENTHESIZED_P): Add PAREN_EXPR. * parser.c (cp_parser_postfix_expression): Handle RID_BUILTIN_ASSOC_BARRIER. * pt.c (tsubst_copy_and_build): If the PAREN_EXPR is not a parenthesized initializer, build a new PAREN_EXPR. * semantics.c (force_paren_expr): Simplify conditionals. Set REF_PARENTHESIZED_P on PAREN_EXPR. (maybe_undo_parenthesized_ref): Test PAREN_EXPR for REF_PARENTHESIZED_P. gcc/c-family/ChangeLog: * c-common.c (c_common_reswords): Add __builtin_assoc_barrier. * c-common.h (enum rid): Add RID_BUILTIN_ASSOC_BARRIER. gcc/c/ChangeLog: * c-decl.c (names_builtin_p): Handle RID_BUILTIN_ASSOC_BARRIER. * c-parser.c (c_parser_postfix_expression): Likewise. gcc/ChangeLog: * doc/extend.texi: Document __builtin_assoc_barrier.
2021-11-17	x86: Add -mharden-sls=[none\|all\|return\|indirect-branch]	H.J. Lu	1	-1/+9
	Add -mharden-sls= to mitigate against straight line speculation (SLS) for function return and indirect branch by adding an INT3 instruction after function return and indirect branch. gcc/ PR target/102952 * config/i386/i386-opts.h (harden_sls): New enum. * config/i386/i386.c (output_indirect_thunk): Mitigate against SLS for function return. (ix86_output_function_return): Likewise. (ix86_output_jmp_thunk_or_indirect): Mitigate against indirect branch. (ix86_output_indirect_jmp): Likewise. (ix86_output_call_insn): Likewise. * config/i386/i386.opt: Add -mharden-sls=. * doc/invoke.texi: Document -mharden-sls=. gcc/testsuite/ PR target/102952 * gcc.target/i386/harden-sls-1.c: New test. * gcc.target/i386/harden-sls-2.c: Likewise. * gcc.target/i386/harden-sls-3.c: Likewise. * gcc.target/i386/harden-sls-4.c: Likewise. * gcc.target/i386/harden-sls-5.c: Likewise.
2021-11-17	doc: document -fimplicit-constexpr	Jason Merrill	1	-0/+7
	I forgot this in the implementation patch. gcc/ChangeLog: * doc/invoke.texi (C++ Dialect Options): Document -fimplicit-constexpr.
2021-11-17	Add IFN_COND_FMIN/FMAX functions	Richard Sandiford	1	-0/+4
	This patch adds conditional forms of FMAX and FMIN, following the pattern for existing conditional binary functions. gcc/ * doc/md.texi (cond_fmin@var{mode}, cond_fmax@var{mode}): Document. * optabs.def (cond_fmin_optab, cond_fmax_optab): New optabs. * internal-fn.def (COND_FMIN, COND_FMAX): New functions. * internal-fn.c (first_commutative_argument): Handle them. (FOR_EACH_COND_FN_PAIR): Likewise. * match.pd (UNCOND_BINARY, COND_BINARY): Likewise. * config/aarch64/aarch64-sve.md (cond_<fmaxmin><mode>): New pattern. gcc/testsuite/ * gcc.target/aarch64/sve/cond_fmaxnm_5.c: New test. * gcc.target/aarch64/sve/cond_fmaxnm_5_run.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_6.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_6_run.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_7.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_7_run.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_8.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_8_run.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_5.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_5_run.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_6.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_6_run.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_7.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_7_run.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_8.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_8_run.c: Likewise.
2021-11-16	libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026]	Marek Polacek	1	-1/+20
	From a link below: "An issue was discovered in the Bidirectional Algorithm in the Unicode Specification through 14.0. It permits the visual reordering of characters via control sequences, which can be used to craft source code that renders different logic than the logical ordering of tokens ingested by compilers and interpreters. Adversaries can leverage this to encode source code for compilers accepting Unicode such that targeted vulnerabilities are introduced invisibly to human reviewers." More info: https://nvd.nist.gov/vuln/detail/CVE-2021-42574 https://trojansource.codes/ This is not a compiler bug. However, to mitigate the problem, this patch implements -Wbidi-chars=[none\|unpaired\|any] to warn about possibly misleading Unicode bidirectional control characters the preprocessor may encounter. The default is =unpaired, which warns about improperly terminated bidirectional control characters; e.g. a LRE without its corresponding PDF. The level =any warns about any use of bidirectional control characters. This patch handles both UCNs and UTF-8 characters. UCNs designating bidi characters in identifiers are accepted since r204886. Then r217144 enabled -fextended-identifiers by default. Extended characters in C/C++ identifiers have been accepted since r275979. However, this patch still warns about mixing UTF-8 and UCN bidi characters; there seems to be no good reason to allow mixing them. We warn in different contexts: comments (both C and C++-style), string literals, character constants, and identifiers. Expectedly, UCNs are ignored in comments and raw string literals. The bidirectional control characters can nest so this patch handles that as well. I have not included nor tested this at all with Fortran (which also has string literals and line comments). Dave M. posted patches improving diagnostic involving Unicode characters. This patch does not make use of this new infrastructure yet. PR preprocessor/103026 gcc/c-family/ChangeLog: * c.opt (Wbidi-chars, Wbidi-chars=): New option. gcc/ChangeLog: * doc/invoke.texi: Document -Wbidi-chars. libcpp/ChangeLog: * include/cpplib.h (enum cpp_bidirectional_level): New. (struct cpp_options): Add cpp_warn_bidirectional. (enum cpp_warning_reason): Add CPP_W_BIDIRECTIONAL. * internal.h (struct cpp_reader): Add warn_bidi_p member function. * init.c (cpp_create_reader): Set cpp_warn_bidirectional. * lex.c (bidi): New namespace. (get_bidi_utf8): New function. (get_bidi_ucn): Likewise. (maybe_warn_bidi_on_close): Likewise. (maybe_warn_bidi_on_char): Likewise. (_cpp_skip_block_comment): Implement warning about bidirectional control characters. (skip_line_comment): Likewise. (forms_identifier_p): Likewise. (lex_identifier): Likewise. (lex_string): Likewise. (lex_raw_string): Likewise. gcc/testsuite/ChangeLog: * c-c++-common/Wbidi-chars-1.c: New test. * c-c++-common/Wbidi-chars-2.c: New test. * c-c++-common/Wbidi-chars-3.c: New test. * c-c++-common/Wbidi-chars-4.c: New test. * c-c++-common/Wbidi-chars-5.c: New test. * c-c++-common/Wbidi-chars-6.c: New test. * c-c++-common/Wbidi-chars-7.c: New test. * c-c++-common/Wbidi-chars-8.c: New test. * c-c++-common/Wbidi-chars-9.c: New test. * c-c++-common/Wbidi-chars-10.c: New test. * c-c++-common/Wbidi-chars-11.c: New test. * c-c++-common/Wbidi-chars-12.c: New test. * c-c++-common/Wbidi-chars-13.c: New test. * c-c++-common/Wbidi-chars-14.c: New test. * c-c++-common/Wbidi-chars-15.c: New test. * c-c++-common/Wbidi-chars-16.c: New test. * c-c++-common/Wbidi-chars-17.c: New test.
2021-11-15	IPA: Provide a mechanism to register static DTORs via cxa_atexit.	Iain Sandoe	2	-0/+10
	For at least one target (Darwin) the platform convention is to register static destructors (i.e. __attribute__((destructor))) with __cxa_atexit rather than placing them into a list that is run by some other mechanism. This patch provides a target hook that allows a target to opt into this and handling for the process in ipa_cdtor_merge (). When the mode is enabled (dtors_from_cxa_atexit is set) we: * Generate new CTORs to register static destructors with __cxa_atexit and add them to the existing list of CTORs; we then process the revised CTORs list. * We sort the DTORs into priority and then TU order, this means that they are registered in that order with __cxa_atexit () and therefore will be run in the reverse order. * Likewise, CTORs are sorted into priority and then TU order, which means that they will run in that order. This matches the behavior of using init/fini (or mod_init_func/mod_term_func) sections. This also fixes a bug where Fortran needs a DTOR to be run to close IO. Signed-off-by: Iain Sandoe <iain@sandoe.co.uk> PR fortran/102992 gcc/ChangeLog: * config/darwin.h (TARGET_DTORS_FROM_CXA_ATEXIT): New. * doc/tm.texi: Regenerated. * doc/tm.texi.in: Add TARGET_DTORS_FROM_CXA_ATEXIT hook. * ipa.c (cgraph_build_static_cdtor_1): Return the built function decl. (build_cxa_atexit_decl): New. (build_dso_handle_decl): New. (build_cxa_dtor_registrations): New. (compare_cdtor_tu_order): New. (build_cxa_atexit_fns): New. (ipa_cdtor_merge): If dtors_from_cxa_atexit is set, process the DTORs/CTORs accordingly. (pass_ipa_cdtor_merge::gate): Also run if dtors_from_cxa_atexit is set. * target.def (dtors_from_cxa_atexit): New hook.
2021-11-15	PR target/103069: Relax cmpxchg loop for x86 target	Hongyu Wang	1	-1/+8
	From the CPU's point of view, getting a cache line for writing is more expensive than reading. See Appendix A.2 Spinlock in: https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ xeon-lock-scaling-analysis-paper.pdf The full compare and swap will grab the cache line exclusive and causes excessive cache line bouncing. The atomic_fetch_{or,xor,and,nand} builtins generates cmpxchg loop under -march=x86-64 like: movl v(%rip), %eax .L2: movl %eax, %ecx movl %eax, %edx orl $1, %ecx lock cmpxchgl %ecx, v(%rip) jne .L2 movl %edx, %eax andl $1, %eax ret To relax above loop, GCC should first emit a normal load, check and jump to .L2 if cmpxchgl may fail. Before jump to .L2, PAUSE should be inserted to yield the CPU to another hyperthread and to save power, so the code is like .L84: movl (%rdi), %ecx movl %eax, %edx orl %esi, %edx cmpl %eax, %ecx jne .L82 lock cmpxchgl %edx, (%rdi) jne .L84 .L82: rep nop jmp .L84 This patch adds corresponding atomic_fetch_op expanders to insert load/ compare and pause for all the atomic logic fetch builtins. Add flag -mrelax-cmpxchg-loop to control whether to generate relaxed loop. gcc/ChangeLog: PR target/103069 * config/i386/i386-expand.c (ix86_expand_atomic_fetch_op_loop): New expand function. * config/i386/i386-options.c (ix86_target_string): Add -mrelax-cmpxchg-loop flag. (ix86_valid_target_attribute_inner_p): Likewise. * config/i386/i386-protos.h (ix86_expand_atomic_fetch_op_loop): New expand function prototype. * config/i386/i386.opt: Add -mrelax-cmpxchg-loop. * config/i386/sync.md (atomic_fetch_<logic><mode>): New expander for SI,HI,QI modes. (atomic_<logic>_fetch<mode>): Likewise. (atomic_fetch_nand<mode>): Likewise. (atomic_nand_fetch<mode>): Likewise. (atomic_fetch_<logic><mode>): New expander for DI,TI modes. (atomic_<logic>_fetch<mode>): Likewise. (atomic_fetch_nand<mode>): Likewise. (atomic_nand_fetch<mode>): Likewise. * doc/invoke.texi: Document -mrelax-cmpxchg-loop. gcc/testsuite/ChangeLog: PR target/103069 * gcc.target/i386/pr103069-1.c: New test. * gcc.target/i386/pr103069-2.c: Ditto.
2021-11-15	VAX: Implement the `-mlra' command-line option	Maciej W. Rozycki	1	-1/+8
	Add the the `-mlra' command-line option for the VAX target, with the usual semantics of enabling Local Register Allocation, off by default. LRA remains unstable with the VAX target, with numerous ICEs throughout the testsuite and worse code produced overall where successful, however the presence of a command line option to enable it makes it easier to experiment with it as the compiler does not have to be rebuilt to flip between the old reload and LRA. gcc/ * config/vax/vax.c (vax_lra_p): New prototype and function. (TARGET_LRA_P): Wire it. * config/vax/vax.opt (mlra): New option. * doc/invoke.texi (Option Summary, VAX Options): Document the new option.
2021-11-13	analyzer: add four new taint-based warnings	David Malcolm	1	-1/+65
	The initial commit of the analyzer in GCC 10 had a single warning, -Wanalyzer-tainted-array-index and required manually enabling the taint checker with -fanalyzer-checker=taint (due to scaling issues). This patch extends the taint detection to add four new taint-based warnings: -Wanalyzer-tainted-allocation-size for e.g. attacker-controlled malloc/alloca -Wanalyzer-tainted-divisor for detecting where an attacker can inject a divide-by-zero -Wanalyzer-tainted-offset for attacker-controlled pointer offsets -Wanalyzer-tainted-size for e.g. attacker-controlled memset and rewords all the warnings to talk about "attacker-controlled" values rather than "tainted" values. Unfortunately I haven't yet addressed the scaling issues, so all of these still require -fanalyzer-checker=taint (in addition to -fanalyzer). gcc/analyzer/ChangeLog: * analyzer.opt (Wanalyzer-tainted-allocation-size): New. (Wanalyzer-tainted-divisor): New. (Wanalyzer-tainted-offset): New. (Wanalyzer-tainted-size): New. * engine.cc (impl_region_model_context::get_taint_map): New. * exploded-graph.h (impl_region_model_context::get_taint_map): New decl. * program-state.cc (sm_state_map::get_state): Call alt_get_inherited_state. (sm_state_map::impl_set_state): Modify states within compound svalues. (program_state::impl_call_analyzer_dump_state): Undo casts. (selftest::test_program_state_1): Update for new context param of create_region_for_heap_alloc. (selftest::test_program_state_merging): Likewise. * region-model-impl-calls.cc (region_model::impl_call_alloca): Likewise. (region_model::impl_call_calloc): Likewise. (region_model::impl_call_malloc): Likewise. (region_model::impl_call_operator_new): Likewise. (region_model::impl_call_realloc): Likewise. * region-model.cc (region_model::check_region_access): Call check_region_for_taint. (region_model::get_representative_path_var_1): Handle binops. (region_model::create_region_for_heap_alloc): Add "ctxt" param and pass it to set_dynamic_extents. (region_model::create_region_for_alloca): Likewise. (region_model::set_dynamic_extents): Add "ctxt" param and use it to call check_dynamic_size_for_taint. (selftest::test_state_merging): Update for new context param of create_region_for_heap_alloc. (selftest::test_malloc_constraints): Likewise. (selftest::test_malloc): Likewise. (selftest::test_alloca): Likewise for create_region_for_alloca. * region-model.h (region_model::create_region_for_heap_alloc): Add "ctxt" param. (region_model::create_region_for_alloca): Likewise. (region_model::set_dynamic_extents): Likewise. (region_model::check_dynamic_size_for_taint): New decl. (region_model::check_region_for_taint): New decl. (region_model_context::get_taint_map): New vfunc. (noop_region_model_context::get_taint_map): New. * sm-taint.cc: Remove include of "diagnostic-event-id.h"; add includes of "gimple-iterator.h", "tristate.h", "selftest.h", "ordered-hash-map.h", "cgraph.h", "cfg.h", "digraph.h", "analyzer/supergraph.h", "analyzer/call-string.h", "analyzer/program-point.h", "analyzer/store.h", "analyzer/region-model.h", and "analyzer/program-state.h". (enum bounds): Move to top of file. (class taint_diagnostic): New. (class tainted_array_index): Convert to subclass of taint_diagnostic. (tainted_array_index::emit): Add CWE-129. Reword warning to use "attacker-controlled" rather than "tainted". (tainted_array_index::describe_state_change): Move to taint_diagnostic::describe_state_change. (tainted_array_index::describe_final_event): Reword to use "attacker-controlled" rather than "tainted". (class tainted_offset): New. (class tainted_size): New. (class tainted_divisor): New. (class tainted_allocation_size): New. (taint_state_machine::alt_get_inherited_state): New. (taint_state_machine::on_stmt): In assignment handling, remove ARRAY_REF handling in favor of check_region_for_taint. Add detection of tainted divisors. (taint_state_machine::get_taint): New. (taint_state_machine::combine_states): New. (region_model::check_region_for_taint): New. (region_model::check_dynamic_size_for_taint): New. * sm.h (state_machine::alt_get_inherited_state): New. gcc/ChangeLog: * doc/invoke.texi (Static Analyzer Options): Add -Wno-analyzer-tainted-allocation-size, -Wno-analyzer-tainted-divisor, -Wno-analyzer-tainted-offset, and -Wno-analyzer-tainted-size to list. Add -Wanalyzer-tainted-allocation-size, -Wanalyzer-tainted-divisor, -Wanalyzer-tainted-offset, and -Wanalyzer-tainted-size to list of options effectively enabled by -fanalyzer. (-Wanalyzer-tainted-allocation-size): New. (-Wanalyzer-tainted-array-index): Tweak wording; add link to CWE. (-Wanalyzer-tainted-divisor): New. (-Wanalyzer-tainted-offset): New. (-Wanalyzer-tainted-size): New. gcc/testsuite/ChangeLog: * gcc.dg/analyzer/pr93382.c: Tweak expected wording. * gcc.dg/analyzer/taint-alloc-1.c: New test. * gcc.dg/analyzer/taint-alloc-2.c: New test. * gcc.dg/analyzer/taint-divisor-1.c: New test. * gcc.dg/analyzer/taint-1.c: Rename to... * gcc.dg/analyzer/taint-read-index-1.c: ...this. Tweak expected wording. Mark some events as xfail. * gcc.dg/analyzer/taint-read-offset-1.c: New test. * gcc.dg/analyzer/taint-size-1.c: New test. * gcc.dg/analyzer/taint-write-index-1.c: New test. * gcc.dg/analyzer/taint-write-offset-1.c: New test. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2021-11-10	rs6000/doc: Rename future cpu with power10	Kewen Lin	1	-12/+12
	Commmit 5d9d0c94588 renamed future to power10 and ace60939fd2 updated the documentation for "future" renaming. This patch is to rename the remaining "future architecture" references in documentation and polish the words for float128. gcc/ChangeLog: * doc/invoke.texi: Change references to "future cpu" to "power10", "-mcpu=future" to "-mcpu=power10". Adjust words for float128.
2021-11-10	attribs: Implement -Wno-attributes=vendor::attr [PR101940]	Marek Polacek	2	-0/+39
	It is desirable for -Wattributes to warn about e.g. [[deprecate]] void g(); // typo, should warn However, -Wattributes also warns about vendor-specific attributes (that's because lookup_scoped_attribute_spec -> find_attribute_namespace finds nothing), which, with -Werror, causes grief. We don't want the -Wattributes warning for [[company::attr]] void f(); GCC warns because it doesn't know the "company" namespace; it only knows the "gnu" and "omp" namespaces. We could entirely disable warning about attributes in unknown scopes but then the compiler would also miss typos like [[company::attrx]] void f(); or [[gmu::warn_used_result]] int write(); so that is not a viable solution. A workaround is to use a #pragma: #pragma GCC diagnostic push #pragma GCC diagnostic ignored "-Wattributes" [[company::attr]] void f() {} #pragma GCC diagnostic pop but that's a mouthful and awkward to use and could also hide typos. In fact, any macro-based solution doesn't seem like a way forward. This patch implements -Wno-attributes=, which takes these arguments: company::attr company:: This option should go well with using @file: the user could have a file containing -Wno-attributes=vendor::attr1,vendor::attr2 and then invoke gcc with '@attrs' or similar. I've also added a new pragma which has the same effect: The pragma along with the new option should help with various static analysis tools. PR c++/101940 gcc/ChangeLog: * attribs.c (struct scoped_attributes): Add a bool member. (lookup_scoped_attribute_spec): Forward declare. (register_scoped_attributes): New bool parameter, defaulted to false. Use it. (handle_ignored_attributes_option): New function. (free_attr_data): New function. (init_attributes): Call handle_ignored_attributes_option. (attr_namespace_ignored_p): New function. (decl_attributes): Check attr_namespace_ignored_p before warning. * attribs.h (free_attr_data): Declare. (register_scoped_attributes): Adjust declaration. (handle_ignored_attributes_option): Declare. (canonicalize_attr_name): New function template. (canonicalize_attr_name): Use it. * common.opt (Wattributes=): New option with a variable. * doc/extend.texi: Document #pragma GCC diagnostic ignored_attributes. * doc/invoke.texi: Document -Wno-attributes=. * opts.c (common_handle_option) <case OPT_Wattributes_>: Handle. * plugin.h (register_scoped_attributes): Adjust declaration. * toplev.c (compile_file): Call free_attr_data. gcc/c-family/ChangeLog: * c-pragma.c (handle_pragma_diagnostic): Handle #pragma GCC diagnostic ignored_attributes. gcc/testsuite/ChangeLog: * c-c++-common/Wno-attributes-1.c: New test. * c-c++-common/Wno-attributes-2.c: New test. * c-c++-common/Wno-attributes-3.c: New test.
2021-11-10	arm: enable cortex-a710 CPU	Przemyslaw Wirkus	1	-1/+1
	This patch is adding support for Cortex-A710 CPU in Arm. gcc/ChangeLog: * config/arm/arm-cpus.in (cortex-a710): New CPU. * config/arm/arm-tables.opt: Regenerate. * config/arm/arm-tune.md: Regenerate. * doc/invoke.texi: Update docs.
2021-11-09	Document --param=threader-debug.	Aldy Hernandez	1	-0/+3
	gcc/ChangeLog: * doc/invoke.texi (Invoking GCC): Document --param=threader-debug.
2021-11-09	Get rid of infinite recursion for 'typedef' used with GTY-marked ↵	Thomas Schwinge	1	-0/+8
	'gcc/diagnostic-spec.h:nowarn_map' [PR101204, PR103157] Reproduced with clang version 10.0.0-4ubuntu1: gtype-desc.c:11333:1: warning: all paths through this function will call itself [-Winfinite-recursion] ... as well as some GCC's '-O2 -fdump-tree-optimized': void gt_pch_nx(int_hash<unsigned int, 0u, 4294967295u>, gt_pointer_operator, void) ([...]) { <bb 2>: <bb 3>: goto <bb 3>; } That three-arguments 'gt_pch_nx' function as well as two one-argument 'gt_ggc_mx', 'gt_pch_nx' functions now turn empty: [...] void -gt_ggc_mx (int_hash<location_t,0,UINT_MAX>& x_r ATTRIBUTE_UNUSED) +gt_ggc_mx (struct xint_hash_t& x_r ATTRIBUTE_UNUSED) { - int_hash<location_t,0,UINT_MAX> * ATTRIBUTE_UNUSED x = &x_r; - gt_ggc_mx (&((x))); + struct xint_hash_t ATTRIBUTE_UNUSED x = &x_r; } [...] void -gt_pch_nx (int_hash<location_t,0,UINT_MAX>& x_r ATTRIBUTE_UNUSED) +gt_pch_nx (struct xint_hash_t& x_r ATTRIBUTE_UNUSED) { - int_hash<location_t,0,UINT_MAX> * ATTRIBUTE_UNUSED x = &x_r; - gt_pch_nx (&((x))); + struct xint_hash_t ATTRIBUTE_UNUSED x = &x_r; } [...] void -gt_pch_nx (int_hash<location_t,0,UINT_MAX>* x ATTRIBUTE_UNUSED, +gt_pch_nx (struct xint_hash_t* x ATTRIBUTE_UNUSED, ATTRIBUTE_UNUSED gt_pointer_operator op, ATTRIBUTE_UNUSED void cookie) { - gt_pch_nx (&((x)), op, cookie); } [...] gcc/ PR middle-end/101204 PR other/103157 * diagnostic-spec.h (typedef xint_hash_t): Turn into... (struct xint_hash_t): ... this. * doc/gty.texi: Update.
2021-11-09	arm: add armv9-a architecture to -march	Przemyslaw Wirkus	1	-0/+1
	In this patch: + Add `armv9-a` to -march. + Update multilib with armv9-a and armv9-a+simd. gcc/ChangeLog: * config/arm/arm-cpus.in (armv9): New define. (ARMv9a): New group. (armv9-a): New arch definition. * config/arm/arm-tables.opt: Regenerate. * config/arm/arm.h (BASE_ARCH_9A): New arch enum value. * config/arm/t-aprofile: Added armv9-a and armv9+simd. * config/arm/t-arm-elf: Added arm9-a, v9_fps and all_v9_archs to MULTILIB_MATCHES. * config/arm/t-multilib: Added v9_a_nosimd_variants and v9_a_simd_variants to MULTILIB_MATCHES. * doc/invoke.texi: Update docs. gcc/testsuite/ChangeLog: * gcc.target/arm/multilib.exp: Update test with armv9-a entries. * lib/target-supports.exp (v9a): Add new armflag. (__ARM_ARCH_9A__): Add new armdef.
2021-11-08	Update documentation for -ftree-loop-vectorize and -ftree-slp-vectorize ↵	liuhongt	1	-2/+2
	which are enabled by default at -02. gcc/ChangeLog: PR tree-optimization/103077 * doc/invoke.texi (Options That Control Optimization): Update documentation for -ftree-loop-vectorize and -ftree-slp-vectorize which are enabled by default at -02.
2021-11-05	doc: No longer generate old.html	Gerald Pfeifer	1	-3/+3
	Commit 431d26e1dd18c1146d3d4dcd3b45a3b04f7f7d59 removed doc/install-old.texi, alas we still tried to generate the associated web page old.html - which then turned out empty. Simplify remove this from the list of pages to be generated. gcc: * doc/install.texi2html: Do not generate old.html any longer.
2021-11-04	vect: Convert cost hooks to classes	Richard Sandiford	2	-31/+2
	The current vector cost interface has a quite a bit of redundancy built in. Each target that defines its own hooks has to replicate the basic unsigned[3] management. Currently each target also duplicates the cost adjustment for inner loops. This patch instead defines a vector_costs class for holding the scalar or vector cost and allows targets to subclass it. There is then only one costing hook: to create a new costs structure of the appropriate type. Everything else can be virtual functions, with common concepts implemented in the base class rather than in each target's derivation. This might seem like excess C++-ification, but it shaves ~100 LOC. I've also got some follow-on changes that become significantly easier with this patch. Maybe it could help with things like weighting blocks based on frequency too. This will clash with Andre's unrolling patches. His patches have priority so this patch should queue behind them. The x86 and rs6000 parts fully convert to a self-contained class. The equivalent aarch64 changes are more complex, so this patch just does the bare minimum. A later patch will rework the aarch64 bits. gcc/ * target.def (targetm.vectorize.init_cost): Replace with... (targetm.vectorize.create_costs): ...this. (targetm.vectorize.add_stmt_cost): Delete. (targetm.vectorize.finish_cost): Likewise. (targetm.vectorize.destroy_cost_data): Likewise. * doc/tm.texi.in (TARGET_VECTORIZE_INIT_COST): Replace with... (TARGET_VECTORIZE_CREATE_COSTS): ...this. (TARGET_VECTORIZE_ADD_STMT_COST): Delete. (TARGET_VECTORIZE_FINISH_COST): Likewise. (TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise. * doc/tm.texi: Regenerate. * tree-vectorizer.h (vec_info::vec_info): Remove target_cost_data parameter. (vec_info::target_cost_data): Change from a void * to a vector_costs . (vector_costs): New class. (init_cost): Take a vec_info and return a vector_costs. (dump_stmt_cost): Remove data parameter. (add_stmt_cost): Replace vinfo and data parameters with a vector_costs. (add_stmt_costs): Likewise. (finish_cost): Replace data parameter with a vector_costs. (destroy_cost_data): Delete. tree-vectorizer.c (dump_stmt_cost): Remove data argument and don't print it. (vec_info::vec_info): Remove the target_cost_data parameter and initialize the member variable to null instead. (vec_info::~vec_info): Delete target_cost_data instead of calling destroy_cost_data. (vector_costs::add_stmt_cost): New function. (vector_costs::finish_cost): Likewise. (vector_costs::record_stmt_cost): Likewise. (vector_costs::adjust_cost_for_freq): Likewise. * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Update call to vec_info::vec_info. (vect_compute_single_scalar_iteration_cost): Update after above changes to costing interface. (vect_analyze_loop_operations): Likewise. (vect_estimate_min_profitable_iters): Likewise. (vect_analyze_loop_2): Initialize LOOP_VINFO_TARGET_COST_DATA at the start_over point, where it needs to be recreated after trying without slp. Update retry code accordingly. * tree-vect-slp.c (_bb_vec_info::_bb_vec_info): Update call to vec_info::vec_info. (vect_slp_analyze_operation): Update after above changes to costing interface. (vect_bb_vectorization_profitable_p): Likewise. * targhooks.h (default_init_cost): Replace with... (default_vectorize_create_costs): ...this. (default_add_stmt_cost): Delete. (default_finish_cost, default_destroy_cost_data): Likewise. * targhooks.c (default_init_cost): Replace with... (default_vectorize_create_costs): ...this. (default_add_stmt_cost): Delete, moving logic to vector_costs instead. (default_finish_cost, default_destroy_cost_data): Delete. * config/aarch64/aarch64.c (aarch64_vector_costs): Inherit from vector_costs. Add a constructor. (aarch64_init_cost): Replace with... (aarch64_vectorize_create_costs): ...this. (aarch64_add_stmt_cost): Replace with... (aarch64_vector_costs::add_stmt_cost): ...this. Use record_stmt_cost to adjust the cost for inner loops. (aarch64_finish_cost): Replace with... (aarch64_vector_costs::finish_cost): ...this. (aarch64_destroy_cost_data): Delete. (TARGET_VECTORIZE_INIT_COST): Replace with... (TARGET_VECTORIZE_CREATE_COSTS): ...this. (TARGET_VECTORIZE_ADD_STMT_COST): Delete. (TARGET_VECTORIZE_FINISH_COST): Likewise. (TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise. * config/i386/i386.c (ix86_vector_costs): New structure. (ix86_init_cost): Replace with... (ix86_vectorize_create_costs): ...this. (ix86_add_stmt_cost): Replace with... (ix86_vector_costs::add_stmt_cost): ...this. Use adjust_cost_for_freq to adjust the cost for inner loops. (ix86_finish_cost, ix86_destroy_cost_data): Delete. (TARGET_VECTORIZE_INIT_COST): Replace with... (TARGET_VECTORIZE_CREATE_COSTS): ...this. (TARGET_VECTORIZE_ADD_STMT_COST): Delete. (TARGET_VECTORIZE_FINISH_COST): Likewise. (TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise. * config/rs6000/rs6000.c (TARGET_VECTORIZE_INIT_COST): Replace with... (TARGET_VECTORIZE_CREATE_COSTS): ...this. (TARGET_VECTORIZE_ADD_STMT_COST): Delete. (TARGET_VECTORIZE_FINISH_COST): Likewise. (TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise. (rs6000_cost_data): Inherit from vector_costs. Add a constructor. Drop loop_info, cost and costing_for_scalar in favor of the corresponding vector_costs member variables. Add "m_" to the names of the remaining member variables and initialize them. (rs6000_density_test): Replace with... (rs6000_cost_data::density_test): ...this. (rs6000_init_cost): Replace with... (rs6000_vectorize_create_costs): ...this. (rs6000_update_target_cost_per_stmt): Replace with... (rs6000_cost_data::update_target_cost_per_stmt): ...this. (rs6000_add_stmt_cost): Replace with... (rs6000_cost_data::add_stmt_cost): ...this. Use adjust_cost_for_freq to adjust the cost for inner loops. (rs6000_adjust_vect_cost_per_loop): Replace with... (rs6000_cost_data::adjust_vect_cost_per_loop): ...this. (rs6000_finish_cost): Replace with... (rs6000_cost_data::finish_cost): ...this. Group loop code into a single if statement and pass the loop_vinfo down to subroutines. (rs6000_destroy_cost_data): Delete.
2021-11-04	Update TARGET_MEM_REF documentation	Richard Biener	1	-27/+31
	This updates the internals manual documentation of TARGET_MEM_REF and amends MEM_REF. The former was seriously out of date. 2021-11-04 Richard Biener <rguenther@suse.de> gcc/ * doc/generic.texi: Update TARGET_MEM_REF and MEM_REF documentation.
2021-11-03	aarch64: enable Ampere-1 CPU	Philipp Tomsich	1	-1/+1
	This adds support and a basic turning model for the Ampere Computing "Ampere-1" CPU. The Ampere-1 implements the ARMv8.6 architecture in A64 mode and is modelled as a 4-wide issue (as with all modern micro-architectures, the chosen issue rate is a compromise between the maximum dispatch rate and the maximum rate of uops issued to the scheduler). This adds the -mcpu=ampere1 command-line option and the relevant cost information/tuning tables for the Ampere-1. gcc/ChangeLog: * config/aarch64/aarch64-cores.def (AARCH64_CORE): New Ampere-1 core. * config/aarch64/aarch64-tune.md: Regenerate. * config/aarch64/aarch64-cost-tables.h: Add extra costs for Ampere-1. * config/aarch64/aarch64.c: Add tuning structures for Ampere-1. * doc/invoke.texi: Add documentation for Ampere-1 core.
2021-11-02	Adjust testcase for O2 vect.	liuhongt	1	-19/+41
	Adjust code in check_vect_slp_store_usage to make it an exact pattern match of the corresponding testcases. These new target/xfail selectors are added as a temporary solution, and should be removed after real issue is fixed for Wstringop-overflow. gcc/ChangeLog: * doc/sourcebuild.texi (vect_slp_v4qi_store_unalign, vect_slp_v2hi_store_unalign, vect_slp_v4hi_store_unalign, vect_slp_v4si_store_unalign): Document efficient target. (vect_slp_v4qi_store_unalign_1, vect_slp_v8qi_store_unalign_1, vect_slp_v16qi_store_unalign_1): Ditto. (vect_slp_v2hi_store_align,vect_slp_v2qi_store_align, vect_slp_v2si_store_align, vect_slp_v4qi_store_align): Ditto. (struct_4char_block_move, struct_8char_block_move, struct_16char_block_move): Ditto. gcc/testsuite/ChangeLog: PR testsuite/102944 * c-c++-common/Wstringop-overflow-2.c: Adjust target/xfail selector. * gcc.dg/Warray-bounds-48.c: Ditto. * gcc.dg/Warray-bounds-51.c: Ditto. * gcc.dg/Warray-parameter-3.c: Ditto. * gcc.dg/Wstringop-overflow-14.c: Ditto. * gcc.dg/Wstringop-overflow-21.c: Ditto. * gcc.dg/Wstringop-overflow-68.c: Ditto * gcc.dg/Wstringop-overflow-76.c: Ditto * gcc.dg/Wzero-length-array-bounds-2.c: Ditto. * lib/target-supports.exp (vect_slp_v4qi_store_unalign): New efficient target. (vect_slp_v4qi_store_unalign_1): Ditto. (struct_4char_block_move): Ditto. (struct_8char_block_move): Ditto. (stryct_16char_block_move): Ditto. (vect_slp_v2hi_store_align): Ditto. (vect_slp_v2qi_store): Rename to .. (vect_slp_v2qi_store_align): .. this. (vect_slp_v4qi_store): Rename to .. (vect_slp_v4qi_store_align): .. This. (vect_slp_v8qi_store): Rename to .. (vect_slp_v8qi_store_unalign_1): .. This. (vect_slp_v16qi_store): Rename to .. (vect_slp_v16qi_store_unalign_1): .. This. (vect_slp_v2hi_store): Rename to .. (vect_slp_v2hi_store_unalign): .. This. (vect_slp_v4hi_store): Rename to .. (vect_slp_v4hi_store_unalign): This. (vect_slp_v2si_store): Rename to .. (vect_slp_v2si_store_align): .. This. (vect_slp_v4si_store): Rename to .. (vect_slp_v4si_store_unalign): Ditto. (check_vect_slp_aligned_store_usage): Rename to .. (check_vect_slp_store_usage): .. this and adjust code to make it an exact pattern match of corresponding testcase.
2021-11-01	diagnostics: escape non-ASCII source bytes for certain diagnostics	David Malcolm	1	-1/+42
	This patch adds support to GCC's diagnostic subsystem for escaping certain bytes and Unicode characters when quoting source code. Specifically, this patch adds a new flag rich_location::m_escape_on_output which is a hint from a diagnostic that non-ASCII bytes in the pertinent lines of the user's source code should be escaped when printed. The patch sets this for the following diagnostics: - when complaining about stray bytes in the program (when these are non-printable) - when complaining about "null character(s) ignored"); - for -Wnormalized= (and generate source ranges for such warnings) The escaping is controlled by a new option: -fdiagnostics-escape-format=[unicode\|bytes] For example, consider a diagnostic involing a source line containing the string "before" followed by the Unicode character U+03C0 ("GREEK SMALL LETTER PI", with UTF-8 encoding 0xCF 0x80) followed by the byte 0xBF (a stray UTF-8 trailing byte), followed by the string "after", where the diagnostic highlights the U+03C0 character. By default, this line will be printed verbatim to the user when reporting a diagnostic at it, as: beforeπXafter ^ (using X for the stray byte to avoid putting invalid UTF-8 in this commit message) If the diagnostic sets the "escape" flag, it will be printed as: before<U+03C0><BF>after ^~~~~~~~ with -fdiagnostics-escape-format=unicode (the default), or as: before<CF><80><BF>after ^~~~~~~~ if the user supplies -fdiagnostics-escape-format=bytes. This only affects how the source is printed; it does not affect how column numbers that are printed (as per -fdiagnostics-column-unit= and -fdiagnostics-column-origin=). gcc/c-family/ChangeLog: * c-lex.c (c_lex_with_flags): When complaining about non-printable CPP_OTHER tokens, set the "escape on output" flag. gcc/ChangeLog: * common.opt (fdiagnostics-escape-format=): New. (diagnostics_escape_format): New enum. (DIAGNOSTICS_ESCAPE_FORMAT_UNICODE): New enum value. (DIAGNOSTICS_ESCAPE_FORMAT_BYTES): Likewise. * diagnostic-format-json.cc (json_end_diagnostic): Add "escape-source" attribute. * diagnostic-show-locus.c (exploc_with_display_col::exploc_with_display_col): Replace "tabstop" param with a cpp_char_column_policy and add an "aspect" param. Use these to compute m_display_col accordingly. (struct char_display_policy): New struct. (layout::m_policy): New field. (layout::m_escape_on_output): New field. (def_policy): New function. (make_range): Update for changes to exploc_with_display_col ctor. (default_print_decoded_ch): New. (width_per_escaped_byte): New. (escape_as_bytes_width): New. (escape_as_bytes_print): New. (escape_as_unicode_width): New. (escape_as_unicode_print): New. (make_policy): New. (layout::layout): Initialize new fields. Update m_exploc ctor call for above change to ctor. (layout::maybe_add_location_range): Update for changes to exploc_with_display_col ctor. (layout::calculate_x_offset_display): Update for change to cpp_display_width. (layout::print_source_line): Pass policy to cpp_display_width_computation. Capture cpp_decoded_char when calling process_next_codepoint. Move printing of source code to m_policy.m_print_cb. (line_label::line_label): Pass in policy rather than context. (layout::print_any_labels): Update for change to line_label ctor. (get_affected_range): Pass in policy rather than context, updating calls to location_compute_display_column accordingly. (get_printed_columns): Likewise, also for cpp_display_width. (correction::correction): Pass in policy rather than tabstop. (correction::compute_display_cols): Pass m_policy rather than m_tabstop to cpp_display_width. (correction::m_tabstop): Replace with... (correction::m_policy): ...this. (line_corrections::line_corrections): Pass in policy rather than context. (line_corrections::m_context): Replace with... (line_corrections::m_policy): ...this. (line_corrections::add_hint): Update to use m_policy rather than m_context. (line_corrections::add_hint): Likewise. (layout::print_trailing_fixits): Likewise. (selftest::test_display_widths): New. (selftest::test_layout_x_offset_display_utf8): Update to use policy rather than tabstop. (selftest::test_one_liner_labels_utf8): Add test of escaping source lines. (selftest::test_diagnostic_show_locus_one_liner_utf8): Update to use policy rather than tabstop. (selftest::test_overlapped_fixit_printing): Likewise. (selftest::test_overlapped_fixit_printing_utf8): Likewise. (selftest::test_overlapped_fixit_printing_2): Likewise. (selftest::test_tab_expansion): Likewise. (selftest::test_escaping_bytes_1): New. (selftest::test_escaping_bytes_2): New. (selftest::diagnostic_show_locus_c_tests): Call the new tests. * diagnostic.c (diagnostic_initialize): Initialize context->escape_format. (convert_column_unit): Update to use default character width policy. (selftest::test_diagnostic_get_location_text): Likewise. * diagnostic.h (enum diagnostics_escape_format): New enum. (diagnostic_context::escape_format): New field. * doc/invoke.texi (-fdiagnostics-escape-format=): New option. (-fdiagnostics-format=): Add "escape-source" attribute to examples of JSON output, and document it. * input.c (location_compute_display_column): Pass in "policy" rather than "tabstop", passing to cpp_byte_column_to_display_column. (selftest::test_cpp_utf8): Update to use cpp_char_column_policy. * input.h (class cpp_char_column_policy): New forward decl. (location_compute_display_column): Pass in "policy" rather than "tabstop". * opts.c (common_handle_option): Handle OPT_fdiagnostics_escape_format_. * selftest.c (temp_source_file::temp_source_file): New ctor overload taking a size_t. * selftest.h (temp_source_file::temp_source_file): Likewise. gcc/testsuite/ChangeLog: * c-c++-common/diagnostic-format-json-1.c: Add regexp to consume "escape-source" attribute. * c-c++-common/diagnostic-format-json-2.c: Likewise. * c-c++-common/diagnostic-format-json-3.c: Likewise. * c-c++-common/diagnostic-format-json-4.c: Likewise, twice. * c-c++-common/diagnostic-format-json-5.c: Likewise. * gcc.dg/cpp/warn-normalized-4-bytes.c: New test. * gcc.dg/cpp/warn-normalized-4-unicode.c: New test. * gcc.dg/encoding-issues-bytes.c: New test. * gcc.dg/encoding-issues-unicode.c: New test. * gfortran.dg/diagnostic-format-json-1.F90: Add regexp to consume "escape-source" attribute. * gfortran.dg/diagnostic-format-json-2.F90: Likewise. * gfortran.dg/diagnostic-format-json-3.F90: Likewise. libcpp/ChangeLog: * charset.c (convert_escape): Use encoding_rich_location when complaining about nonprintable unknown escape sequences. (cpp_display_width_computation::::cpp_display_width_computation): Pass in policy rather than tabstop. (cpp_display_width_computation::process_next_codepoint): Add "out" param and populate out if non-NULL. (cpp_display_width_computation::advance_display_cols): Pass NULL to process_next_codepoint. (cpp_byte_column_to_display_column): Pass in policy rather than tabstop. Pass NULL to process_next_codepoint. (cpp_display_column_to_byte_column): Pass in policy rather than tabstop. errors.c (cpp_diagnostic_get_current_location): New function, splitting out the logic from... (cpp_diagnostic): ...here. (cpp_warning_at): New function. (cpp_pedwarning_at): New function. * include/cpplib.h (cpp_warning_at): New decl for rich_location. (cpp_pedwarning_at): Likewise. (struct cpp_decoded_char): New. (struct cpp_char_column_policy): New. (cpp_display_width_computation::cpp_display_width_computation): Replace "tabstop" param with "policy". (cpp_display_width_computation::process_next_codepoint): Add "out" param. (cpp_display_width_computation::m_tabstop): Replace with... (cpp_display_width_computation::m_policy): ...this. (cpp_byte_column_to_display_column): Replace "tabstop" param with "policy". (cpp_display_width): Likewise. (cpp_display_column_to_byte_column): Likewise. * include/line-map.h (rich_location::escape_on_output_p): New. (rich_location::set_escape_on_output): New. (rich_location::m_escape_on_output): New. * internal.h (cpp_diagnostic_get_current_location): New decl. (class encoding_rich_location): New. * lex.c (skip_whitespace): Use encoding_rich_location when complaining about null characters. (warn_about_normalization): Generate a source range when complaining about improperly normalized tokens, rather than just a point, and use encoding_rich_location so that the source code is escaped on printing. * line-map.c (rich_location::rich_location): Initialize m_escape_on_output. Signed-off-by: David Malcolm <dmalcolm@redhat.com>
2021-11-01	Rename duplicate_loop_to_header_edge to duplicate_loop_body_to_header_edge	Xionghu Luo	1	-2/+2
	gcc/ChangeLog: 2021-11-01 Xionghu Luo <luoxhu@linux.ibm.com> * cfghooks.c (cfg_hook_duplicate_loop_to_header_edge): Rename duplicate_loop_to_header_edge to duplicate_loop_body_to_header_edge. (cfg_hook_duplicate_loop_body_to_header_edge): Likewise. * cfghooks.h (struct cfg_hooks): Likewise. (cfg_hook_duplicate_loop_body_to_header_edge): Likewise. * cfgloopmanip.c (duplicate_loop_body_to_header_edge): Likewise. (clone_loop_to_header_edge): Likewise. * cfgloopmanip.h (duplicate_loop_body_to_header_edge): Likewise. * cfgrtl.c (struct cfg_hooks): Likewise. * doc/loop.texi: Likewise. * loop-unroll.c (unroll_loop_constant_iterations): Likewise. (unroll_loop_runtime_iterations): Likewise. (unroll_loop_stupid): Likewise. (apply_opt_in_copies): Likewise. * tree-cfg.c (struct cfg_hooks): Likewise. * tree-ssa-loop-ivcanon.c (try_unroll_loop_completely): Likewise. (try_peel_loop): Likewise. * tree-ssa-loop-manip.c (copy_phi_node_args): Likewise. (gimple_duplicate_loop_body_to_header_edge): Likewise. (tree_transform_and_unroll_loop): Likewise. * tree-ssa-loop-manip.h (gimple_duplicate_loop_body_to_header_edge): Likewise.