Age | Commit message (Collapse) | Author | Files | Lines |
|
This final patch in the series is much simpler and adds command-line support for -march=armv8.8-a,
making use of the +mops features added in the previous patches.
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/ChangeLog:
* config/aarch64/aarch64-arches.def (armv8.8-a): Define.
* config/aarch64/aarch64.h (AARCH64_FL_V8_8): Define.
(AARCH64_FL_FOR_ARCH8_8): Define.
* doc/invoke.texi: Document -march=armv8.8-a.
|
|
This patch adds the +mops architecture extension flag from the 2021 Arm Architecture extensions, Armv8.8-a.
The +mops extensions introduce instructions to accelerate the memcpy, memset, memmove standard functions.
The first patch here uses the instructions in the inline memcpy expansion.
Further patches in the series will use similar instructions to inline memmove and memset.
A new param, aarch64-mops-memcpy-size-threshold, is introduced to control the size threshold above which to
emit the new sequence. Its default setting is 256 bytes, which is the same as the current threshold above
which we'd emit a libcall.
Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/ChangeLog:
* config/aarch64/aarch64-option-extensions.def (mops): Define.
* config/aarch64/aarch64.c (aarch64_expand_cpymem_mops): Define.
(aarch64_expand_cpymem): Define.
* config/aarch64/aarch64.h (AARCH64_FL_MOPS): Define.
(AARCH64_ISA_MOPS): Define.
(TARGET_MOPS): Define.
(MOVE_RATIO): Adjust for TARGET_MOPS.
* config/aarch64/aarch64.md ("unspec"): Add UNSPEC_CPYMEM.
(aarch64_cpymemdi): New pattern.
(cpymemdi): Adjust for TARGET_MOPS.
* config/aarch64/aarch64.opt (aarch64-mops-memcpy-size-threshol):
New param.
* doc/invoke.texi (AArch64 Options): Document +mops.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/mops_1.c: New test.
|
|
gcc/ChangeLog:
* doc/extend.texi: Use @item for the first @itemx entry.
|
|
as dicussed in PR ipa/103454 there are several benchmarks that regresses
for -finline-functions-called once. Runtmes:
- tramp3d with -Ofast. 31%
- exchange2 with -Ofast 11-21%
- roms O2 9%-10%
- tonto 2.5-3.5% with LTO
Build times:
- specfp2006 41% (mostly wrf that builds 71% faster)
- specint2006 1.5-3%
- specfp2017 64% (again mostly wrf)
- specint2017 2.5-3.5%
This patch adds two params to tweak the behaviour:
1) max-inline-functions-called-once-loop-depth limiting the loop depth
(this is useful primarily for exchange where the inlined function is in
loop depth 9)
2) max-inline-functions-called-once-insns
We already have large-function-insns/growth parameters, but these are
limiting also inlining small functions, so reducing them will regress
very large functions that are hot.
Because inlining functions called once is meant just as a cleanup pass
I think it makes sense to have separate limit for it.
gcc/ChangeLog:
2021-12-09 Jan Hubicka <hubicka@ucw.cz>
* doc/invoke.texi (max-inline-functions-called-once-loop-depth,
max-inline-functions-called-once-insns): New parameters.
* ipa-inline.c (check_callers): Handle
param_inline_functions_called_once_loop_depth and
param_inline_functions_called_once_insns.
(edge_badness): Fix linebreaks.
* params.opt (param=max-inline-functions-called-once-loop-depth,
param=max-inline-functions-called-once-insn): New params.
|
|
Resolves:
PR middle-end/101751 - attribute access none with void pointer expects nonzero size
gcc/ChangeLog:
PR middle-end/101751
* doc/extend.texi (attribute access): Adjust.
* gimple-ssa-warn-access.cc (pass_waccess::maybe_check_access_sizes):
Treat access mode none on a void* argument as expecting as few as
zero bytes.
gcc/testsuite/ChangeLog:
PR middle-end/101751
* gcc.dg/Wstringop-overflow-86.c: New test.
|
|
The following patch adds support for relocation of the PCH blob on PCH
restore if we don't manage to get the preferred map slot for it.
The GTY stuff knows where all the pointers are, after all it relocates
it once during PCH save from the addresses where it was initially allocated
to addresses in the preferred map slot.
But, if we were to do it solely using GTY info upon PCH restore, we'd need
another set of GTY functions, which I think would make it less maintainable
and I think it would also be more costly at PCH restore time. Those
functions would need to call something to add bias to pointers that haven't
been marked yet and make sure not to add bias to any pointer twice.
So, this patch instead builds a relocation table (sorted list of addresses
in the blob which needs relocation) at PCH save time, stores it in a very
compact form into the gch file and upon restore, adjusts pointers in GTY
roots (that is right away in the root structures) and the addresses in the
relocation table.
The cost on stdc++.gch/O2g.gch (previously 85MB large) is about 3% file size
growth, there are 2.5 million pointers that need relocation in the gch blob
and the relocation table uses uleb128 for address deltas and needs ~1.01 bytes
for one address that needs relocation, and about 20% compile time during
PCH save (I think it is mainly because of the need to qsort those 2.5
million pointers). On PCH restore, if it doesn't need relocation (the usual
case), it is just an extra fread of sizeof (size_t) data and fseek
(in my tests real time on vanilla tree for #include <bits/stdc++.h> CU
was ~0.175s and with the patch but no relocation ~0.173s), while if it needs
relocation it took ~0.193s, i.e. 11.5% slower.
Without PCH that
#include <bits/stdc++.h>
int i;
testcase compiles with -O2 -g in ~1.199s, i.e. 6.2 times slower than PCH with
relocation and 6.9 times than PCH without relocation.
The discovery of the pointers in the blob that need relocation is done
in the relocate_ptrs hook which does the pointer relocation during PCH save.
Unfortunately, I had to make one change to the gengtype stuff due to the
nested_ptr feature of GTY, which some libcpp headers and stringpool.c use.
The relocate_ptrs hook had 2 arguments, pointer to the pointer and a cookie.
When relocate_ptrs is done, in most cases it is called solely on the
subfields of the current object, so e.g.
if ((void *)(x) == this_obj)
op (&((*x).u.fld[0].rt_rtx), cookie);
so relocate_ptrs can assert that ptr_p is within the
state->ptrs[state->ptrs_i]->obj ..
state->ptrs[state->ptrs_i]->obj+state->ptrs[state->ptrs_i]->size-sizeof(void*)
range and compute from that the address in the blob which will need
relocation (state->ptrs[state->ptrs_i]->new_addr is the new address
given to it and ptr_p-state->ptrs[state->ptrs_i]->obj is the relative
offset. Unfortunately, for nested_ptr gengtype emits something like:
{
union tree_node * x0 =
((*x).val.node.node) ? HT_IDENT_TO_GCC_IDENT (HT_NODE (((*x).val.node.node))) : NULL;
if ((void *)(x) == this_obj)
op (&(x0), cookie);
(*x).val.node.node = (x0) ? CPP_HASHNODE (GCC_IDENT_TO_HT_IDENT ((x0))) : NULL;
}
so relocate_ptrs is called with an address of some temporary variable and
so doesn't know where the pointer will finally be.
So, I've added another argument to relocate_ptrs (and to
gt_pointer_operator). For the most common case I pass NULL as the new middle
argument to that function, first one remains pointer to the pointer that
needs adjustment and last the cookie. The NULL seems to be cheap to compute
and short in the gt*.[ch] files and stands for ptr_p is an address within
the this_obj's range, remember its address. For the nested_ptr case, the
new middle argument contains actual address of the pointer that might need
to be relocated, so instead of the above
op (&(x0), &((*x).val.node.node), cookie);
in there. And finally, e.g. for the reorder case I need a way to tell
restore_ptrs to ignore a particular address for the relocation purposes
and only treat it the old way. I've used for that the case when
the first and second arguments are equal.
In order to enable support for mapping PCH as fallback at different
addresses than the preferred ones, a small change is needed to the
host pch_use_address hooks. One change I've done to all of them is
the change of the type of the first argument from void * to void *&,
such that the actual address can be told to the callers (or shall I
instead use void **?), but another change that still needs to be done
in them if they want the relocation is actually not fail if they couldn't
get a preferred address, but instead modify what the first argument
refers to. I've done that only for host-linux.c and Iain is testing
similar change for host-darwin.c. Didn't change hpux, netbsd, openbsd,
solaris, mingw32 or the fallbacks because I can't test those.
Tested also with the:
--- gcc/config/host-linux.c.jj 2021-12-06 22:22:42.007777367 +0100
+++ gcc/config/host-linux.c 2021-12-07 00:21:53.052674040 +0100
@@ -191,6 +191,8 @@ linux_gt_pch_use_address (void *&base, s
if (size == 0)
return -1;
+base = (char *) base + ((size + 8191) & (size_t) -4096);
+
/* Try to map the file with MAP_PRIVATE. */
addr = mmap (base, size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, offset);
hack which forces all PCH restores to be relocated. An earlier version of the
patch has been also regrest with base = (char *) base + 16384; in that spot,
so both relocation to a non-overlapping spot and to an overlapping spot have
been tested.
2021-12-09 Jakub Jelinek <jakub@redhat.com>
PR pch/71934
* coretypes.h (gt_pointer_operator): Use 3 pointer arguments instead
of two.
* gengtype.c (struct walk_type_data): Add in_nested_ptr argument.
(walk_type): Temporarily set d->in_nested_ptr around nested_ptr
handling.
(write_types_local_user_process_field): Pass a new middle pointer
to gt_pointer_operator op calls, if d->in_nested_ptr pass there
address of d->prev_val[2], otherwise NULL.
(write_types_local_process_field): Likewise.
* ggc-common.c (relocate_ptrs): Add real_ptr_p argument. If equal
to ptr_p, do nothing, otherwise if NULL remember ptr_p's
or if non-NULL real_ptr_p's corresponding new address in
reloc_addrs_vec.
(reloc_addrs_vec): New variable.
(compare_ptr, read_uleb128, write_uleb128): New functions.
(gt_pch_save): When iterating over objects through relocate_ptrs,
save current i into state.ptrs_i. Sort reloc_addrs_vec and emit
it as uleb128 of differences between pointer addresses into the
PCH file.
(gt_pch_restore): Allow restoring of PCH to a different address
than the preferred one, in that case adjust global pointers by bias
and also adjust by bias addresses read from the relocation table
as uleb128 differences. Otherwise fseek over it. Perform
gt_pch_restore_stringpool only after adjusting callbacks and for
callback adjustments also take into account the bias.
(default_gt_pch_use_address): Change type of first argument from
void * to void *&.
(mmap_gt_pch_use_address): Likewise.
* ggc-tests.c (gt_pch_nx): Pass NULL as new middle argument to op.
* hash-map.h (hash_map::pch_nx_helper): Likewise.
(gt_pch_nx): Likewise.
* hash-set.h (gt_pch_nx): Likewise.
* hash-table.h (gt_pch_nx): Likewise.
* hash-traits.h (ggc_remove::pch_nx): Likewise.
* hosthooks-def.h (default_gt_pch_use_address): Change type of first
argument from void * to void *&.
(mmap_gt_pch_use_address): Likewise.
* hosthooks.h (struct host_hooks): Change type of first argument of
gt_pch_use_address hook from void * to void *&.
* machmode.h (gt_pch_nx): Expect a callback with 3 pointers instead of
two in the middle argument.
* poly-int.h (gt_pch_nx): Likewise.
* stringpool.c (gt_pch_nx): Pass NULL as new middle argument to op.
* tree-cfg.c (gt_pch_nx): Likewise, except for LOCATION_BLOCK pass
the same &(block) twice.
* value-range.h (gt_pch_nx): Pass NULL as new middle argument to op.
* vec.h (gt_pch_nx): Likewise.
* wide-int.h (gt_pch_nx): Likewise.
* config/host-darwin.c (darwin_gt_pch_use_address): Change type of
first argument from void * to void *&.
* config/host-darwin.h (darwin_gt_pch_use_address): Likewise.
* config/host-hpux.c (hpux_gt_pch_use_address): Likewise.
* config/host-linux.c (linux_gt_pch_use_address): Likewise. If
it couldn't succeed to mmap at the preferred location, set base
to the actual one. Update addr in the manual reading loop instead of
base.
* config/host-netbsd.c (netbsd_gt_pch_use_address): Change type of
first argument from void * to void *&.
* config/host-openbsd.c (openbsd_gt_pch_use_address): Likewise.
* config/host-solaris.c (sol_gt_pch_use_address): Likewise.
* config/i386/host-mingw32.c (mingw32_gt_pch_use_address): Likewise.
* config/rs6000/rs6000-gen-builtins.c (write_init_file): Pass NULL
as new middle argument to op in the generated code.
* doc/gty.texi: Adjust samples for the addition of middle pointer
to gt_pointer_operator callback.
gcc/ada/
* gcc-interface/decl.c (gt_pch_nx): Pass NULL as new middle argument
to op.
gcc/c-family/
* c-pch.c (c_common_no_more_pch): Pass a temporary void * var
with NULL value instead of NULL to host_hooks.gt_pch_use_address.
gcc/c/
* c-decl.c (resort_field_decl_cmp): Pass the same pointer twice
to resort_data.new_value.
gcc/cp/
* module.cc (nop): Add another void * argument.
* name-lookup.c (resort_member_name_cmp): Pass the same pointer twice
to resort_data.new_value.
|
|
MIPS release 6 requires the lw/ld/sw/sd can work with
unaligned address, while it can be implemented by
full hardware or trap&emulate.
Since it doesn't have to be fully done by hardware, we add a
pair of options -m(no-)unaligned-access. Kernels may need them.
gcc/ChangeLog:
* config/mips/mips.h (ISA_HAS_UNALIGNED_ACCESS, STRICT_ALIGNMENT):
R6 can unaligned access.
* config/mips/mips.md (movmisalign<mode>): Likewise.
* config/mips/mips.opt: add -m(no-)unaligned-access
* doc/invoke.texi: Likewise.
gcc/testsuite/ChangeLog:
* gcc.target/mips/mips.exp: add unaligned-access
* gcc.target/mips/unaligned-2.c: New test.
* gcc.target/mips/unaligned-3.c: New test.
|
|
Add -mmove-max=bits and -mstore-max=bits to enable 256-bit/512-bit move
and store, independent of -mprefer-vector-width=bits:
1. Add X86_TUNE_AVX512_MOVE_BY_PIECES and X86_TUNE_AVX512_STORE_BY_PIECES
which are enabled for Intel Sapphire Rapids processor.
2. Add -mmove-max=bits to set the maximum number of bits can be moved from
memory to memory efficiently. The default value is derived from
X86_TUNE_AVX512_MOVE_BY_PIECES, X86_TUNE_AVX256_MOVE_BY_PIECES, and the
preferred vector width.
3. Add -mstore-max=bits to set the maximum number of bits can be stored to
memory efficiently. The default value is derived from
X86_TUNE_AVX512_STORE_BY_PIECES, X86_TUNE_AVX256_STORE_BY_PIECES and the
preferred vector width.
gcc/
PR target/103269
* config/i386/i386-expand.c (ix86_expand_builtin): Pass PVW_NONE
and PVW_NONE to ix86_target_string.
* config/i386/i386-options.c (ix86_target_string): Add arguments
for move_max and store_max.
(ix86_target_string::add_vector_width): New lambda.
(ix86_debug_options): Pass ix86_move_max and ix86_store_max to
ix86_target_string.
(ix86_function_specific_print): Pass ptr->x_ix86_move_max and
ptr->x_ix86_store_max to ix86_target_string.
(ix86_valid_target_attribute_tree): Handle x_ix86_move_max and
x_ix86_store_max.
(ix86_option_override_internal): Set the default x_ix86_move_max
and x_ix86_store_max.
* config/i386/i386-options.h (ix86_target_string): Add
prefer_vector_width and prefer_vector_width.
* config/i386/i386.h (TARGET_AVX256_MOVE_BY_PIECES): Removed.
(TARGET_AVX256_STORE_BY_PIECES): Likewise.
(MOVE_MAX): Use 64 if ix86_move_max or ix86_store_max ==
PVW_AVX512. Use 32 if ix86_move_max or ix86_store_max >=
PVW_AVX256.
(STORE_MAX_PIECES): Use 64 if ix86_store_max == PVW_AVX512.
Use 32 if ix86_store_max >= PVW_AVX256.
* config/i386/i386.opt: Add -mmove-max=bits and -mstore-max=bits.
* config/i386/x86-tune.def (X86_TUNE_AVX512_MOVE_BY_PIECES): New.
(X86_TUNE_AVX512_STORE_BY_PIECES): Likewise.
* doc/invoke.texi: Document -mmove-max=bits and -mstore-max=bits.
gcc/testsuite/
PR target/103269
* gcc.target/i386/pieces-memcpy-17.c: New test.
* gcc.target/i386/pieces-memcpy-18.c: Likewise.
* gcc.target/i386/pieces-memcpy-19.c: Likewise.
* gcc.target/i386/pieces-memcpy-20.c: Likewise.
* gcc.target/i386/pieces-memcpy-21.c: Likewise.
* gcc.target/i386/pieces-memset-45.c: Likewise.
* gcc.target/i386/pieces-memset-46.c: Likewise.
* gcc.target/i386/pieces-memset-47.c: Likewise.
* gcc.target/i386/pieces-memset-48.c: Likewise.
* gcc.target/i386/pieces-memset-49.c: Likewise.
|
|
1. On some targets, like PowerPC, reference to ifunc function resolver
must be non-local so that compiler will properly emit PLT call. Add
TARGET_IFUNC_REF_LOCAL_OK to allow binding indirect function resolver
locally for targets which don't require special PLT call sequence.
2. Add ix86_call_use_plt_p to call local ifunc function resolvers via
PLT.
gcc/
PR target/51469
PR target/83782
* target.def (ifunc_ref_local_ok): Add a target hook.
* varasm.c (default_binds_local_p_3): Force indirect function
resolver non-local only if targetm.ifunc_ref_local_ok returns
false.
* config/i386/i386-expand.c (ix86_expand_call): Call
ix86_call_use_plt_p to check if PLT should be used.
* config/i386/i386-protos.h (ix86_call_use_plt_p): New.
* config/i386/i386.c (output_pic_addr_const): Call
ix86_call_use_plt_p to check if "@PLT" is needed.
(ix86_call_use_plt_p): New.
(TARGET_IFUNC_REF_LOCAL_OK): New.
* doc/tm.texi.in: Add TARGET_IFUNC_REF_LOCAL_OK.
* doc/tm.texi: Regenerated.
gcc/testsuite/
PR target/51469
PR target/83782
* gcc.target/i386/pr83782-1.c: New test.
* gcc.target/i386/pr83782-2.c: Likewise.
|
|
So, if we want to make PCH work for PIEs, I'd say we can:
1) add a new GTY option, say callback, which would act like
skip for non-PCH and for PCH would make us skip it but
remember for address bias translation
2) drop the skip for tree_translation_unit_decl::language
3) change get_unnamed_section to have const char * as
last argument instead of const void *, change
unnamed_section::data also to const char * and update
everything related to that
4) maybe add a host hook whether it is ok to support binaries
changing addresses (the only thing I'm worried is if
some host that uses function descriptors allocates them
dynamically instead of having them somewhere in the
executable)
5) maybe add a gengtype warning if it sees in GTY tracked
structure a function pointer without that new callback
option
Here is 1), 2), 3) implemented.
Note, on stdc++.h.gch/O2g.gch there are just those 10 relocations without
the second patch, with it a few more, but nothing huge. And for non-PIEs
there isn't really any extra work on the load side except freading two scalar
values and fseek.
2021-12-03 Jakub Jelinek <jakub@redhat.com>
PR pch/71934
gcc/
* ggc.h (gt_pch_note_callback): Declare.
* gengtype.h (enum typekind): Add TYPE_CALLBACK.
(callback_type): Declare.
* gengtype.c (dbgprint_count_type_at): Handle TYPE_CALLBACK.
(callback_type): New variable.
(process_gc_options): Add CALLBACK argument, handle callback
option.
(set_gc_used_type): Adjust process_gc_options caller, if callback,
set type to &callback_type.
(output_mangled_typename): Handle TYPE_CALLBACK.
(walk_type): Likewise. Handle callback option.
(write_types_process_field): Handle TYPE_CALLBACK.
(write_types_local_user_process_field): Likewise.
(write_types_local_process_field): Likewise.
(write_root): Likewise.
(dump_typekind): Likewise.
(dump_type): Likewise.
* gengtype-state.c (type_lineloc): Handle TYPE_CALLBACK.
(state_writer::write_state_callback_type): New method.
(state_writer::write_state_type): Handle TYPE_CALLBACK.
(read_state_callback_type): New function.
(read_state_type): Handle TYPE_CALLBACK.
* ggc-common.c (callback_vec): New variable.
(gt_pch_note_callback): New function.
(gt_pch_save): Stream out gt_pch_save function address and relocation
table.
(gt_pch_restore): Stream in saved gt_pch_save function address and
relocation table and apply relocations if needed.
* doc/gty.texi (callback): Document new GTY option.
* varasm.c (get_unnamed_section): Change callback argument's type and
last argument's type from const void * to const char *.
(output_section_asm_op): Change argument's type from const void *
to const char *, remove unnecessary cast.
* tree-core.h (struct tree_translation_unit_decl): Drop GTY((skip))
from language member.
* output.h (unnamed_section_callback): Change argument type from
const void * to const char *.
(struct unnamed_section): Use GTY((callback)) instead of GTY((skip))
for callback member. Change data member type from const void *
to const char *.
(struct noswitch_section): Use GTY((callback)) instead of GTY((skip))
for callback member.
(get_unnamed_section): Change callback argument's type and
last argument's type from const void * to const char *.
(output_section_asm_op): Change argument's type from const void *
to const char *.
* config/avr/avr.c (avr_output_progmem_section_asm_op): Likewise.
Remove unneeded cast.
* config/darwin.c (output_objc_section_asm_op): Change argument's type
from const void * to const char *.
* config/pa/pa.c (som_output_text_section_asm_op): Likewise.
(som_output_comdat_data_section_asm_op): Likewise.
* config/rs6000/rs6000.c (rs6000_elf_output_toc_section_asm_op):
Likewise.
(rs6000_xcoff_output_readonly_section_asm_op): Likewise. Instead
of dereferencing directive hardcode variable names and decide based on
whether directive is NULL or not.
(rs6000_xcoff_output_readwrite_section_asm_op): Change argument's type
from const void * to const char *.
(rs6000_xcoff_output_tls_section_asm_op): Likewise. Instead
of dereferencing directive hardcode variable names and decide based on
whether directive is NULL or not.
(rs6000_xcoff_output_toc_section_asm_op): Change argument's type
from const void * to const char *.
(rs6000_xcoff_asm_init_sections): Adjust get_unnamed_section callers.
gcc/c-family/
* c-pch.c (struct c_pch_validity): Remove pch_init member.
(pch_init): Don't initialize v.pch_init.
(c_common_valid_pch): Don't warn and punt if .text addresses change.
libcpp/
* include/line-map.h (class line_maps): Add GTY((callback)) to
reallocator and round_alloc_size members.
|
|
FreeBSD 1 and FreeBSD 2, both still a.out, have been end of life for
over two decades and GCC has not been supporting them for ages, too,
so simply remove references.
gcc:
* doc/install.texi (*-*-freebsd*): Remove references to
FreeBSD 1 and FreeBSD 2.
|
|
PR gcov-profile/96092
gcc/ChangeLog:
* common.opt: New option.
* coverage.c (coverage_begin_function): Emit filename with
remap_profile_filename.
* doc/invoke.texi: Document the new option.
* file-prefix-map.c (add_profile_prefix_map): New.
(remap_profile_filename): Likewise.
* file-prefix-map.h (add_profile_prefix_map): Likewise.
(remap_profile_filename): Likewise.
* lto-opts.c (lto_write_options): Handle
OPT_fprofile_prefix_map_.
* opts-global.c (handle_common_deferred_options): Likewise.
* opts.c (common_handle_option): Likewise.
(gen_command_line_string): Likewise.
* profile.c (output_location): Emit filename with
remap_profile_filename.
|
|
For PR61825, honza changed tree_single_nonzero_warnv_p to prevent a later
declaration from marking a function as weak after we've determined that it
wasn't weak before. But we shouldn't do that for speculative folding; we
should only do it when we actually need a constant value. In C++, such a
context is called "manifestly constant-evaluated". In fold, this seems to
correspond to the folding_initializer flag, since in C this situation only
occurs in static initializers.
This change makes nonzero-1.c well-formed; I've added a nonzero-1a.c to
verify that we delete the null check eventually if there is no weak
redeclaration.
The varasm.c change is so that if we do get the weak redeclaration error, we
get it at the position of the weak declaration rather than the previous
declaration.
Using the FOLD_INIT paths also affects floating point arithmetic: notably,
this makes floating point division by zero in a manifestly
constant-evaluated context constant, as in a C static initializer. I've had
some success convincing CWG that this is the right direction; C++ should
follow C's floating point semantics more than we have been doing, and Joseph
says that the C policy is that Annex F overrides other parts of the standard
that say that some operations are undefined. But since we're in stage 3,
I'm only making this change with the new flag -fconstexpr-fp-except. It may
turn on by default in a future release.
I think this distinction is only relevant for binary operations; arithmetic
for the floating point case, comparison for possibly non-zero addresses.
PR c++/103310
gcc/ChangeLog:
* fold-const.c (maybe_nonzero_address): Use get_create or get
depending on folding_initializer.
(fold_binary_initializer_loc): New.
* fold-const.h (fold_binary_initializer_loc): Declare.
* varasm.c (mark_weak): Don't use the decl location.
* doc/invoke.texi: Document -fconstexpr-fp-except.
gcc/c-family/ChangeLog:
* c.opt: Add -fconstexpr-fp-except.
gcc/cp/ChangeLog:
* constexpr.c (cxx_eval_binary_expression): Use
fold_binary_initializer_loc if manifestly cxeval.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/constexpr-fp-except1.C: New test.
* g++.dg/cpp1z/constexpr-if36.C: New test.
* gcc.dg/tree-ssa/nonzero-1.c: Now well-formed.
* gcc.dg/tree-ssa/nonzero-1a.c: New test.
|
|
bootstrap.
gcc/ChangeLog:
* doc/install.texi (Prerequisites): Add note that D front end now
requires GDC installed in order to bootstrap.
(Building): Add D compiler section, referencing prerequisites.
|
|
This patch adds SLP support for IFN_GATHER_LOAD. Like the SLP
support for IFN_MASK_LOAD, it works by treating only some of the
arguments as child nodes. Unlike IFN_MASK_LOAD, it requires the
other arguments (base, scale, and extension type) to be the same
for all calls in the group. It does not require/expect the loads
to be in a group (which probably wouldn't make sense for gathers).
I was worried about the possible alias effect of moving gathers
around to be part of the same SLP group. The patch therefore
makes vect_analyze_data_ref_dependence treat gathers and scatters
as a top-level concern, punting if the accesses aren't completely
independent and if the user hasn't told us that a particular
VF is safe. I think in practice we already punted in the same
circumstances; the idea is just to make it more explicit.
gcc/
PR tree-optimization/102467
* doc/sourcebuild.texi (vect_gather_load_ifn): Document.
* tree-vect-data-refs.c (vect_analyze_data_ref_dependence):
Commonize safelen handling. Punt for anything involving
gathers and scatters unless safelen says otherwise.
* tree-vect-slp.c (arg1_map): New variable.
(vect_get_operand_map): Handle IFN_GATHER_LOAD.
(vect_build_slp_tree_1): Likewise.
(vect_build_slp_tree_2): Likewise.
(compatible_calls_p): If vect_get_operand_map returns nonnull,
check that any skipped arguments are equal.
(vect_slp_analyze_node_operations_1): Tighten reduction check.
* tree-vect-stmts.c (check_load_store_for_partial_vectors): Take
an ncopies argument.
(vect_get_gather_scatter_ops): Take slp_node and ncopies arguments.
Handle SLP nodes.
(vectorizable_store, vectorizable_load): Adjust accordingly.
gcc/testsuite/
* lib/target-supports.exp
(check_effective_target_vect_gather_load_ifn): New target test.
* gcc.dg/vect/vect-gather-1.c: New test.
* gcc.dg/vect/vect-gather-2.c: Likewise.
* gcc.target/aarch64/sve/gather_load_11.c: Likewise.
|
|
This patch adds support for reductions involving calls to fmax*()
and fmin*(), without the -ffast-math flags that allow them to be
converted to MAX_EXPR and MIN_EXPR.
gcc/
* doc/md.texi (reduc_fmin_scal_@var{m}): Document.
(reduc_fmax_scal_@var{m}): Likewise.
* optabs.def (reduc_fmax_scal_optab): New optab.
(reduc_fmin_scal_optab): Likewise
* internal-fn.def (REDUC_FMAX, REDUC_FMIN): New functions.
* tree-vect-loop.c (reduction_fn_for_scalar_code): Handle
CASE_CFN_FMAX and CASE_CFN_FMIN.
(neutral_op_for_reduction): Likewise.
(needs_fold_left_reduction_p): Likewise.
* config/aarch64/iterators.md (FMAXMINV): New iterator.
(fmaxmin): Handle UNSPEC_FMAXNMV and UNSPEC_FMINNMV.
* config/aarch64/aarch64-simd.md (reduc_<optab>_scal_<mode>): Fix
unspec mode.
(reduc_<fmaxmin>_scal_<mode>): New pattern.
* config/aarch64/aarch64-sve.md (reduc_<fmaxmin>_scal_<mode>):
Likewise.
gcc/testsuite/
* gcc.dg/vect/vect-fmax-1.c: New test.
* gcc.dg/vect/vect-fmax-2.c: Likewise.
* gcc.dg/vect/vect-fmax-3.c: Likewise.
* gcc.dg/vect/vect-fmin-1.c: New test.
* gcc.dg/vect/vect-fmin-2.c: Likewise.
* gcc.dg/vect/vect-fmin-3.c: Likewise.
* gcc.target/aarch64/fmaxnm_1.c: Likewise.
* gcc.target/aarch64/fmaxnm_2.c: Likewise.
* gcc.target/aarch64/fminnm_1.c: Likewise.
* gcc.target/aarch64/fminnm_2.c: Likewise.
* gcc.target/aarch64/sve/fmaxnm_2.c: Likewise.
* gcc.target/aarch64/sve/fmaxnm_3.c: Likewise.
* gcc.target/aarch64/sve/fminnm_2.c: Likewise.
* gcc.target/aarch64/sve/fminnm_3.c: Likewise.
|
|
gcc/ChangeLog:
* doc/invoke.texi: Use @option for -Wuninitialized.
|
|
The following patch implements the C++23 Multidimensional subscript operator
P2128R6 paper.
As C++20 and older only allow a single expression in between []s (albeit
for C++20 with a deprecation warning if it is a comma expression) and even
in C++23 and for the coming years I think the vast majority of subscript
expressions will still have a single expression and even in C++23 it is
quite special, as e.g. the builtin operator requires exactly one
assignment expression, the patch attempts to optimize for that case and
if possible not to slow down that common case (or use more memory for it).
So, already during parsing it differentiates between that (uses a single
index_exp tree in that case) and the new cases (zero or two+ expressions
in the list), for which it sets index_exp to NULL_TREE and uses a
releasing_vec instead similarly to how e.g. finish_call_expr uses it.
In call.c it introduces new functions build_op_subscript{,_1} which are
something in between build_new_op{,_1} and build_op_call{,_1}.
The former requires fixed number of arguments (and the patch still uses
it for the common case of subscript with exactly one index expression),
the latter handles variable number of arguments but is too CALL_EXPR specific
and handles various cases that are unnecessary for the subscript.
Right now the subscript for 0 or 2+ expressions doesn't need to deal with
builtin candidates and so is quite simple.
As discussed in the paper, for backwards compatibility, if for 2+ index
expressions build_op_subscript fails (called with tf_none) and the
expressions together form a valid comma expression (again checked with
tf_none), it is used that C++20-ish way with a pedwarn about it, but if
even that fails, build_op_subscript is called again with standard complain
flags to diagnose it in the new way. And similarly for the builtin case.
The -Wcomma-subscript warning used to be enabled by default unless
-Wno-deprecated. Since the C/C++98..20 behavior is no longer deprecated,
but ill-formed or changed meaning, it is now for C++23 enabled by
default regardless of -Wno-deprecated and controls the pedwarn (but not the
errors emitted if something wasn't valid before and isn't valid in C++23
either).
2021-11-25 Jakub Jelinek <jakub@redhat.com>
PR c++/102611
gcc/
* doc/invoke.texi (-Wcomma-subscript): Document that for
-std=c++20 the option isn't enabled by default with -Wno-deprecated
but for -std=c++23 it is.
gcc/c-family/
* c-opts.c (c_common_post_options): Enable -Wcomma-subscript by
default for C++23 regardless of warn_deprecated.
* c-cppbuiltin.c (c_cpp_builtins): Predefine
__cpp_multidimensional_subscript=202110L for C++23.
gcc/cp/
* cp-tree.h (build_op_subscript): Implement P2128R6
- Multidimensional subscript operator. Declare.
(class releasing_vec): Add release method.
(grok_array_decl): Remove bool argument, add vec<tree, va_gc> **
and tsubst_flags_t arguments.
(build_min_non_dep_op_overload): Declare another overload.
* parser.c (cp_parser_parenthesized_expression_list_elt): New function.
(cp_parser_postfix_open_square_expression): Mention C++23 syntax in
function comment. For C++23 parse zero or more than one initializer
clauses in expression list, adjust grok_array_decl caller.
(cp_parser_parenthesized_expression_list): Use
cp_parser_parenthesized_expression_list_elt.
(cp_parser_builtin_offsetof): Adjust grok_array_decl caller.
* decl.c (grok_op_properties): For C++23 don't check number
of arguments of operator[].
* decl2.c (grok_array_decl): Remove decltype_p argument, add
index_exp_list and complain arguments. If index_exp is NULL,
handle *index_exp_list as the subscript expression list.
* tree.c (build_min_non_dep_op_overload): New overload.
* call.c (add_operator_candidates, build_over_call): Adjust comments
for removal of build_new_op_1.
(build_op_subscript): New function.
* pt.c (tsubst_copy_and_build_call_args): New function.
(tsubst_copy_and_build) <case ARRAY_REF>: If second
operand is magic CALL_EXPR with ovl_op_identifier (ARRAY_REF)
as CALL_EXPR_FN, tsubst CALL_EXPR arguments including expanding
pack expressions in it and call grok_array_decl instead of
build_x_array_ref.
<case CALL_EXPR>: Use tsubst_copy_and_build_call_args.
* semantics.c (handle_omp_array_sections_1): Adjust grok_array_decl
caller.
gcc/testsuite/
* g++.dg/cpp2a/comma1.C: Expect different diagnostics for C++23.
* g++.dg/cpp2a/comma3.C: Likewise.
* g++.dg/cpp2a/comma4.C: Expect diagnostics for C++23.
* g++.dg/cpp2a/comma5.C: Expect different diagnostics for C++23.
* g++.dg/cpp23/feat-cxx2b.C: Test __cpp_multidimensional_subscript
predefined macro.
* g++.dg/cpp23/subscript1.C: New test.
* g++.dg/cpp23/subscript2.C: New test.
* g++.dg/cpp23/subscript3.C: New test.
* g++.dg/cpp23/subscript4.C: New test.
* g++.dg/cpp23/subscript5.C: New test.
* g++.dg/cpp23/subscript6.C: New test.
|
|
Resolves:
PR middle-end/88232 - Please implement -Winfinite-recursion
gcc/ChangeLog:
PR middle-end/88232
* Makefile.in (OBJS): Add gimple-warn-recursion.o.
* common.opt: Add -Winfinite-recursion.
* doc/invoke.texi (-Winfinite-recursion): Document.
* passes.def (pass_warn_recursion): Schedule a new pass.
* tree-pass.h (make_pass_warn_recursion): Declare.
* gimple-warn-recursion.c: New file.
gcc/c-family/ChangeLog:
PR middle-end/88232
* c.opt: Add -Winfinite-recursion.
gcc/testsuite/ChangeLog:
PR middle-end/88232
* c-c++-common/attr-used-5.c: Suppress valid warning.
* c-c++-common/attr-used-6.c: Same.
* c-c++-common/attr-used-9.c: Same.
* g++.dg/warn/Winfinite-recursion-2.C: New test.
* g++.dg/warn/Winfinite-recursion-3.C: New test.
* g++.dg/warn/Winfinite-recursion.C: New test.
* gcc.dg/Winfinite-recursion-2.c: New test.
* gcc.dg/Winfinite-recursion.c: New test.
|
|
gcc/ChangeLog:
* doc/invoke.texi: Remove 2 more duplicite param descriptions.
|
|
gcc/ChangeLog:
* doc/invoke.texi: Remove duplicate documentation for 3 params.
|
|
At least some version(s) of makeinfo (4.8) do not like @option {-xxxx}
the brace has to follow the @option without any whitespace.
makeinfo 4.8 is installed on Darwin systems and this breaks bootstrap.
The amendment follows the style of the surrounding code.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/ChangeLog:
* doc/invoke.texi: Remove whitespace after an @option.
|
|
Using -fno-semantic-interposition has been reported by various people
to bring about considerable speed up at the cost of strict compliance
to the ELF symbol interposition rules See for example
https://fedoraproject.org/wiki/Changes/PythonNoSemanticInterpositionSpeedup
As such I believe it should be implied by our -Ofast optimization
level, not only so that benchmarks that can benefit run faster, but
also so that people looking at -Ofast documentation for options that
could speed their programs find it.
gcc/ChangeLog:
2021-11-12 Martin Jambor <mjambor@suse.cz>
* opts.c (default_options_table): Switch off
flag_semantic_interposition at Ofast.
* doc/invoke.texi (Optimize Options): Document that Ofast switches off
-fsemantic-interposition.
|
|
Resolves:
PR c/33925 - gcc -Waddress lost some useful warnings
PR c/102867 - -Waddress from macro expansion in readelf.c
gcc/c-family/ChangeLog:
PR c++/33925
PR c/102867
* c-common.c (decl_with_nonnull_addr_p): Call maybe_nonzero_address
and improve handling tof defined symbols.
gcc/c/ChangeLog:
PR c++/33925
PR c/102867
* c-typeck.c (maybe_warn_for_null_address): Suppress warnings for
code resulting from macro expansion.
gcc/cp/ChangeLog:
PR c++/33925
PR c/102867
* typeck.c (warn_for_null_address): Suppress warnings for code
resulting from macro expansion.
gcc/ChangeLog:
PR c++/33925
PR c/102867
* doc/invoke.texi (-Waddress): Update.
gcc/testsuite/ChangeLog:
PR c++/33925
PR c/102867
* g++.dg/warn/Walways-true-2.C: Adjust to avoid a valid warning.
* c-c++-common/Waddress-5.c: New test.
* c-c++-common/Waddress-6.c: New test.
* g++.dg/warn/Waddress-7.C: New test.
* gcc.dg/Walways-true-2.c: Adjust to avoid a valid warning.
* gcc.dg/weak/weak-3.c: Expect a warning.
|
|
The `configure` scripts generated with autoconf often tests compiler
features by setting output to `/dev/null`, which then sets the dump
folder as being /dev/* and the compilation halts with an error because
GCC cannot create files in /dev/. This is a problem when configure is
testing for compiler features because it cannot tell if the failure was
due to unsupported features or any other problem, and disable it even
if it is working.
As an example, running configure overriding CFLAGS="-fdump-ipa-clones"
will result in several compiler-features as being disabled because of
gcc halting with an error creating files in /dev/*.
This commit fixes this issue by checking if the output file is
/dev/null or /dev/zero. In this case we use the current working
directory for dump output instead of the directory of the output
file because we cannot write to /dev/*.
gcc/ChangeLog
2021-11-16 Giuliano Belinassi <gbelinassi@suse.de>
* gcc.c (process_command): Skip dumpdir override if file is a
not_actual_file_p.
* doc/invoke.texi: Update -dumpdir documentation.
gcc/testsuite/ChangeLog
2021-11-16 Giuliano Belinassi <gbelinassi@suse.de>
* gcc.dg/devnull-dump.c: New.
Signed-off-by: Giuliano Belinassi <gbelinassi@suse.de>
|
|
2021 update: Last year I posted a version of this patch:
<https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559162.html>
but it didn't make it in. The main objection seemed to be that the
patch tried to do too much, and overlapped with the ME uninitialized
warnings. Since the patch used walk_tree without any data flow info,
it issued false positives for things like a(0 ? b : 42) and similar.
I'll admit I've been dreading resurrecting this because of the lack
of clarity about where we should warn about what. On the other hand,
I think we really should do something about this. So I've simplified
the original patch as much as it seemed reasonable. For instance, it
doesn't even attempt to handle cases like "a((b = 42)), c(b)" -- for
these I simply give up for the whole mem-initializer (but who writes
code like that, anyway?). I also give up when a member is initialized
with a function call, because we don't know what the call could do.
See Wuninitialized-17.C, for which clang emits a false positive but
we don't. I remember having a hard time dealing with initializer lists
in my previous patch, so now I only handle simple a{b} cases, but no
more. It turned out that this abridged version still warns about 90%
cases where users would expect a warning.
More complicated cases are left for the ME, which, for unused inline
functions, will only warn with -fkeep-inline-functions, but so be it.
(This is bug 21678.)
This patch implements the long-desired -Wuninitialized warning for
member initializer lists, so that the front end can detect bugs like
struct A {
int a;
int b;
A() : b(1), a(b) { }
};
where the field 'b' is used uninitialized because the order of member
initializers in the member initializer list is irrelevant; what matters
is the order of declarations in the class definition.
I've implemented this by keeping a hash set holding fields that are not
initialized yet, so at first it will be {a, b}, and after initializing
'a' it will be {b} and so on. Then I use walk_tree to walk the
initializer and if we see that an uninitialized object is used, we warn.
Of course, when we use the address of the object, we may not warn:
struct B {
int &r;
int *p;
int a;
B() : r(a), p(&a), a(1) { } // ok
};
Likewise, don't warn in unevaluated contexts such as sizeof. Classes
without an explicit initializer may still be initialized by their
default constructors; whether or not something is considered initialized
is handled in perform_member_init, see member_initialized_p.
PR c++/19808
PR c++/96121
gcc/cp/ChangeLog:
* init.c (perform_member_init): Remove a forward declaration.
Walk the initializer using find_uninit_fields_r. New parameter
to track uninitialized fields. If a member is initialized,
remove it from the hash set.
(perform_target_ctor): Return the initializer.
(struct find_uninit_data): New class.
(find_uninit_fields_r): New function.
(find_uninit_fields): New function.
(emit_mem_initializers): Keep and initialize a set holding fields
that are not initialized. When handling delegating constructors,
walk the constructor tree using find_uninit_fields_r. Also when
initializing base clases. Pass uninitialized down to
perform_member_init.
gcc/ChangeLog:
* doc/invoke.texi: Update documentation for -Wuninitialized.
* tree.c (stabilize_reference): Set location.
gcc/testsuite/ChangeLog:
* g++.dg/warn/Wuninitialized-14.C: New test.
* g++.dg/warn/Wuninitialized-15.C: New test.
* g++.dg/warn/Wuninitialized-16.C: New test.
* g++.dg/warn/Wuninitialized-17.C: New test.
* g++.dg/warn/Wuninitialized-18.C: New test.
* g++.dg/warn/Wuninitialized-19.C: New test.
* g++.dg/warn/Wuninitialized-20.C: New test.
* g++.dg/warn/Wuninitialized-21.C: New test.
* g++.dg/warn/Wuninitialized-22.C: New test.
* g++.dg/warn/Wuninitialized-23.C: New test.
* g++.dg/warn/Wuninitialized-24.C: New test.
* g++.dg/warn/Wuninitialized-25.C: New test.
* g++.dg/warn/Wuninitialized-26.C: New test.
* g++.dg/warn/Wuninitialized-27.C: New test.
* g++.dg/warn/Wuninitialized-28.C: New test.
* g++.dg/warn/Wuninitialized-29.C: New test.
* g++.dg/warn/Wuninitialized-30.C: New test.
|
|
Add -mindirect-branch-cs-prefix to add CS prefix to call and jmp to
indirect thunk with branch target in r8-r15 registers so that the call
and jmp instruction length is 6 bytes to allow them to be replaced with
"lfence; call *%r8-r15" or "lfence; jmp *%r8-r15" at run-time.
gcc/
PR target/102952
* config/i386/i386.c (ix86_output_jmp_thunk_or_indirect): Emit
CS prefix for -mindirect-branch-cs-prefix.
(ix86_output_indirect_branch_via_reg): Likewise.
* config/i386/i386.opt: Add -mindirect-branch-cs-prefix.
* doc/invoke.texi: Document -mindirect-branch-cs-prefix.
gcc/testsuite/
PR target/102952
* gcc.target/i386/indirect-thunk-cs-prefix-1.c: New test.
* gcc.target/i386/indirect-thunk-cs-prefix-2.c: Likewise.
|
|
New builtin to enable explicit use of PAREN_EXPR in C & C++ code.
Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
gcc/testsuite/ChangeLog:
* c-c++-common/builtin-assoc-barrier-1.c: New test.
gcc/cp/ChangeLog:
* constexpr.c (cxx_eval_constant_expression): Handle PAREN_EXPR
via cxx_eval_constant_expression.
* cp-objcp-common.c (names_builtin_p): Handle
RID_BUILTIN_ASSOC_BARRIER.
* cp-tree.h: Adjust TREE_LANG_FLAG documentation to include
PAREN_EXPR in REF_PARENTHESIZED_P.
(REF_PARENTHESIZED_P): Add PAREN_EXPR.
* parser.c (cp_parser_postfix_expression): Handle
RID_BUILTIN_ASSOC_BARRIER.
* pt.c (tsubst_copy_and_build): If the PAREN_EXPR is not a
parenthesized initializer, build a new PAREN_EXPR.
* semantics.c (force_paren_expr): Simplify conditionals. Set
REF_PARENTHESIZED_P on PAREN_EXPR.
(maybe_undo_parenthesized_ref): Test PAREN_EXPR for
REF_PARENTHESIZED_P.
gcc/c-family/ChangeLog:
* c-common.c (c_common_reswords): Add __builtin_assoc_barrier.
* c-common.h (enum rid): Add RID_BUILTIN_ASSOC_BARRIER.
gcc/c/ChangeLog:
* c-decl.c (names_builtin_p): Handle RID_BUILTIN_ASSOC_BARRIER.
* c-parser.c (c_parser_postfix_expression): Likewise.
gcc/ChangeLog:
* doc/extend.texi: Document __builtin_assoc_barrier.
|
|
Add -mharden-sls= to mitigate against straight line speculation (SLS)
for function return and indirect branch by adding an INT3 instruction
after function return and indirect branch.
gcc/
PR target/102952
* config/i386/i386-opts.h (harden_sls): New enum.
* config/i386/i386.c (output_indirect_thunk): Mitigate against
SLS for function return.
(ix86_output_function_return): Likewise.
(ix86_output_jmp_thunk_or_indirect): Mitigate against indirect
branch.
(ix86_output_indirect_jmp): Likewise.
(ix86_output_call_insn): Likewise.
* config/i386/i386.opt: Add -mharden-sls=.
* doc/invoke.texi: Document -mharden-sls=.
gcc/testsuite/
PR target/102952
* gcc.target/i386/harden-sls-1.c: New test.
* gcc.target/i386/harden-sls-2.c: Likewise.
* gcc.target/i386/harden-sls-3.c: Likewise.
* gcc.target/i386/harden-sls-4.c: Likewise.
* gcc.target/i386/harden-sls-5.c: Likewise.
|
|
I forgot this in the implementation patch.
gcc/ChangeLog:
* doc/invoke.texi (C++ Dialect Options): Document
-fimplicit-constexpr.
|
|
This patch adds conditional forms of FMAX and FMIN, following
the pattern for existing conditional binary functions.
gcc/
* doc/md.texi (cond_fmin@var{mode}, cond_fmax@var{mode}): Document.
* optabs.def (cond_fmin_optab, cond_fmax_optab): New optabs.
* internal-fn.def (COND_FMIN, COND_FMAX): New functions.
* internal-fn.c (first_commutative_argument): Handle them.
(FOR_EACH_COND_FN_PAIR): Likewise.
* match.pd (UNCOND_BINARY, COND_BINARY): Likewise.
* config/aarch64/aarch64-sve.md (cond_<fmaxmin><mode>): New
pattern.
gcc/testsuite/
* gcc.target/aarch64/sve/cond_fmaxnm_5.c: New test.
* gcc.target/aarch64/sve/cond_fmaxnm_5_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_6.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_6_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_7.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_7_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_8.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_8_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_5.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_5_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_6.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_6_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_7.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_7_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_8.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_8_run.c: Likewise.
|
|
From a link below:
"An issue was discovered in the Bidirectional Algorithm in the Unicode
Specification through 14.0. It permits the visual reordering of
characters via control sequences, which can be used to craft source code
that renders different logic than the logical ordering of tokens
ingested by compilers and interpreters. Adversaries can leverage this to
encode source code for compilers accepting Unicode such that targeted
vulnerabilities are introduced invisibly to human reviewers."
More info:
https://nvd.nist.gov/vuln/detail/CVE-2021-42574
https://trojansource.codes/
This is not a compiler bug. However, to mitigate the problem, this patch
implements -Wbidi-chars=[none|unpaired|any] to warn about possibly
misleading Unicode bidirectional control characters the preprocessor may
encounter.
The default is =unpaired, which warns about improperly terminated
bidirectional control characters; e.g. a LRE without its corresponding PDF.
The level =any warns about any use of bidirectional control characters.
This patch handles both UCNs and UTF-8 characters. UCNs designating
bidi characters in identifiers are accepted since r204886. Then r217144
enabled -fextended-identifiers by default. Extended characters in C/C++
identifiers have been accepted since r275979. However, this patch still
warns about mixing UTF-8 and UCN bidi characters; there seems to be no
good reason to allow mixing them.
We warn in different contexts: comments (both C and C++-style), string
literals, character constants, and identifiers. Expectedly, UCNs are ignored
in comments and raw string literals. The bidirectional control characters
can nest so this patch handles that as well.
I have not included nor tested this at all with Fortran (which also has
string literals and line comments).
Dave M. posted patches improving diagnostic involving Unicode characters.
This patch does not make use of this new infrastructure yet.
PR preprocessor/103026
gcc/c-family/ChangeLog:
* c.opt (Wbidi-chars, Wbidi-chars=): New option.
gcc/ChangeLog:
* doc/invoke.texi: Document -Wbidi-chars.
libcpp/ChangeLog:
* include/cpplib.h (enum cpp_bidirectional_level): New.
(struct cpp_options): Add cpp_warn_bidirectional.
(enum cpp_warning_reason): Add CPP_W_BIDIRECTIONAL.
* internal.h (struct cpp_reader): Add warn_bidi_p member
function.
* init.c (cpp_create_reader): Set cpp_warn_bidirectional.
* lex.c (bidi): New namespace.
(get_bidi_utf8): New function.
(get_bidi_ucn): Likewise.
(maybe_warn_bidi_on_close): Likewise.
(maybe_warn_bidi_on_char): Likewise.
(_cpp_skip_block_comment): Implement warning about bidirectional
control characters.
(skip_line_comment): Likewise.
(forms_identifier_p): Likewise.
(lex_identifier): Likewise.
(lex_string): Likewise.
(lex_raw_string): Likewise.
gcc/testsuite/ChangeLog:
* c-c++-common/Wbidi-chars-1.c: New test.
* c-c++-common/Wbidi-chars-2.c: New test.
* c-c++-common/Wbidi-chars-3.c: New test.
* c-c++-common/Wbidi-chars-4.c: New test.
* c-c++-common/Wbidi-chars-5.c: New test.
* c-c++-common/Wbidi-chars-6.c: New test.
* c-c++-common/Wbidi-chars-7.c: New test.
* c-c++-common/Wbidi-chars-8.c: New test.
* c-c++-common/Wbidi-chars-9.c: New test.
* c-c++-common/Wbidi-chars-10.c: New test.
* c-c++-common/Wbidi-chars-11.c: New test.
* c-c++-common/Wbidi-chars-12.c: New test.
* c-c++-common/Wbidi-chars-13.c: New test.
* c-c++-common/Wbidi-chars-14.c: New test.
* c-c++-common/Wbidi-chars-15.c: New test.
* c-c++-common/Wbidi-chars-16.c: New test.
* c-c++-common/Wbidi-chars-17.c: New test.
|
|
For at least one target (Darwin) the platform convention is to
register static destructors (i.e. __attribute__((destructor)))
with __cxa_atexit rather than placing them into a list that is
run by some other mechanism.
This patch provides a target hook that allows a target to opt
into this and handling for the process in ipa_cdtor_merge ().
When the mode is enabled (dtors_from_cxa_atexit is set) we:
* Generate new CTORs to register static destructors with
__cxa_atexit and add them to the existing list of CTORs;
we then process the revised CTORs list.
* We sort the DTORs into priority and then TU order, this
means that they are registered in that order with
__cxa_atexit () and therefore will be run in the reverse
order.
* Likewise, CTORs are sorted into priority and then TU order,
which means that they will run in that order.
This matches the behavior of using init/fini (or
mod_init_func/mod_term_func) sections.
This also fixes a bug where Fortran needs a DTOR to be run to
close IO.
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
PR fortran/102992
gcc/ChangeLog:
* config/darwin.h (TARGET_DTORS_FROM_CXA_ATEXIT): New.
* doc/tm.texi: Regenerated.
* doc/tm.texi.in: Add TARGET_DTORS_FROM_CXA_ATEXIT hook.
* ipa.c (cgraph_build_static_cdtor_1): Return the built
function decl.
(build_cxa_atexit_decl): New.
(build_dso_handle_decl): New.
(build_cxa_dtor_registrations): New.
(compare_cdtor_tu_order): New.
(build_cxa_atexit_fns): New.
(ipa_cdtor_merge): If dtors_from_cxa_atexit is set,
process the DTORs/CTORs accordingly.
(pass_ipa_cdtor_merge::gate): Also run if
dtors_from_cxa_atexit is set.
* target.def (dtors_from_cxa_atexit): New hook.
|
|
From the CPU's point of view, getting a cache line for writing is more
expensive than reading. See Appendix A.2 Spinlock in:
https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/
xeon-lock-scaling-analysis-paper.pdf
The full compare and swap will grab the cache line exclusive and causes
excessive cache line bouncing.
The atomic_fetch_{or,xor,and,nand} builtins generates cmpxchg loop under
-march=x86-64 like:
movl v(%rip), %eax
.L2:
movl %eax, %ecx
movl %eax, %edx
orl $1, %ecx
lock cmpxchgl %ecx, v(%rip)
jne .L2
movl %edx, %eax
andl $1, %eax
ret
To relax above loop, GCC should first emit a normal load, check and jump to
.L2 if cmpxchgl may fail. Before jump to .L2, PAUSE should be inserted to
yield the CPU to another hyperthread and to save power, so the code is
like
.L84:
movl (%rdi), %ecx
movl %eax, %edx
orl %esi, %edx
cmpl %eax, %ecx
jne .L82
lock cmpxchgl %edx, (%rdi)
jne .L84
.L82:
rep nop
jmp .L84
This patch adds corresponding atomic_fetch_op expanders to insert load/
compare and pause for all the atomic logic fetch builtins. Add flag
-mrelax-cmpxchg-loop to control whether to generate relaxed loop.
gcc/ChangeLog:
PR target/103069
* config/i386/i386-expand.c (ix86_expand_atomic_fetch_op_loop):
New expand function.
* config/i386/i386-options.c (ix86_target_string): Add
-mrelax-cmpxchg-loop flag.
(ix86_valid_target_attribute_inner_p): Likewise.
* config/i386/i386-protos.h (ix86_expand_atomic_fetch_op_loop):
New expand function prototype.
* config/i386/i386.opt: Add -mrelax-cmpxchg-loop.
* config/i386/sync.md (atomic_fetch_<logic><mode>): New expander
for SI,HI,QI modes.
(atomic_<logic>_fetch<mode>): Likewise.
(atomic_fetch_nand<mode>): Likewise.
(atomic_nand_fetch<mode>): Likewise.
(atomic_fetch_<logic><mode>): New expander for DI,TI modes.
(atomic_<logic>_fetch<mode>): Likewise.
(atomic_fetch_nand<mode>): Likewise.
(atomic_nand_fetch<mode>): Likewise.
* doc/invoke.texi: Document -mrelax-cmpxchg-loop.
gcc/testsuite/ChangeLog:
PR target/103069
* gcc.target/i386/pr103069-1.c: New test.
* gcc.target/i386/pr103069-2.c: Ditto.
|
|
Add the the `-mlra' command-line option for the VAX target, with the
usual semantics of enabling Local Register Allocation, off by default.
LRA remains unstable with the VAX target, with numerous ICEs throughout
the testsuite and worse code produced overall where successful, however
the presence of a command line option to enable it makes it easier to
experiment with it as the compiler does not have to be rebuilt to flip
between the old reload and LRA.
gcc/
* config/vax/vax.c (vax_lra_p): New prototype and function.
(TARGET_LRA_P): Wire it.
* config/vax/vax.opt (mlra): New option.
* doc/invoke.texi (Option Summary, VAX Options): Document the
new option.
|
|
The initial commit of the analyzer in GCC 10 had a single warning,
-Wanalyzer-tainted-array-index
and required manually enabling the taint checker with
-fanalyzer-checker=taint (due to scaling issues).
This patch extends the taint detection to add four new taint-based
warnings:
-Wanalyzer-tainted-allocation-size
for e.g. attacker-controlled malloc/alloca
-Wanalyzer-tainted-divisor
for detecting where an attacker can inject a divide-by-zero
-Wanalyzer-tainted-offset
for attacker-controlled pointer offsets
-Wanalyzer-tainted-size
for e.g. attacker-controlled memset
and rewords all the warnings to talk about "attacker-controlled" values
rather than "tainted" values.
Unfortunately I haven't yet addressed the scaling issues, so all of
these still require -fanalyzer-checker=taint (in addition to -fanalyzer).
gcc/analyzer/ChangeLog:
* analyzer.opt (Wanalyzer-tainted-allocation-size): New.
(Wanalyzer-tainted-divisor): New.
(Wanalyzer-tainted-offset): New.
(Wanalyzer-tainted-size): New.
* engine.cc (impl_region_model_context::get_taint_map): New.
* exploded-graph.h (impl_region_model_context::get_taint_map):
New decl.
* program-state.cc (sm_state_map::get_state): Call
alt_get_inherited_state.
(sm_state_map::impl_set_state): Modify states within
compound svalues.
(program_state::impl_call_analyzer_dump_state): Undo casts.
(selftest::test_program_state_1): Update for new context param of
create_region_for_heap_alloc.
(selftest::test_program_state_merging): Likewise.
* region-model-impl-calls.cc (region_model::impl_call_alloca):
Likewise.
(region_model::impl_call_calloc): Likewise.
(region_model::impl_call_malloc): Likewise.
(region_model::impl_call_operator_new): Likewise.
(region_model::impl_call_realloc): Likewise.
* region-model.cc (region_model::check_region_access): Call
check_region_for_taint.
(region_model::get_representative_path_var_1): Handle binops.
(region_model::create_region_for_heap_alloc): Add "ctxt" param and
pass it to set_dynamic_extents.
(region_model::create_region_for_alloca): Likewise.
(region_model::set_dynamic_extents): Add "ctxt" param and use it
to call check_dynamic_size_for_taint.
(selftest::test_state_merging): Update for new context param of
create_region_for_heap_alloc.
(selftest::test_malloc_constraints): Likewise.
(selftest::test_malloc): Likewise.
(selftest::test_alloca): Likewise for create_region_for_alloca.
* region-model.h (region_model::create_region_for_heap_alloc): Add
"ctxt" param.
(region_model::create_region_for_alloca): Likewise.
(region_model::set_dynamic_extents): Likewise.
(region_model::check_dynamic_size_for_taint): New decl.
(region_model::check_region_for_taint): New decl.
(region_model_context::get_taint_map): New vfunc.
(noop_region_model_context::get_taint_map): New.
* sm-taint.cc: Remove include of "diagnostic-event-id.h"; add
includes of "gimple-iterator.h", "tristate.h", "selftest.h",
"ordered-hash-map.h", "cgraph.h", "cfg.h", "digraph.h",
"analyzer/supergraph.h", "analyzer/call-string.h",
"analyzer/program-point.h", "analyzer/store.h",
"analyzer/region-model.h", and "analyzer/program-state.h".
(enum bounds): Move to top of file.
(class taint_diagnostic): New.
(class tainted_array_index): Convert to subclass of taint_diagnostic.
(tainted_array_index::emit): Add CWE-129. Reword warning to use
"attacker-controlled" rather than "tainted".
(tainted_array_index::describe_state_change): Move to
taint_diagnostic::describe_state_change.
(tainted_array_index::describe_final_event): Reword to use
"attacker-controlled" rather than "tainted".
(class tainted_offset): New.
(class tainted_size): New.
(class tainted_divisor): New.
(class tainted_allocation_size): New.
(taint_state_machine::alt_get_inherited_state): New.
(taint_state_machine::on_stmt): In assignment handling, remove
ARRAY_REF handling in favor of check_region_for_taint. Add
detection of tainted divisors.
(taint_state_machine::get_taint): New.
(taint_state_machine::combine_states): New.
(region_model::check_region_for_taint): New.
(region_model::check_dynamic_size_for_taint): New.
* sm.h (state_machine::alt_get_inherited_state): New.
gcc/ChangeLog:
* doc/invoke.texi (Static Analyzer Options): Add
-Wno-analyzer-tainted-allocation-size,
-Wno-analyzer-tainted-divisor, -Wno-analyzer-tainted-offset, and
-Wno-analyzer-tainted-size to list. Add
-Wanalyzer-tainted-allocation-size, -Wanalyzer-tainted-divisor,
-Wanalyzer-tainted-offset, and -Wanalyzer-tainted-size to list
of options effectively enabled by -fanalyzer.
(-Wanalyzer-tainted-allocation-size): New.
(-Wanalyzer-tainted-array-index): Tweak wording; add link to CWE.
(-Wanalyzer-tainted-divisor): New.
(-Wanalyzer-tainted-offset): New.
(-Wanalyzer-tainted-size): New.
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/pr93382.c: Tweak expected wording.
* gcc.dg/analyzer/taint-alloc-1.c: New test.
* gcc.dg/analyzer/taint-alloc-2.c: New test.
* gcc.dg/analyzer/taint-divisor-1.c: New test.
* gcc.dg/analyzer/taint-1.c: Rename to...
* gcc.dg/analyzer/taint-read-index-1.c: ...this. Tweak expected
wording. Mark some events as xfail.
* gcc.dg/analyzer/taint-read-offset-1.c: New test.
* gcc.dg/analyzer/taint-size-1.c: New test.
* gcc.dg/analyzer/taint-write-index-1.c: New test.
* gcc.dg/analyzer/taint-write-offset-1.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
Commmit 5d9d0c94588 renamed future to power10 and ace60939fd2
updated the documentation for "future" renaming. This patch
is to rename the remaining "future architecture" references in
documentation and polish the words for float128.
gcc/ChangeLog:
* doc/invoke.texi: Change references to "future cpu" to "power10",
"-mcpu=future" to "-mcpu=power10". Adjust words for float128.
|
|
It is desirable for -Wattributes to warn about e.g.
[[deprecate]] void g(); // typo, should warn
However, -Wattributes also warns about vendor-specific attributes
(that's because lookup_scoped_attribute_spec -> find_attribute_namespace
finds nothing), which, with -Werror, causes grief. We don't want the
-Wattributes warning for
[[company::attr]] void f();
GCC warns because it doesn't know the "company" namespace; it only knows
the "gnu" and "omp" namespaces. We could entirely disable warning about
attributes in unknown scopes but then the compiler would also miss typos
like
[[company::attrx]] void f();
or
[[gmu::warn_used_result]] int write();
so that is not a viable solution. A workaround is to use a #pragma:
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wattributes"
[[company::attr]] void f() {}
#pragma GCC diagnostic pop
but that's a mouthful and awkward to use and could also hide typos. In
fact, any macro-based solution doesn't seem like a way forward.
This patch implements -Wno-attributes=, which takes these arguments:
company::attr
company::
This option should go well with using @file: the user could have a file
containing
-Wno-attributes=vendor::attr1,vendor::attr2
and then invoke gcc with '@attrs' or similar.
I've also added a new pragma which has the same effect:
The pragma along with the new option should help with various static
analysis tools.
PR c++/101940
gcc/ChangeLog:
* attribs.c (struct scoped_attributes): Add a bool member.
(lookup_scoped_attribute_spec): Forward declare.
(register_scoped_attributes): New bool parameter, defaulted to
false. Use it.
(handle_ignored_attributes_option): New function.
(free_attr_data): New function.
(init_attributes): Call handle_ignored_attributes_option.
(attr_namespace_ignored_p): New function.
(decl_attributes): Check attr_namespace_ignored_p before
warning.
* attribs.h (free_attr_data): Declare.
(register_scoped_attributes): Adjust declaration.
(handle_ignored_attributes_option): Declare.
(canonicalize_attr_name): New function template.
(canonicalize_attr_name): Use it.
* common.opt (Wattributes=): New option with a variable.
* doc/extend.texi: Document #pragma GCC diagnostic ignored_attributes.
* doc/invoke.texi: Document -Wno-attributes=.
* opts.c (common_handle_option) <case OPT_Wattributes_>: Handle.
* plugin.h (register_scoped_attributes): Adjust declaration.
* toplev.c (compile_file): Call free_attr_data.
gcc/c-family/ChangeLog:
* c-pragma.c (handle_pragma_diagnostic): Handle #pragma GCC diagnostic
ignored_attributes.
gcc/testsuite/ChangeLog:
* c-c++-common/Wno-attributes-1.c: New test.
* c-c++-common/Wno-attributes-2.c: New test.
* c-c++-common/Wno-attributes-3.c: New test.
|
|
This patch is adding support for Cortex-A710 CPU in Arm.
gcc/ChangeLog:
* config/arm/arm-cpus.in (cortex-a710): New CPU.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm-tune.md: Regenerate.
* doc/invoke.texi: Update docs.
|
|
gcc/ChangeLog:
* doc/invoke.texi (Invoking GCC): Document --param=threader-debug.
|
|
'gcc/diagnostic-spec.h:nowarn_map' [PR101204, PR103157]
Reproduced with clang version 10.0.0-4ubuntu1:
gtype-desc.c:11333:1: warning: all paths through this function will call itself [-Winfinite-recursion]
... as well as some GCC's '-O2 -fdump-tree-optimized':
void gt_pch_nx(int_hash<unsigned int, 0u, 4294967295u>*, gt_pointer_operator, void*) ([...])
{
<bb 2>:
<bb 3>:
goto <bb 3>;
}
That three-arguments 'gt_pch_nx' function as well as two one-argument
'gt_ggc_mx', 'gt_pch_nx' functions now turn empty:
[...]
void
-gt_ggc_mx (int_hash<location_t,0,UINT_MAX>& x_r ATTRIBUTE_UNUSED)
+gt_ggc_mx (struct xint_hash_t& x_r ATTRIBUTE_UNUSED)
{
- int_hash<location_t,0,UINT_MAX> * ATTRIBUTE_UNUSED x = &x_r;
- gt_ggc_mx (&((*x)));
+ struct xint_hash_t * ATTRIBUTE_UNUSED x = &x_r;
}
[...]
void
-gt_pch_nx (int_hash<location_t,0,UINT_MAX>& x_r ATTRIBUTE_UNUSED)
+gt_pch_nx (struct xint_hash_t& x_r ATTRIBUTE_UNUSED)
{
- int_hash<location_t,0,UINT_MAX> * ATTRIBUTE_UNUSED x = &x_r;
- gt_pch_nx (&((*x)));
+ struct xint_hash_t * ATTRIBUTE_UNUSED x = &x_r;
}
[...]
void
-gt_pch_nx (int_hash<location_t,0,UINT_MAX>* x ATTRIBUTE_UNUSED,
+gt_pch_nx (struct xint_hash_t* x ATTRIBUTE_UNUSED,
ATTRIBUTE_UNUSED gt_pointer_operator op,
ATTRIBUTE_UNUSED void *cookie)
{
- gt_pch_nx (&((*x)), op, cookie);
}
[...]
gcc/
PR middle-end/101204
PR other/103157
* diagnostic-spec.h (typedef xint_hash_t): Turn into...
(struct xint_hash_t): ... this.
* doc/gty.texi: Update.
|
|
In this patch:
+ Add `armv9-a` to -march.
+ Update multilib with armv9-a and armv9-a+simd.
gcc/ChangeLog:
* config/arm/arm-cpus.in (armv9): New define.
(ARMv9a): New group.
(armv9-a): New arch definition.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm.h (BASE_ARCH_9A): New arch enum value.
* config/arm/t-aprofile: Added armv9-a and armv9+simd.
* config/arm/t-arm-elf: Added arm9-a, v9_fps and all_v9_archs
to MULTILIB_MATCHES.
* config/arm/t-multilib: Added v9_a_nosimd_variants and
v9_a_simd_variants to MULTILIB_MATCHES.
* doc/invoke.texi: Update docs.
gcc/testsuite/ChangeLog:
* gcc.target/arm/multilib.exp: Update test with armv9-a entries.
* lib/target-supports.exp (v9a): Add new armflag.
(__ARM_ARCH_9A__): Add new armdef.
|
|
which are enabled by default at -02.
gcc/ChangeLog:
PR tree-optimization/103077
* doc/invoke.texi (Options That Control Optimization):
Update documentation for -ftree-loop-vectorize and
-ftree-slp-vectorize which are enabled by default at -02.
|
|
Commit 431d26e1dd18c1146d3d4dcd3b45a3b04f7f7d59 removed
doc/install-old.texi, alas we still tried to generate the
associated web page old.html - which then turned out empty.
Simplify remove this from the list of pages to be generated.
gcc:
* doc/install.texi2html: Do not generate old.html any longer.
|
|
The current vector cost interface has a quite a bit of redundancy
built in. Each target that defines its own hooks has to replicate
the basic unsigned[3] management. Currently each target also
duplicates the cost adjustment for inner loops.
This patch instead defines a vector_costs class for holding
the scalar or vector cost and allows targets to subclass it.
There is then only one costing hook: to create a new costs
structure of the appropriate type. Everything else can be
virtual functions, with common concepts implemented in the
base class rather than in each target's derivation.
This might seem like excess C++-ification, but it shaves
~100 LOC. I've also got some follow-on changes that become
significantly easier with this patch. Maybe it could help
with things like weighting blocks based on frequency too.
This will clash with Andre's unrolling patches. His patches
have priority so this patch should queue behind them.
The x86 and rs6000 parts fully convert to a self-contained class.
The equivalent aarch64 changes are more complex, so this patch
just does the bare minimum. A later patch will rework the
aarch64 bits.
gcc/
* target.def (targetm.vectorize.init_cost): Replace with...
(targetm.vectorize.create_costs): ...this.
(targetm.vectorize.add_stmt_cost): Delete.
(targetm.vectorize.finish_cost): Likewise.
(targetm.vectorize.destroy_cost_data): Likewise.
* doc/tm.texi.in (TARGET_VECTORIZE_INIT_COST): Replace with...
(TARGET_VECTORIZE_CREATE_COSTS): ...this.
(TARGET_VECTORIZE_ADD_STMT_COST): Delete.
(TARGET_VECTORIZE_FINISH_COST): Likewise.
(TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise.
* doc/tm.texi: Regenerate.
* tree-vectorizer.h (vec_info::vec_info): Remove target_cost_data
parameter.
(vec_info::target_cost_data): Change from a void * to a vector_costs *.
(vector_costs): New class.
(init_cost): Take a vec_info and return a vector_costs.
(dump_stmt_cost): Remove data parameter.
(add_stmt_cost): Replace vinfo and data parameters with a vector_costs.
(add_stmt_costs): Likewise.
(finish_cost): Replace data parameter with a vector_costs.
(destroy_cost_data): Delete.
* tree-vectorizer.c (dump_stmt_cost): Remove data argument and
don't print it.
(vec_info::vec_info): Remove the target_cost_data parameter and
initialize the member variable to null instead.
(vec_info::~vec_info): Delete target_cost_data instead of calling
destroy_cost_data.
(vector_costs::add_stmt_cost): New function.
(vector_costs::finish_cost): Likewise.
(vector_costs::record_stmt_cost): Likewise.
(vector_costs::adjust_cost_for_freq): Likewise.
* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Update
call to vec_info::vec_info.
(vect_compute_single_scalar_iteration_cost): Update after above
changes to costing interface.
(vect_analyze_loop_operations): Likewise.
(vect_estimate_min_profitable_iters): Likewise.
(vect_analyze_loop_2): Initialize LOOP_VINFO_TARGET_COST_DATA
at the start_over point, where it needs to be recreated after
trying without slp. Update retry code accordingly.
* tree-vect-slp.c (_bb_vec_info::_bb_vec_info): Update call
to vec_info::vec_info.
(vect_slp_analyze_operation): Update after above changes to costing
interface.
(vect_bb_vectorization_profitable_p): Likewise.
* targhooks.h (default_init_cost): Replace with...
(default_vectorize_create_costs): ...this.
(default_add_stmt_cost): Delete.
(default_finish_cost, default_destroy_cost_data): Likewise.
* targhooks.c (default_init_cost): Replace with...
(default_vectorize_create_costs): ...this.
(default_add_stmt_cost): Delete, moving logic to vector_costs instead.
(default_finish_cost, default_destroy_cost_data): Delete.
* config/aarch64/aarch64.c (aarch64_vector_costs): Inherit from
vector_costs. Add a constructor.
(aarch64_init_cost): Replace with...
(aarch64_vectorize_create_costs): ...this.
(aarch64_add_stmt_cost): Replace with...
(aarch64_vector_costs::add_stmt_cost): ...this. Use record_stmt_cost
to adjust the cost for inner loops.
(aarch64_finish_cost): Replace with...
(aarch64_vector_costs::finish_cost): ...this.
(aarch64_destroy_cost_data): Delete.
(TARGET_VECTORIZE_INIT_COST): Replace with...
(TARGET_VECTORIZE_CREATE_COSTS): ...this.
(TARGET_VECTORIZE_ADD_STMT_COST): Delete.
(TARGET_VECTORIZE_FINISH_COST): Likewise.
(TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise.
* config/i386/i386.c (ix86_vector_costs): New structure.
(ix86_init_cost): Replace with...
(ix86_vectorize_create_costs): ...this.
(ix86_add_stmt_cost): Replace with...
(ix86_vector_costs::add_stmt_cost): ...this. Use adjust_cost_for_freq
to adjust the cost for inner loops.
(ix86_finish_cost, ix86_destroy_cost_data): Delete.
(TARGET_VECTORIZE_INIT_COST): Replace with...
(TARGET_VECTORIZE_CREATE_COSTS): ...this.
(TARGET_VECTORIZE_ADD_STMT_COST): Delete.
(TARGET_VECTORIZE_FINISH_COST): Likewise.
(TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise.
* config/rs6000/rs6000.c (TARGET_VECTORIZE_INIT_COST): Replace with...
(TARGET_VECTORIZE_CREATE_COSTS): ...this.
(TARGET_VECTORIZE_ADD_STMT_COST): Delete.
(TARGET_VECTORIZE_FINISH_COST): Likewise.
(TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise.
(rs6000_cost_data): Inherit from vector_costs.
Add a constructor. Drop loop_info, cost and costing_for_scalar
in favor of the corresponding vector_costs member variables.
Add "m_" to the names of the remaining member variables and
initialize them.
(rs6000_density_test): Replace with...
(rs6000_cost_data::density_test): ...this.
(rs6000_init_cost): Replace with...
(rs6000_vectorize_create_costs): ...this.
(rs6000_update_target_cost_per_stmt): Replace with...
(rs6000_cost_data::update_target_cost_per_stmt): ...this.
(rs6000_add_stmt_cost): Replace with...
(rs6000_cost_data::add_stmt_cost): ...this. Use adjust_cost_for_freq
to adjust the cost for inner loops.
(rs6000_adjust_vect_cost_per_loop): Replace with...
(rs6000_cost_data::adjust_vect_cost_per_loop): ...this.
(rs6000_finish_cost): Replace with...
(rs6000_cost_data::finish_cost): ...this. Group loop code
into a single if statement and pass the loop_vinfo down to
subroutines.
(rs6000_destroy_cost_data): Delete.
|
|
This updates the internals manual documentation of TARGET_MEM_REF
and amends MEM_REF. The former was seriously out of date.
2021-11-04 Richard Biener <rguenther@suse.de>
gcc/
* doc/generic.texi: Update TARGET_MEM_REF and MEM_REF
documentation.
|
|
This adds support and a basic turning model for the Ampere Computing
"Ampere-1" CPU.
The Ampere-1 implements the ARMv8.6 architecture in A64 mode and is
modelled as a 4-wide issue (as with all modern micro-architectures,
the chosen issue rate is a compromise between the maximum dispatch
rate and the maximum rate of uops issued to the scheduler).
This adds the -mcpu=ampere1 command-line option and the relevant cost
information/tuning tables for the Ampere-1.
gcc/ChangeLog:
* config/aarch64/aarch64-cores.def (AARCH64_CORE): New Ampere-1 core.
* config/aarch64/aarch64-tune.md: Regenerate.
* config/aarch64/aarch64-cost-tables.h: Add extra costs for Ampere-1.
* config/aarch64/aarch64.c: Add tuning structures for Ampere-1.
* doc/invoke.texi: Add documentation for Ampere-1 core.
|
|
Adjust code in check_vect_slp_store_usage to make it an exact
pattern match of the corresponding testcases.
These new target/xfail selectors are added as a temporary solution,
and should be removed after real issue is fixed for Wstringop-overflow.
gcc/ChangeLog:
* doc/sourcebuild.texi (vect_slp_v4qi_store_unalign,
vect_slp_v2hi_store_unalign, vect_slp_v4hi_store_unalign,
vect_slp_v4si_store_unalign): Document efficient target.
(vect_slp_v4qi_store_unalign_1, vect_slp_v8qi_store_unalign_1,
vect_slp_v16qi_store_unalign_1): Ditto.
(vect_slp_v2hi_store_align,vect_slp_v2qi_store_align,
vect_slp_v2si_store_align, vect_slp_v4qi_store_align): Ditto.
(struct_4char_block_move, struct_8char_block_move,
struct_16char_block_move): Ditto.
gcc/testsuite/ChangeLog:
PR testsuite/102944
* c-c++-common/Wstringop-overflow-2.c: Adjust target/xfail
selector.
* gcc.dg/Warray-bounds-48.c: Ditto.
* gcc.dg/Warray-bounds-51.c: Ditto.
* gcc.dg/Warray-parameter-3.c: Ditto.
* gcc.dg/Wstringop-overflow-14.c: Ditto.
* gcc.dg/Wstringop-overflow-21.c: Ditto.
* gcc.dg/Wstringop-overflow-68.c: Ditto
* gcc.dg/Wstringop-overflow-76.c: Ditto
* gcc.dg/Wzero-length-array-bounds-2.c: Ditto.
* lib/target-supports.exp (vect_slp_v4qi_store_unalign): New
efficient target.
(vect_slp_v4qi_store_unalign_1): Ditto.
(struct_4char_block_move): Ditto.
(struct_8char_block_move): Ditto.
(stryct_16char_block_move): Ditto.
(vect_slp_v2hi_store_align): Ditto.
(vect_slp_v2qi_store): Rename to ..
(vect_slp_v2qi_store_align): .. this.
(vect_slp_v4qi_store): Rename to ..
(vect_slp_v4qi_store_align): .. This.
(vect_slp_v8qi_store): Rename to ..
(vect_slp_v8qi_store_unalign_1): .. This.
(vect_slp_v16qi_store): Rename to ..
(vect_slp_v16qi_store_unalign_1): .. This.
(vect_slp_v2hi_store): Rename to ..
(vect_slp_v2hi_store_unalign): .. This.
(vect_slp_v4hi_store): Rename to ..
(vect_slp_v4hi_store_unalign): This.
(vect_slp_v2si_store): Rename to ..
(vect_slp_v2si_store_align): .. This.
(vect_slp_v4si_store): Rename to ..
(vect_slp_v4si_store_unalign): Ditto.
(check_vect_slp_aligned_store_usage): Rename to ..
(check_vect_slp_store_usage): .. this and adjust code to make
it an exact pattern match of corresponding testcase.
|
|
This patch adds support to GCC's diagnostic subsystem for escaping certain
bytes and Unicode characters when quoting source code.
Specifically, this patch adds a new flag rich_location::m_escape_on_output
which is a hint from a diagnostic that non-ASCII bytes in the pertinent
lines of the user's source code should be escaped when printed.
The patch sets this for the following diagnostics:
- when complaining about stray bytes in the program (when these
are non-printable)
- when complaining about "null character(s) ignored");
- for -Wnormalized= (and generate source ranges for such warnings)
The escaping is controlled by a new option:
-fdiagnostics-escape-format=[unicode|bytes]
For example, consider a diagnostic involing a source line containing the
string "before" followed by the Unicode character U+03C0 ("GREEK SMALL
LETTER PI", with UTF-8 encoding 0xCF 0x80) followed by the byte 0xBF
(a stray UTF-8 trailing byte), followed by the string "after", where the
diagnostic highlights the U+03C0 character.
By default, this line will be printed verbatim to the user when
reporting a diagnostic at it, as:
beforeπXafter
^
(using X for the stray byte to avoid putting invalid UTF-8 in this
commit message)
If the diagnostic sets the "escape" flag, it will be printed as:
before<U+03C0><BF>after
^~~~~~~~
with -fdiagnostics-escape-format=unicode (the default), or as:
before<CF><80><BF>after
^~~~~~~~
if the user supplies -fdiagnostics-escape-format=bytes.
This only affects how the source is printed; it does not affect
how column numbers that are printed (as per -fdiagnostics-column-unit=
and -fdiagnostics-column-origin=).
gcc/c-family/ChangeLog:
* c-lex.c (c_lex_with_flags): When complaining about non-printable
CPP_OTHER tokens, set the "escape on output" flag.
gcc/ChangeLog:
* common.opt (fdiagnostics-escape-format=): New.
(diagnostics_escape_format): New enum.
(DIAGNOSTICS_ESCAPE_FORMAT_UNICODE): New enum value.
(DIAGNOSTICS_ESCAPE_FORMAT_BYTES): Likewise.
* diagnostic-format-json.cc (json_end_diagnostic): Add
"escape-source" attribute.
* diagnostic-show-locus.c
(exploc_with_display_col::exploc_with_display_col): Replace
"tabstop" param with a cpp_char_column_policy and add an "aspect"
param. Use these to compute m_display_col accordingly.
(struct char_display_policy): New struct.
(layout::m_policy): New field.
(layout::m_escape_on_output): New field.
(def_policy): New function.
(make_range): Update for changes to exploc_with_display_col ctor.
(default_print_decoded_ch): New.
(width_per_escaped_byte): New.
(escape_as_bytes_width): New.
(escape_as_bytes_print): New.
(escape_as_unicode_width): New.
(escape_as_unicode_print): New.
(make_policy): New.
(layout::layout): Initialize new fields. Update m_exploc ctor
call for above change to ctor.
(layout::maybe_add_location_range): Update for changes to
exploc_with_display_col ctor.
(layout::calculate_x_offset_display): Update for change to
cpp_display_width.
(layout::print_source_line): Pass policy
to cpp_display_width_computation. Capture cpp_decoded_char when
calling process_next_codepoint. Move printing of source code to
m_policy.m_print_cb.
(line_label::line_label): Pass in policy rather than context.
(layout::print_any_labels): Update for change to line_label ctor.
(get_affected_range): Pass in policy rather than context, updating
calls to location_compute_display_column accordingly.
(get_printed_columns): Likewise, also for cpp_display_width.
(correction::correction): Pass in policy rather than tabstop.
(correction::compute_display_cols): Pass m_policy rather than
m_tabstop to cpp_display_width.
(correction::m_tabstop): Replace with...
(correction::m_policy): ...this.
(line_corrections::line_corrections): Pass in policy rather than
context.
(line_corrections::m_context): Replace with...
(line_corrections::m_policy): ...this.
(line_corrections::add_hint): Update to use m_policy rather than
m_context.
(line_corrections::add_hint): Likewise.
(layout::print_trailing_fixits): Likewise.
(selftest::test_display_widths): New.
(selftest::test_layout_x_offset_display_utf8): Update to use
policy rather than tabstop.
(selftest::test_one_liner_labels_utf8): Add test of escaping
source lines.
(selftest::test_diagnostic_show_locus_one_liner_utf8): Update to
use policy rather than tabstop.
(selftest::test_overlapped_fixit_printing): Likewise.
(selftest::test_overlapped_fixit_printing_utf8): Likewise.
(selftest::test_overlapped_fixit_printing_2): Likewise.
(selftest::test_tab_expansion): Likewise.
(selftest::test_escaping_bytes_1): New.
(selftest::test_escaping_bytes_2): New.
(selftest::diagnostic_show_locus_c_tests): Call the new tests.
* diagnostic.c (diagnostic_initialize): Initialize
context->escape_format.
(convert_column_unit): Update to use default character width policy.
(selftest::test_diagnostic_get_location_text): Likewise.
* diagnostic.h (enum diagnostics_escape_format): New enum.
(diagnostic_context::escape_format): New field.
* doc/invoke.texi (-fdiagnostics-escape-format=): New option.
(-fdiagnostics-format=): Add "escape-source" attribute to examples
of JSON output, and document it.
* input.c (location_compute_display_column): Pass in "policy"
rather than "tabstop", passing to
cpp_byte_column_to_display_column.
(selftest::test_cpp_utf8): Update to use cpp_char_column_policy.
* input.h (class cpp_char_column_policy): New forward decl.
(location_compute_display_column): Pass in "policy" rather than
"tabstop".
* opts.c (common_handle_option): Handle
OPT_fdiagnostics_escape_format_.
* selftest.c (temp_source_file::temp_source_file): New ctor
overload taking a size_t.
* selftest.h (temp_source_file::temp_source_file): Likewise.
gcc/testsuite/ChangeLog:
* c-c++-common/diagnostic-format-json-1.c: Add regexp to consume
"escape-source" attribute.
* c-c++-common/diagnostic-format-json-2.c: Likewise.
* c-c++-common/diagnostic-format-json-3.c: Likewise.
* c-c++-common/diagnostic-format-json-4.c: Likewise, twice.
* c-c++-common/diagnostic-format-json-5.c: Likewise.
* gcc.dg/cpp/warn-normalized-4-bytes.c: New test.
* gcc.dg/cpp/warn-normalized-4-unicode.c: New test.
* gcc.dg/encoding-issues-bytes.c: New test.
* gcc.dg/encoding-issues-unicode.c: New test.
* gfortran.dg/diagnostic-format-json-1.F90: Add regexp to consume
"escape-source" attribute.
* gfortran.dg/diagnostic-format-json-2.F90: Likewise.
* gfortran.dg/diagnostic-format-json-3.F90: Likewise.
libcpp/ChangeLog:
* charset.c (convert_escape): Use encoding_rich_location when
complaining about nonprintable unknown escape sequences.
(cpp_display_width_computation::::cpp_display_width_computation):
Pass in policy rather than tabstop.
(cpp_display_width_computation::process_next_codepoint): Add "out"
param and populate *out if non-NULL.
(cpp_display_width_computation::advance_display_cols): Pass NULL
to process_next_codepoint.
(cpp_byte_column_to_display_column): Pass in policy rather than
tabstop. Pass NULL to process_next_codepoint.
(cpp_display_column_to_byte_column): Pass in policy rather than
tabstop.
* errors.c (cpp_diagnostic_get_current_location): New function,
splitting out the logic from...
(cpp_diagnostic): ...here.
(cpp_warning_at): New function.
(cpp_pedwarning_at): New function.
* include/cpplib.h (cpp_warning_at): New decl for rich_location.
(cpp_pedwarning_at): Likewise.
(struct cpp_decoded_char): New.
(struct cpp_char_column_policy): New.
(cpp_display_width_computation::cpp_display_width_computation):
Replace "tabstop" param with "policy".
(cpp_display_width_computation::process_next_codepoint): Add "out"
param.
(cpp_display_width_computation::m_tabstop): Replace with...
(cpp_display_width_computation::m_policy): ...this.
(cpp_byte_column_to_display_column): Replace "tabstop" param with
"policy".
(cpp_display_width): Likewise.
(cpp_display_column_to_byte_column): Likewise.
* include/line-map.h (rich_location::escape_on_output_p): New.
(rich_location::set_escape_on_output): New.
(rich_location::m_escape_on_output): New.
* internal.h (cpp_diagnostic_get_current_location): New decl.
(class encoding_rich_location): New.
* lex.c (skip_whitespace): Use encoding_rich_location when
complaining about null characters.
(warn_about_normalization): Generate a source range when
complaining about improperly normalized tokens, rather than just a
point, and use encoding_rich_location so that the source code
is escaped on printing.
* line-map.c (rich_location::rich_location): Initialize
m_escape_on_output.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
|
|
gcc/ChangeLog:
2021-11-01 Xionghu Luo <luoxhu@linux.ibm.com>
* cfghooks.c (cfg_hook_duplicate_loop_to_header_edge): Rename
duplicate_loop_to_header_edge to
duplicate_loop_body_to_header_edge.
(cfg_hook_duplicate_loop_body_to_header_edge): Likewise.
* cfghooks.h (struct cfg_hooks): Likewise.
(cfg_hook_duplicate_loop_body_to_header_edge): Likewise.
* cfgloopmanip.c (duplicate_loop_body_to_header_edge): Likewise.
(clone_loop_to_header_edge): Likewise.
* cfgloopmanip.h (duplicate_loop_body_to_header_edge): Likewise.
* cfgrtl.c (struct cfg_hooks): Likewise.
* doc/loop.texi: Likewise.
* loop-unroll.c (unroll_loop_constant_iterations): Likewise.
(unroll_loop_runtime_iterations): Likewise.
(unroll_loop_stupid): Likewise.
(apply_opt_in_copies): Likewise.
* tree-cfg.c (struct cfg_hooks): Likewise.
* tree-ssa-loop-ivcanon.c (try_unroll_loop_completely): Likewise.
(try_peel_loop): Likewise.
* tree-ssa-loop-manip.c (copy_phi_node_args): Likewise.
(gimple_duplicate_loop_body_to_header_edge): Likewise.
(tree_transform_and_unroll_loop): Likewise.
* tree-ssa-loop-manip.h (gimple_duplicate_loop_body_to_header_edge):
Likewise.
|