aboutsummaryrefslogtreecommitdiff
path: root/gcc/targhooks.h
AgeCommit message (Collapse)AuthorFilesLines
2025-01-02Update copyright years.Jakub Jelinek1-1/+1
2024-11-25Add target-independent store forwarding avoidance passKonstantinos Eleftheriou1-0/+3
This pass detects cases of expensive store forwarding and tries to avoid them by reordering the stores and using suitable bit insertion sequences. For example it can transform this: strb w2, [x1, 1] ldr x0, [x1] # Expensive store forwarding to larger load. To: ldr x0, [x1] strb w2, [x1] bfi x0, x2, 0, 8 Assembly like this can appear with bitfields or type punning / unions. On stress-ng when running the cpu-union microbenchmark the following speedups have been observed. Neoverse-N1: +29.4% Intel Coffeelake: +13.1% AMD 5950X: +17.5% The transformation is rejected on cases that cause store_bit_field to generate subreg expressions on different register classes. Files avoid-store-forwarding-4.c and avoid-store-forwarding-5.c contain such cases and have been marked as XFAIL. Due to biasing of its operands in store_bit_field, there is a special handling for machines with BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN. The need for this was exosed by an issue exposed on the H8 architecture, which uses big-endian ordering, but BITS_BIG_ENDIAN is false. In that case, the START parameter of store_bit_field needs to be calculated from the end of the destination register. gcc/ChangeLog: * Makefile.in (OBJS): Add avoid-store-forwarding.o. * common.opt (favoid-store-forwarding): New option. * common.opt.urls: Regenerate. * doc/invoke.texi: New param store-forwarding-max-distance. * doc/passes.texi: Document new pass. * doc/tm.texi: Regenerate. * doc/tm.texi.in: Document new pass. * params.opt (store-forwarding-max-distance): New param. * passes.def: Add pass_rtl_avoid_store_forwarding before pass_early_remat. * target.def (avoid_store_forwarding_p): New DEFHOOK. * target.h (struct store_fwd_info): Declare. * targhooks.cc (default_avoid_store_forwarding_p): New function. * targhooks.h (default_avoid_store_forwarding_p): Declare. * tree-pass.h (make_pass_rtl_avoid_store_forwarding): Declare. * avoid-store-forwarding.cc: New file. * avoid-store-forwarding.h: New file. * timevar.def (TV_AVOID_STORE_FORWARDING): New timevar. gcc/testsuite/ChangeLog: * gcc.target/aarch64/avoid-store-forwarding-1.c: New test. * gcc.target/aarch64/avoid-store-forwarding-2.c: New test. * gcc.target/aarch64/avoid-store-forwarding-3.c: New test. * gcc.target/aarch64/avoid-store-forwarding-4.c: New test. * gcc.target/aarch64/avoid-store-forwarding-5.c: New test. * gcc.target/x86_64/abi/callabi/avoid-store-forwarding-1.c: New test. * gcc.target/x86_64/abi/callabi/avoid-store-forwarding-2.c: New test. Co-authored-by: Philipp Tomsich <philipp.tomsich@vrull.eu> Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu> Signed-off-by: Konstantinos Eleftheriou <konstantinos.eleftheriou@vrull.eu>
2024-07-22middle-end: Implement conditonal store vectorizer pattern [PR115531]Tamar Christina1-0/+1
This adds a conditional store optimization for the vectorizer as a pattern. The vectorizer already supports modifying memory accesses because of the pattern based gather/scatter recognition. Doing it in the vectorizer allows us to still keep the ability to vectorize such loops for architectures that don't have MASK_STORE support, whereas doing this in ifcvt makes us commit to MASK_STORE. Concretely for this loop: void foo1 (char *restrict a, int *restrict b, int *restrict c, int n, int stride) { if (stride <= 1) return; for (int i = 0; i < n; i++) { int res = c[i]; int t = b[i+stride]; if (a[i] != 0) res = t; c[i] = res; } } today we generate: .L3: ld1b z29.s, p7/z, [x0, x5] ld1w z31.s, p7/z, [x2, x5, lsl 2] ld1w z30.s, p7/z, [x1, x5, lsl 2] cmpne p15.b, p6/z, z29.b, #0 sel z30.s, p15, z30.s, z31.s st1w z30.s, p7, [x2, x5, lsl 2] add x5, x5, x4 whilelo p7.s, w5, w3 b.any .L3 which in gimple is: vect_res_18.9_68 = .MASK_LOAD (vectp_c.7_65, 32B, loop_mask_67); vect_t_20.12_74 = .MASK_LOAD (vectp.10_72, 32B, loop_mask_67); vect__9.15_77 = .MASK_LOAD (vectp_a.13_75, 8B, loop_mask_67); mask__34.16_79 = vect__9.15_77 != { 0, ... }; vect_res_11.17_80 = VEC_COND_EXPR <mask__34.16_79, vect_t_20.12_74, vect_res_18.9_68>; .MASK_STORE (vectp_c.18_81, 32B, loop_mask_67, vect_res_11.17_80); A MASK_STORE is already conditional, so there's no need to perform the load of the old values and the VEC_COND_EXPR. This patch makes it so we generate: vect_res_18.9_68 = .MASK_LOAD (vectp_c.7_65, 32B, loop_mask_67); vect__9.15_77 = .MASK_LOAD (vectp_a.13_75, 8B, loop_mask_67); mask__34.16_79 = vect__9.15_77 != { 0, ... }; .MASK_STORE (vectp_c.18_81, 32B, mask__34.16_79, vect_res_18.9_68); which generates: .L3: ld1b z30.s, p7/z, [x0, x5] ld1w z31.s, p7/z, [x1, x5, lsl 2] cmpne p7.b, p7/z, z30.b, #0 st1w z31.s, p7, [x2, x5, lsl 2] add x5, x5, x4 whilelo p7.s, w5, w3 b.any .L3 gcc/ChangeLog: PR tree-optimization/115531 * tree-vect-patterns.cc (vect_cond_store_pattern_same_ref): New. (vect_recog_cond_store_pattern): New. (vect_vect_recog_func_ptrs): Use it. * target.def (conditional_operation_is_expensive): New. * doc/tm.texi: Regenerate. * doc/tm.texi.in: Document it. * targhooks.cc (default_conditional_operation_is_expensive): New. * targhooks.h (default_conditional_operation_is_expensive): New.
2024-06-25Replace {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE with new hook mode_for_floating_typeKewen Lin1-0/+1
Currently how we determine which mode will be used for a floating point type is that for a given type precision (size) call mode_for_size to get the first mode which has this size in the specified class. On Powerpc, we have three modes (TF/KF/IF) having the same mode precision 128 (see[1]), so the processing forces us to have to place TF at the first place, it would require us to make more adjustment in some generic code to avoid some unexpected mode conversions and it would be even worse if we get rid of TF eventually one day. And as Joseph pointed out in [2], "floating types should have their mode, not a poorly defined precision value", as Joseph and Richi suggested, this patch is to introduce one hook mode_for_floating_type which returns the corresponding mode for type float, double or long double. The default implementation returns SFmode for float and DFmode for double or long double. For ports which need special treatment, there are some other patches for their own port specific implementation (referring to how {,LONG_}DOUBLE_TYPE_SIZE get used there). For all generic uses of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE, depending on the context, some of them are replaced with TYPE_PRECISION of the according type node, some other are replaced with GET_MODE_PRECISION on the mode from mode_for_floating_type. This patch also poisons {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE, so most defines of {FLOAT,{,LONG_}DOUBLE}_TYPE_SIZE in port specific are removed, but there are still some which are good to be kept for readability then they get renamed with port specific prefix. [1] https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651017.html [2] https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651209.html gcc/jit/ChangeLog: * jit-recording.cc (recording::memento_of_get_type::get_size): Update macros {FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE by calling targetm.c.mode_for_floating_type with TI_{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE. gcc/ChangeLog: * coretypes.h (enum tree_index): Forward declaration. * defaults.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * doc/rtl.texi: Update document by replacing {FLOAT,DOUBLE}_TYPE_SIZE with C type {float,double}. * doc/tm.texi.in: Document new hook mode_for_floating_type, remove document entries for {FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE and update document for WIDEST_HARDWARE_FP_SIZE. * doc/tm.texi: Regenerate. * emit-rtl.cc (init_emit_once): Replace DOUBLE_TYPE_SIZE by calling targetm.c.mode_for_floating_type with TI_DOUBLE_TYPE. * real.h (REAL_VALUE_TO_TARGET_LONG_DOUBLE): Use TYPE_PRECISION of long_double_type_node to replace LONG_DOUBLE_TYPE_SIZE. * system.h (FLOAT_TYPE_SIZE): Poison. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * target.def (mode_for_floating_type): New hook. * targhooks.cc (default_mode_for_floating_type): New function. (default_scalar_mode_supported_p): Update macros {FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE by calling targetm.c.mode_for_floating_type with TI_{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE. * targhooks.h (default_mode_for_floating_type): New declaration. * tree-core.h (enum tree_index): Specify underlying type unsigned to sync with forward declaration in coretypes.h. (NUM_FLOATN_TYPES): Explicitly convert to int. (NUM_FLOATNX_TYPES): Likewise. (NUM_FLOATN_NX_TYPES): Likewise. * tree.cc (build_common_tree_nodes): Update macros {FLOAT,DOUBLE,LONG_DOUBLE}_TYPE_SIZE by calling targetm.c.mode_for_floating_type with TI_{FLOAT,DOUBLE,LONG_DOUBLE}_TYPE and set type mode accordingly. * config/arc/arc.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/bpf/bpf.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/epiphany/epiphany.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/fr30/fr30.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/frv/frv.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/ft32/ft32.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/gcn/gcn.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/iq2000/iq2000.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/lm32/lm32.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/m32c/m32c.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/m32r/m32r.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/microblaze/microblaze.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/mmix/mmix.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/moxie/moxie.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/msp430/msp430.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/nds32/nds32.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/nios2/nios2.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/nvptx/nvptx.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/or1k/or1k.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/pdp11/pdp11.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/pru/pru.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/stormy16/stormy16.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/visium/visium.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/xtensa/xtensa.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/rs6000/rs6000.cc (TARGET_C_MODE_FOR_FLOATING_TYPE): New macro. (rs6000_c_mode_for_floating_type): New function. * config/rs6000/rs6000.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/aarch64/aarch64.cc (aarch64_c_mode_for_floating_type): New function. (TARGET_C_MODE_FOR_FLOATING_TYPE): New macro. * config/aarch64/aarch64.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/alpha/alpha.cc (alpha_c_mode_for_floating_type): New function. (TARGET_C_MODE_FOR_FLOATING_TYPE): New macro. * config/alpha/alpha.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/avr/avr.cc (avr_c_mode_for_floating_type): New function. (TARGET_C_MODE_FOR_FLOATING_TYPE): New macro. * config/avr/avr.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/i386/i386.cc (ix86_c_mode_for_floating_type): New function. (TARGET_C_MODE_FOR_FLOATING_TYPE): New macro. * config/i386/i386.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/ia64/ia64.cc (ia64_c_mode_for_floating_type): New function. (TARGET_C_MODE_FOR_FLOATING_TYPE): New macro. * config/ia64/ia64.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/riscv/riscv.cc (riscv_c_mode_for_floating_type): New function. (TARGET_C_MODE_FOR_FLOATING_TYPE): New macro. * config/riscv/riscv.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/rl78/rl78.cc (TARGET_C_MODE_FOR_FLOATING_TYPE): New macro. (rl78_c_mode_for_floating_type): New function. * config/rl78/rl78.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/rx/rx.cc (rx_c_mode_for_floating_type): New function. (TARGET_C_MODE_FOR_FLOATING_TYPE): New macro. * config/rx/rx.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/s390/s390.cc (s390_c_mode_for_floating_type): New function. (TARGET_C_MODE_FOR_FLOATING_TYPE): New macro. * config/s390/s390.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. * config/sh/sh.cc (sh_c_mode_for_floating_type): New function. (TARGET_C_MODE_FOR_FLOATING_TYPE): New macro. * config/sh/sh.h (LONG_DOUBLE_TYPE_SIZE): Remove. * config/h8300/h8300.cc (h8300_c_mode_for_floating_type): New function. (TARGET_C_MODE_FOR_FLOATING_TYPE): New macro. * config/h8300/h8300.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Remove. (LONG_DOUBLE_TYPE_SIZE): Remove. (DOUBLE_TYPE_MODE): New macro. * config/h8300/linux.h (DOUBLE_TYPE_SIZE): Remove. (DOUBLE_TYPE_MODE): New macro. * config/loongarch/loongarch.cc (loongarch_c_mode_for_floating_type): New function. (TARGET_C_MODE_FOR_FLOATING_TYPE): New macro. * config/loongarch/loongarch.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Remove. (LONG_DOUBLE_TYPE_SIZE): Rename to ... (LA_LONG_DOUBLE_TYPE_SIZE): ... this. (UNITS_PER_FPVALUE): Replace LONG_DOUBLE_TYPE_SIZE with LA_LONG_DOUBLE_TYPE_SIZE. (MAX_FIXED_MODE_SIZE): Likewise. (STRUCTURE_SIZE_BOUNDARY): Likewise. (BIGGEST_ALIGNMENT): Likewise. * config/m68k/m68k.cc (m68k_c_mode_for_floating_type): New function. (TARGET_C_MODE_FOR_FLOATING_TYPE): New macro. * config/m68k/m68k.h (LONG_DOUBLE_TYPE_SIZE): Remove. (LONG_DOUBLE_TYPE_MODE): New macro. * config/m68k/netbsd-elf.h (LONG_DOUBLE_TYPE_SIZE): Remove. (LONG_DOUBLE_TYPE_MODE): New macro. * config/mips/mips.cc (mips_c_mode_for_floating_type): New function. (TARGET_C_MODE_FOR_FLOATING_TYPE): New macro. * config/mips/mips.h (UNITS_PER_FPVALUE): Replace LONG_DOUBLE_TYPE_SIZE with MIPS_LONG_DOUBLE_TYPE_SIZE. (MAX_FIXED_MODE_SIZE): Likewise. (STRUCTURE_SIZE_BOUNDARY): Likewise. (BIGGEST_ALIGNMENT): Likewise. (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Remove. (LONG_DOUBLE_TYPE_SIZE): Rename to ... (MIPS_LONG_DOUBLE_TYPE_SIZE): ... this. * config/mips/n32-elf.h (LONG_DOUBLE_TYPE_SIZE): Rename to ... (MIPS_LONG_DOUBLE_TYPE_SIZE): ... this. * config/pa/pa.cc (pa_c_mode_for_floating_type): New function. (TARGET_C_MODE_FOR_FLOATING_TYPE): New macro. (pa_scalar_mode_supported_p): Rename FLOAT_TYPE_SIZE to PA_FLOAT_TYPE_SIZE, rename DOUBLE_TYPE_SIZE to PA_DOUBLE_TYPE_SIZE and rename LONG_DOUBLE_TYPE_SIZE to PA_LONG_DOUBLE_TYPE_SIZE. * config/pa/pa.h (PA_FLOAT_TYPE_SIZE): New macro. (PA_DOUBLE_TYPE_SIZE): Likewise. (PA_LONG_DOUBLE_TYPE_SIZE): Likewise. * config/pa/pa-64.h (FLOAT_TYPE_SIZE): Rename to ... (PA_FLOAT_TYPE_SIZE): ... this. (DOUBLE_TYPE_SIZE): Rename to ... (PA_DOUBLE_TYPE_SIZE): ... this. (LONG_DOUBLE_TYPE_SIZE): Rename to ... (PA_LONG_DOUBLE_TYPE_SIZE): ... this. * config/pa/pa-hpux.h (LONG_DOUBLE_TYPE_SIZE): Rename to ... (PA_LONG_DOUBLE_TYPE_SIZE): ... this. * config/sparc/sparc.cc (sparc_c_mode_for_floating_type): New function. (TARGET_C_MODE_FOR_FLOATING_TYPE): New macro. (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Likewise. (LONG_DOUBLE_TYPE_SIZE): Likewise. (sparc_type_code): Replace FLOAT_TYPE_SIZE with TYPE_PRECISION of float_type_node. * config/sparc/sparc.h (FLOAT_TYPE_SIZE): Remove. (DOUBLE_TYPE_SIZE): Remove. * config/sparc/freebsd.h (LONG_DOUBLE_TYPE_SIZE): Rename to ... (SPARC_LONG_DOUBLE_TYPE_SIZE): ... this. * config/sparc/linux.h (LONG_DOUBLE_TYPE_SIZE): Rename to ... (SPARC_LONG_DOUBLE_TYPE_SIZE): ... this. * config/sparc/linux64.h (LONG_DOUBLE_TYPE_SIZE): Rename to ... (SPARC_LONG_DOUBLE_TYPE_SIZE): ... this. * config/sparc/netbsd-elf.h (LONG_DOUBLE_TYPE_SIZE): Rename to ... (SPARC_LONG_DOUBLE_TYPE_SIZE): ... this. * config/sparc/openbsd64.h (LONG_DOUBLE_TYPE_SIZE): Rename to ... (SPARC_LONG_DOUBLE_TYPE_SIZE): ... this. * config/sparc/sol2.h (LONG_DOUBLE_TYPE_SIZE): Rename to ... (SPARC_LONG_DOUBLE_TYPE_SIZE): ... this. * config/sparc/sp-elf.h (LONG_DOUBLE_TYPE_SIZE): Rename to ... (SPARC_LONG_DOUBLE_TYPE_SIZE): ... this. * config/sparc/sp64-elf.h (LONG_DOUBLE_TYPE_SIZE): Rename to ... (SPARC_LONG_DOUBLE_TYPE_SIZE): ... this. * config/bfin/bfin.h (FLOAT_TYPE_SIZE): Rename to ... (BFIN_FLOAT_TYPE_SIZE): ... this. (DOUBLE_TYPE_SIZE): Rename to ... (BFIN_DOUBLE_TYPE_SIZE): ... this. (LONG_DOUBLE_TYPE_SIZE): Remove. (UNITS_PER_FLOAT): Replace FLOAT_TYPE_SIZE with BFIN_FLOAT_TYPE_SIZE. (UNITS_PER_DOUBLE): Replace DOUBLE_TYPE_SIZE with BFIN_DOUBLE_TYPE_SIZE.
2024-06-13[APX CCMP] Add targetm.have_ccmp hook [PR115370]Hongyu Wang1-0/+1
In cfgexpand, there is an optimization for branch which tests targetm.gen_ccmp_first == NULL. However for target like x86-64, the hook was implemented but it does not indicate that ccmp was enabled. Add a new target hook TARGET_HAVE_CCMP and replace the middle-end check for the existance of gen_ccmp_first to avoid misoptimization. gcc/ChangeLog: PR target/115370 PR target/115463 * target.def (have_ccmp): New target hook. * targhooks.cc (default_have_ccmp): New function. * targhooks.h (default_have_ccmp): New prototype. * doc/tm.texi.in: Add TARGET_HAVE_CCMP. * doc/tm.texi: Regenerate. * cfgexpand.cc (expand_gimple_cond): Call targetm.have_ccmp instead of checking if targetm.gen_ccmp_first exists. * expr.cc (expand_expr_real_gassign): Likewise. * config/i386/i386.cc (ix86_have_ccmp): New target hook to check if APX_CCMP enabled. (TARGET_HAVE_CCMP): Define.
2024-05-10c++, mingw: Fix up types of dtor hooks to __cxa_{,thread_}atexit/__cxa_throw ↵Jakub Jelinek1-0/+1
on mingw ia32 [PR114968] __cxa_atexit/__cxa_thread_atexit/__cxa_throw functions accept function pointers to usually directly destructors rather than wrappers around them. Now, mingw ia32 uses implicitly __attribute__((thiscall)) calling conventions for METHOD_TYPE (where the this pointer is passed in %ecx register, the rest on the stack), so these functions use: in config/os/mingw32/os_defines.h: #if defined (__i386__) #define _GLIBCXX_CDTOR_CALLABI __thiscall #endif in libsupc++/cxxabi.h __cxa_atexit(void (_GLIBCXX_CDTOR_CALLABI *)(void*), void*, void*) _GLIBCXX_NOTHROW; __cxa_thread_atexit(void (_GLIBCXX_CDTOR_CALLABI *)(void*), void*, void *) _GLIBCXX_NOTHROW; __cxa_throw(void*, std::type_info*, void (_GLIBCXX_CDTOR_CALLABI *) (void *)) __attribute__((__noreturn__)); Now, mingw for some weird reason uses #define TARGET_CXX_USE_ATEXIT_FOR_CXA_ATEXIT hook_bool_void_true so it never actually uses __cxa_atexit, but does use __cxa_thread_atexit and __cxa_throw. Recent changes for modules result in more detailed __cxa_*atexit/__cxa_throw prototypes precreated by the compiler, and if that happens and one also includes <cxxabi.h>, the compiler complains about mismatches in the prototypes. One thing is the missing thiscall attribute on the FUNCTION_TYPE, the other problem is that all of atexit/__cxa_atexit/__cxa_thread_atexit get function pointer types created by a single function, get_atexit_fn_ptr_type (), which creates it depending on if atexit or __cxa_atexit will be used as either void(*)(void) or void(*)(void *), but when using atexit and __cxa_thread_atexit it uses the wrong function type for __cxa_thread_atexit. The following patch adds a target hook to add the thiscall attribute to the function pointers, and splits the get_atexit_fn_ptr_type () function into get_atexit_fn_ptr_type () and get_cxa_atexit_fn_ptr_type (), the former always creates shared void(*)(void) type, the latter creates either void(*)(void*) (on most targets) or void(__attribute__((thiscall))*)(void*) (on mingw ia32). So that we don't waiste another GTY global tree for it, because cleanup_type used for the same purpose for __cxa_throw should be the same, the code changes it to use that type too. In register_dtor_fn then based on the decision whether to use atexit, __cxa_atexit or __cxa_thread_atexit it picks the right function pointer type, and also if it decides to emit a __tcf_* wrapper for the cleanup, uses that type for that wrapper so that it agrees on calling convention. 2024-05-10 Jakub Jelinek <jakub@redhat.com> PR target/114968 gcc/ * target.def (use_atexit_for_cxa_atexit): Remove spurious space from comment. (adjust_cdtor_callabi_fntype): New cxx target hook. * targhooks.h (default_cxx_adjust_cdtor_callabi_fntype): Declare. * targhooks.cc (default_cxx_adjust_cdtor_callabi_fntype): New function. * doc/tm.texi.in (TARGET_CXX_ADJUST_CDTOR_CALLABI_FNTYPE): Add. * doc/tm.texi: Regenerate. * config/i386/i386.cc (ix86_cxx_adjust_cdtor_callabi_fntype): New function. (TARGET_CXX_ADJUST_CDTOR_CALLABI_FNTYPE): Redefine. gcc/cp/ * cp-tree.h (atexit_fn_ptr_type_node, cleanup_type): Adjust macro comments. (get_cxa_atexit_fn_ptr_type): Declare. * decl.cc (get_atexit_fn_ptr_type): Adjust function comment, only build type for atexit argument. (get_cxa_atexit_fn_ptr_type): New function. (get_atexit_node): Call get_cxa_atexit_fn_ptr_type rather than get_atexit_fn_ptr_type when using __cxa_atexit. (get_thread_atexit_node): Call get_cxa_atexit_fn_ptr_type rather than get_atexit_fn_ptr_type. (start_cleanup_fn): Add ob_parm argument, call get_cxa_atexit_fn_ptr_type or get_atexit_fn_ptr_type depending on it and create PARM_DECL also based on that argument. (register_dtor_fn): Adjust start_cleanup_fn caller, use get_cxa_atexit_fn_ptr_type rather than get_atexit_fn_ptr_type for use_dtor casts. * except.cc (build_throw): Use get_cxa_atexit_fn_ptr_type ().
2024-01-03Update copyright years.Jakub Jelinek1-1/+1
2023-12-16Add support for target_version attributeAndrew Carlotti1-0/+1
This patch adds support for the "target_version" attribute to the middle end and the C++ frontend, which will be used to implement function multiversioning in the aarch64 backend. On targets that don't use the "target" attribute for multiversioning, there is no conflict between the "target" and "target_clones" attributes. This patch therefore makes the mutual exclusion in C-family, D and Ada conditonal upon the value of the expanded_clones_attribute target hook. The "target_version" attribute is only added to C++ in this patch, because this is currently the only frontend which supports multiversioning using the "target" attribute. Support for the "target_version" attribute will be extended to C at a later date. Targets that currently use the "target" attribute for function multiversioning (i.e. i386 and rs6000) are not affected by this patch. gcc/ChangeLog: * attribs.cc (decl_attributes): Pass attribute name to target. (is_function_default_version): Update comment to specify incompatibility with target_version attributes. * cgraphclones.cc (cgraph_node::create_version_clone_with_body): Call valid_version_attribute_p for target_version attributes. * defaults.h (TARGET_HAS_FMV_TARGET_ATTRIBUTE): New macro. * target.def (valid_version_attribute_p): New hook. * doc/tm.texi.in: Add new hook. * doc/tm.texi: Regenerate. * multiple_target.cc (create_dispatcher_calls): Remove redundant is_function_default_version check. (expand_target_clones): Use target macro to pick attribute name. * targhooks.cc (default_target_option_valid_version_attribute_p): New. * targhooks.h (default_target_option_valid_version_attribute_p): New. * tree.h (DECL_FUNCTION_VERSIONED): Update comment to include target_version attributes. gcc/c-family/ChangeLog: * c-attribs.cc (attr_target_exclusions): Make target/target_clones exclusion target-dependent. (attr_target_clones_exclusions): Ditto, and add target_version. (attr_target_version_exclusions): New. (c_common_attribute_table): Add target_version. (handle_target_version_attribute): New. (handle_target_attribute): Amend comment. (handle_target_clones_attribute): Ditto. gcc/ada/ChangeLog: * gcc-interface/utils.cc (attr_target_exclusions): Make target/target_clones exclusion target-dependent. (attr_target_clones_exclusions): Ditto. gcc/d/ChangeLog: * d-attribs.cc (attr_target_exclusions): Make target/target_clones exclusion target-dependent. (attr_target_clones_exclusions): Ditto. gcc/cp/ChangeLog: * decl2.cc (check_classfn): Update comment to include target_version attributes.
2023-12-05Add a new target hook: TARGET_START_CALL_ARGSRichard Sandiford1-2/+3
We have the following two hooks into the call expansion code: - TARGET_CALL_ARGS is called for each argument before arguments are moved into hard registers. - TARGET_END_CALL_ARGS is called after the end of the call sequence (specifically, after any return value has been moved to a pseudo). This patch adds a TARGET_START_CALL_ARGS hook that is called before the TARGET_CALL_ARGS sequence. This means that TARGET_START_CALL_REGS and TARGET_END_CALL_REGS bracket the region in which argument registers might be live. They also bracket a region in which the only call emiitted by target-independent code is the call to the target function itself. (For example, TARGET_START_CALL_ARGS happens after any use of memcpy to copy arguments, and TARGET_END_CALL_ARGS happens before any use of memcpy to copy the result.) Also, the patch adds the cumulative argument structure as an argument to the hooks, so that the target can use it to record and retrieve information about the call as a whole. The TARGET_CALL_ARGS docs said: While generating RTL for a function call, this target hook is invoked once for each argument passed to the function, either a register returned by ``TARGET_FUNCTION_ARG`` or a memory location. It is called just - before the point where argument registers are stored. The last bit was true for normal calls, but for libcalls the hook was invoked earlier, before stack arguments have been copied. I don't think this caused a practical difference for nvptx (the only port to use the hooks) since I wouldn't expect any libcalls to take stack parameters. gcc/ * doc/tm.texi.in: Add TARGET_START_CALL_ARGS. * doc/tm.texi: Regenerate. * target.def (start_call_args): New hook. (call_args, end_call_args): Add a parameter for the cumulative argument information. * hooks.h (hook_void_rtx_tree): Delete. * hooks.cc (hook_void_rtx_tree): Likewise. * targhooks.h (hook_void_CUMULATIVE_ARGS): Declare. (hook_void_CUMULATIVE_ARGS_rtx_tree): Likewise. * targhooks.cc (hook_void_CUMULATIVE_ARGS): New function. (hook_void_CUMULATIVE_ARGS_rtx_tree): Likewise. * calls.cc (expand_call): Call start_call_args before computing and storing stack parameters. Pass the cumulative argument information to call_args and end_call_args. (emit_library_call_value_1): Likewise. * config/nvptx/nvptx.cc (nvptx_call_args): Add a cumulative argument parameter. (nvptx_end_call_args): Likewise.
2023-11-23gcc: Introduce -fhardenedMarek Polacek1-0/+1
In <https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628748.html> I proposed -fhardened, a new umbrella option that enables a reasonable set of hardening flags. The read of the room seems to be that the option would be useful. So here's a patch implementing that option. Currently, -fhardened enables: -D_FORTIFY_SOURCE=3 (or =2 for older glibcs) -D_GLIBCXX_ASSERTIONS -ftrivial-auto-var-init=zero -fPIE -pie -Wl,-z,relro,-z,now -fstack-protector-strong -fstack-clash-protection -fcf-protection=full (x86 GNU/Linux only) -fhardened will not override options that were specified on the command line (before or after -fhardened). For example, -D_FORTIFY_SOURCE=1 -fhardened means that _FORTIFY_SOURCE=1 will be used. Similarly, -fhardened -fstack-protector will not enable -fstack-protector-strong. Currently, -fhardened is only supported on GNU/Linux. In DW_AT_producer it is reflected only as -fhardened; it doesn't expand to anything. This patch provides -Whardened, enabled by default, which warns when -fhardened couldn't enable a particular option. I think most often it will say that _FORTIFY_SOURCE wasn't enabled because optimization were not enabled. gcc/c-family/ChangeLog: * c-opts.cc: Include "target.h". (c_finish_options): Maybe cpp_define _FORTIFY_SOURCE and _GLIBCXX_ASSERTIONS. gcc/ChangeLog: * common.opt (Whardened, fhardened): New options. * config.in: Regenerate. * config/bpf/bpf.cc: Include "opts.h". (bpf_option_override): If flag_stack_protector_set_by_fhardened_p, do not inform that -fstack-protector does not work. * config/i386/i386-options.cc (ix86_option_override_internal): When -fhardened, maybe enable -fcf-protection=full. * config/linux-protos.h (linux_fortify_source_default_level): Declare. * config/linux.cc (linux_fortify_source_default_level): New. * config/linux.h (TARGET_FORTIFY_SOURCE_DEFAULT_LEVEL): Redefine. * configure: Regenerate. * configure.ac: Check if the linker supports '-z now' and '-z relro'. Check if -fhardened is supported on $target_os. * doc/invoke.texi: Document -fhardened and -Whardened. * doc/tm.texi: Regenerate. * doc/tm.texi.in (TARGET_FORTIFY_SOURCE_DEFAULT_LEVEL): Add. * gcc.cc (driver_handle_option): Remember if any link options or -static were specified on the command line. (process_command): When -fhardened, maybe enable -pie and -Wl,-z,relro,-z,now. * opts.cc (flag_stack_protector_set_by_fhardened_p): New global. (finish_options): When -fhardened, enable -ftrivial-auto-var-init=zero and -fstack-protector-strong. (print_help_hardened): New. (print_help): Call it. * opts.h (flag_stack_protector_set_by_fhardened_p): Declare. * target.def (fortify_source_default_level): New target hook. * targhooks.cc (default_fortify_source_default_level): New. * targhooks.h (default_fortify_source_default_level): Declare. * toplev.cc (process_options): When -fhardened, enable -fstack-clash-protection. If flag_stack_protector_set_by_fhardened_p, do not warn that -fstack-protector not supported for this target. Don't enable -fhardened when !HAVE_FHARDENED_SUPPORT. gcc/testsuite/ChangeLog: * gcc.misc-tests/help.exp: Test -fhardened. * c-c++-common/fhardened-1.S: New test. * c-c++-common/fhardened-1.c: New test. * c-c++-common/fhardened-10.c: New test. * c-c++-common/fhardened-11.c: New test. * c-c++-common/fhardened-12.c: New test. * c-c++-common/fhardened-13.c: New test. * c-c++-common/fhardened-14.c: New test. * c-c++-common/fhardened-15.c: New test. * c-c++-common/fhardened-2.c: New test. * c-c++-common/fhardened-3.c: New test. * c-c++-common/fhardened-4.c: New test. * c-c++-common/fhardened-5.c: New test. * c-c++-common/fhardened-6.c: New test. * c-c++-common/fhardened-7.c: New test. * c-c++-common/fhardened-8.c: New test. * c-c++-common/fhardened-9.c: New test. * gcc.target/i386/cf_check-6.c: New test.
2023-11-18gcov: Remove TARGET_GCOV_TYPE_SIZE target hookSebastian Huber1-2/+0
This reverts commit 8cdcea51c0fd753e6a652c9b236e91b3a6e0911c. gcc/c-family/ChangeLog: * c-cppbuiltin.cc (c_cpp_builtins): Do not define __LIBGCC_GCOV_TYPE_SIZE. gcc/ChangeLog: * config/sparc/rtemself.h (SPARC_GCOV_TYPE_SIZE): Remove. * config/sparc/sparc.cc (sparc_gcov_type_size): Likewise. (TARGET_GCOV_TYPE_SIZE): Likewise. * coverage.cc (get_gcov_type): Use LONG_LONG_TYPE_SIZE instead of removed target hook. * doc/tm.texi: Regenerate. * doc/tm.texi.in (TARGET_GCOV_TYPE_SIZE): Remove. * target.def: Likewise. * targhooks.cc (default_gcov_type_size): Likewise. * targhooks.h (default_gcov_type_size): Likewise. libgcc/ChangeLog: * libgcov.h (gcov_type): Use LONG_LONG_TYPE_SIZE. (gcov_type_unsigned): Likewise.
2023-09-06Middle-end _BitInt support [PR102989]Jakub Jelinek1-0/+1
The following patch introduces the middle-end part of the _BitInt support, a new BITINT_TYPE, handling it where needed, except the lowering pass and sanitizer support. 2023-09-06 Jakub Jelinek <jakub@redhat.com> PR c/102989 * tree.def (BITINT_TYPE): New type. * tree.h (TREE_CHECK6, TREE_NOT_CHECK6): Define. (NUMERICAL_TYPE_CHECK, INTEGRAL_TYPE_P): Include BITINT_TYPE. (BITINT_TYPE_P): Define. (CONSTRUCTOR_BITFIELD_P): Return true even for BLKmode bit-fields if they have BITINT_TYPE type. (tree_check6, tree_not_check6): New inline functions. (any_integral_type_check): Include BITINT_TYPE. (build_bitint_type): Declare. * tree.cc (tree_code_size, wide_int_to_tree_1, cache_integer_cst, build_zero_cst, type_hash_canon_hash, type_cache_hasher::equal, type_hash_canon): Handle BITINT_TYPE. (bitint_type_cache): New variable. (build_bitint_type): New function. (signed_or_unsigned_type_for, verify_type_variant, verify_type): Handle BITINT_TYPE. (tree_cc_finalize): Free bitint_type_cache. * builtins.cc (type_to_class): Handle BITINT_TYPE. (fold_builtin_unordered_cmp): Handle BITINT_TYPE like INTEGER_TYPE. * cfgexpand.cc (expand_debug_expr): Punt on BLKmode BITINT_TYPE INTEGER_CSTs. * convert.cc (convert_to_pointer_1, convert_to_real_1, convert_to_complex_1): Handle BITINT_TYPE like INTEGER_TYPE. (convert_to_integer_1): Likewise. For BITINT_TYPE don't check GET_MODE_PRECISION (TYPE_MODE (type)). * doc/generic.texi (BITINT_TYPE): Document. * doc/tm.texi.in (TARGET_C_BITINT_TYPE_INFO): New. * doc/tm.texi: Regenerated. * dwarf2out.cc (base_type_die, is_base_type, modified_type_die, gen_type_die_with_usage): Handle BITINT_TYPE. (rtl_for_decl_init): Punt on BLKmode BITINT_TYPE INTEGER_CSTs or handle those which fit into shwi. * expr.cc (expand_expr_real_1): Define EXTEND_BITINT macro, reduce to bitfield precision reads from BITINT_TYPE vars, parameters or memory locations. Expand large/huge BITINT_TYPE INTEGER_CSTs into memory. * fold-const.cc (fold_convert_loc, make_range_step): Handle BITINT_TYPE. (extract_muldiv_1): For BITINT_TYPE use TYPE_PRECISION rather than GET_MODE_SIZE (SCALAR_INT_TYPE_MODE). (native_encode_int, native_interpret_int, native_interpret_expr): Handle BITINT_TYPE. * gimple-expr.cc (useless_type_conversion_p): Make BITINT_TYPE to some other integral type or vice versa conversions non-useless. * gimple-fold.cc (gimple_fold_builtin_memset): Punt for BITINT_TYPE. (clear_padding_unit): Mention in comment that _BitInt types don't need to fit either. (clear_padding_bitint_needs_padding_p): New function. (clear_padding_type_may_have_padding_p): Handle BITINT_TYPE. (clear_padding_type): Likewise. * internal-fn.cc (expand_mul_overflow): For unsigned non-mode precision operands force pos_neg? to 1. (expand_MULBITINT, expand_DIVMODBITINT, expand_FLOATTOBITINT, expand_BITINTTOFLOAT): New functions. * internal-fn.def (MULBITINT, DIVMODBITINT, FLOATTOBITINT, BITINTTOFLOAT): New internal functions. * internal-fn.h (expand_MULBITINT, expand_DIVMODBITINT, expand_FLOATTOBITINT, expand_BITINTTOFLOAT): Declare. * match.pd (non-equality compare simplifications from fold_binary): Punt if TYPE_MODE (arg1_type) is BLKmode. * pretty-print.h (pp_wide_int): Handle printing of large precision wide_ints which would buffer overflow digit_buffer. * stor-layout.cc (finish_bitfield_representative): For bit-fields with BITINT_TYPE, prefer representatives with precisions in multiple of limb precision. (layout_type): Handle BITINT_TYPE. Handle COMPLEX_TYPE with BLKmode element type and assert it is BITINT_TYPE. * target.def (bitint_type_info): New C target hook. * target.h (struct bitint_info): New type. * targhooks.cc (default_bitint_type_info): New function. * targhooks.h (default_bitint_type_info): Declare. * tree-pretty-print.cc (dump_generic_node): Handle BITINT_TYPE. Handle printing large wide_ints which would buffer overflow digit_buffer. * tree-ssa-sccvn.cc: Include target.h. (eliminate_dom_walker::eliminate_stmt): Punt for large/huge BITINT_TYPE. * tree-switch-conversion.cc (jump_table_cluster::emit): For more than 64-bit BITINT_TYPE subtract low bound from expression and cast to 64-bit integer type both the controlling expression and case labels. * typeclass.h (enum type_class): Add bitint_type_class enumerator. * varasm.cc (output_constant): Handle BITINT_TYPE INTEGER_CSTs. * vr-values.cc (check_for_binary_op_overflow): Use widest2_int rather than widest_int. (simplify_using_ranges::simplify_internal_call_using_ranges): Use unsigned_type_for rather than build_nonstandard_integer_type.
2023-08-09targhooks: Extend legitimate_address_p with code_helper [PR110248]Kewen Lin1-3/+3
As PR110248 shows, some middle-end passes like IVOPTs can query the target hook legitimate_address_p with some artificially constructed rtx to determine whether some addressing modes are supported by target for some gimple statement. But for now the existing legitimate_address_p only checks the given mode, it's unable to distinguish some special cases unfortunately, for example, for LEN_LOAD ifn on Power port, we would expand it with lxvl hardware insn, which only supports one register to hold the address (the other register is holding the length), that is we don't support base (reg) + index (reg) addressing mode for sure. But hook legitimate_address_p only considers the given mode which would be some vector mode for LEN_LOAD ifn, and we do support base + index addressing mode for normal vector load and store insns, so the hook will return true for the query unexpectedly. This patch is to introduce one extra argument of type code_helper for hook legitimate_address_p, it makes targets able to handle some special case like what's described above. PR tree-optimization/110248 gcc/ChangeLog: * coretypes.h (class code_helper): Add forward declaration. * doc/tm.texi: Regenerate. * lra-constraints.cc (valid_address_p): Call target hook targetm.addr_space.legitimate_address_p with an extra parameter ERROR_MARK as its prototype changes. * recog.cc (memory_address_addr_space_p): Likewise. * reload.cc (strict_memory_address_addr_space_p): Likewise. * target.def (legitimate_address_p, addr_space.legitimate_address_p): Extend with one more argument of type code_helper, update the documentation accordingly. * targhooks.cc (default_legitimate_address_p): Adjust for the new code_helper argument. (default_addr_space_legitimate_address_p): Likewise. * targhooks.h (default_legitimate_address_p): Likewise. (default_addr_space_legitimate_address_p): Likewise. * config/aarch64/aarch64.cc (aarch64_legitimate_address_hook_p): Adjust with extra unnamed code_helper argument with default ERROR_MARK. * config/alpha/alpha.cc (alpha_legitimate_address_p): Likewise. * config/arc/arc.cc (arc_legitimate_address_p): Likewise. * config/arm/arm-protos.h (arm_legitimate_address_p): Likewise. (tree.h): New include for tree_code ERROR_MARK. * config/arm/arm.cc (arm_legitimate_address_p): Adjust with extra unnamed code_helper argument with default ERROR_MARK. * config/avr/avr.cc (avr_addr_space_legitimate_address_p): Likewise. * config/bfin/bfin.cc (bfin_legitimate_address_p): Likewise. * config/bpf/bpf.cc (bpf_legitimate_address_p): Likewise. * config/c6x/c6x.cc (c6x_legitimate_address_p): Likewise. * config/cris/cris-protos.h (cris_legitimate_address_p): Likewise. (tree.h): New include for tree_code ERROR_MARK. * config/cris/cris.cc (cris_legitimate_address_p): Adjust with extra unnamed code_helper argument with default ERROR_MARK. * config/csky/csky.cc (csky_legitimate_address_p): Likewise. * config/epiphany/epiphany.cc (epiphany_legitimate_address_p): Likewise. * config/frv/frv.cc (frv_legitimate_address_p): Likewise. * config/ft32/ft32.cc (ft32_addr_space_legitimate_address_p): Likewise. * config/gcn/gcn.cc (gcn_addr_space_legitimate_address_p): Likewise. * config/h8300/h8300.cc (h8300_legitimate_address_p): Likewise. * config/i386/i386.cc (ix86_legitimate_address_p): Likewise. * config/ia64/ia64.cc (ia64_legitimate_address_p): Likewise. * config/iq2000/iq2000.cc (iq2000_legitimate_address_p): Likewise. * config/lm32/lm32.cc (lm32_legitimate_address_p): Likewise. * config/loongarch/loongarch.cc (loongarch_legitimate_address_p): Likewise. * config/m32c/m32c.cc (m32c_legitimate_address_p): Likewise. (m32c_addr_space_legitimate_address_p): Likewise. * config/m32r/m32r.cc (m32r_legitimate_address_p): Likewise. * config/m68k/m68k.cc (m68k_legitimate_address_p): Likewise. * config/mcore/mcore.cc (mcore_legitimate_address_p): Likewise. * config/microblaze/microblaze-protos.h (tree.h): New include for tree_code ERROR_MARK. (microblaze_legitimate_address_p): Adjust with extra unnamed code_helper argument with default ERROR_MARK. * config/microblaze/microblaze.cc (microblaze_legitimate_address_p): Likewise. * config/mips/mips.cc (mips_legitimate_address_p): Likewise. * config/mmix/mmix.cc (mmix_legitimate_address_p): Likewise. * config/mn10300/mn10300.cc (mn10300_legitimate_address_p): Likewise. * config/moxie/moxie.cc (moxie_legitimate_address_p): Likewise. * config/msp430/msp430.cc (msp430_legitimate_address_p): Likewise. (msp430_addr_space_legitimate_address_p): Adjust with extra code_helper argument with default ERROR_MARK and adjust the call to function msp430_legitimate_address_p. * config/nds32/nds32.cc (nds32_legitimate_address_p): Adjust with extra unnamed code_helper argument with default ERROR_MARK. * config/nios2/nios2.cc (nios2_legitimate_address_p): Likewise. * config/nvptx/nvptx.cc (nvptx_legitimate_address_p): Likewise. * config/or1k/or1k.cc (or1k_legitimate_address_p): Likewise. * config/pa/pa.cc (pa_legitimate_address_p): Likewise. * config/pdp11/pdp11.cc (pdp11_legitimate_address_p): Likewise. * config/pru/pru.cc (pru_addr_space_legitimate_address_p): Likewise. * config/riscv/riscv.cc (riscv_legitimate_address_p): Likewise. * config/rl78/rl78-protos.h (rl78_as_legitimate_address): Likewise. (tree.h): New include for tree_code ERROR_MARK. * config/rl78/rl78.cc (rl78_as_legitimate_address): Adjust with extra unnamed code_helper argument with default ERROR_MARK. * config/rs6000/rs6000.cc (rs6000_legitimate_address_p): Likewise. (rs6000_debug_legitimate_address_p): Adjust with extra code_helper argument and adjust the call to function rs6000_legitimate_address_p. * config/rx/rx.cc (rx_is_legitimate_address): Adjust with extra unnamed code_helper argument with default ERROR_MARK. * config/s390/s390.cc (s390_legitimate_address_p): Likewise. * config/sh/sh.cc (sh_legitimate_address_p): Likewise. * config/sparc/sparc.cc (sparc_legitimate_address_p): Likewise. * config/v850/v850.cc (v850_legitimate_address_p): Likewise. * config/vax/vax.cc (vax_legitimate_address_p): Likewise. * config/visium/visium.cc (visium_legitimate_address_p): Likewise. * config/xtensa/xtensa.cc (xtensa_legitimate_address_p): Likewise. * config/stormy16/stormy16-protos.h (xstormy16_legitimate_address_p): Likewise. (tree.h): New include for tree_code ERROR_MARK. * config/stormy16/stormy16.cc (xstormy16_legitimate_address_p): Adjust with extra unnamed code_helper argument with default ERROR_MARK.
2023-04-28Add targetm.libm_function_max_errorJakub Jelinek1-0/+3
As has been discussed before, the following patch adds target hook for math library function maximum errors measured in ulps. The default is to return ~0U which is a magic maximum value which means nothing is known about precision of the match function. The first argument is unsigned int because enum combined_fn isn't available everywhere where target hooks are included but is expected to be given the enum combined_fn value, although it should be used solely to find out which kind of match function (say sin vs. cos vs. sqrt vs. exp10) rather than its variant (f suffix, no suffix, l suffix, f128 suffix, ...), for which there is the machine_mode argument. The last argument is a bool, if it is false, the function should return maximum known error in ulps for a given function (taking -frounding-math into account if enabled), with 0.5ulps being represented as 0. If it is true, it is about whether the function can return values outside of an intrinsic finite range for the function and by how many ulps. E.g. sin/cos should return result in [-1.,1], if the function is expected to never return values outside of that finite interval, the hook should return 0. Similarly for sqrt such range is [-0.,+Inf]. The patch implements it for glibc only so far, I hope other maintainers can submit details for Solaris, musl, perhaps BSDs, etc. For glibc I've gathered data from: 1) https://www.gnu.org/software/libc/manual/html_node/Errors-in-Math-Functions.html as latest published glibc data 2) https://www.gnu.org/software/libc/manual/2.22/html_node/Errors-in-Math-Functions.html as a few years old glibc data 3) using attached libc-ulps.sh script from glibc git 4) using attached ulp-tester.c (how to invoke in file comment; tested both x86_64, ppc64, ppc64le 50M pseudo-random values in all 4 rounding modes, plus on x86_64 float/double sin/cos using libmvec - see attached libmvec-wrapper.c as well) 5) using attached boundary-tester.c to test for whether sin/cos/sqrt return values outside of the intrinsic ranges for those functions (again, tested on x86_64, ppc64, ppc64le plus on x86_64 using libmvec as well; libmvec with non-default rounding modes is pretty much random number generator it seems) The data is added to various hooks, the generic and generic glibc versions being in targhooks.c so that the various targets can easily override it. The intent is that the generic glibc version handles most of the stuff and specific target arch overrides handle the outliers or special cases. The patch has special case for x86_64 when __FAST_MATH__ is defined (as one can use in that case either libm or libmvec and we don't know which one will be used; so it uses maximum of what libm provides and libmvec), rs6000 (had to add one because cosf has 3ulps on ppc* rather than 1-2ulps on most other targets; MODE_COMPOSITE_P could be in theory handled in the generic code too, but as we have rs6000-linux specific function, it can be done just there), arc-linux (because DFmode sin has 7ulps there compared to 1ulps on other targets, both in default rounding mode and in others) and or1k-linux (while DFmode sin has 1ulps there for default rounding mode, for other rounding modes it has up to 7ulps). Now, for -frounding-math I'm trying to add a few ulps more because I expect it to be much less tested, except that for boundary_p I try to use the numbers I got from the 5) tester. 2023-04-28 Jakub Jelinek <jakub@redhat.com> * target.def (libm_function_max_error): New target hook. * doc/tm.texi.in (TARGET_LIBM_FUNCTION_MAX_ERROR): Add. * doc/tm.texi: Regenerated. * targhooks.h (default_libm_function_max_error, glibc_linux_libm_function_max_error): Declare. * targhooks.cc: Include case-cfn-macros.h. (default_libm_function_max_error, glibc_linux_libm_function_max_error): New functions. * config/linux.h (TARGET_LIBM_FUNCTION_MAX_ERROR): Redefine. * config/linux-protos.h (linux_libm_function_max_error): Declare. * config/linux.cc: Include target.h and targhooks.h. (linux_libm_function_max_error): New function. * config/arc/arc.cc: Include targhooks.h and case-cfn-macros.h. (arc_libm_function_max_error): New function. (TARGET_LIBM_FUNCTION_MAX_ERROR): Redefine. * config/i386/i386.cc (ix86_libc_has_fast_function): Formatting fix. (ix86_libm_function_max_error): New function. (TARGET_LIBM_FUNCTION_MAX_ERROR): Redefine. * config/rs6000/rs6000-protos.h (rs6000_linux_libm_function_max_error): Declare. * config/rs6000/rs6000-linux.cc: Include target.h, targhooks.h, tree.h and case-cfn-macros.h. (rs6000_linux_libm_function_max_error): New function. * config/rs6000/linux.h (TARGET_LIBM_FUNCTION_MAX_ERROR): Redefine. * config/rs6000/linux64.h (TARGET_LIBM_FUNCTION_MAX_ERROR): Redefine. * config/or1k/or1k.cc: Include targhooks.h and case-cfn-macros.h. (or1k_libm_function_max_error): New function. (TARGET_LIBM_FUNCTION_MAX_ERROR): Redefine.
2023-03-12middle-end: Implement preferred_div_as_shifts_over_mult [PR108583]Tamar Christina1-0/+2
This now implements a hook preferred_div_as_shifts_over_mult that indicates whether a target prefers that the vectorizer decomposes division as shifts rather than multiplication when possible. In order to be able to use this we need to check whether the current precision has enough bits to do the operation without any of the additions overflowing. We use range information to determine this and only do the operation if we're sure am overflow won't occur. This now uses ranger to do this range check. This seems to work better than vect_get_range_info which uses range_query, but I have not switched the interface of vect_get_range_info over in this PR fix. As Andy said before initializing a ranger instance is cheap but not free, and if the intention is to call it often during a pass it should be instantiated at pass startup and passed along to the places that need it. This is a big refactoring and doesn't seem right to do in this PR. But we should in GCC 14. Currently we only instantiate it after a long series of much cheaper checks. gcc/ChangeLog: PR target/108583 * target.def (preferred_div_as_shifts_over_mult): New. * doc/tm.texi.in: Document it. * doc/tm.texi: Regenerate. * targhooks.cc (default_preferred_div_as_shifts_over_mult): New. * targhooks.h (default_preferred_div_as_shifts_over_mult): New. * tree-vect-patterns.cc (vect_recog_divmod_pattern): Use it. gcc/testsuite/ChangeLog: PR target/108583 * gcc.dg/vect/vect-div-bitmask-4.c: New test. * gcc.dg/vect/vect-div-bitmask-5.c: New test.
2023-03-12middle-end: Revert can_special_div_by_const changes [PR108583]Tamar Christina1-2/+0
This reverts the changes for the CAN_SPECIAL_DIV_BY_CONST hook. gcc/ChangeLog: PR target/108583 * doc/tm.texi (TARGET_VECTORIZE_CAN_SPECIAL_DIV_BY_CONST): Remove. * doc/tm.texi.in: Likewise. * explow.cc (round_push, align_dynamic_address): Revert previous patch. * expmed.cc (expand_divmod): Likewise. * expmed.h (expand_divmod): Likewise. * expr.cc (force_operand, expand_expr_divmod): Likewise. * optabs.cc (expand_doubleword_mod, expand_doubleword_divmod): Likewise. * target.def (can_special_div_by_const): Remove. * target.h: Remove tree-core.h include * targhooks.cc (default_can_special_div_by_const): Remove. * targhooks.h (default_can_special_div_by_const): Remove. * tree-vect-generic.cc (expand_vector_operation): Remove hook. * tree-vect-patterns.cc (vect_recog_divmod_pattern): Remove hook. * tree-vect-stmts.cc (vectorizable_operation): Remove hook.
2023-03-03c++, v3: Emit fundamental tinfos for _Float16/decltype(0.0bf16) types on ↵Jakub Jelinek1-0/+2
ia32 with -mno-sse2 [PR108883] _Float16 and decltype(0.0bf16) types are on x86 supported only with -msse2. On x86_64 that is the default, but on ia32 it is not. We should still emit fundamental type tinfo for those types in libsupc++.a/libstdc++.*, regardless of whether libsupc++/libstdc++ is compiled with -msse2 or not, as user programs can be compiled with different ISA flags from libsupc++/libstdc++ and if they are compiled with -msse2 and use std::float16_t or std::bfloat16_t and need RTTI for it, it should work out of the box. Furthermore, libstdc++ ABI on ia32 shouldn't depend on whether the library is compiled with -mno-sse or -msse2. Unfortunately, just hacking up libsupc++ Makefile/configure so that a single source is compiled with -msse2 isn't appropriate, because that TU emits also code and the code should be able to run on CPUs which libstdc++ supports. We could add [[gnu::attribute ("no-sse2")]] there perhaps conditionally, but it all gets quite ugly. The following patch instead adds a target hook which allows the backend to temporarily tweak registered types such that emit_support_tinfos emits whatever is needed. Additionally, it makes emit_support_tinfos_1 call emit_tinfo_decl immediately, so that temporarily created dummy types for emit_support_tinfo purposes only can be nullified again afterwards. And removes the previous fallback_* types used for dfloat*_type_node tinfos even when decimal types aren't supported. 2023-03-03 Jakub Jelinek <jakub@redhat.com> PR target/108883 gcc/ * target.h (emit_support_tinfos_callback): New typedef. * targhooks.h (default_emit_support_tinfos): Declare. * targhooks.cc (default_emit_support_tinfos): New function. * target.def (emit_support_tinfos): New target hook. * doc/tm.texi.in (emit_support_tinfos): Document it. * doc/tm.texi: Regenerated. * config/i386/i386.cc (ix86_emit_support_tinfos): New function. (TARGET_EMIT_SUPPORT_TINFOS): Redefine. gcc/cp/ * cp-tree.h (enum cp_tree_index): Remove CPTI_FALLBACK_DFLOAT*_TYPE enumerators. (fallback_dfloat32_type, fallback_dfloat64_type, fallback_dfloat128_type): Remove. * rtti.cc (emit_support_tinfo_1): If not emitted already, call emit_tinfo_decl and remove from unemitted_tinfo_decls right away. (emit_support_tinfos): Move &dfloat*_type_node from fundamentals array into new fundamentals_with_fallback array. Call emit_support_tinfo_1 on elements of that array too, with the difference that if the type is NULL, use a fallback REAL_TYPE for it temporarily. Drop the !targetm.decimal_float_supported_p () handling. Call targetm.emit_support_tinfos at the end. * mangle.cc (write_builtin_type): Remove references to fallback_dfloat*_type. Handle bfloat16_type_node mangling.
2023-01-02Update copyright years.Jakub Jelinek1-1/+1
2022-11-24Adjust the symbol for SECTION_LINK_ORDER linked_to section [PR99889]Kewen.Lin1-3/+0
As discussed in PR98125, -fpatchable-function-entry with SECTION_LINK_ORDER support doesn't work well on powerpc64 ELFv1 because the filled "Symbol" in .section name,"flags"o,@type,Symbol sits in .opd section instead of in the function_section like .text or named .text*. Since we already generates one label LPFE* which sits in function_section of current_function_decl, this patch is to reuse it as the symbol for the linked_to section. It avoids the above ABI specific issue when using the symbol concluded from current_function_decl. Besides, with this support some previous workarounds can be reverted. PR target/99889 gcc/ChangeLog: * config/rs6000/rs6000.cc (rs6000_print_patchable_function_entry): Adjust to call function default_print_patchable_function_entry. * targhooks.cc (default_print_patchable_function_entry_1): Remove and move the flags preparation ... (default_print_patchable_function_entry): ... here, adjust to use current_function_funcdef_no for label no. * targhooks.h (default_print_patchable_function_entry_1): Remove. * varasm.cc (default_elf_asm_named_section): Adjust code for __patchable_function_entries section support with LPFE label. gcc/testsuite/ChangeLog: * g++.dg/pr93195a.C: Remove the skip on powerpc*-*-* 64-bit. * gcc.target/aarch64/pr92424-2.c: Adjust LPFE1 with LPFE0. * gcc.target/aarch64/pr92424-3.c: Likewise. * gcc.target/i386/pr93492-2.c: Likewise. * gcc.target/i386/pr93492-3.c: Likewise. * gcc.target/i386/pr93492-4.c: Likewise. * gcc.target/i386/pr93492-5.c: Likewise.
2022-11-14middle-end: Support not decomposing specific divisions during vectorization.Tamar Christina1-0/+2
In plenty of image and video processing code it's common to modify pixel values by a widening operation and then scale them back into range by dividing by 255. e.g.: x = y / (2 ^ (bitsize (y)/2)-1 This patch adds a new target hook can_special_div_by_const, similar to can_vec_perm which can be called to check if a target will handle a particular division in a special way in the back-end. The vectorizer will then vectorize the division using the standard tree code and at expansion time the hook is called again to generate the code for the division. Alot of the changes in the patch are to pass down the tree operands in all paths that can lead to the divmod expansion so that the target hook always has the type of the expression you're expanding since the types can change the expansion. gcc/ChangeLog: * expmed.h (expand_divmod): Pass tree operands down in addition to RTX. * expmed.cc (expand_divmod): Likewise. * explow.cc (round_push, align_dynamic_address): Likewise. * expr.cc (force_operand, expand_expr_divmod): Likewise. * optabs.cc (expand_doubleword_mod, expand_doubleword_divmod): Likewise. * target.h: Include tree-core. * target.def (can_special_div_by_const): New. * targhooks.cc (default_can_special_div_by_const): New. * targhooks.h (default_can_special_div_by_const): New. * tree-vect-generic.cc (expand_vector_operation): Use it. * doc/tm.texi.in: Document it. * doc/tm.texi: Regenerate. * tree-vect-patterns.cc (vect_recog_divmod_pattern): Check for support. * tree-vect-stmts.cc (vectorizable_operation): Likewise. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-div-bitmask-1.c: New test. * gcc.dg/vect/vect-div-bitmask-2.c: New test. * gcc.dg/vect/vect-div-bitmask-3.c: New test. * gcc.dg/vect/vect-div-bitmask.h: New file.
2022-11-04Better integrate default 'sorry' 'TARGET_ASM_CONSTRUCTOR', ↵Thomas Schwinge1-0/+2
'TARGET_ASM_DESTRUCTOR' ... after commit 4ee35c11fd328728c12f3e086ae016ca94624bf8 "Restore default 'sorry' 'TARGET_ASM_CONSTRUCTOR', 'TARGET_ASM_DESTRUCTOR'". No functional change. gcc/ * Makefile.in (OBJS): Remove 'dbxout.o'. * config/nvptx/nvptx.cc: Don't '#include "dbxout.h"'. * dbxout.cc: Remove. * dbxout.h: Likewise. * target-def.h (TARGET_ASM_CONSTRUCTOR, TARGET_ASM_DESTRUCTOR): Default to 'default_asm_out_constructor', 'default_asm_out_destructor'. * targhooks.cc (default_asm_out_constructor) (default_asm_out_destructor): New. * targhooks.h (default_asm_out_constructor) (default_asm_out_destructor): Declare.
2022-04-25target/89125 - BSD and math functionsSteve Kargl1-0/+1
Back story: When GCC is configured and built on non-glibc platforms, it seems very little to no effort is made to enumerate the available C99 libm functions. It is all or nothing for C99 libm. The patch introduces a new function, used on only FreeBSD, to inform gcc that it has C99 libm functions (minus a few which clearly GCC does not check nor test). 2022-04-15 Steven G. Kargl <kargl@gcc.gnu.org> PR target/89125 * config/freebsd.h: Define TARGET_LIBC_HAS_FUNCTION to be bsd_libc_has_function. * targhooks.cc (bsd_libc_has_function): New function. Expand the supported math functions to inclue C99 libm. * targhooks.h (bsd_libc_has_function): New Prototype.
2022-01-04ipa-inline: Add target info into fn summary [PR102059]Kewen Lin1-0/+2
Power ISA 2.07 (Power8) introduces transactional memory feature but ISA3.1 (Power10) removes it. It exposes one troublesome issue as PR102059 shows. Users define some function with target pragma cpu=power10 then it calls one function with attribute always_inline which inherits command line option -mcpu=power8 which enables HTM implicitly. The current isa_flags check doesn't allow this inlining due to "target specific option mismatch" and error mesasge is emitted. Normally, the callee function isn't intended to exploit HTM feature, but the default flag setting make it look it has. As Richi raised in the PR, we have fp_expressions flag in function summary, and allow us to check the function actually contains any floating point expressions to avoid overkill. So this patch follows the similar idea but is more target specific, for this rs6000 port specific requirement on HTM feature check, we would like to check rs6000 specific HTM built-in functions and inline assembly, it allows targets to do their own customized checks and updates. It introduces two target hooks need_ipa_fn_target_info and update_ipa_fn_target_info. The former allows target to do some previous check and decides to collect target specific information for this function or not. For some special case, it can predict the analysis result and set it early without any scannings. The latter allows the analyze_function_body to pass gimple stmts down just like fp_expressions handlings, target can do its own tricks. I put them together as one hook initially with one boolean to indicate whether it's initial time, but the code looks a bit ugly, to separate them seems to have better readability. gcc/ChangeLog: PR ipa/102059 * config/rs6000/rs6000.c (TARGET_NEED_IPA_FN_TARGET_INFO): New macro. (TARGET_UPDATE_IPA_FN_TARGET_INFO): Likewise. (rs6000_need_ipa_fn_target_info): New function. (rs6000_update_ipa_fn_target_info): Likewise. (rs6000_can_inline_p): Adjust for ipa function summary target info. * config/rs6000/rs6000.h (RS6000_FN_TARGET_INFO_HTM): New macro. * ipa-fnsummary.c (ipa_dump_fn_summary): Adjust for ipa function summary target info. (analyze_function_body): Adjust for ipa function summary target info and call hook rs6000_need_ipa_fn_target_info and rs6000_update_ipa_fn_target_info. (ipa_merge_fn_summary_after_inlining): Adjust for ipa function summary target info. (inline_read_section): Likewise. (ipa_fn_summary_write): Likewise. * ipa-fnsummary.h (ipa_fn_summary::target_info): New member. * doc/tm.texi: Regenerate. * doc/tm.texi.in (TARGET_UPDATE_IPA_FN_TARGET_INFO): Document new hook. (TARGET_NEED_IPA_FN_TARGET_INFO): Likewise. * target.def (update_ipa_fn_target_info): New hook. (need_ipa_fn_target_info): Likewise. * targhooks.c (default_need_ipa_fn_target_info): New function. (default_update_ipa_fn_target_info): Likewise. * targhooks.h (default_update_ipa_fn_target_info): New declare. (default_need_ipa_fn_target_info): Likewise. gcc/testsuite/ChangeLog: PR ipa/102059 * gcc.dg/lto/pr102059-1_0.c: New test. * gcc.dg/lto/pr102059-1_1.c: New test. * gcc.dg/lto/pr102059-1_2.c: New test. * gcc.dg/lto/pr102059-2_0.c: New test. * gcc.dg/lto/pr102059-2_1.c: New test. * gcc.dg/lto/pr102059-2_2.c: New test. * gcc.target/powerpc/pr102059-1.c: New test. * gcc.target/powerpc/pr102059-2.c: New test. * gcc.target/powerpc/pr102059-3.c: New test.
2022-01-03Update copyright years.Jakub Jelinek1-1/+1
2021-11-04vect: Convert cost hooks to classesRichard Sandiford1-7/+1
The current vector cost interface has a quite a bit of redundancy built in. Each target that defines its own hooks has to replicate the basic unsigned[3] management. Currently each target also duplicates the cost adjustment for inner loops. This patch instead defines a vector_costs class for holding the scalar or vector cost and allows targets to subclass it. There is then only one costing hook: to create a new costs structure of the appropriate type. Everything else can be virtual functions, with common concepts implemented in the base class rather than in each target's derivation. This might seem like excess C++-ification, but it shaves ~100 LOC. I've also got some follow-on changes that become significantly easier with this patch. Maybe it could help with things like weighting blocks based on frequency too. This will clash with Andre's unrolling patches. His patches have priority so this patch should queue behind them. The x86 and rs6000 parts fully convert to a self-contained class. The equivalent aarch64 changes are more complex, so this patch just does the bare minimum. A later patch will rework the aarch64 bits. gcc/ * target.def (targetm.vectorize.init_cost): Replace with... (targetm.vectorize.create_costs): ...this. (targetm.vectorize.add_stmt_cost): Delete. (targetm.vectorize.finish_cost): Likewise. (targetm.vectorize.destroy_cost_data): Likewise. * doc/tm.texi.in (TARGET_VECTORIZE_INIT_COST): Replace with... (TARGET_VECTORIZE_CREATE_COSTS): ...this. (TARGET_VECTORIZE_ADD_STMT_COST): Delete. (TARGET_VECTORIZE_FINISH_COST): Likewise. (TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise. * doc/tm.texi: Regenerate. * tree-vectorizer.h (vec_info::vec_info): Remove target_cost_data parameter. (vec_info::target_cost_data): Change from a void * to a vector_costs *. (vector_costs): New class. (init_cost): Take a vec_info and return a vector_costs. (dump_stmt_cost): Remove data parameter. (add_stmt_cost): Replace vinfo and data parameters with a vector_costs. (add_stmt_costs): Likewise. (finish_cost): Replace data parameter with a vector_costs. (destroy_cost_data): Delete. * tree-vectorizer.c (dump_stmt_cost): Remove data argument and don't print it. (vec_info::vec_info): Remove the target_cost_data parameter and initialize the member variable to null instead. (vec_info::~vec_info): Delete target_cost_data instead of calling destroy_cost_data. (vector_costs::add_stmt_cost): New function. (vector_costs::finish_cost): Likewise. (vector_costs::record_stmt_cost): Likewise. (vector_costs::adjust_cost_for_freq): Likewise. * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Update call to vec_info::vec_info. (vect_compute_single_scalar_iteration_cost): Update after above changes to costing interface. (vect_analyze_loop_operations): Likewise. (vect_estimate_min_profitable_iters): Likewise. (vect_analyze_loop_2): Initialize LOOP_VINFO_TARGET_COST_DATA at the start_over point, where it needs to be recreated after trying without slp. Update retry code accordingly. * tree-vect-slp.c (_bb_vec_info::_bb_vec_info): Update call to vec_info::vec_info. (vect_slp_analyze_operation): Update after above changes to costing interface. (vect_bb_vectorization_profitable_p): Likewise. * targhooks.h (default_init_cost): Replace with... (default_vectorize_create_costs): ...this. (default_add_stmt_cost): Delete. (default_finish_cost, default_destroy_cost_data): Likewise. * targhooks.c (default_init_cost): Replace with... (default_vectorize_create_costs): ...this. (default_add_stmt_cost): Delete, moving logic to vector_costs instead. (default_finish_cost, default_destroy_cost_data): Delete. * config/aarch64/aarch64.c (aarch64_vector_costs): Inherit from vector_costs. Add a constructor. (aarch64_init_cost): Replace with... (aarch64_vectorize_create_costs): ...this. (aarch64_add_stmt_cost): Replace with... (aarch64_vector_costs::add_stmt_cost): ...this. Use record_stmt_cost to adjust the cost for inner loops. (aarch64_finish_cost): Replace with... (aarch64_vector_costs::finish_cost): ...this. (aarch64_destroy_cost_data): Delete. (TARGET_VECTORIZE_INIT_COST): Replace with... (TARGET_VECTORIZE_CREATE_COSTS): ...this. (TARGET_VECTORIZE_ADD_STMT_COST): Delete. (TARGET_VECTORIZE_FINISH_COST): Likewise. (TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise. * config/i386/i386.c (ix86_vector_costs): New structure. (ix86_init_cost): Replace with... (ix86_vectorize_create_costs): ...this. (ix86_add_stmt_cost): Replace with... (ix86_vector_costs::add_stmt_cost): ...this. Use adjust_cost_for_freq to adjust the cost for inner loops. (ix86_finish_cost, ix86_destroy_cost_data): Delete. (TARGET_VECTORIZE_INIT_COST): Replace with... (TARGET_VECTORIZE_CREATE_COSTS): ...this. (TARGET_VECTORIZE_ADD_STMT_COST): Delete. (TARGET_VECTORIZE_FINISH_COST): Likewise. (TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise. * config/rs6000/rs6000.c (TARGET_VECTORIZE_INIT_COST): Replace with... (TARGET_VECTORIZE_CREATE_COSTS): ...this. (TARGET_VECTORIZE_ADD_STMT_COST): Delete. (TARGET_VECTORIZE_FINISH_COST): Likewise. (TARGET_VECTORIZE_DESTROY_COST_DATA): Likewise. (rs6000_cost_data): Inherit from vector_costs. Add a constructor. Drop loop_info, cost and costing_for_scalar in favor of the corresponding vector_costs member variables. Add "m_" to the names of the remaining member variables and initialize them. (rs6000_density_test): Replace with... (rs6000_cost_data::density_test): ...this. (rs6000_init_cost): Replace with... (rs6000_vectorize_create_costs): ...this. (rs6000_update_target_cost_per_stmt): Replace with... (rs6000_cost_data::update_target_cost_per_stmt): ...this. (rs6000_add_stmt_cost): Replace with... (rs6000_cost_data::add_stmt_cost): ...this. Use adjust_cost_for_freq to adjust the cost for inner loops. (rs6000_adjust_vect_cost_per_loop): Replace with... (rs6000_cost_data::adjust_vect_cost_per_loop): ...this. (rs6000_finish_cost): Replace with... (rs6000_cost_data::finish_cost): ...this. Group loop code into a single if statement and pass the loop_vinfo down to subroutines. (rs6000_destroy_cost_data): Delete.
2021-10-20calls.c: Remove some dead code and target hooksAlex Coplan1-4/+0
Looking at calls.c:initialize_argument_information, I spotted some dead code that seems to have been left behind from when MPX support was removed. This change removes that code as well as the associated target hooks (which appear to be unused). gcc/ChangeLog: * calls.c (initialize_argument_information): Remove some dead code, remove handling for function_arg returning const_int. * doc/tm.texi: Delete documentation for unused target hooks. * doc/tm.texi.in: Likewise. * target.def (load_bounds_for_arg): Delete. (store_bounds_for_arg): Delete. (load_returned_bounds): Delete. (store_returned_bounds): Delete. * targhooks.c (default_load_bounds_for_arg): Delete. (default_store_bounds_for_arg): Delete. (default_load_returned_bounds): Delete. (default_store_returned_bounds): Delete. * targhooks.h (default_load_bounds_for_arg): Delete. (default_store_bounds_for_arg): Delete. (default_load_returned_bounds): Delete. (default_store_returned_bounds): Delete.
2021-08-16gcov: Add TARGET_GCOV_TYPE_SIZE target hookSebastian Huber1-0/+2
If -fprofile-update=atomic is used, then the target must provide atomic operations for the counters of the type returned by get_gcov_type(). This is a 64-bit type for targets which have a 64-bit long long type. On 32-bit targets this could be an issue since they may not provide 64-bit atomic operations. Allow targets to override the default type size with the new TARGET_GCOV_TYPE_SIZE target hook. If a 32-bit gcov type size is used, then there is currently a warning in libgcov-driver.c in a dead code block due to sizeof (counter) == sizeof (gcov_unsigned_t): libgcc/libgcov-driver.c: In function 'dump_counter': libgcc/libgcov-driver.c:401:46: warning: right shift count >= width of type [-Wshift-count-overflow] 401 | dump_unsigned ((gcov_unsigned_t)(counter >> 32), dump_fn, arg); | ^~ gcc/c-family/ * c-cppbuiltin.c (c_cpp_builtins): Define __LIBGCC_GCOV_TYPE_SIZE if flag_building_libgcc is true. gcc/ * config/sparc/rtemself.h (SPARC_GCOV_TYPE_SIZE): Define. * config/sparc/sparc.c (sparc_gcov_type_size): New. (TARGET_GCOV_TYPE_SIZE): Redefine if SPARC_GCOV_TYPE_SIZE is defined. * coverage.c (get_gcov_type): Use targetm.gcov_type_size(). * doc/tm.texi (TARGET_GCOV_TYPE_SIZE): Add hook under "Misc". * doc/tm.texi.in: Regenerate. * target.def (gcov_type_size): New target hook. * targhooks.c (default_gcov_type_size): New. * targhooks.h (default_gcov_type_size): Declare. * tree-profile.c (gimple_gen_edge_profiler): Use precision of gcov_type_node. (gimple_gen_time_profiler): Likewise. libgcc/ * libgcov.h (gcov_type): Define using __LIBGCC_GCOV_TYPE_SIZE. (gcov_type_unsigned): Likewise.
2021-07-29Use preferred mode for doloop IV [PR61837]Jiufu Guo1-0/+1
Currently, doloop.xx variable is using the type as niter which may be shorter than word size. For some targets, it would be better to use word size type. For example, on 64bit system, to access 32bit value, subreg maybe used. Then using 64bit type maybe better for niter if it can be present in both 32bit and 64bit. This patch add target hook to query preferred mode for doloop IV, and update mode accordingly. gcc/ChangeLog: 2021-07-29 Jiufu Guo <guojiufu@linux.ibm.com> PR target/61837 * config/rs6000/rs6000.c (TARGET_PREFERRED_DOLOOP_MODE): New hook. (rs6000_preferred_doloop_mode): New hook. * doc/tm.texi: Regenerate. * doc/tm.texi.in: Add hook preferred_doloop_mode. * target.def (preferred_doloop_mode): New hook. * targhooks.c (default_preferred_doloop_mode): New hook. * targhooks.h (default_preferred_doloop_mode): New hook. * tree-ssa-loop-ivopts.c (compute_doloop_base_on_mode): New function. (add_iv_candidate_for_doloop): Call targetm.preferred_doloop_mode and compute_doloop_base_on_mode. gcc/testsuite/ChangeLog: 2021-07-29 Jiufu Guo <guojiufu@linux.ibm.com> PR target/61837 * gcc.target/powerpc/pr61837.c: New test.
2021-06-17Add a target calls hook: TARGET_PUSH_ARGUMENTH.J. Lu1-0/+1
1. Replace PUSH_ARGS with a target calls hook, TARGET_PUSH_ARGUMENT, which takes an integer argument. When it returns true, push instructions will be used to pass outgoing arguments. If the argument is nonzero, it is the number of bytes to push and indicates the PUSH instruction usage is optional so that the backend can decide if PUSH instructions should be generated. Otherwise, the argument is zero. 2. Implement x86 target hook which returns false when the number of bytes to push is no less than 16 (8 for 32-bit targets) if vector load and store can be used. 3. Remove target PUSH_ARGS definitions which return 0 as it is the same as the default. 4. Define TARGET_PUSH_ARGUMENT of cr16 and m32c to always return true. gcc/ PR target/100704 * calls.c (expand_call): Replace PUSH_ARGS with targetm.calls.push_argument (0). (emit_library_call_value_1): Likewise. * defaults.h (PUSH_ARGS): Removed. (PUSH_ARGS_REVERSED): Replace PUSH_ARGS with targetm.calls.push_argument (0). * expr.c (block_move_libcall_safe_for_call_parm): Likewise. (emit_push_insn): Pass the number bytes to push to targetm.calls.push_argument and pass 0 if ARGS_ADDR is 0. * hooks.c (hook_bool_uint_true): New. * hooks.h (hook_bool_uint_true): Likewise. * rtlanal.c (nonzero_bits1): Replace PUSH_ARGS with targetm.calls.push_argument (0). * target.def (push_argument): Add a targetm.calls hook. * targhooks.c (default_push_argument): New. * targhooks.h (default_push_argument): Likewise. * config/bpf/bpf.h (PUSH_ARGS): Removed. * config/cr16/cr16.c (TARGET_PUSH_ARGUMENT): New. * config/cr16/cr16.h (PUSH_ARGS): Removed. * config/i386/i386.c (ix86_push_argument): New. (TARGET_PUSH_ARGUMENT): Likewise. * config/i386/i386.h (PUSH_ARGS): Removed. * config/m32c/m32c.c (TARGET_PUSH_ARGUMENT): New. * config/m32c/m32c.h (PUSH_ARGS): Removed. * config/nios2/nios2.h (PUSH_ARGS): Likewise. * config/pru/pru.h (PUSH_ARGS): Likewise. * doc/tm.texi.in: Remove PUSH_ARGS documentation. Add TARGET_PUSH_ARGUMENT hook. * doc/tm.texi: Regenerated. gcc/testsuite/ PR target/100704 * gcc.target/i386/pr100704-1.c: New test. * gcc.target/i386/pr100704-2.c: Likewise. * gcc.target/i386/pr100704-3.c: Likewise.
2021-05-11vect: Add costing_for_scalar parameter to init_cost hookKewen Lin1-1/+1
rs6000 port function rs6000_density_test wants to differentiate the current cost model is for the scalar version of a loop or block, or the vector version. As Richi suggested, this patch introduces one new parameter costing_for_scalar to init_cost hook to pass down this information explicitly. gcc/ChangeLog: * doc/tm.texi: Regenerated. * target.def (init_cost): Add new parameter costing_for_scalar. * targhooks.c (default_init_cost): Adjust for new parameter. * targhooks.h (default_init_cost): Likewise. * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Likewise. (vect_compute_single_scalar_iteration_cost): Likewise. (vect_analyze_loop_2): Likewise. * tree-vect-slp.c (_bb_vec_info::_bb_vec_info): Likewise. (vect_bb_vectorization_profitable_p): Likewise. * tree-vectorizer.h (init_cost): Likewise. * config/aarch64/aarch64.c (aarch64_init_cost): Likewise. * config/i386/i386.c (ix86_init_cost): Likewise. * config/rs6000/rs6000.c (rs6000_init_cost): Likewise.
2021-04-03rs6000: Avoid -fpatchable-function-entry* regressions on powerpc64 be [PR98125]Jakub Jelinek1-0/+3
The SECTION_LINK_ORDER changes broke powerpc64-linux ELFv1. Seems that the assembler/linker relies on the symbol mentioned for the "awo" section to be in the same section as the symbols mentioned in the relocations in that section (i.e. labels for the patchable area in this case). That is the case for most targets, including powerpc-linux 32-bit or powerpc64 ELFv2 (that one has -fpatchable-function-entry* support broken for other reasons and it doesn't seem to be a regression). But it doesn't work on powerpc64-linux ELFv1. We emit: .section ".opd","aw" .align 3 _Z3foov: .quad .L._Z3foov,.TOC.@tocbase,0 .previous .type _Z3foov, @function .L._Z3foov: .section __patchable_function_entries,"awo",@progbits,_Z3foov .align 3 .8byte .LPFE1 .section .text._Z3foov,"axG",@progbits,_Z3foov,comdat .LPFE1: nop .LFB0: .cfi_startproc and because _Z3foov is in the .opd section rather than the function text section, it doesn't work. I'm afraid I don't know what exactly should be done, whether e.g. it could use .section __patchable_function_entries,"awo",@progbits,.L._Z3foov instead, or whether the linker should be changed to handle it as is, or something else. But because we have a P1 regression that didn't see useful progress over the 4 months since it has been filed and we don't really have much time, below is an attempt to do a targetted reversion of H.J's patch, basically act as if HAVE_GAS_SECTION_LINK_ORDER is never true for powerpc64-linux ELFv1, but for 32-bit or 64-bit ELFv2 keep working as is. This would give us time to resolve it for GCC 12 properly. 2021-04-03 Jakub Jelinek <jakub@redhat.com> PR testsuite/98125 * targhooks.h (default_print_patchable_function_entry_1): Declare. * targhooks.c (default_print_patchable_function_entry_1): New function, copied from default_print_patchable_function_entry with an added flags argument. (default_print_patchable_function_entry): Rewritten into a small wrapper around default_print_patchable_function_entry_1. * config/rs6000/rs6000.c (TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY): Redefine. (rs6000_print_patchable_function_entry): New function. * g++.dg/pr93195a.C: Skip on powerpc*-*-* 64-bit.
2021-01-04Update copyright years.Jakub Jelinek1-1/+1
2020-12-17Update default_estimated_poly_value prototype in targhooks.hH.J. Lu1-1/+2
commit 64432b680eab0bddbe9a4ad4798457cf6a14ad60 Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com> Date: Thu Dec 17 18:02:37 2020 +0000 vect, aarch64: Extend SVE vs Advanced SIMD costing decisions in vect_better_loop_vinfo_p changed default_estimated_poly_value to HOST_WIDE_INT default_estimated_poly_value (poly_int64 x, poly_value_estimate_kind) { return x.coeffs[0]; } Update default_estimated_poly_value prototype in targhooks.h to match it. * targhooks.h (default_estimated_poly_value): Updated.
2020-12-02introduce overridable clear_cache emitterAlexandre Oliva1-0/+1
This patch introduces maybe_emit_call_builtin___clear_cache for the builtin expander machinery and the trampoline initializers to use to clear the instruction cache, removing a source of inconsistencies and subtle errors in low-level machinery. I've adjusted all trampoline_init implementations that used to issue explicit calls to __clear_cache or similar to use this new primitive. Specifically on vxworks targets, we needed to drop the __clear_cache symbol in libgcc, for reasons related with linking that I didn't need to understand, and we wanted to call cacheTextUpdate directly, despite the different calling conventions: the second argument is a length rather than the end address. So I introduced a target hook to enable target OS-level overriding of builtin __clear_cache call emission, retaining nearly (*) the same logic to govern the decision on whether to emit a call (or nothing, or a machine-dependent insn) but enabling a call to a target system-defined function with different calling conventions to be issued, without having to modify .md files of the various architectures supported by the target system to introduce or modify clear_cache insns. (*) I write "nearly" mainly because, when not optimizing, we'd issue a call regardless, but since the call may now be overridden, I added it to the set of builtins that are not directly turned into calls when not optimizing, following the normal expansion path instead. It wouldn't be hard to skip the emission of cache-clearing insns when not optimizing, but it didn't seem very important, especially for the new uses from trampoline init. Another difference that might be relevant is that now we expand the begin and end arguments unconditionally. This might make a difference if they have side effects. That's prettty much impossible at expand time, but I thought I'd mention it. I have NOT modified targets that did not issue cache-clearing calls in trampoline init to use the new clear_cache-calling infrastructure even if it would expand to nothing. I have considered doing so, to have __builtin___clear_cache and trampoline init call cacheTextUpdate on all vxworks targets, but decided not to, since on targets that don't do any cache clearing, cacheTextUpdate ought to be a no-op, even though rs6000 seems to use icbi and dcbf instructions in the function called to initialize a trampoline, but AFAICT not in the __clear_cache builtin. Hopefully target maintainers will have a look and take advantage of this new piece of infrastructure to remove such (apparent?) inconsistencies. Not rs6000 and other that call asm-coded trampoline setup instructions, for sure, but they might wish to introduce a CLEAR_INSN_CACHE macro or a clear_cache expander if they don't have one. for gcc/ChangeLog * builtins.c (default_emit_call_builtin___clear_cache): New. (maybe_emit_call_builtin___clear_cache): New. (expand_builtin___clear_cache): Split into the above. (expand_builtin): Do not issue clear_cache call any more. * builtins.h (maybe_emit_call_builtin___clear_cache): Declare. * config/aarch64/aarch64.c (aarch64_trampoline_init): Use maybe_emit_call_builtin___clear_cache. * config/arc/arc.c (arc_trampoline_init): Likewise. * config/arm/arm.c (arm_trampoline_init): Likewise. * config/c6x/c6x.c (c6x_initialize_trampoline): Likewise. * config/csky/csky.c (csky_trampoline_init): Likewise. * config/m68k/linux.h (FInALIZE_TRAMPOLINE): Likewise. * config/tilegx/tilegx.c (tilegx_trampoline_init): Likewise. * config/tilepro/tilepro.c (tilepro_trampoline_init): Ditto. * config/vxworks.c: Include rtl.h, memmodel.h, and optabs.h. (vxworks_emit_call_builtin___clear_cache): New. * config/vxworks.h (CLEAR_INSN_CACHE): Drop. (TARGET_EMIT_CALL_BUILTIN___CLEAR_CACHE): Define. * target.def (trampoline_init): In the documentation, refer to maybe_emit_call_builtin___clear_cache. (emit_call_builtin___clear_cache): New. * doc/tm.texi.in: Add new hook point. (CLEAR_CACHE_INSN): Remove duplicate 'both'. * doc/tm.texi: Rebuilt. * targhooks.h (default_meit_call_builtin___clear_cache): Declare. * tree.h (BUILTIN_ASM_NAME_PTR): New. for libgcc/ChangeLog * config/t-vxworks (LIB2ADD): Drop. * config/t-vxworks7 (LIB2ADD): Likewise. * config/vxcache.c: Remove.
2020-11-25libsanitizer: mid-end: Introduce stack variable handling for HWASANMatthew Malcomson1-0/+8
Handling stack variables has three features. 1) Ensure HWASAN required alignment for stack variables When tagging shadow memory, we need to ensure that each tag granule is only used by one variable at a time. This is done by ensuring that each tagged variable is aligned to the tag granule representation size and also ensure that the end of each object is aligned to ensure the start of any other data stored on the stack is in a different granule. This patch ensures the above by forcing the stack pointer to be aligned before and after allocating any stack objects. Since we are forcing alignment we also use `align_local_variable` to ensure this new alignment is advertised properly through SET_DECL_ALIGN. 2) Put tags into each stack variable pointer Make sure that every pointer to a stack variable includes a tag of some sort on it. The way tagging works is: 1) For every new stack frame, a random tag is generated. 2) A base register is formed from the stack pointer value and this random tag. 3) References to stack variables are now formed with RTL describing an offset from this base in both tag and value. The random tag generation is handled by a backend hook. This hook decides whether to introduce a random tag or use the stack background based on the parameter hwasan-random-frame-tag. Using the stack background is necessary for testing and bootstrap. It is necessary during bootstrap to avoid breaking the `configure` test program for determining stack direction. Using the stack background means that every stack frame has the initial tag of zero and variables are tagged with incrementing tags from 1, which also makes debugging a bit easier. Backend hooks define the size of a tag, the layout of the HWASAN shadow memory, and handle emitting the code that inserts and extracts tags from a pointer. 3) For each stack variable, tag and untag the shadow stack on function prologue and epilogue. On entry to each function we tag the relevant shadow stack region for each stack variable. This stack region is tagged to match the tag added to each pointer to that variable. This is the first patch where we use the HWASAN shadow space, so we need to add in the libhwasan initialisation code that creates this shadow memory region into the binary we produce. This instrumentation is done in `compile_file`. When exiting a function we need to ensure the shadow stack for this function has no remaining tags. Without clearing the shadow stack area for this stack frame, later function calls could get false positives when those later function calls check untagged areas (such as parameters passed on the stack) against a shadow stack area with left-over tag. Hence we ensure that the entire stack frame is cleared on function exit. config/ChangeLog: * bootstrap-hwasan.mk: Disable random frame tags for stack-tagging during bootstrap. gcc/ChangeLog: * asan.c (struct hwasan_stack_var): New. (hwasan_sanitize_p): New. (hwasan_sanitize_stack_p): New. (hwasan_sanitize_allocas_p): New. (initialize_sanitizer_builtins): Define new builtins. (ATTR_NOTHROW_LIST): New macro. (hwasan_current_frame_tag): New. (hwasan_frame_base): New. (stack_vars_base_reg_p): New. (hwasan_maybe_init_frame_base_init): New. (hwasan_record_stack_var): New. (hwasan_get_frame_extent): New. (hwasan_increment_frame_tag): New. (hwasan_record_frame_init): New. (hwasan_emit_prologue): New. (hwasan_emit_untag_frame): New. (hwasan_finish_file): New. (hwasan_truncate_to_tag_size): New. * asan.h (hwasan_record_frame_init): New declaration. (hwasan_record_stack_var): New declaration. (hwasan_emit_prologue): New declaration. (hwasan_emit_untag_frame): New declaration. (hwasan_get_frame_extent): New declaration. (hwasan_maybe_enit_frame_base_init): New declaration. (hwasan_frame_base): New declaration. (stack_vars_base_reg_p): New declaration. (hwasan_current_frame_tag): New declaration. (hwasan_increment_frame_tag): New declaration. (hwasan_truncate_to_tag_size): New declaration. (hwasan_finish_file): New declaration. (hwasan_sanitize_p): New declaration. (hwasan_sanitize_stack_p): New declaration. (hwasan_sanitize_allocas_p): New declaration. (HWASAN_TAG_SIZE): New macro. (HWASAN_TAG_GRANULE_SIZE): New macro. (HWASAN_STACK_BACKGROUND): New macro. * builtin-types.def (BT_FN_VOID_PTR_UINT8_PTRMODE): New. * builtins.def (DEF_SANITIZER_BUILTIN): Enable for HWASAN. * cfgexpand.c (align_local_variable): When using hwasan ensure alignment to tag granule. (align_frame_offset): New. (expand_one_stack_var_at): For hwasan use tag offset. (expand_stack_vars): Record stack objects for hwasan. (expand_one_stack_var_1): Record stack objects for hwasan. (init_vars_expansion): Initialise hwasan state. (expand_used_vars): Emit hwasan prologue and generate hwasan epilogue. (pass_expand::execute): Emit hwasan base initialization if needed. * doc/tm.texi (TARGET_MEMTAG_TAG_SIZE,TARGET_MEMTAG_GRANULE_SIZE, TARGET_MEMTAG_INSERT_RANDOM_TAG,TARGET_MEMTAG_ADD_TAG, TARGET_MEMTAG_SET_TAG,TARGET_MEMTAG_EXTRACT_TAG, TARGET_MEMTAG_UNTAGGED_POINTER): Document new hooks. * doc/tm.texi.in (TARGET_MEMTAG_TAG_SIZE,TARGET_MEMTAG_GRANULE_SIZE, TARGET_MEMTAG_INSERT_RANDOM_TAG,TARGET_MEMTAG_ADD_TAG, TARGET_MEMTAG_SET_TAG,TARGET_MEMTAG_EXTRACT_TAG, TARGET_MEMTAG_UNTAGGED_POINTER): Document new hooks. * explow.c (get_dynamic_stack_base): Take new `base` argument. * explow.h (get_dynamic_stack_base): Take new `base` argument. * sanitizer.def (BUILT_IN_HWASAN_INIT): New. (BUILT_IN_HWASAN_TAG_MEM): New. * target.def (target_memtag_tag_size,target_memtag_granule_size, target_memtag_insert_random_tag,target_memtag_add_tag, target_memtag_set_tag,target_memtag_extract_tag, target_memtag_untagged_pointer): New hooks. * targhooks.c (HWASAN_SHIFT): New. (HWASAN_SHIFT_RTX): New. (default_memtag_tag_size): New default hook. (default_memtag_granule_size): New default hook. (default_memtag_insert_random_tag): New default hook. (default_memtag_add_tag): New default hook. (default_memtag_set_tag): New default hook. (default_memtag_extract_tag): New default hook. (default_memtag_untagged_pointer): New default hook. * targhooks.h (default_memtag_tag_size): New default hook. (default_memtag_granule_size): New default hook. (default_memtag_insert_random_tag): New default hook. (default_memtag_add_tag): New default hook. (default_memtag_set_tag): New default hook. (default_memtag_extract_tag): New default hook. (default_memtag_untagged_pointer): New default hook. * toplev.c (compile_file): Call hwasan_finish_file when finished.
2020-11-25libsanitizer: options: Add hwasan flags and argument parsingMatthew Malcomson1-0/+1
These flags can't be used at the same time as any of the other sanitizers. We add an equivalent flag to -static-libasan in -static-libhwasan to ensure static linking. The -fsanitize=kernel-hwaddress option is for compiling targeting the kernel. This flag has defaults to match the LLVM implementation and sets some other behaviors to work in the kernel (e.g. accounting for the fact that the stack pointer will have 0xff in the top byte and to not call the userspace library initialisation routines). The defaults are that we do not sanitize variables on the stack and always recover from a detected bug. Since we are introducing a few more conflicts between sanitizer flags we refactor the checking for such conflicts to use a helper function which makes checking for such conflicts more easy and consistent. We introduce a backend hook `targetm.memtag.can_tag_addresses` that indicates to the mid-end whether a target has a feature like AArch64 TBI where the top byte of an address is ignored. Without this feature hwasan sanitization is not done. gcc/ChangeLog: * common.opt (flag_sanitize_recover): Default for kernel hwaddress. (static-libhwasan): New cli option. * config/aarch64/aarch64.c (aarch64_can_tag_addresses): New. (TARGET_MEMTAG_CAN_TAG_ADDRESSES): New. * config/gnu-user.h (LIBHWASAN_EARLY_SPEC): hwasan equivalent of asan command line flags. * cppbuiltin.c (define_builtin_macros_for_compilation_flags): Add hwasan equivalent of __SANITIZE_ADDRESS__. * doc/invoke.texi: Document hwasan command line flags. * doc/tm.texi: Document new hook. * doc/tm.texi.in: Document new hook. * flag-types.h (enum sanitize_code): New sanitizer values. * gcc.c (STATIC_LIBHWASAN_LIBS): New macro. (LIBHWASAN_SPEC): New macro. (LIBHWASAN_EARLY_SPEC): New macro. (SANITIZER_EARLY_SPEC): Update to include hwasan. (SANITIZER_SPEC): Update to include hwasan. (sanitize_spec_function): Use hwasan options. * opts.c (finish_options): Describe conflicts between address sanitizers. (find_sanitizer_argument): New. (report_conflicting_sanitizer_options): New. (sanitizer_opts): Introduce new sanitizer flags. (common_handle_option): Add defaults for kernel sanitizer. * params.opt (hwasan--instrument-stack): New (hwasan-random-frame-tag): New (hwasan-instrument-allocas): New (hwasan-instrument-reads): New (hwasan-instrument-writes): New (hwasan-instrument-mem-intrinsics): New * target.def (HOOK_PREFIX): Add new hook. (can_tag_addresses): Add new hook under memtag prefix. * targhooks.c (default_memtag_can_tag_addresses): New. * targhooks.h (default_memtag_can_tag_addresses): New decl. * toplev.c (process_options): Ensure hwasan only on architectures that advertise the possibility.
2020-10-30Add -fzero-call-used-regs option and zero_call_used_regs function attributes.qing zhao1-0/+1
This new feature causes the compiler to zero a subset of all call-used registers at function return. This is used to increase program security by either mitigating Return-Oriented Programming (ROP) attacks or preventing information leakage through registers. gcc/ChangeLog: 2020-10-30 Qing Zhao <qing.zhao@oracle.com> H.J.Lu <hjl.tools@gmail.com> * common.opt: Add new option -fzero-call-used-regs * config/i386/i386.c (zero_call_used_regno_p): New function. (zero_call_used_regno_mode): Likewise. (zero_all_vector_registers): Likewise. (zero_all_st_registers): Likewise. (zero_all_mm_registers): Likewise. (ix86_zero_call_used_regs): Likewise. (TARGET_ZERO_CALL_USED_REGS): Define. * df-scan.c (df_epilogue_uses_p): New function. (df_get_exit_block_use_set): Replace EPILOGUE_USES with df_epilogue_uses_p. * df.h (df_epilogue_uses_p): Declare. * doc/extend.texi: Document the new zero_call_used_regs attribute. * doc/invoke.texi: Document the new -fzero-call-used-regs option. * doc/tm.texi: Regenerate. * doc/tm.texi.in (TARGET_ZERO_CALL_USED_REGS): New hook. * emit-rtl.h (struct rtl_data): New field must_be_zero_on_return. * flag-types.h (namespace zero_regs_flags): New namespace. * function.c (gen_call_used_regs_seq): New function. (class pass_zero_call_used_regs): New class. (pass_zero_call_used_regs::execute): New function. (make_pass_zero_call_used_regs): New function. * optabs.c (expand_asm_reg_clobber_mem_blockage): New function. * optabs.h (expand_asm_reg_clobber_mem_blockage): Declare. * opts.c (zero_call_used_regs_opts): New structure array initialization. (parse_zero_call_used_regs_options): New function. (common_handle_option): Handle -fzero-call-used-regs. * opts.h (zero_call_used_regs_opts): New structure array. * passes.def: Add new pass pass_zero_call_used_regs. * recog.c (valid_insn_p): New function. * recog.h (valid_insn_p): Declare. * resource.c (init_resource_info): Replace EPILOGUE_USES with df_epilogue_uses_p. * target.def (zero_call_used_regs): New hook. * targhooks.c (default_zero_call_used_regs): New function. * targhooks.h (default_zero_call_used_regs): Declare. * tree-pass.h (make_pass_zero_call_used_regs): Declare. gcc/c-family/ChangeLog: 2020-10-30 Qing Zhao <qing.zhao@oracle.com> H.J.Lu <hjl.tools@gmail.com> * c-attribs.c (c_common_attribute_table): Add new attribute zero_call_used_regs. (handle_zero_call_used_regs_attribute): New function. gcc/testsuite/ChangeLog: 2020-10-30 Qing Zhao <qing.zhao@oracle.com> H.J.Lu <hjl.tools@gmail.com> * c-c++-common/zero-scratch-regs-1.c: New test. * c-c++-common/zero-scratch-regs-10.c: New test. * c-c++-common/zero-scratch-regs-11.c: New test. * c-c++-common/zero-scratch-regs-2.c: New test. * c-c++-common/zero-scratch-regs-3.c: New test. * c-c++-common/zero-scratch-regs-4.c: New test. * c-c++-common/zero-scratch-regs-5.c: New test. * c-c++-common/zero-scratch-regs-6.c: New test. * c-c++-common/zero-scratch-regs-7.c: New test. * c-c++-common/zero-scratch-regs-8.c: New test. * c-c++-common/zero-scratch-regs-9.c: New test. * c-c++-common/zero-scratch-regs-attr-usages.c: New test. * gcc.target/i386/zero-scratch-regs-1.c: New test. * gcc.target/i386/zero-scratch-regs-10.c: New test. * gcc.target/i386/zero-scratch-regs-11.c: New test. * gcc.target/i386/zero-scratch-regs-12.c: New test. * gcc.target/i386/zero-scratch-regs-13.c: New test. * gcc.target/i386/zero-scratch-regs-14.c: New test. * gcc.target/i386/zero-scratch-regs-15.c: New test. * gcc.target/i386/zero-scratch-regs-16.c: New test. * gcc.target/i386/zero-scratch-regs-17.c: New test. * gcc.target/i386/zero-scratch-regs-18.c: New test. * gcc.target/i386/zero-scratch-regs-19.c: New test. * gcc.target/i386/zero-scratch-regs-2.c: New test. * gcc.target/i386/zero-scratch-regs-20.c: New test. * gcc.target/i386/zero-scratch-regs-21.c: New test. * gcc.target/i386/zero-scratch-regs-22.c: New test. * gcc.target/i386/zero-scratch-regs-23.c: New test. * gcc.target/i386/zero-scratch-regs-24.c: New test. * gcc.target/i386/zero-scratch-regs-25.c: New test. * gcc.target/i386/zero-scratch-regs-26.c: New test. * gcc.target/i386/zero-scratch-regs-27.c: New test. * gcc.target/i386/zero-scratch-regs-28.c: New test. * gcc.target/i386/zero-scratch-regs-29.c: New test. * gcc.target/i386/zero-scratch-regs-30.c: New test. * gcc.target/i386/zero-scratch-regs-31.c: New test. * gcc.target/i386/zero-scratch-regs-3.c: New test. * gcc.target/i386/zero-scratch-regs-4.c: New test. * gcc.target/i386/zero-scratch-regs-5.c: New test. * gcc.target/i386/zero-scratch-regs-6.c: New test. * gcc.target/i386/zero-scratch-regs-7.c: New test. * gcc.target/i386/zero-scratch-regs-8.c: New test. * gcc.target/i386/zero-scratch-regs-9.c: New test.
2020-09-30[nvptx] Add type arg to TARGET_LIBC_HAS_FUNCTIONTom de Vries1-3/+3
GCC has a target hook TARGET_LIBC_HAS_FUNCTION, which tells the compiler which functions it can expect to be present in libc. The default target hook does not include the sincos functions. The nvptx port of newlib does include sincos and sincosf, but not sincosl. The target hook TARGET_LIBC_HAS_FUNCTION does not distinguish between sincos, sincosf and sincosl, so if we enable it for the sincos functions, then for test.c: ... long double x, a, b; int main (void) { x = 0.5; a = sinl (x); b = cosl (x); printf ("a: %f\n", (double)a); printf ("b: %f\n", (double)b); return 0; } ... we introduce a regression: ... $ gcc test.c -lm -O2 unresolved symbol sincosl collect2: error: ld returned 1 exit status ... Add a type argument to target hook TARGET_LIBC_HAS_FUNCTION_TYPE, and use it in nvptx_libc_has_function_type to enable sincos and sincosf, but not sincosl. Build and reg-tested on x86_64-linux. Build and tested on nvptx. gcc/ChangeLog: 2020-09-28 Tobias Burnus <tobias@codesourcery.com> Tom de Vries <tdevries@suse.de> * builtins.c (expand_builtin_cexpi, fold_builtin_sincos): Update targetm.libc_has_function call. * builtins.def (DEF_C94_BUILTIN, DEF_C99_BUILTIN, DEF_C11_BUILTIN): (DEF_C2X_BUILTIN, DEF_C99_COMPL_BUILTIN, DEF_C99_C90RES_BUILTIN): Same. * config/darwin-protos.h (darwin_libc_has_function): Update prototype. * config/darwin.c (darwin_libc_has_function): Add arg. * config/linux-protos.h (linux_libc_has_function): Update prototype. * config/linux.c (linux_libc_has_function): Add arg. * config/i386/i386.c (ix86_libc_has_function): Update targetm.libc_has_function call. * config/nvptx/nvptx.c (nvptx_libc_has_function): New function. (TARGET_LIBC_HAS_FUNCTION): Redefine to nvptx_libc_has_function. * convert.c (convert_to_integer_1): Update targetm.libc_has_function call. * match.pd: Same. * target.def (libc_has_function): Add arg. * doc/tm.texi: Regenerate. * targhooks.c (default_libc_has_function, gnu_libc_has_function) (no_c99_libc_has_function): Add arg. * targhooks.h (default_libc_has_function, no_c99_libc_has_function) (gnu_libc_has_function): Update prototype. * tree-ssa-math-opts.c (pass_cse_sincos::execute): Update targetm.libc_has_function call. gcc/fortran/ChangeLog: 2020-09-30 Tom de Vries <tdevries@suse.de> * f95-lang.c (gfc_init_builtin_functions): Update targetm.libc_has_function call.
2020-05-13add vectype parameter to add_stmt_cost hookRichard Biener1-1/+1
This adds a vectype parameter to add_stmt_cost which avoids the need to pass down a (wrong) stmt_info just to carry this information. Useful for invariants which do not have a stmt_info associated. 2020-05-13 Richard Biener <rguenther@suse.de> * target.def (add_stmt_cost): Add new vectype parameter. * targhooks.c (default_add_stmt_cost): Adjust. * targhooks.h (default_add_stmt_cost): Likewise. * config/aarch64/aarch64.c (aarch64_add_stmt_cost): Take new vectype parameter. * config/arm/arm.c (arm_add_stmt_cost): Likewise. * config/i386/i386.c (ix86_add_stmt_cost): Likewise. * config/rs6000/rs6000.c (rs6000_add_stmt_cost): Likewise. * tree-vectorizer.h (stmt_info_for_cost::vectype): Add. (dump_stmt_cost): Add new vectype parameter. (add_stmt_cost): Likewise. (record_stmt_cost): Likewise. (record_stmt_cost): Add overload with old signature. * tree-vect-loop.c (vect_compute_single_scalar_iteration_cost): Adjust. (vect_get_known_peeling_cost): Likewise. (vect_estimate_min_profitable_iters): Likewise. * tree-vectorizer.c (dump_stmt_cost): Add new vectype parameter. * tree-vect-stmts.c (record_stmt_cost): Likewise. (vect_prologue_cost_for_slp_op): Remove stmt_vec_info parameter and pass down correct vectype and NULL stmt_info. (vect_model_simple_cost): Adjust. (vect_model_store_cost): Likewise.
2020-05-12RISC-V: Add shorten_memrefs pass.Craig Blackmore1-0/+1
gcc/ * config.gcc: Add riscv-shorten-memrefs.o to extra_objs for riscv. * config/riscv/riscv-passes.def: New file. * config/riscv/riscv-protos.h (make_pass_shorten_memrefs): Declare. * config/riscv/riscv-shorten-memrefs.c: New file. * config/riscv/riscv.c (tree-pass.h): New include. (riscv_compressed_reg_p): New Function (riscv_compressed_lw_offset_p): Likewise. (riscv_compressed_lw_address_p): Likewise. (riscv_shorten_lw_offset): Likewise. (riscv_legitimize_address): Attempt to convert base + large_offset to compressible new_base + small_offset. (riscv_address_cost): Make anticipated compressed load/stores cheaper for code size than uncompressed load/stores. (riscv_register_priority): Move compressed register check to riscv_compressed_reg_p. * config/riscv/riscv.h (C_S_BITS): Define. (CSW_MAX_OFFSET): Define. * config/riscv/riscv.opt (mshorten-memefs): New option. * config/riscv/t-riscv (riscv-shorten-memrefs.o): New rule. (PASSES_EXTRA): Add riscv-passes.def. * doc/invoke.texi: Document -mshorten-memrefs. * config/riscv/riscv.c (riscv_new_address_profitable_p): New function. (TARGET_NEW_ADDRESS_PROFITABLE_P): Define. * doc/tm.texi: Regenerate. * doc/tm.texi.in (TARGET_NEW_ADDRESS_PROFITABLE_P): New hook. * sched-deps.c (attempt_change): Use old address if it is cheaper than new address. * target.def (new_address_profitable_p): New hook. * targhooks.c (default_new_address_profitable_p): New function. * targhooks.h (default_new_address_profitable_p): Declare. gcc/testsuite/ * gcc.target/riscv/shorten-memrefs-1.c: New test. * gcc.target/riscv/shorten-memrefs-2.c: New test. * gcc.target/riscv/shorten-memrefs-3.c: New test. * gcc.target/riscv/shorten-memrefs-4.c: New test. * gcc.target/riscv/shorten-memrefs-5.c: New test. * gcc.target/riscv/shorten-memrefs-6.c: New test. * gcc.target/riscv/shorten-memrefs-7.c: New test.
2020-05-05fix build of targets not implementing add_stmt_costRichard Biener1-1/+2
C++ makes mismatched prototype and implementation OK. 2020-05-05 Richard Biener <rguenther@suse.de> * targhooks.h (default_add_stmt_cost): Add vec_info * parameter.
2020-01-01Update copyright years.Jakub Jelinek1-1/+1
From-SVN: r279813
2019-11-27target.def (TARGET_VECTORIZE_BUILTIN_CONVERSION): Remove.Richard Biener1-2/+0
2019-11-27 Richard Biener <rguenther@suse.de> * target.def (TARGET_VECTORIZE_BUILTIN_CONVERSION): Remove. * targhooks.c (default_builtin_vectorized_conversion): Likewise. * targhooks.h (default_builtin_vectorized_conversion): Likewise. * optabs-tree.c (supportable_convert_operation): Do not call targetm.vectorize.builtin_conversion. Remove unused decl parameter. * optabs-tree.h (supportable_convert_operation): Adjust. * doc/tm.texi.in (TARGET_VECTORIZE_BUILTIN_CONVERSION): Remove. * doc/tm.texi: Regenerate. * tree-ssa-forwprop.c (simplify_vector_constructor): Adjust. * tree-vect-generic.c (expand_vector_conversion): Likewise. * tree-vect-stmts.c (vect_gen_widened_results_half): Remove unused decl parameter and adjust. (vect_create_vectorized_promotion_stmts): Likewise. (vectorizable_conversion): Adjust. From-SVN: r278765
2019-11-16Optionally pick the cheapest loop_vec_infoRichard Sandiford1-1/+1
This patch adds a mode in which the vectoriser tries each available base vector mode and picks the one with the lowest cost. The new behaviour is selected by autovectorize_vector_modes. The patch keeps the current behaviour of preferring a VF of loop->simdlen over any larger or smaller VF, regardless of costs or target preferences. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * target.h (VECT_COMPARE_COSTS): New constant. * target.def (autovectorize_vector_modes): Return a bitmask of flags. * doc/tm.texi: Regenerate. * targhooks.h (default_autovectorize_vector_modes): Update accordingly. * targhooks.c (default_autovectorize_vector_modes): Likewise. * config/aarch64/aarch64.c (aarch64_autovectorize_vector_modes): Likewise. * config/arc/arc.c (arc_autovectorize_vector_modes): Likewise. * config/arm/arm.c (arm_autovectorize_vector_modes): Likewise. * config/i386/i386.c (ix86_autovectorize_vector_modes): Likewise. * config/mips/mips.c (mips_autovectorize_vector_modes): Likewise. * tree-vectorizer.h (_loop_vec_info::vec_outside_cost) (_loop_vec_info::vec_inside_cost): New member variables. * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize them. (vect_better_loop_vinfo_p, vect_joust_loop_vinfos): New functions. (vect_analyze_loop): When autovectorize_vector_modes returns VECT_COMPARE_COSTS, try vectorizing the loop with each available vector mode and picking the one with the lowest cost. (vect_estimate_min_profitable_iters): Record the computed costs in the loop_vec_info. From-SVN: r278336
2019-11-14Replace autovectorize_vector_sizes with autovectorize_vector_modesRichard Sandiford1-1/+1
This is another patch in the series to remove the assumption that all modes involved in vectorisation have to be the same size. Rather than have the target provide a list of vector sizes, it makes the target provide a list of vector "approaches", with each approach represented by a mode. A later patch will pass this mode to targetm.vectorize.related_mode to get the vector mode for a given element mode. Until then, the modes simply act as an alternative way of specifying the vector size. 2019-11-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * target.h (vector_sizes, auto_vector_sizes): Delete. (vector_modes, auto_vector_modes): New typedefs. * target.def (autovectorize_vector_sizes): Replace with... (autovectorize_vector_modes): ...this new hook. * doc/tm.texi.in (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Replace with... (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): ...this new hook. * doc/tm.texi: Regenerate. * targhooks.h (default_autovectorize_vector_sizes): Delete. (default_autovectorize_vector_modes): New function. * targhooks.c (default_autovectorize_vector_sizes): Delete. (default_autovectorize_vector_modes): New function. * omp-general.c (omp_max_vf): Use autovectorize_vector_modes instead of autovectorize_vector_sizes. Use the number of units in the mode to calculate the maximum VF. * omp-low.c (omp_clause_aligned_alignment): Use autovectorize_vector_modes instead of autovectorize_vector_sizes. Use a loop based on related_mode to iterate through all supported vector modes for a given scalar mode. * optabs-query.c (can_vec_mask_load_store_p): Use autovectorize_vector_modes instead of autovectorize_vector_sizes. * tree-vect-loop.c (vect_analyze_loop, vect_transform_loop): Likewise. * tree-vect-slp.c (vect_slp_bb_region): Likewise. * config/aarch64/aarch64.c (aarch64_autovectorize_vector_sizes): Replace with... (aarch64_autovectorize_vector_modes): ...this new function. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Delete. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Define. * config/arc/arc.c (arc_autovectorize_vector_sizes): Replace with... (arc_autovectorize_vector_modes): ...this new function. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Delete. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Define. * config/arm/arm.c (arm_autovectorize_vector_sizes): Replace with... (arm_autovectorize_vector_modes): ...this new function. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Delete. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Define. * config/i386/i386.c (ix86_autovectorize_vector_sizes): Replace with... (ix86_autovectorize_vector_modes): ...this new function. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Delete. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Define. * config/mips/mips.c (mips_autovectorize_vector_sizes): Replace with... (mips_autovectorize_vector_modes): ...this new function. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES): Delete. (TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Define. From-SVN: r278236
2019-11-14Pass the data vector mode to get_mask_modeRichard Sandiford1-1/+1
This patch passes the data vector mode to get_mask_mode, rather than its size and nunits. This is a bit simpler and allows targets to distinguish between modes that happen to have the same size and number of elements. 2019-11-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * target.def (get_mask_mode): Take a vector mode itself as argument, instead of properties about the vector mode. * doc/tm.texi: Regenerate. * targhooks.h (default_get_mask_mode): Update to reflect new get_mode_mask interface. * targhooks.c (default_get_mask_mode): Likewise. Use related_int_vector_mode. * optabs-query.c (can_vec_mask_load_store_p): Update call to get_mask_mode. * tree-vect-stmts.c (check_load_store_masking): Likewise, checking first that the original mode really is a vector. * tree.c (build_truth_vector_type_for): Likewise. * config/aarch64/aarch64.c (aarch64_get_mask_mode): Update for new get_mode_mask interface. (aarch64_expand_sve_vcond): Update call accordingly. * config/gcn/gcn.c (gcn_vectorize_get_mask_mode): Update for new get_mode_mask interface. * config/i386/i386.c (ix86_get_mask_mode): Likewise. From-SVN: r278233
2019-11-14Add a targetm.vectorize.related_mode hookRichard Sandiford1-0/+3
This patch is the first of a series that tries to remove two assumptions: (1) that all vectors involved in vectorisation must be the same size (2) that there is only one vector mode for a given element mode and number of elements Relaxing (1) helps with targets that support multiple vector sizes or that require the number of elements to stay the same. E.g. if we're vectorising code that operates on narrow and wide elements, and the narrow elements use 64-bit vectors, then on AArch64 it would normally be better to use 128-bit vectors rather than pairs of 64-bit vectors for the wide elements. Relaxing (2) makes it possible for -msve-vector-bits=128 to produce fixed-length code for SVE. It also allows unpacked/half-size SVE vectors to work with -msve-vector-bits=256. The patch adds a new hook that targets can use to control how we move from one vector mode to another. The hook takes a starting vector mode, a new element mode, and (optionally) a new number of elements. The flexibility needed for (1) comes in when the number of elements isn't specified. All callers in this patch specify the number of elements, but a later vectoriser patch doesn't. 2019-11-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * target.def (related_mode): New hook. * doc/tm.texi.in (TARGET_VECTORIZE_RELATED_MODE): New hook. * doc/tm.texi: Regenerate. * targhooks.h (default_vectorize_related_mode): Declare. * targhooks.c (default_vectorize_related_mode): New function. * machmode.h (related_vector_mode): Declare. * stor-layout.c (related_vector_mode): New function. * expmed.c (extract_bit_field_1): Use it instead of mode_for_vector. * optabs-query.c (qimode_for_vec_perm): Likewise. * tree-vect-stmts.c (get_group_load_store_type): Likewise. (vectorizable_store, vectorizable_load): Likewise From-SVN: r278229
2019-09-30Add a function for getting the ABI of a call insn targetRichard Sandiford1-2/+0
This patch replaces get_call_reg_set_usage with insn_callee_abi, which returns the ABI of the target of a call insn. The ABI's full_reg_clobbers corresponds to regs_invalidated_by_call, whereas many callers instead passed call_used_or_fixed_regs, i.e.: (regs_invalidated_by_call | fixed_reg_set) The patch slavishly preserves the "| fixed_reg_set" for these callers; later patches will clean this up. 2019-09-30 Richard Sandiford <richard.sandiford@arm.com> gcc/ * target.def (insn_callee_abi): New hook. (remove_extra_call_preserved_regs): Delete. * doc/tm.texi.in (TARGET_INSN_CALLEE_ABI): New macro. (TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS): Delete. * doc/tm.texi: Regenerate. * targhooks.h (default_remove_extra_call_preserved_regs): Delete. * targhooks.c (default_remove_extra_call_preserved_regs): Delete. * config/aarch64/aarch64.c (aarch64_simd_call_p): Constify the insn argument. (aarch64_remove_extra_call_preserved_regs): Delete. (aarch64_insn_callee_abi): New function. (TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS): Delete. (TARGET_INSN_CALLEE_ABI): New macro. * rtl.h (get_call_fndecl): Declare. (cgraph_rtl_info): Fix formatting. Tweak comment for function_used_regs. Remove function_used_regs_valid. * rtlanal.c (get_call_fndecl): Moved from final.c * function-abi.h (insn_callee_abi): Declare. (target_function_abi_info): Mention insn_callee_abi. * function-abi.cc (fndecl_abi): Handle flag_ipa_ra in a similar way to get_call_reg_set_usage did. (insn_callee_abi): New function. * regs.h (get_call_reg_set_usage): Delete. * final.c: Include function-abi.h. (collect_fn_hard_reg_usage): Add fixed and stack registers to function_used_regs before the main loop rather than afterwards. Use insn_callee_abi instead of get_call_reg_set_usage. Exit early if function_used_regs ends up not being useful. (get_call_fndecl): Move to rtlanal.c (get_call_cgraph_rtl_info, get_call_reg_set_usage): Delete. * caller-save.c: Include function-abi.h. (setup_save_areas, save_call_clobbered_regs): Use insn_callee_abi instead of get_call_reg_set_usage. * cfgcleanup.c: Include function-abi.h. (old_insns_match_p): Use insn_callee_abi instead of get_call_reg_set_usage. * cgraph.h (cgraph_node::rtl_info): Take a const_tree instead of a tree. * cgraph.c (cgraph_node::rtl_info): Likewise. Initialize function_used_regs. * df-scan.c: Include function-abi.h. (df_get_call_refs): Use insn_callee_abi instead of get_call_reg_set_usage. * ira-lives.c: Include function-abi.h. (process_bb_node_lives): Use insn_callee_abi instead of get_call_reg_set_usage. * lra-lives.c: Include function-abi.h. (process_bb_lives): Use insn_callee_abi instead of get_call_reg_set_usage. * postreload.c: Include function-abi.h. (reload_combine): Use insn_callee_abi instead of get_call_reg_set_usage. * regcprop.c: Include function-abi.h. (copyprop_hardreg_forward_1): Use insn_callee_abi instead of get_call_reg_set_usage. * resource.c: Include function-abi.h. (mark_set_resources, mark_target_live_regs): Use insn_callee_abi instead of get_call_reg_set_usage. * var-tracking.c: Include function-abi.h. (dataflow_set_clear_at_call): Use insn_callee_abi instead of get_call_reg_set_usage. From-SVN: r276309
2019-09-09Remove bt-load.cRichard Sandiford1-1/+0
bt-load.c has AFAIK been dead code since the removal of the SH5 port in 2016. I have a patch series that would need to update the liveness tracking in a nontrivial way, so it seemed better to remove the pass rather than install an untested and probably bogus change. 2019-09-09 Richard Sandiford <richard.sandiford@arm.com> gcc/ * Makefile.in (OBJS): Remove bt-load.o. * doc/invoke.texi (fbranch-target-load-optimize): Delete. (fbranch-target-load-optimize2, fbtr-bb-exclusive): Likewise. * common.opt (fbranch-target-load-optimize): Mark as Ignore and document that the option no longer does anything. (fbranch-target-load-optimize2, fbtr-bb-exclusive): Likewise. * target.def (branch_target_register_class): Delete. (branch_target_register_callee_saved): Likewise. * doc/tm.texi.in (TARGET_BRANCH_TARGET_REGISTER_CLASS): Likewise. (TARGET_BRANCH_TARGET_REGISTER_CALLEE_SAVED): Likewise. * doc/tm.texi: Regenerate. * tree-pass.h (make_pass_branch_target_load_optimize1): Delete. (make_pass_branch_target_load_optimize2): Likewise. * passes.def (pass_branch_target_load_optimize1): Likewise. (pass_branch_target_load_optimize2): Likewise. * targhooks.h (default_branch_target_register_class): Likewise. * targhooks.c (default_branch_target_register_class): Likewise. * opt-suggestions.c (test_completion_valid_options): Remove -fbtr-bb-exclusive from the list of test options. * bt-load.c: Remove. From-SVN: r275521
2019-08-20Use function_arg_info for TARGET_CALLEE_COPIESRichard Sandiford1-5/+3
The hook is passed the unpromoted type mode instead of the promoted mode. The aarch64 definition is redundant, but worth keeping for emphasis. 2019-08-20 Richard Sandiford <richard.sandiford@arm.com> gcc/ * target.def (callee_copies): Take a function_arg_info instead of a mode, type and named flag. * doc/tm.texi: Regenerate. * targhooks.h (hook_callee_copies_named): Take a function_arg_info instead of a mode, type and named flag. (hook_bool_CUMULATIVE_ARGS_mode_tree_bool_false): Delete. (hook_bool_CUMULATIVE_ARGS_mode_tree_bool_true): Likewise. (hook_bool_CUMULATIVE_ARGS_arg_info_true): New function. * targhooks.c (hook_callee_copies_named): Take a function_arg_info instead of a mode, type and named flag. (hook_bool_CUMULATIVE_ARGS_mode_tree_bool_false): Delete. (hook_bool_CUMULATIVE_ARGS_mode_tree_bool_true): Likewise. (hook_bool_CUMULATIVE_ARGS_arg_info_true): New function. * calls.h (reference_callee_copied): Take a function_arg_info instead of a mode, type and named flag. * calls.c (reference_callee_copied): Likewise. (initialize_argument_information): Update call accordingly. (emit_library_call_value_1): Likewise. * function.c (gimplify_parameters): Likewise. * config/aarch64/aarch64.c (TARGET_CALLEE_COPIES): Define to hook_bool_CUMULATIVE_ARGS_arg_info_false instead of hook_bool_CUMULATIVE_ARGS_mode_tree_bool_false. * config/c6x/c6x.c (c6x_callee_copies): Delete. (TARGET_CALLEE_COPIES): Define to hook_bool_CUMULATIVE_ARGS_arg_info_true instead. * config/epiphany/epiphany.c (TARGET_CALLEE_COPIES): Define to hook_bool_CUMULATIVE_ARGS_arg_info_true instead of hook_bool_CUMULATIVE_ARGS_mode_tree_bool_true. * config/mips/mips.c (mips_callee_copies): Take a function_arg_info instead of a mode, type and named flag. * config/mmix/mmix.c (TARGET_CALLEE_COPIES): Define to hook_bool_CUMULATIVE_ARGS_arg_info_true instead of hook_bool_CUMULATIVE_ARGS_mode_tree_bool_true. * config/mn10300/mn10300.c (TARGET_CALLEE_COPIES): Likewise. * config/msp430/msp430.c (msp430_callee_copies): Delete. (TARGET_CALLEE_COPIES): Define to hook_bool_CUMULATIVE_ARGS_arg_info_true instead. * config/pa/pa.c (pa_callee_copies): Take a function_arg_info instead of a mode, type and named flag. * config/sh/sh.c (sh_callee_copies): Likewise. * config/v850/v850.c (TARGET_CALLEE_COPIES): Define to hook_bool_CUMULATIVE_ARGS_arg_info_true instead of hook_bool_CUMULATIVE_ARGS_mode_tree_bool_true. From-SVN: r274702